Optimized Guidelines for Creating a Voice Sample for Audio

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline
User avatar

levideo

  • Posts: 89
  • Joined: Thu Oct 31, 2019 1:56 pm
  • Location: milan italy
  • Real Name: marco levi

Optimized Guidelines for Creating a Voice Sample for Audio

PostMon May 12, 2025 9:10 am

Optimized Guidelines for Creating a Voice Sample for Audio Learning in DaVinci Resolve Studio 20

Hello everyone,
Unfortunately, there are no guidelines either on the Black Magic forum or generally on the web about how to best train software to learn and profile a voice, as is now possible with the new AI-based function in DaVinci Resolve Studio 20. I've studied this process, asking for suggestions from artificial intelligence via prompts and consulting with audio and computer experts, all working in synergy.
From what can be determined, the quality of the sample must be high, both in terms of the recording quality and its content.
The reading of the text attached here lasted 26 minutes, during which I tried to articulate the words as clearly as possible to allow DaVinci to produce the best possible result.
For the recording, I used a Zoom H4 professional audio recorder with a directional microphone at a 90-degree recording angle at a distance of about 30 cm, being careful not to saturate the recording by constantly monitoring the LEDs related to audio peaks.
The recording was made at the highest possible quality in PCM WAV format at 96kHz/24-bit, resulting in a 433MB file. From what I understand, the audio sample should be at least ten minutes of recording.
Once the recording was completed, I imported the audio file of my voice into DaVinci Resolve Studio 20 Beta 3 and gave the learning audio command, setting the quality factor over the speed factor.
The PC I used is equipped as follows:

Motherboard: ASRock Steel Legend B850M
CPU: AMD Ryzen 7 7700 8-core
RAM: 32GB DDR5 with CL30 at 7000 MHz
GPU: RTX 4070 12GB VRAM MSI
Storage: WD 850X 2TB NVMe SSD

Processing the new voice took about fifteen minutes with an average CPU usage of around 25%, while the NVIDIA graphics card worked at 100%.
When the processing was finished, DaVinci displayed a notification that the voice had been sampled. I assume we need to refer to this message; I don't think there's any further rendering to wait for. I'd appreciate your input on this.
I tried importing another audio into DaVinci's timeline, and from there, using the voice converter, I told it to insert my voice. I must say the result seems valid to me.
In your opinion, did I execute everything correctly? What do you think of the text I created to produce the best possible audio sample to profile?
I'm really very curious about your responses and hope to hear from the program's development team as well. I think the continuous exchange of information can be helpful to everyone.
Best regards,
Marco from Milan

;)

THE AUDIO SAMPLE COME FROM HERE


Good morning everyone. My name is Marco and I am from Milan, and this is a recording made with a professional Zoom H4 digital audio recorder in wav format at 96kHz 24 bit, to create a complete sample of my voice. I will talk about various topics, using different tones, rhythms, and intonations to provide a representative sample of my natural way of expressing myself. I will try to include a wide range of sounds, words, and expressions of the Italian language.
Personal Introduction
Some information about me: I live in Italy and I am interested in different fields, from technology to art, from literature to cinema. I like exploring new ideas and I am always curious to discover new tools that can help me express my creativity. In my free time I love [insert your hobby], and when I can, I travel to discover new places and different cultures.
In recent years, I have developed a particular interest in new technologies and their creative applications. It's fascinating to see how artificial intelligence is changing our way of working and expressing ourselves. I believe these tools can amplify our creative abilities, allowing us to focus more on the most significant aspects of our work.
Numbers, Dates and Sequences
Let's count together: one, two, three, four, five, six, seven, eight, nine, ten. And now let's continue with: eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty. Let's continue with: twenty-one, twenty-two, twenty-three, twenty-four, twenty-five, twenty-six, twenty-seven, twenty-eight, twenty-nine, thirty.
Counting by tens: ten, twenty, thirty, forty, fifty, sixty, seventy, eighty, ninety, one hundred. And larger numbers: one hundred and one, two hundred and fifty, three hundred and twenty-seven, four hundred and eighty-nine, five hundred, one thousand two hundred, two thousand three hundred and forty-five, one million, two billion.
Today is [insert complete date]. It is [insert exact time]. The temperature is [insert temperature] degrees. In a week it will be [insert future date] and in a month we will be in [insert next month].
Questions and Exclamations
Have you ever wondered what it would be like to live in another century? Which historical period fascinates you the most? Would you prefer to visit ancient Egypt or the Roman Empire? What do you think of new technologies? Do you believe they will radically change our way of living in the next ten years?
Have you ever climbed a mountain? How much time do you dedicate to reading each day? What is your favorite film of all time? If you could meet a historical character, who would you choose and why?
How wonderful! I can't believe my eyes! Incredible how time flies! Fantastic! Exceptional! What an incredible surprise! I would never have imagined such a thing! What a spectacle nature is in spring! Look at that breathtaking sunset!
Descriptions of Landscapes and Places
Italy is an extraordinary country, a true open-air museum. From north to south, from the coast to the mountains, every corner tells a thousand-year-old story.
The Alps dominate the northern panorama with their majestic snow-capped peaks. Mont Blanc, at 4,810 meters, is the highest mountain in Italy and Western Europe. Its perennial glaciers reflect the sunlight, creating dazzling plays of light. In the Alpine valleys, you can find characteristic villages with wooden and stone houses, bell towers that soar towards the sky, and very green meadows dotted with colorful flowers in spring.
The pre-Alpine lakes such as Lake Como, Lake Maggiore, and Lake Garda are jewels set among the mountains. Their crystal-clear waters reflect the surrounding landscape, while elegant villas surrounded by lush gardens stand on the shores. The mild climate favors the growth of Mediterranean plants, creating a fascinating contrast with the snow-capped mountains in the background.
The Po Valley, crossed by the Po River and its tributaries, is one of the most fertile areas in Europe. Cultivated fields extend as far as the eye can see, interspersed with rows of poplars and irrigation canals. Cities rich in history and culture rise in this plain: Milan with its Gothic Cathedral, Bologna with its medieval towers, Parma with its opera house.
Venice is a unique city in the world, built on 118 islands connected by over 400 bridges. Its palaces seem to float on the water of the canals, while gondolas glide silently under the Rialto Bridge and the Bridge of Sighs. St. Mark's Basilica and the Doge's Palace testify to the splendor of the Most Serene Republic that dominated the Mediterranean for centuries.
The Apennines cross Italy like a backbone, from Liguria to Calabria. Among their valleys lie perfectly preserved medieval villages, where time seems to have stood still. Ancient castles stand on strategic hills, once defensive bastions, now silent witnesses of a glorious past.
Tuscany enchants with its landscape that seems to have come out of a Renaissance painting: rolling hills, orderly vineyards, silvery olive groves, and cypresses that stand out against the blue sky. Florence, cradle of the Renaissance, preserves masterpieces such as Michelangelo's David and Botticelli's Primavera. Siena with its shell-shaped Piazza del Campo and San Gimignano with its medieval towers complete this perfect picture.
Rome, the Eternal City, tells over 2,800 years of history through its monuments. The Colosseum, an imposing amphitheater where gladiatorial games took place, the Imperial Forums, political center of ancient Rome, the Trevi Fountain where tourists throw coins and make a wish, and St. Peter's Basilica, heart of Christianity. Every corner of the city hides a treasure to discover.
Naples lives in the shadow of Vesuvius, the volcano that in 79 AD buried Pompeii and Herculaneum under ash and lapilli, preserving them perfectly to this day. The Gulf of Naples, with the islands of Capri, Ischia, and Procida, offers panoramas of incomparable beauty. The Amalfi Coast, with its towns perched on cliffs overlooking the sea, is one of the most photographed landscapes in the world.
Puglia surprises with its trulli, typical dry stone constructions with a conical roof, white masserie (farmhouses) immersed in centuries-old olive groves, and fine sand beaches bathed by a crystal-clear sea. The historic centers of Lecce, with its lavish baroque style, and Alberobello, with its trulli, are UNESCO World Heritage sites.
Sicily, the largest island in the Mediterranean, is a crossroads of cultures: Greek, Roman, Byzantine, Arab, Norman. The temple of Agrigento, the Greek theater of Taormina with a view of Mount Etna, the Cathedral of Palermo with its golden mosaics, tell the millennial history of the island. And then there's Etna, the highest active volcano in Europe, which dominates the eastern landscape of the island.
Sardinia fascinates with its jagged coasts, turquoise waters, and ancient Nuragic constructions, mysterious stone towers dating back to the Bronze Age. The wild interior, with its plateaus and mountains, preserves ancient traditions and a language that preserves ancient pre-Latin roots.
Words with Special Sounds
The Italian language is rich in particular sounds. Here are some words that contain them:
Scienza, coscienza, conoscenza, uscita, scelta, pesce, piscina, usciere, scivolo, sciame. [Science, conscience, knowledge, exit, choice, fish, swimming pool, usher, slide, swarm]
Gnocchi, ragno, cognome, agnello, sogno, ignoto, bagno, legno, ingegnere, vigna. [Gnocchi, spider, surname, lamb, dream, unknown, bathroom, wood, engineer, vineyard]
Zaino, zanzara, zenzero, azzurro, pazienza, zucchero, zona, zoppo, zodiaco, zampa. [Backpack, mosquito, ginger, blue, patience, sugar, zone, lame, zodiac, paw]
Cielo, ciao, cioccolato, cespuglio, cipresso, ciuffo, ciambella, cimice, cinema, ciliegia. [Sky, hello, chocolate, bush, cypress, tuft, doughnut, bug, cinema, cherry]
Ghiaccio, ghepardo, ghirlanda, ghiotto, margherita, funghi, ghiro, ghigno, ghetta, spaghetti. [Ice, cheetah, garland, greedy, daisy, mushrooms, dormouse, sneer, gaiter, spaghetti]
Quadro, quanto, quercia, quindici, liquido, acqua, quaderno, acquario, acquisto, squadra. [Picture, how much, oak, fifteen, liquid, water, notebook, aquarium, purchase, team]
Figlio, foglia, famiglia, migliore, aglio, biglietto, coniglio, giglio, maglia, bottiglia. [Son, leaf, family, better, garlic, ticket, rabbit, lily, sweater, bottle]
Pizza, pozzo, pazzo, ragazzo, pezzo, prezzo, spazzolino, razza, puzza, spruzzare. [Pizza, well, crazy, boy, piece, price, toothbrush, race, stink, spray]
Taxi, extra, xilofono, box, mix, index, texture, Excel, xenofobo, xerografia. [Taxi, extra, xylophone, box, mix, index, texture, Excel, xenophobe, xerography]
Emotions and States of Mind
Happiness is an emotion that illuminates life and makes us feel in harmony with the world. When I am happy, I feel light and full of positive energy. Everything seems brighter and difficulties appear surmountable.
Sadness, on the contrary, is like a veil that obscures reality. Colors seem faded, time flows more slowly, and even simple things require greater effort. Yet, sadness is part of life and helps us appreciate moments of joy more.
Anger is a powerful emotion that can be destructive if not managed correctly, but it can also be a driving force for change. When I am angry, I feel my heart beating faster, muscles tense, and the mind intensely focuses on the reason for my discomfort.
Fear is a defense mechanism that protects us from dangers. It can manifest as a shiver down the spine, a knot in the stomach, labored breathing. Fear can paralyze us, but it can also push us to react promptly in emergency situations.
Surprise is a brief but intense emotion. An unexpected event can leave us open-mouthed, with wide eyes and suspended breath. Surprise can be pleasant or unpleasant, but it is always a moment of intense emotional involvement.
Nostalgia is a bittersweet feeling, a mix of joy for good memories and melancholy for what will never return. I experience nostalgia when I hear a song linked to a particular moment in my life, or when I find an object that takes me back to childhood.
Enthusiasm is a driving force that pushes us to give our best. When I am enthusiastic about a project, time flies and difficulties become stimulating challenges. Enthusiasm is contagious and can inspire the people around us too.
Literature and Quotations
Italian literature boasts world-famous authors such as Dante Alighieri, Francesco Petrarca, Giovanni Boccaccio, Niccolò Machiavelli, Ludovico Ariosto, Torquato Tasso, Alessandro Manzoni, Giacomo Leopardi, Luigi Pirandello, Italo Calvino, Umberto Eco, and many others.
"In the middle of the journey of our life I found myself in a dark forest, for the straight way was lost." This is how Dante's Divine Comedy begins, the epic poem that marked the birth of the modern Italian language.
"Always dear to me was this solitary hill, and this hedge, which from so much of the ultimate horizon excludes the view." The verses of "The Infinite" by Giacomo Leopardi express the sense of infinity that the human mind can conceive even when faced with a limited space.
"This marriage is not to be done, neither tomorrow nor ever." It is one of the most famous phrases from "The Betrothed" by Alessandro Manzoni, spoken by Don Rodrigo's thugs to Don Abbondio.
"The Late Mattia Pascal" by Luigi Pirandello explores the theme of identity through the story of a man who has the opportunity to completely reinvent himself.
"If on a Winter's Night a Traveler" by Italo Calvino is an experimental novel that directly involves the reader in the narrative, challenging the traditional conventions of literature.
Technical and Specific Terms
Here are some technical terms from different fields:
Computing: algorithm, database, interface, programming, software, hardware, cloud computing, artificial intelligence, neural network, cryptography.
Medicine: diagnosis, prognosis, pathology, anamnesis, therapy, cardiology, neurology, pediatrics, surgery.
Economics: inflation, deflation, gross domestic product, stock market, interest rate, currency, investment, budget, taxation, recession.
Art: perspective, chiaroscuro, fresco, sculpture, engraving, oil painting, watercolor, composition, avant-garde, installation, photography.
Music: harmony, melody, rhythm, counterpoint, symphony, sonata, opera, pentatonic, diatonic, chromatic.
Common Phrases
Good morning, how are you today? I hope your day is going well. Sorry, could you repeat that? I didn't understand what you said. Thank you very much for your help, it was really valuable. I'm sorry, I can't attend tomorrow's event, I have another commitment. What time is it? I have an appointment soon. Could I have a glass of water, please? Where is the nearest bus stop? How much does this item cost? Is it available in other colors? What beautiful weather today! Perfect for a walk outdoors. Would you prefer to go to the cinema or the theater this evening? There are various interesting shows.
Seasons and Weather
In spring, nature awakens from winter hibernation. Trees put out their first tender green leaves, flowers bloom in meadows creating colored carpets, and migratory birds return to nest. The days gradually lengthen and the air fills with the scent of flowers. It is the season of rebirth and hope.
Summer brings with it intense heat, clear skies, and long sunny days. It's the time for vacations at the seaside, where the crystal-clear water invites you to dive in to find refreshment, or in the mountains, where the fresh and pure air fills your lungs. Summer thunderstorms arrive suddenly, with loud thunder and intense rain, but they pass quickly, leaving the air fresher and cleaner.
Autumn transforms the landscape with its warm colors: red, orange, yellow, brown. Leaves slowly fall from trees, forming crunchy carpets underfoot. The air becomes crisper, especially early in the morning and in the evening. It is the season of the grape harvest, of collecting chestnuts, of mushrooms in the woods. Morning mists envelop the landscapes in an almost magical atmosphere.
Winter brings cold, sometimes snow that covers everything with its silent white blanket. Breath condenses in the icy air, hands seek refuge in gloves, and noses turn red from the cold. The days are short, but the lights of the end-of-year festivities illuminate the long nights. It is a time of gathering, reflection, domestic warmth.
Travel and Cultures
Traveling is one of the most enriching experiences we can live. Every journey allows us to discover not only new places but also different ways of living, thinking, eating, relating to others.
In France, we can admire the Eiffel Tower, symbol of Paris, walk along the Seine, visit the Louvre with its masterpieces, but also lose ourselves in the alleys of Montmartre or in the villages of Provence with their lavender fields.
Spain welcomes us with its warm climate, its unique architecture such as the Sagrada Familia in Barcelona, its flamenco rhythms, tapas to be enjoyed in crowded bars, the beaches of the Costa del Sol, and the mountains of the Pyrenees.
Japan is a fascinating mix of tradition and modernity. In Tokyo, futuristic skyscrapers coexist with ancient temples, while in Kyoto we can admire Zen gardens, participate in a tea ceremony, or see geishas walking elegantly in the streets of Gion.
India is an assault on the senses: vivid colors of women's saris, the scent of spices in the markets, the sound of horns in chaotic traffic, intense flavors of local cuisine, the touch of silk and cotton in traditional fabrics. The Taj Mahal, the temples of Khajuraho, the backwaters of Kerala are just some of the wonders that this country offers.
The United States surprises with the variety of landscapes: from the skyscrapers of New York to the beaches of California, from the canyons of Arizona to the prairies of the Midwest, from the forests of the Pacific Northwest to the bayous of Louisiana. Each state has its own culture, its own cuisine, its own accent.
Last edited by levideo on Tue May 13, 2025 6:41 am, edited 2 times in total.
Offline
User avatar

levideo

  • Posts: 89
  • Joined: Thu Oct 31, 2019 1:56 pm
  • Location: milan italy
  • Real Name: marco levi

Re: Optimized Guide for Creating a Voice Sample for Audio

PostTue May 13, 2025 6:38 am

no one who answered me...
but can I at least know how you created the sample to be sampled and trained by the program?
this exchange of information could be really very useful, maybe sometimes it takes very little to improve the results.
what do you say what do you read and for how long.
it would also be nice to know the recording techniques see hardware and software equipment.
it would also be nice and useful from black magic a guideline on the matter.
what do you think?
regards thanks marco
Last edited by levideo on Fri May 16, 2025 5:49 am, edited 1 time in total.
Offline

CougerJoe

  • Posts: 607
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Optimized Guidelines for Creating a Voice Sample for Aud

PostTue May 13, 2025 8:04 am

My current opinion is this doesn't work well enough to use, and you should stop putting so much effort into this. Instead grab some audio books, or podcasts that you believe have been well produced with good audio, edit out everything but the voice you want to test, try 5,10,20,40 minutes, see if you can gauge a sweet spot where quality doesn't appear to improve, if 40minutes sounds best try 80 minutes.

Also you might want to look into tortoise TTS tutorials and other voice training software, often after the training they have another procedure, which I think they call upscaling or similar, it's a way to improve the quality a little more. There might be OFX plugins for Resolve that could do the same. As for Tortoise TTS what I recall is the CPU version did an ok job but took forever, but there was a GPU version, but unfortunately it wasn't as good, it compromised quality for speed.

The AI voices used for many YouTube and Tiktok narrations sounds heaps better than Tortoise TTS or Resolve AI voice, often you only know it's an AI voice because you've heard that voice before or the way it repeats certain words or turns of phrase and will say it in such an identical way it becomes obvious it's AI.
Offline
User avatar

levideo

  • Posts: 89
  • Joined: Thu Oct 31, 2019 1:56 pm
  • Location: milan italy
  • Real Name: marco levi

Re: Optimized Guidelines for Creating a Voice Sample for Aud

PostFri May 16, 2025 5:47 am

CougerJoe wrote:My current opinion is this doesn't work well enough to use, and you should stop putting so much effort into this. Instead grab some audio books, or podcasts that you believe have been well produced with good audio, edit out everything but the voice you want to test, try 5,10,20,40 minutes, see if you can gauge a sweet spot where quality doesn't appear to improve, if 40minutes sounds best try 80 minutes.

Also you might want to look into tortoise TTS tutorials and other voice training software, often after the training they have another procedure, which I think they call upscaling or similar, it's a way to improve the quality a little more. There might be OFX plugins for Resolve that could do the same. As for Tortoise TTS what I recall is the CPU version did an ok job but took forever, but there was a GPU version, but unfortunately it wasn't as good, it compromised quality for speed.

The AI voices used for many YouTube and Tiktok narrations sounds heaps better than Tortoise TTS or Resolve AI voice, often you only know it's an AI voice because you've heard that voice before or the way it repeats certain words or turns of phrase and will say it in such an identical way it becomes obvious it's AI.


thank you for the valuable suggestions, despite having provided the program with a sample of my voice with the best possible quality, I noticed that in the voice converter the voice does not appear to be mine, similar in distance, let's say of a person from the virtual world and not real, certainly free to use for the speech related to copyright.
in my opinion under this aspect of learning the program must be greatly improved.
it would be nice to understand what the correct terms are to perform the best possible sampling.
it would be nice if other users who are reading our posts were also able to share their experiences, and where they get the voices from which sites etc etc.
sharing our experiences helps all of us to develop ideas and synergies.
let me know thanks finally marco

;)
Offline

CougerJoe

  • Posts: 607
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: Optimized Guidelines for Creating a Voice Sample for Aud

PostFri May 16, 2025 6:50 am

levideo wrote:I noticed that in the voice converter the voice does not appear to be mine, similar in distance, let's say of a person from the virtual world and not real, certainly free to use for the speech related to copyright.
in my opinion under this aspect of learning the program must be greatly improved.
it would be nice to understand what the correct terms are to perform the best possible sampling.

;)


Yeah as a voice cloner it's not very good (currently) and for it's role in re-voicing I am noticing the results often sound like a hearing impaired person. I think it's to do with the AI not perfectly hearing the voice to be replaced, it can miss syllables and the re-voice sounds like a deaf person. No offence to people with hearing problems, I just didn't know how else to describe the problem.

Return to DaVinci Resolve

Who is online

Users browsing this forum: Firehouse Creative, Kevin.Naben, panos_mts and 256 guests