How I Train Tortoise in Other Languages - Training to Finished Model

Building Desktop Applications using Native PHP with Simon Hamp

Latest in AI TTS and StyleTTS WebUI Updates

We need to beat it… 😳⚽️

Co to ti kluci zase vymýšlí?🤭😅

Sergei Barracuda - Ostrava (OFFICIAL VIDEO)

Accidently Training Tortoise TTS on Crappy Audio Data

Jarods Journey

zhlédnutí 1 744

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 6. 09. 2024
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/Jar...
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoff...

Komentáře • 18

@GraveUypo Před 6 měsíci ⁺⁵
the famous "garbage in garbage out" saying peeking its ugly face
@dougmaisner Před 6 měsíci ⁺⁵
happens to the best of us
@9-volt247 Před 6 měsíci
Yes, indeed!
@nbase2652 Před 5 měsíci ⁺¹
Have you thought about using IRs to bring back some depth when the dataset is too sterile or thin-sounding after all that UVR stuff? Still won't turn garbage into gold, but if high quality audio just isn't available for whatever reason, you can at least make it a bit better, and even smooth out those dead cutoffs a bit.
(Impulse Responses are small .wav files that basically capture the characteristics of a recording environment. This is usually used for "convolution reverb", but capabilities go way beyond just reverb. It can be the frequency response of a certain microphone, how a guitar amp cabinet sounds when recorded from the back etc... I recall using a short damp hit on the body of an acoustic guitar to fatten up thin vocals for example.)
@RobAgrees Před 5 měsíci ⁺¹
Hey Jarrod! Been following your channel for awhile since you always have the best AI voice content. I'm curious what you think is the highest fidelity voice clone repo out there currently? Is it still mrq's tortoise TTS fork?
@lo7o7xenpai76 Před 3 měsíci
Sounds better than my 1st model . I just throw in bunch on audio on colab and it was shit.
@shovonjamali7854 Před 5 měsíci
How did you segment audio for Vietnamese as this language is not supported in whisperx because I believe you are using whisperx for segmenting this?
@Jarods_Journey Před 5 měsíci
You can run whisperx with --no_align. It's just not supported well enough with an alignment model, whisper supports it
@StringerBell Před 6 měsíci
Hey, Jarod. I've been failing miserably to train a Bulgarian voice model for months for Tortoise TTS. I have absolutely no issue training RVC models with great success but for some reason my TTS models are borderline unusable, no matter what I try. My dataset is consisted of studio quality voice recordings, so the quality is not the issue.
Is there any way to hire you for consultation to help me out? Thanks!
@Zegur Před 6 měsíci
I need some help, I've trained a model today running on a i9 9900k, Rtx 2070 super. The training went fine but when actually using the Text to speech it just seems to take ages. Im trying to do 7 sentences and have been waiting for about 2 hours and im at 4/7 meanwhile i see you just almost having instant results
@Jarods_Journey Před 6 měsíci
Your samples # is probably too high, reduce that to 2. As well, you may have too many audio samples trying to make latents from, move those audio files to a backup folder or create a new voice in voice and place 2 small audio files there for generation inference
@Zegur Před 6 měsíci
@@Jarods_Journey Thanks, I completely missed this step. Hope it works now
@user-iv2sp2gl1z Před 5 měsíci
Can your method applied to Chinese?
@Jarods_Journey Před 5 měsíci
Should be fine, but I'd recommend using Pinyin. The tokenizer isn't wide enough to accept all the kanji in Chinese
@user-iv2sp2gl1z Před 5 měsíci
@@Jarods_Journey YThank you for your reply. A few days ago, I used about 900 hours of Chinese voice data to fine-tune the tortoise-tts model. However, I found that the voice generated by the model ultimately contained a severe foreign accent, as if a foreigner was speaking Chinese. It's not authentic enough. What could be the reason for this?
@user-iv2sp2gl1z Před 4 měsíci
@@Jarods_Journey Can you show the mel/text loss in validation set when you train Japanese tortoise-tts with 840-hour speech corpus? When I train the model in 720-hour Chinese speech corpus, I can observe the similar mel/text loss in training set. However, when I added the courterpart in validation set, the mel/text loss in validation set didn't decrease, but increase dramatically. Why? Did you observe the similar phenomenon?
@mrpokemon517 Před 6 měsíci
Anime female girl voice
@mrpokemon517 Před 6 měsíci
Generate website

Další v pořadí

Automatické přehrávání

How I Train Tortoise in Other Languages - Training to Finished Model

How I Train Tortoise in Other Languages - Training to Finished Model

Building Desktop Applications using Native PHP with Simon Hamp

Building Desktop Applications using Native PHP with Simon Hamp

Latest in AI TTS and StyleTTS WebUI Updates

Latest in AI TTS and StyleTTS WebUI Updates

We need to beat it… 😳⚽️

We need to beat it… 😳⚽️

Co to ti kluci zase vymýšlí?🤭😅

Co to ti kluci zase vymýšlí?🤭😅

Sergei Barracuda - Ostrava (OFFICIAL VIDEO)

Sergei Barracuda - Ostrava (OFFICIAL VIDEO)

The First Time You Say ' Mom ' #shortsfeed #funny

The First Time You Say ' Mom ' #shortsfeed #funny

2 AI Voice Tools I'm Finishing Up and Another NEW TTS Engine

2 AI Voice Tools I'm Finishing Up and Another NEW TTS Engine

Updated AI Audiobook Maker Installation and Bug Fixes

Updated AI Audiobook Maker Installation and Bug Fixes

New AI Voice Cloning Project - StyleTTS2 Webui (in progress)

New AI Voice Cloning Project - StyleTTS2 Webui (in progress)

Voice Training Webui Complete and ParlerTTS

Voice Training Webui Complete and ParlerTTS

When RESTful architecture isn't enough...

When RESTful architecture isn't enough...

Llama3 Full Rag - API with Ollama, LangChain and ChromaDB with Flask API and PDF upload

Llama3 Full Rag - API with Ollama, LangChain and ChromaDB with Flask API and PDF upload

Why Are Open Source Alternatives So Bad?

Why Are Open Source Alternatives So Bad?

"How to give GPT my business knowledge?" - Knowledge embedding 101

"How to give GPT my business knowledge?" - Knowledge embedding 101

New Realtime AI Voice Changing Advancements - NO GPU

New Realtime AI Voice Changing Advancements - NO GPU

BEST AIRPODS MAGIC SECRET | @Whoispelagheya

BEST AIRPODS MAGIC SECRET | @Whoispelagheya

Running With Bigger And Bigger Feastables

Running With Bigger And Bigger Feastables

KOCOVINA VE 20 vs VE 30 LETECH 😅😂

KOCOVINA VE 20 vs VE 30 LETECH 😅😂

Gender reveal 🤰🩵 #hannahstocking #shorts

Gender reveal 🤰🩵 #hannahstocking #shorts

When you discover a family secret

When you discover a family secret

Moja Prvá Šialená Skúsenosť s Alkoholom - Animácia CZ/SK

Moja Prvá Šialená Skúsenosť s Alkoholom - Animácia CZ/SK

Komu Přeteče Sklenička, Dostane Šlehačku do Obličeje!

Komu Přeteče Sklenička, Dostane Šlehačku do Obličeje!