My Top 5 Open Source Text to Speech Softwares Starting off in 2024

How to Make the PERFECT Dataset for RVC AI Voice Training

Realtime Speech Translation with Facebook's SeamlessM4T

Underwater Challenge 😱

Gender reveal 🤰🩵 #hannahstocking #shorts

Gli occhiali da sole non mi hanno coperto! 😎

Open Source Multimodal LLM for Speech - SpeechGPT

Jarods Journey

zhlédnutí 4 140

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 6. 09. 2024
SpeechGPT - github.com/0nu...
Examples - 0nutation.gith...
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/Jar...
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoff...

Komentáře • 21

@bomar920 Před 6 měsíci ⁺⁶
We are eagerly waiting how you get your dataset prepare in your previous video
@miladmohseni187 Před 6 měsíci ⁺¹
Thank you teacher for the excellent educational videos
Please tell me the name of the most powerful voice changing artificial intelligence that you have tested so far 🙏
@Amandeep-yq7ew Před 6 měsíci ⁺¹
Can you make a vedio on the installation process
@seifuishiguro Před 6 měsíci
Hey Jarrod. I've recently come across your channel and I learned quite a bit, I love this content. I would like to try out Tortoise + RVC sometime in the near future when I can afford a GPU.
At the moment I am trying out eleven labs, their v2 model is pretty dang good, but I can't get it to clone special voices like Luffy and Usopp from One Piece (dub), even with high quality recordings from sound resource. Usopp is somewhat close but Luffy is far off.
Anyway I'm very curious to see if Tortoise+RVC can do a better job. I saw your short where you compared the two models with Melina's voice, but that was quite a few months ago, any chance you can compare them again soon?
@thekinoreview1515 Před 6 měsíci
Jarod, have you seen/tried aero (slp-rl/aero on gh) at all? I am impressed with it for audio super resolution. It did a good job with dialogue from 12khz -> 48khz for me. I think it could be used in a TTS/voice conversion pipeline as either dataset pre-processing (to get 48k samples for RVC, which are hard to come by) or applied to output so you can train lower on lower quality data but end up with high quality results.
@Jarods_Journey Před 6 měsíci
I haven't seen that one, but I believe voicefixer is an option on tortoise (not sure if it's active or deactivated) which is an "upscaler" for audio as well. I might have to check
@thekinoreview1515 Před 6 měsíci
@@Jarods_Journey Thanks, I will check out voicefixer also.
@gotrixf3088 Před 6 měsíci
Hello bro, watching your videos, I was very interested in learning more about programming and understanding how to build systems like yours. Could you give a quick guide on where to start?
@Jarods_Journey Před 6 měsíci
You can watch some beginner Python courses on YT, but I mainly developed my skills through doing projects. When ChatGPT came along, it sped up the process even more.
My suggestion is browse around on CZcams and find relevant tutorials or guides on topics your interested in, and then start tinkering with it yourself
@WorldLie Před 6 měsíci
Jarod can u please do a video on how to continue training tortoise tts if theres a blackout during the training. i would really appreciate if make a quick video on it.
@Jarods_Journey Před 6 měsíci
You might wanna check the latest Japanese tortoise video, I show how to continue training in that one
@tylerchambliss8379 Před 6 měsíci
Hey Jarrod, so I'm wondering what I'm doing wrong, my tortoise models are repeating but the audio sounds fine as far as output goes. I manually split my dataset with my audio editor instead of letting Whisper do it because of it leaving breaths and noise in the end of my clips and while that has helped get rid of the artifacts and improve my audio it's still skipping stuff, repeating, and making jibberish on occasion. I have text LR and mel ratios all the way up, learning rate at 0.01 and between 10 and 50 epocs depending on dataset length, 5 minutes to about an hour. My losses were around 0.6 something at the lowest and 1.3 something at the highest. Pause and repeat penalty are both set to 8 on inference.
@Jarods_Journey Před 6 měsíci
I think your LR is a little too high, I would try with a lower LR, closer to 0.0001 or 0.00001, but repeating words and artifacts are a known issue on tortoise. Better datasets and longer training might help to mitigate this, but it really depends seemingly on the voice.
@tylerchambliss8379 Před 6 měsíci
@@Jarods_Journey So now Tortoise is complaining about my text length being too long. It was only about 6000 characters and I've put 22000 some odd characters through it on the default autoregressive model and it went fine. What's going on here?
@Hury209 Před 6 měsíci
Do someone knows easiest way to install mistral llm to chat with pdf on windows with possibly webui?
@vinchenzovarela8039 Před 6 měsíci
I'm currently learning some languages and it would be very interesting to see this type of models be implement in some sort of tutoring app, I'm not sure if it has the capability of differentiating languages in the same .wav file dou
@vinchenzovarela8039 Před 6 měsíci
You are awesome bro, keep up the good work. I'm using your tortoise tts depo for a project now
@Jarods_Journey Před 6 měsíci
Not as of now, it seems to be English only. Bur maybe in the future, definitely
@yuyutsurao Před 6 měsíci
How i can contact u ?
@GraveUypo Před 6 měsíci
That's cool but i need it to be better than this. Also it might sound better with an RVC pass 😬

Další v pořadí

Automatické přehrávání

My Top 5 Open Source Text to Speech Softwares Starting off in 2024

My Top 5 Open Source Text to Speech Softwares Starting off in 2024

How to Make the PERFECT Dataset for RVC AI Voice Training

How to Make the PERFECT Dataset for RVC AI Voice Training

Realtime Speech Translation with Facebook's SeamlessM4T

Realtime Speech Translation with Facebook's SeamlessM4T

Underwater Challenge 😱

Underwater Challenge 😱

Gender reveal 🤰🩵 #hannahstocking #shorts

Gender reveal 🤰🩵 #hannahstocking #shorts

Gli occhiali da sole non mi hanno coperto! 😎

Gli occhiali da sole non mi hanno coperto! 😎

NEJRYCHLEJŠÍ Střela v Historii FOTBALU…

NEJRYCHLEJŠÍ Střela v Historii FOTBALU…

AnyGPT: The Any-to-Any Multimodal LLM - Audio, Text, and Image! (Opensource)

AnyGPT: The Any-to-Any Multimodal LLM - Audio, Text, and Image! (Opensource)

3 Seconds of Audio Can Clone Any Voice - Speech Editting with VoiceCraft

3 Seconds of Audio Can Clone Any Voice - Speech Editting with VoiceCraft

Training Any Language in AI Voice Cloning - Tortoise TTS

Training Any Language in AI Voice Cloning - Tortoise TTS

LLaVA - This Open Source Model Can SEE Just like GPT-4-V

LLaVA - This Open Source Model Can SEE Just like GPT-4-V

What is Retrieval-Augmented Generation (RAG)?

What is Retrieval-Augmented Generation (RAG)?

5 Easy Ways to help LLMs to Reason

5 Easy Ways to help LLMs to Reason

NExT-GPT: The first Any-to-Any Multimodal LLM

NExT-GPT: The first Any-to-Any Multimodal LLM

This AI video generator just broke reality

This AI video generator just broke reality

Local Low Latency Speech to Speech - Mistral 7B + OpenVoice / Whisper | Open Source AI

Local Low Latency Speech to Speech - Mistral 7B + OpenVoice / Whisper | Open Source AI

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

This Famous Athlete Shocked the Olympics 👟

This Famous Athlete Shocked the Olympics 👟

Proč první Deadpool nemĕl ústa? #deadpool #wolverine #shorts

Proč první Deadpool nemĕl ústa? #deadpool #wolverine #shorts

Ne vždycky to jde napoprvé😅💕

Ne vždycky to jde napoprvé😅💕

Running With Bigger And Bigger Feastables

Running With Bigger And Bigger Feastables

This Is the Most Satisfying Pimple DIY Ever 🤩 #diy #hack

This Is the Most Satisfying Pimple DIY Ever 🤩 #diy #hack

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

NEJLEPŠÍ KVÍZ NA YOUTUBE @Duklock @EvilBender47

TOHLE JSEM FAKT NEPOTŘEBOVAL VĚDĚT 😅

TOHLE JSEM FAKT NEPOTŘEBOVAL VĚDĚT 😅