Updated AI Audiobook Maker Installation and Bug Fixes
Vložit
- čas přidán 6. 09. 2024
- Links referenced in the video:
Audiobook Maker - github.com/Jar...
AI Voice Cloning V3 - • AI Voice Cloning v3 Pa...
Get RVC Voices - • How to Get AI Voice Mo...
RVC Playlist - • RVC (Retrieval-based V...
Hardware for my PC:
Graphics Card - amzn.to/3pcREux
CPU - amzn.to/43O66Ir
Cooler - amzn.to/3p98TwX
RAM - amzn.to/3NBAsIq
SSD Storage - amzn.to/42NgMFR
Power Supply (PSU) - amzn.to/430bIhy
PC Case - amzn.to/447499T
Mother Board - amzn.to/3CziMXI
Alternative prebuilds to my PC:
Corsair Vengeance i7400 - amzn.to/3p64r22
MSI MPG Velox - amzn.to/42MnJHl
Cheapest and PC recommended:
Cyberpower 3060 - amzn.to/3XjtZoP
Come join The Learning Journey!
Discord - / discord
Github - github.com/Jar...
TikTok - / jarodsjourney
If you found anything helpful, please consider supporting me and the content I am trying to produce!
www.buymeacoff...
This guy has single handedly allowed me to work on projects and progress my life and hobbies in a way I NEVER would have been able to do without. Thank you sir!
great to know. but what is your hobby?, what do you do?
Is there a way to download pretrained tortois tts models so that we can just plug them in ?
Like a place where to download them ?
Love your work, Jarod ❤
Hiya! And thank you for your wonderful work!
Just wanted to say, I upgraded to this version and noticed an odd bug where the generated json will remove all periods except those next to a quotation mark. This causes about half the generated audio to hang at the end of a sentence or mispronounce the final word and wasn't an issue in the previous version. The solution so far is to just manually add back in the missing periods if the audio is bad and regenerate.
Odd! I'll take note of this!
The hero we don't deserve!
Hi Jarod. I want to ask something. Why you choose to still use TortoiseTTS + RVC instead of StyleTTS2?
From performance and quality wise, isn't StyleTTS2 better? Or is there anything else you consider?
Thankss
@Jarods_Journey Have you considered working on the code to enable it to run in the background or multithreaded? That way, it won't freeze the program when you regenerate the audio.
I left that in there as you technically shouldn't be doing anything while it's generating audio, it's similar to how I grey out buttons when the main generation block is running. Never went back to make it unblocking
@Jarods_Journey is never a good practice to have a gui freeze like that, it will be better disable the rest of the buttons or add a please wait message box, by the way good work in all you do, just an opinion, thanks.
What files do I need to move around to use a voice I trained on the web guide for the audio books?
The paus doesn't seem to work, everything else is fine. Any clue as to why?
How do you create an audiobook with multiple different voices?
Why not combining to M4B ? thank you very much for the update!
Can you integrate pronounciation correction in this (like Balabolka).🧐
If this frontend utility become utilisable like balabolka, [per word realtime pronounciation check, IPA pronounciation/editing, assign shortforms for sentence (AI for Artificial Intelligence)] and inbuilt text editor with pronounciation dictionary.
Man, No kidding, I'll become member for lifetime 🙏🏻
I've been having a lot of fun with udio lately. Tried it? I've put in some old unfinished music of mine and it's really surprised me with ways to move the song forward.
Thanks for this Jarod :). Will there ever be a working model of your tortoise build set up in Google Colab? I really like the way it runs, but I simply don't have the processing power. Many thanks for everything you do.
RVC makes things sound worse??
Just having a play with this, and the voice pitches are raised/messed around with from the source version from Tortoise? How can you stop the output from being manipulated by Audiobook from what Tortoise generates? ie: The WAV file in the Tortoise results folder sounds fine. But in the Audiobooks output folder, the WAV has been manipulated and sounds worse :(
Is it RVC messing it up?
Yes! The RVC manipulation of the WAV files makes them sound worse. I've stopped it from working by overwriting the RVC output with the original file and it sounds much better:-
orig_audio_path = audio_path
audio_path = rvc_convert(model_path=voice_model_path,
f0_up_key=f0_pitch,
resample_sr=0,
file_index=voice_index_path,
index_rate=index_rate,
input_path=audio_path)
shutil.copy2(orig_audio_path, audio_path) # LOSE RVC
how do I make tort stick with one voice instead of doing it randomly? I don't want to make my own voice.
Please review IMS Toucan
So useful! Thank you ❤🎉
this is awesome! thank you
crazy good update!
It work but a bit hard on my machine. I wonder if there's a way to run my own audio through SVC and record it. Like playing an audio and run it through svc like you do with mic and record it.
wait is this compatible with rvc ?
Yes, it uses RVC models to convert the TTS output from tortoise.
What do I do with this error?
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 69: invalid start byte
How to create subtitle for converted audio as .srt file please? Is there any ways to convert audio + subtitle files? Thank you.
Absolute hero 🎉
🙏🙏
I'm getting
Error: [WinError 10061] No connection could be made because the target machine actively refused it, retrying... (1/3)
Error: [WinError 10061] No connection could be made because the target machine actively refused it, retrying... (2/3)
Error: [WinError 10061] No connection could be made because the target machine actively refused it, retrying... (3/3)
When trying to select the text file & then start audiobook generation
same
Is there a way you can improve the UI to be accessible with screen readers for the blind? The web based gradio interfaces are great but this is almost unusable. As of now if I wanted to make a book I'd have to manually record each sentence from Tortoise in my digital audio workstation sentence by sentence because Tortoise has voice glitches you are all too aware of when using fine tuned models. I hope Style TTS 2 will finally be able to replace Tortoise and sound just as good. Obviously I'd rather something local be as good as Chat GPT4O's voice or 11 labs but I know that's going to be a few years down the line. You know Tortoise would be really great if it wasn't susceptible to the voice glitches.
well done!
hello brother, I always get info can't access localhost :7860 (HTTP 1.1 404 Not Found), when activating start_package so I can't access data in the audio book maker.
same
Hello Jarod, did you take a look at seed tts?
Yes, it's really really good. However, ain't no way bytedance is releasing their models 😂
Would love a linux version
Will it work with other languages?
levenlabs altrenative create sir please
Windows only?
Linux please