Build an LLM powered Chrome Extension 🔥

Fine tuning Whisper for Speech Transcription

Why I Switched from Python to Rust for AI Deployment

Just try to use a cool gadget 😍

Tom & Jerry !! 😂😂

Erling Haaland Stále ROSTE i ve 23 letech...

Build a Containerized Transcription API using Whisper Model and FastAPI

AI Anytime

zhlédnutí 6 213

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 28. 06. 2024
In this exciting tutorial, I'll guide you step by step on how to create your very own Containerized Transcription API using the powerful Whisper AI model and FastAPI as the backend framework.
We'll start by setting up a development environment and configuring FastAPI to build a robust web application. Then, we'll seamlessly integrate the open-source Whisper AI model, a cutting-edge solution for Speech-to-Text (STT) transcription, to enable accurate and efficient audio-to-text conversion.
But that's not all! We'll take it a step further by containerizing our application using Docker, ensuring that it runs consistently and efficiently in any environment. This approach not only simplifies deployment but also allows for scalability and easy management.
By the end of this tutorial, you'll have a fully functional, containerized Transcription API that can effortlessly convert audio files into text. Whether you want to automate your transcription tasks, enhance accessibility for your content, or explore the world of AI-powered applications, this project has you covered.
Don't forget to hit that "Like" button if you find this tutorial helpful, leave your questions and thoughts in the comments section below, and be sure to subscribe for more exciting AI and development tutorials.
GitHub Repo: github.com/AIAnytime
Whisper GitHub: github.com/openai/whisper
#openai #ai #python
Věda a technologie

Komentáře • 29

@shivamroy1775 Před 8 měsíci ⁺²
Absolute quality content. So informative and I love how every step is explained in great detail.
@AIAnytime Před 8 měsíci
Glad you liked it!
@anukamithara Před dnem
Thank you!
It's working perfectly
@chrisumali9841 Před 5 měsíci
Thanks for the demo and info, very informative and precise. I truly appreciate it. Easy to deploy. Have a great day.
@AIAnytime Před 5 měsíci
Glad it was helpful!
@MrZelektronz Před 7 měsíci ⁺¹
Solely judging from the title this is exactly what i need. I hope it works as I expect :D gonna keep watching
@AIAnytime Před 7 měsíci
Thanks 👍
@HowayaNowTed Před 7 dny
Great video, thanks very much! I'm looking to deploy Whisper for an app I'm working on which will require multiple transcriptions of small audio chunks to take place concurrently. If I were to deploy your solution on EC2, what sort of specs would I need?
@shubhbhalla3850 Před 3 měsíci
Great explanations, thank you so much for the tutorial!
@AIAnytime Před 3 měsíci
You're very welcome!
@nicolassuarez2933 Před měsícem
Outstanding!
@kshitizkhandelwal879 Před 8 měsíci ⁺²
You are incredible. Can we get more of end to end projects involving Docker
@AIAnytime Před 8 měsíci
Thanks... you can watch this as well. czcams.com/video/7CeAJ0EbzDA/video.html
@harshkadam3702 Před 8 měsíci ⁺¹
Hey , you created video on the text to image API in past , so can we able to create API that can use checkpoint from civitai , like able to use multiple checkpoint , models and able to call that API ? Is it possible ?
@joshmay9531 Před 4 měsíci
Do you know if speaker diarization (breaking up the transcription be each speaker) can be built into this?
@nicolassuarez2933 Před měsícem
Best way to deploy this container? AWS EC2 kind of expensive... needs lot of RAM
@SonGoku-pc7jl Před 4 měsíci
thank you so much! One question, in the first version of whisper you couldn't do a translation from English to Spanish. You could only do a .transcribe of one language or another but not the translation. Do you know if whisper v3 can now do translations from English to Spanish? or any updated whisperX or any options? In truth, where I want to use it the most is, for example, translating your videos since the CZcams translator is very bad and it is difficult to follow you. If possible, could you make a video? ;)
@nguyentoanhnt Před měsícem
How can I use async with the code line: result = model.transcribe(temp.name)
@rois8888 Před 8 měsíci
When I run in Postman in headers I put Content-Type: multipart/form-data and in the Body I put Key as "files" and for Value I upload the .wav file. For some reason I get files: undefined
Maybe on Mac I'm supposed to do something different?
@josuechacon6240 Před 5 měsíci
I got the same error. Because I called Files to the parameter and is mandatory (from FastAPI documentation) to call "file" the parameter in the function.
file: UploadFile
Then, you can access to the file:
File = file.file
@ryanbradbury3745 Před 4 měsíci
I notice you're pushing the audio file via http post method. Is there anyway to pull the file from a given location? i.e. from AWS S3 bucket, file system etc...
@RAVINDRACHOWDARY Před 4 měsíci ⁺¹
Hii 👋,
Can you do for whisperjax?😉
@josuechacon6240 Před 5 měsíci
Someone know how to handle myltiples requests and running in differents GPU sockets?
Because I have four GPU in the server but the model and FastAPI only use one GPU (number 0)
@user-jf5ru5ow8u Před 2 měsíci
what happen when i pass 8 gn file
@concaption Před 8 měsíci
requirements file in incomplete. Is not working with the whisper library that i am usign from pypi
@AIAnytime Před 8 měsíci
You don't have to install Whisper from pypi from requirements.txt. Dockerfile will take care of it. As it is building directly from Git.
@concaption Před 8 měsíci
@@AIAnytime I figured it out finally.
There were some issues in the newer version of openai-whisper package.
fastapi==0.78.0
uvicorn[standard]==0.23.2
aiofiles==23.2.1
python-multipart==0.0.6
torch==2.0.1
openai-whisper==20230314
tiktoken==0.3.1
@datasciencetoday7127 Před 8 měsíci
hero
@nguyentoanhnt Před 26 dny
How can I run with GPU.
Currently when I run a container, the code line "DEVICE = "cuda" if torch.cuda.is_available() else "cpu"" the DEVICE is "cpu" though my computer has gpu.
Thanks.

Další v pořadí

Automatické přehrávání

Build an LLM powered Chrome Extension 🔥

Build an LLM powered Chrome Extension 🔥

Fine tuning Whisper for Speech Transcription

Fine tuning Whisper for Speech Transcription

Why I Switched from Python to Rust for AI Deployment

Why I Switched from Python to Rust for AI Deployment

Just try to use a cool gadget 😍

Just try to use a cool gadget 😍

Tom & Jerry !! 😂😂

Tom & Jerry !! 😂😂

Erling Haaland Stále ROSTE i ve 23 letech...

Erling Haaland Stále ROSTE i ve 23 letech...

ASÍ IMPROVISÓ AL FINAL DE LA COREO MI ALUMNA 😱

ASÍ IMPROVISÓ AL FINAL DE LA COREO MI ALUMNA 😱

Contributing to Open Source for the first time

Contributing to Open Source for the first time

Build an SQL Agent with Llama 3 | Langchain | Ollama

Build an SQL Agent with Llama 3 | Langchain | Ollama

Transcribe and Translate Audio with Whisper Using Ubuntu

Transcribe and Translate Audio with Whisper Using Ubuntu

Fastest speech to text transcription, 100% offline - Whisper.cpp | Zero latency

Fastest speech to text transcription, 100% offline - Whisper.cpp | Zero latency

$0 Embeddings (OpenAI vs. free & open source)

$0 Embeddings (OpenAI vs. free & open source)

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

"I want Llama3 to perform 10x with my private knowledge" - Local Agentic RAG w/ llama3

Build a Gemini Voice Assistant in Python

Build a Gemini Voice Assistant in Python

How to Install & Use Whisper AI Voice to Text

How to Install & Use Whisper AI Voice to Text

Build an AI app with FastAPI and Docker - Coding Tutorial with Tips

Build an AI app with FastAPI and Docker - Coding Tutorial with Tips

Simple maintenance. #leddisplay #ledscreen #ledwall #ledmodule #ledinstallation

Simple maintenance. #leddisplay #ledscreen #ledwall #ledmodule #ledinstallation

Apple Watch with a CAMERA?! 😳

Apple Watch with a CAMERA?! 😳

Držák na telefon vyrobený na 3D tiskárně.

Držák na telefon vyrobený na 3D tiskárně.

Connecting Living Neurons to a Computer | Signals Received!

Connecting Living Neurons to a Computer | Signals Received!

FAKE TECH Restoration Videos! 🖥️

FAKE TECH Restoration Videos! 🖥️

Tiny Keyboard Family

Tiny Keyboard Family

Prodal ledvinu, aby si koupil iPhone

Prodal ledvinu, aby si koupil iPhone

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech

#phonescreenprotector #tempered #smartphone #temperedglass #cellphone #goodthing #mobilephone #tech