Build a Containerized Transcription API using Whisper Model and FastAPI

Sdílet
Vložit
  • čas přidán 28. 06. 2024
  • In this exciting tutorial, I'll guide you step by step on how to create your very own Containerized Transcription API using the powerful Whisper AI model and FastAPI as the backend framework.
    We'll start by setting up a development environment and configuring FastAPI to build a robust web application. Then, we'll seamlessly integrate the open-source Whisper AI model, a cutting-edge solution for Speech-to-Text (STT) transcription, to enable accurate and efficient audio-to-text conversion.
    But that's not all! We'll take it a step further by containerizing our application using Docker, ensuring that it runs consistently and efficiently in any environment. This approach not only simplifies deployment but also allows for scalability and easy management.
    By the end of this tutorial, you'll have a fully functional, containerized Transcription API that can effortlessly convert audio files into text. Whether you want to automate your transcription tasks, enhance accessibility for your content, or explore the world of AI-powered applications, this project has you covered.
    Don't forget to hit that "Like" button if you find this tutorial helpful, leave your questions and thoughts in the comments section below, and be sure to subscribe for more exciting AI and development tutorials.
    GitHub Repo: github.com/AIAnytime
    Whisper GitHub: github.com/openai/whisper
    #openai #ai #python
  • Věda a technologie

Komentáře • 29

  • @shivamroy1775
    @shivamroy1775 Před 8 měsíci +2

    Absolute quality content. So informative and I love how every step is explained in great detail.

  • @anukamithara
    @anukamithara Před dnem

    Thank you!
    It's working perfectly

  • @chrisumali9841
    @chrisumali9841 Před 5 měsíci

    Thanks for the demo and info, very informative and precise. I truly appreciate it. Easy to deploy. Have a great day.

  • @MrZelektronz
    @MrZelektronz Před 7 měsíci +1

    Solely judging from the title this is exactly what i need. I hope it works as I expect :D gonna keep watching

  • @HowayaNowTed
    @HowayaNowTed Před 7 dny

    Great video, thanks very much! I'm looking to deploy Whisper for an app I'm working on which will require multiple transcriptions of small audio chunks to take place concurrently. If I were to deploy your solution on EC2, what sort of specs would I need?

  • @shubhbhalla3850
    @shubhbhalla3850 Před 3 měsíci

    Great explanations, thank you so much for the tutorial!

  • @nicolassuarez2933
    @nicolassuarez2933 Před měsícem

    Outstanding!

  • @kshitizkhandelwal879
    @kshitizkhandelwal879 Před 8 měsíci +2

    You are incredible. Can we get more of end to end projects involving Docker

    • @AIAnytime
      @AIAnytime  Před 8 měsíci

      Thanks... you can watch this as well. czcams.com/video/7CeAJ0EbzDA/video.html

  • @harshkadam3702
    @harshkadam3702 Před 8 měsíci +1

    Hey , you created video on the text to image API in past , so can we able to create API that can use checkpoint from civitai , like able to use multiple checkpoint , models and able to call that API ? Is it possible ?

  • @joshmay9531
    @joshmay9531 Před 4 měsíci

    Do you know if speaker diarization (breaking up the transcription be each speaker) can be built into this?

  • @nicolassuarez2933
    @nicolassuarez2933 Před měsícem

    Best way to deploy this container? AWS EC2 kind of expensive... needs lot of RAM

  • @SonGoku-pc7jl
    @SonGoku-pc7jl Před 4 měsíci

    thank you so much! One question, in the first version of whisper you couldn't do a translation from English to Spanish. You could only do a .transcribe of one language or another but not the translation. Do you know if whisper v3 can now do translations from English to Spanish? or any updated whisperX or any options? In truth, where I want to use it the most is, for example, translating your videos since the CZcams translator is very bad and it is difficult to follow you. If possible, could you make a video? ;)

  • @nguyentoanhnt
    @nguyentoanhnt Před měsícem

    How can I use async with the code line: result = model.transcribe(temp.name)

  • @rois8888
    @rois8888 Před 8 měsíci

    When I run in Postman in headers I put Content-Type: multipart/form-data and in the Body I put Key as "files" and for Value I upload the .wav file. For some reason I get files: undefined
    Maybe on Mac I'm supposed to do something different?

    • @josuechacon6240
      @josuechacon6240 Před 5 měsíci

      I got the same error. Because I called Files to the parameter and is mandatory (from FastAPI documentation) to call "file" the parameter in the function.
      file: UploadFile
      Then, you can access to the file:
      File = file.file

  • @ryanbradbury3745
    @ryanbradbury3745 Před 4 měsíci

    I notice you're pushing the audio file via http post method. Is there anyway to pull the file from a given location? i.e. from AWS S3 bucket, file system etc...

  • @RAVINDRACHOWDARY
    @RAVINDRACHOWDARY Před 4 měsíci +1

    Hii 👋,
    Can you do for whisperjax?😉

  • @josuechacon6240
    @josuechacon6240 Před 5 měsíci

    Someone know how to handle myltiples requests and running in differents GPU sockets?
    Because I have four GPU in the server but the model and FastAPI only use one GPU (number 0)

  • @user-jf5ru5ow8u
    @user-jf5ru5ow8u Před 2 měsíci

    what happen when i pass 8 gn file

  • @concaption
    @concaption Před 8 měsíci

    requirements file in incomplete. Is not working with the whisper library that i am usign from pypi

    • @AIAnytime
      @AIAnytime  Před 8 měsíci

      You don't have to install Whisper from pypi from requirements.txt. Dockerfile will take care of it. As it is building directly from Git.

    • @concaption
      @concaption Před 8 měsíci

      @@AIAnytime I figured it out finally.
      There were some issues in the newer version of openai-whisper package.
      fastapi==0.78.0
      uvicorn[standard]==0.23.2
      aiofiles==23.2.1
      python-multipart==0.0.6
      torch==2.0.1
      openai-whisper==20230314
      tiktoken==0.3.1

  • @datasciencetoday7127
    @datasciencetoday7127 Před 8 měsíci

    hero

  • @nguyentoanhnt
    @nguyentoanhnt Před 26 dny

    How can I run with GPU.
    Currently when I run a container, the code line "DEVICE = "cuda" if torch.cuda.is_available() else "cpu"" the DEVICE is "cpu" though my computer has gpu.
    Thanks.