Google Cloud Speech-To-Text API With Python For Beginners

Sdílet
Vložit
  • čas přidán 23. 07. 2024
  • With Google Speech-To-Text API, you can convert speech to text, transcribe videos, and even recognize custom keywords. In this video, we are going to learn how to get started with the Google Speech-to-Text API in Python from scratch. I will show you every single step from create a Google Cloud project all the way to writing a Python script to connect to Google Cloud Speech-To-Text API. So if you are new to Google Cloud service, this would be a great introductory video to learn how to get started.
    📑 Google Speech-To-Text Models: cloud.google.com/speech-to-te...
    📑 Audio Encoding List: cloud.google.com/speech-to-te...
    📑 Google Speech-To-Text API Reference: cloud.google.com/speech-to-te...
    ► Buy Me a Coffee? Your support is much appreciated!
    -------------------------------------------------------------------------------------------
    ☕ Paypal: www.paypal.me/jiejenn/5
    ☕ Venmo: @Jie-Jenn
    💸 Join Robinhood with my link and we'll both get a free stock: bit.ly/3iWr7LC
    ► Support my channel so I can continue making free contents
    ---------------------------------------------------------------------------------------------------------------
    🛒 By shopping on Amazon → amzn.to/2JkGeMD
    👩‍💻 Follow me on Linked: / jiejenn
    🌳 Becoming a Patreon supporter: / jiejenn
    ✉️ Business Inquiring: CZcams@LearnDataAnalysis.org
    00:00 - Intro
    00:50 - Google Speech-To-Text API Pricing
    01:11 - Create Python Virtual Environment
    03:17 - Install Speech-To-Text API Python package
    03:36 - Create a Google Cloud project
    04:50 - Enable Speech-To-Text API
    06:38 - Create a Google Service Account
    07:38 - Download Service Account client file
    08:57 - Python script example 1 (local file)
    17:47 - Python script example 2 (cloud storage)
    #googlecloud #googlespeechtotext #googleapi

Komentáře • 29

  • @frankking5326
    @frankking5326 Před 6 měsíci +2

    After two days fighting with this on my own, your video solved my problem!! Thanks

  • @mariodevelopersantos1102
    @mariodevelopersantos1102 Před 8 měsíci

    this is exactly what I needed, thanks

  • @ashishkumar-eo1tz
    @ashishkumar-eo1tz Před rokem

    Simply awesome. Keep up the good work bro 👍

  • @TheCloudShepherd
    @TheCloudShepherd Před 5 měsíci

    Excellent. Thank you very much

  • @RicardoCrumbleton
    @RicardoCrumbleton Před 6 měsíci

    How could this be adapted for the v2 api with Chirp?

  • @DungLe-rp5vu
    @DungLe-rp5vu Před 6 měsíci

    Your instructions are great, besides how can I get text from online mp3 link?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Před rokem +1

    by chance, do you have any idea what Industry Coding Assessment is, as part of an interview? I was told that it is not the same as testing you on algorithms, like in LeetCode.

    • @jiejenn
      @jiejenn  Před rokem

      Each company is different, so there's no defined answer to be honest.

  • @g_30_sanketpatil62
    @g_30_sanketpatil62 Před 3 měsíci

    hey bro i have created a voice bot using google dialogflow cx, now I wanted to transcribe the ongoing voice call so can you please tell me how can I achieve it
    thanks

  • @aastharathod8786
    @aastharathod8786 Před 11 měsíci

    how can i get it for JAVA?

  • @user-wo2xo4du2l
    @user-wo2xo4du2l Před 5 měsíci

    Donde se encuentra la interfaz de usuario?

  • @ashishprakash8430
    @ashishprakash8430 Před 7 měsíci +1

    hii, i follow your tutorial but it is not transcribing all audio.. please help.

    • @frankking5326
      @frankking5326 Před 6 měsíci

      what error did you get?

    • @ashishprakash8430
      @ashishprakash8430 Před 6 měsíci

      @@frankking5326 found the solution, I was using default model. Video model worked. Thanks for your tutorial.

    • @jubileudasilva9258
      @jubileudasilva9258 Před měsícem

      Speech-to-Text has three main methods to perform speech recognition. These are listed below:
      Synchronous Recognition (REST and gRPC) sends audio data to the Speech-to-Text API, performs recognition on that data, and returns results after all audio has been processed. Synchronous recognition requests are limited to audio data of 1 minute or less in duration.
      Asynchronous Recognition (REST and gRPC) sends audio data to the Speech-to-Text API and initiates a Long Running Operation. Using this operation, you can periodically poll for recognition results. Use asynchronous requests for audio data of any duration up to 480 minutes.
      Streaming Recognition (gRPC only) performs recognition on audio data provided within a gRPC bi-directional stream. Streaming requests are designed for real-time recognition purposes, such as capturing live audio from a microphone. Streaming recognition provides interim results while audio is being captured, allowing result to appear, for example, while a user is still speaking.

  • @dhoreys
    @dhoreys Před 2 měsíci

    My .wav file did not convert. Is there a sample .wav file I could use?

    • @jiejenn
      @jiejenn  Před 2 měsíci

      You can search on Google, there are plenty.

    • @dhoreys
      @dhoreys Před 2 měsíci

      @@jiejenn The samples I have aren't working. I tried those. Gemini is saying to make sure that the file is in LINEAR16 format.

  • @user-kb6ck7gb6f
    @user-kb6ck7gb6f Před 4 měsíci

    Is it possible to make it recognize in real-time from a microphone with good performance?
    Edit: Another Question: Does it support the Arabic language as AWS doesn't in streaming (real-time)?

    • @jiejenn
      @jiejenn  Před 4 měsíci

      Yeah, it definitely possible, but not going to be cheap though.

    • @madhav1527
      @madhav1527 Před 2 měsíci

      Hi Did you find out a solution on how to get the input from a microphone, and supporting arabic? in real time, if so do let me know as i am having trouble in implementing the same

    • @user-kb6ck7gb6f
      @user-kb6ck7gb6f Před 2 měsíci

      @@madhav1527 I used Open AI Whisper

    • @madhav1527
      @madhav1527 Před 2 měsíci

      ​@@user-kb6ck7gb6f but that has a cost right per api call, could you let me know if you found any library that does it without a cost

    • @madhav1527
      @madhav1527 Před 2 měsíci

      ​​@@user-kb6ck7gb6f and one more question, how accurate would you say the open ai whisper is

  • @TheCloudShepherd
    @TheCloudShepherd Před 5 měsíci +1

    Google documentation sucks. Thanks for this clearly explained how-to video

    • @jiejenn
      @jiejenn  Před 5 měsíci

      Glad the video helped.

  • @sadamhussain816
    @sadamhussain816 Před 11 měsíci +2

    Is it free?

  • @7BlackJack8
    @7BlackJack8 Před 10 dny

    Google couldn't do any better to gets developer away from this. It's an atrocious mess to use the apis