Building My Own Alexa / Google Home: Detecting the Wake-up Keyword

Sdílet
Vložit
  • čas přidán 27. 07. 2024
  • In today's video I start a project where I create my own Google Home / Alexa style device that can be voice controlled. In this video I implement the initial wakeup keyword, or "OK, GPT" detection
    GitHub: github.com/unconv/ok-gpt
    Support: buymeacoffee.com/unconv
    Consultations: www.buymeacoffee.com/unconv/e...
    Memberships: www.buymeacoffee.com/unconv/m...
    00:00 Intro
    01:10 Trying PocketSphinx keyphrase detection
    04:19 Getting past PocketSphinx recognition errors
    22:49 Detecting multiple keyphrases with multiprocessing
    33:58 Separating initialization and recognition
    37:22 Using Queues to perform tasks when keyphrase is detected
    44:55 It works!
  • Věda a technologie

Komentáře • 16

  • @stony42069
    @stony42069 Před 7 měsíci

    I have looked into creating a centralized hub for any bot providing options for real time listening.. or voice activation.. and text to speech/speech to text where you can name each bot and call on them one at a time or with Hierarchy response real time conversation with all of them or even essentially let them talk amongst themselves... I was considering android studio... the only setback is I have only self taught understanding of how code works and no experience of actually coding... but through the tumble method of feeding code back and forth from bot to bot they can polish and clean up code quite well and for a different project with gpt4 and memgpt i have successfully created an xml page... internally I know how computers operate but I don't have much implemented experience..but this will be a fun video to watch

  • @mikebledig7208
    @mikebledig7208 Před 3 měsíci

    One thing that has me thinking, when you record your voice using just a microphone on to magnetic tape, or now days, some form of digital storage, what you say gets recorded exactly as you said it. However, computers don't seem to be able to pick out exactly what you said. Pocketsphinx gets it so wrong. Of course, there are things like google speech recognition that do a pretty good job now days. But it's just amazing how a simple microphone used to record your words, doesn't get it wrong, yet a computer gets it wrong.
    imagine leaving a voice msg for someone where you said: "See you later" and the microphone got it as, See you never". Bleh...

  • @dawn_of_Artificial_Intellect

    great video please continue on with the project. I love the way you allow errors to happen and then you solve them as you go in realtime

    • @unconv
      @unconv  Před 8 měsíci +1

      Thanks! I will continue (2nd video is out already!)

  • @zeta_meow_meow
    @zeta_meow_meow Před 6 měsíci

    cool, but this went totally over my head haha

  • @dreamofeternalhappiness8001
    @dreamofeternalhappiness8001 Před 8 měsíci +2

    That Sphinx seems a good way to make scientific experiments with elevated stress levels. 🧔 What speech recognizion CZcams uses to generate subtitles?

    • @dreamofeternalhappiness8001
      @dreamofeternalhappiness8001 Před 8 měsíci

      🕳️👀 It is their secret it seems. That was interesting experiment with local 'Ok Google'. Maybe OpenAI Whisper would give better results than Sphinx.

    • @unconv
      @unconv  Před 8 měsíci

      Yeah, I thought about using Whisper but I don't want to send all the audio to OpenAI. Although there might be a local version of Whisper, but it seemed like a hassle to get working, especially for a Raspberry Pi

    • @TheVisitorX
      @TheVisitorX Před 8 měsíci

      ​@@unconv If you have a newer model and a 64bit os, you could try out whisper.cpp with the tiny model, which seems to work with the Pi (at least model 4). Whisper is really great and much better than any other solution out there.

    • @unconv
      @unconv  Před 8 měsíci +1

      Thanks! I'll try out Whisper locally. I was just intimidated by the AI-ness of it haha