We can create a voice command system!
Vložit
- čas přidán 24. 05. 2023
- This project utilizes the ESP32-S3 microcontroller and the INMP441 microphone to enable voice command functionality. By leveraging the Espressif Speech Recognition framework, the system can recognize wake-up words like "Hey Siri", and "Ok Google" and execute specific actions associated with those commands. The voice user interface provides a convenient means of interaction in screenless environments and can be applied to a wide range of projects. The implementation involves configuring the ESP32-S3 and integrating the INMP441 microphone to capture audio input, processing the speech recognition using the framework, and linking recognized commands to predefined actions. Overall, this project enables users to control various tasks and functions by speaking voice commands, enhancing usability in scenarios where traditional screen-based interfaces are not available.
[ESP-SR Speech Recognition Framework]
github.com/espressif/esp-sr
[ESP-Skainet for Generic ESP32-S3 DevKit-C]
github.com/0015/esp-skainet/t...
[ESP32-S3-DevKitC-1-N8R8 Development Board]
amzn.to/3FZmfAM
[INMP441 Omnidirectional Microphone]
amzn.to/3Ma8C47
#VUI #Voice #Speech #Recognition #MCU #ESP32 #ThatProject - Věda a technologie
Check More Projects - youtube.com/@ThatProject
Github Repository - github.com/0015/
Join FB Group - facebook.com/groups/138965931539175
@ThatProject it was interesting to send the data to the server (from your previous video) only after defining the wake word. By connecting several of them on different ports, you can create a voice control network.Thanks for interesting videos👍.
Thanks, brother... Cool Project!
Hey bro. Can I do this using ESP32 Devkit V1 Normal 30pins board??
Firstly, I would like to congratulate you on your content. You are helping many people all over the world. I myself have learned a lot from you about ESP32. I would like to thank you very much for that. If possible, I would like some guidance on how to do the same thing, but with the activation word in Portuguese and in the Arduino framework on PlatformIO.
awesome, iam gonna try it with my esp32.
Thank you! 🙂
Pretty cool project, I made it work with INMP441 and a MAX98357A. They need to let us train custom wake words.
how did you do it ill like to try
Thank you so much❤
Nice!!!
Hi could you help to get started with ESP IDF? Mini tutorial maybe? Thx in advance.
You're just a few days ahead of me! Did you see any option to have it detect specific sounds, songs or other audio that wasn't voice?
A sound, song, or other audio seems too broad. Since it resembles noise, it seems impossible to find any pattern from such audio data. what do you think about it?
hello, how did you change the code? did you modify the framework to accept data from the connected microphone or did you change something else, i want to do the same thing, but with the esp32 s3 module.
The source code based on ESP32-EYE has been slightly modified to operate on ESP32-S3 with INMP441. Check out my GitHub page.
Thank you for this wonderful work. I need to make my own. Can this project be implemented on lilygo esp32 s3 t3-cam board? if yes , please what changes should i make. Thank you for your support
I think it works because the T3-CAM board has PSRAM. Please modify the pin connector of the connected microphone to your own as shown in the video and test it.
Great work! Thanks for providing the patch for devkitc.
I was wondering, though, if you were able to get it to work for stereo rather than mono (two INMP441 instead of single INMP441)
That is one thing I cannot do at the time.
I know that the ESP-SR supports stereo input, but I haven't actually tried it. How far have you tried?
I tried it with two INMP441 and wired one as L (grounded the L/R pin) and the other as R (connected L/R pin to VDD). After a few tries I noticed the arguments of i2s_new_channel(.) are hard coded to mono, so I modified it to stereo and set the bit per channel to 16 and sample rate to 16000. Plus setting the total channels to 3, num of microphones to 2 and number of refs to 1. When flashed it, the wake word stopped working.@@ThatProject
@@alrostami Wouldn't it be necessary to configure each channel separately in i2s_config_t? Have you set I2S_NUM_0 and I2S_NUM_1? It's hard to find examples of this.
@ThatProject My catch is that these two channels are configured separately when there is microphone and speaker (to play the voice assistant sound) presented in the examples. To make it even more confusing, I found examples that configure both tx (to play sound) and rx (to read mic data) on the I2S_NUM_X channel, e.g. i2s_new_channel(&chan_cfg, &tx_handle, &rx_handle) for esp32s3-korvo-2, and in another example they used both I2S_NUM_0 and I2S_NUM_1 and configured one to tx for playing sound and the other one to rx for reading mic data. It is a real mess.
@@alrostami It seems to me that you have already surpassed my project. I don't think I can help you with this unless I try it myself.
Can support long words? For example: turn on the light number 3
Multinet seems to recognize up to 5 words at a time. Need to do more testing.
Hi, I like your projects vey much. I've been studying ESP-IDF for a few months. I've been facing these problems with version because I'm using 5.1 now. Can you show how to make some portability from an older version to a new one? Thank you very much.
A few weeks ago it was updated to work with IDF 5.X. Please check again.
@@ThatProject I saw that. But actually I'm trying to compile an ESP-WHO example that hasn't the 5.1 version yet. I don't find a way to set up the environment variables for the new folder I downloaded from git clone command. I'm using Prompt on windows. Thanks
@@anlpereira ESP-WHO also seems to work with ESP-IDF 5.x. Check out this branch. github.com/espressif/esp-who/tree/idfv5.0
How's the sensitivity of the INMP441? Can it hear you properly from across a room, for example?
The sensitivity is so low that you will only hear what is said in front of the INMP441.
@@ThatProject Ah, bummer. Do you know of any mic with a high sensitivity? Looking to put together a local voice assistant to replace my nest devices.
@@ThatProject Finally got around to testing the INMP441; it picked up my voice from across the room clear as day, so sensitivity doesn't seem to be an issue.
@@TheBlackBeltPanda That's interesting. INMP441 is omnidirectional, so in my case, I couldn't catch most of the sound unless I was in front of the microphone. How did you configure your i2s environment?
@@ThatProject I just wired it up to a ESP32-S3, installed ESPHome, and configured the microphone and voice_assistant components per some examples I found.
Hello, thank you for the excellent content of your channel.🙏🌹
Is it possible to use another language like Farsi?
A product like PicoVice supports Farsi, but not esp32 yet.
As far as I know, the language models provided by default are only English and Chinese. Please check again.
Is a wake up word necessary or can it listen to commands directly? Or if it needs to wake up then what about using a button to wake it up?
It's divided into two parts: wakeup word recognition and command word recognition. There is no button to trigger the system. See esp-skainet for details.github.com/espressif/esp-skainet
Is it possible to use typical esp32 boards like wroom 32?
Maybe it's possible, but it seems PSRAM must be on that board.
Is it possible to use the wake word feature to start listening and then send following audio to mqtt server? Thinking this would be great way to send audio to home assistants voice assistant.
It seems possible but I need to try to implement it to check if it has any problems.
@@ThatProject There is a project called "Willow" that appears to integrate skainet with home assistant but is only for esp-box.
possible, I've done that. Note that the new version of esp-skainet seems to have entire pipeline (noise reduction, echo cancellation, automatic gain control, wakeword detection) integrated and uses ringbuffer so I wasn't able to get it working on esp32 without psram. however the older versions had wakeword detection 'exposed' so that you could work out the necessary buffers and pass data to wwd in chunks yourself. Some optimizations are needed when sending the data to the server if you don't have psram, otherwise you might miss chunks of audio and voice assistant will have trouble understanding the recording
another problem you might get if using analog microphone and builtin adc is that when wifi transmits data there appears noise on power line and audio gets distorted, again making it harder to understand for whatever voice assistant/text to speech you plan to use
Can we alter wake up time so its always on? meaning i say wake up word only once and then it stays on all time?And can you please tell the total code when you uploaded how much kb was it? thank you!
After waking up through the wake-up word, it waits for commands only for a short time. It then returns to waiting mode for the wake-up word. I don't know the size of the built binary file.
Sir, I've ordered the same esp32s3 as of yours but I got wroom-2 instead of wroom-1 and now I am trying on it, but it's not working as per your instructions. What should I do in this situation?
I don't think there is much difference between the two, but check the ports and see if anything has changed.
@@ThatProject Thank you it worked
Thanks for your guidance and help
i just have one more query can you help me with it?
@@ThatProject I am unable to find the exact place or file from where it is taking the access of the gpios and leds. can you help me with that? because i've searched all the files but without giving the gpio pin number it is still accessing the led and my code is not able to proceed if i force it to change the colour and is stuck in the listening part. then everytime for a new command i have to again give the wake word. Please guide me with it.
Terimakasih
Hi, I like this video, but where can I set up the RGB LED?
search google for "rgb led in {your country name}". Hopefully you will find it online.
As Fork was updated, the led_action.c file was removed. sorry.
github.com/0015/esp-skainet/blob/ESP32-S3-Devkit-C/examples/en_speech_commands_recognition/main/main.c#L112C11-L112C13
line #114, just set the color of the LED Strip according to the color of the command you added.
/* Set the LED pixel using RGB from 0 (0%) to 255 (100%) for each color */
led_strip_set_pixel(led_strip, 0, 16, 16, 16);
/* Refresh the strip to send data */
led_strip_refresh(led_strip);
@@ThatProject Thank you❤
can i use with my esp32? how can i intefaceit with Lcd to display the commands?
To use the ESP-SR framework, ESP32-S3 with PSRAM is recommended. Connecting the display is easy via SPI. Check out the projects using displays on my channel.
Can I use this framework to make project that detect if there is activity speech to record file until end speech, and the project should by accuracy to start record just if speech detected, not any voice
I think you can find a way to do it if you drill down this framework. But I have no idea about the accuracy you mentioned. This is a generic solution and there is no way to tune by yourself. If you need to have a very accurate voice user interface then need to find other solutions.
@@ThatProject
Can you help me for this project, or can i contact you privately
@@user-lr4yi5dz4m Sorry, but I can't afford to work on other projects at the moment.
I build and compiled and upload your project from your github exactly and my esp32 is esp32s3 devkit c1 N18R8 and i see this massage in terminal:
Guru meditation error:Core 0 paniced:Exception was unhandled. I dont know whats the matter and problem?
I need more detailed information about the error.
can I use a NodeMCU - ESP8266 instead of ESP-32 in this project ?
Unfortunately, it doesn't support ESP8266.
When I am running the code , there is an error stating no cmake file to read . What shoulf i do ?
Please check if your ESP-IDF build environment is working properly.
Hello, thank you for your tutorial. I couldn't make this work in my ESP32. I have a N8R2, so I think I am running out of ram. Do you think is there a way to solve this without purchasin a new microcontroller?
Apparetly, this 2 MB RAM should be enough for wakenet only, but when I tried that... everything was okay until this error:
assert failed: feed_Task main.c:31 (nch
It says "ESP32-S3 is recommended, which supports AI instructions and larger, high-speed octal SPI PSRAM. The new algorithms will no longer support ESP32 chips." Are you sure your ESP32 is ESP32-S3, not ESP32?
@@ThatProjectThank you for your reply! Yes, I have a ESP32-S3 N8R2.
I followed the README in your github project, using ESP-IDF CMD, because I tried to use VSCode but I got some errors regarding CMake that I couldn't solve.
I also got some errors in this CMD, but I solved it by changing some things in menuconfig (by setting flash size to 8 MB and PSRAM to quad mode).
Then the terminal shows that everything is okay apparently, but then says something about PSRAM and then reboots the ESP and runs the code again and again. Let me find the error text for you...
After setting PSRAM to 8MB, you can build, but the program does not run because your device has 2MB. I'm looking for the official documentation of the ESP-SR framework, but I can't find exactly how many MB of PSRAM is required. Do you have info on it?
Why the sdkconfig file is not creating after selecting the taraget device?
Is ESP-IDF running normally?
When i compile project, i see psram not found?!! My board is esp32s3 devkitc v1
There are more options after esp32s3 devkitc v1.
For N8R2, Flash is 8MB and PSRAM is 2MB. If there is no PSRAM, there is only N. I hope you check this out.
this is very impressive. I've had the opportunity to follow your master class series with great interest, and I'm truly inspired by the depth of knowledge and practical insights you've shared. To further enhance my understanding and skills, I'm considering developing a case study based on the concepts taught, specifically focusing on the LED control mechanisms you've discussed.
As part of this endeavor, I'm particularly interested in the led_actions.h file and its integration within the project. I understand that it plays a crucial role in managing the LED's behaviors, such as dimming and color setting, via the ESP32-S3's LED PWM Controller (LEDC). However, I find myself seeking more clarity on how exactly to structure this header file and implement its functionalities effectively. Let me know. Thanks a lot for your master class
Control the LED color using the led_strip component. Please take a look at this. github.com/UncleRus/esp-idf-lib/blob/master/components/led_strip/led_strip.c
@@ThatProjectThis is super interesting. Very advanced for me... Let me know if there is a plan to make this project available. would be super cool! Thanks.
Is it working in Offline, or does it need to have a working Internet connection?
Yes, this is an offline solution.
Good Day Sir, Kindly guide me to edit the code for RGB LED control .Default code not responding . I used your same example code.Thank you Sir.
First, it seems necessary to test why the basic code does not work. Is your mic working properly?
Please have you done an update to this repository since? Because i can not find the Base_speech_commands folder nor the led_actions.h header file. Help me i want to implement it on my esp32 s3 t3cam board, I have IDF 5.0 installed. I am also working with espressif IDE. Thanks
Unfortunately, the LED source code was deleted when the original branch was updated. Please use en_speech_commands_recognition.
@@ThatProject Thank you for your help. I still can not make my led respond to my voice. What did i miss?
@@amadouroufai2337 czcams.com/video/3XbnzfBjmZk/video.htmlsi=CjHs33NP9_8MPETo&t=396
Here you can see that the set_led_color function is executed according to the index of the command. You can implement this part to change the color of WS2812.
@@ThatProject Thank you very much!
Is the voice recognition user dependant?
It has no user dependencies. It is a general use.
Will this still work if I rather use " ESP32 Development Board | Doit DevKit V1 "?
Inatead of using esp32-s3.
Or would I run into storage issuses with it? Or any other possible issues?
Please anyone give me feedback if possible
As far as I know, DOIT ESP32 DEVKIT V1 Board doesn't have PSRAM. At least your board must have PSRAM to run ESP-SR.
@@ThatProject ok Sir, thankyou .
and is there any other cheap alternative like doit devkit v1 instead of using this esp32-s3
@@omkarbansode6305 ESP32-S3-DevKitC-1-N8R8 is only $15. Sometimes, using the official one can make development very convenient. amzn.to/4b2Wf5E
@@ThatProject thankyou for your help , I will definitely try to buy this one
hey, i have a problem, when wake work detected, the board resets and doesnt listen to command
What board do you use? Is it equipped with PSRAM?
I have the some problem
the problem was solved, I set the first parameter in sdkconfig for PSRAM and build project. I not worked. Than I return this param to prev. And now it work =)
@@zhdanvadim9536 What did you do, please explain carefully
@@yeaboi726 I double click on sdkconfig file. Settings manager appeared. I found params related with PSRAM. Changed to another param. Save file sdkconfig. at this time the settings were rebuilt. Than I return param to previous and save config file again. After this the compilation was successful
Hi i tried your code , but the the esp keeps rebooting, can you please provide exact sdk config file that you used ? Please
After I shared this project, esp-skinet was updated a lot. So I think you should refer to the official page. github.com/espressif/esp-skainet/tree/master/examples/en_speech_commands_recognition
Can you make some time to correct the code i am using esp32 s3 devkit c1 board , and i wish to have this assistant working, please help me to proceed through
Is there a way we can connect , and try resolving this issue , if you're fine i can share my email id here and then we can connect on any platform for discussion
hii iam new to the esp32s3 module iam not getting work on it can u help me out
Please take a look at the ESP-IDF environment configuration and ESP-SR. (Googling)
Hi, I use ESP-IDF 5.1.2 and Esp32s3, when I tried to specify the device target there will show the message likes " ... doesn't seem Cmake build the directory ..." I tried to follow other people step but its still the same. Do you think I have more way to solve this problem?
Cmake is an essential build system in IDF. Is it possible to build the basic example in your environment?
@@ThatProject can build hello world, in command and it shows the new project folder in esp-idf file, I try to creat new project and move all the files into it but doesnt work.
If that's the case, there's no issue with your build environment. I need more clues to figure it out.
@@ThatProject
CMake Warning (dev) in CMakeLists.txt:
No cmake_minimum_required command is present. A line of code such as
cmake_minimum_required(VERSION 3.28)
this is showing on cmd when I try to set the target but I do have cmake 3.28 already
@@ThatProjectaftern I change cmake_minimum_required I can set the target without problem but still having problem with sdkconfig with save-defconfig'" terminated with exit code: 2.
wake up word is amaizing. I am courious. Is it possible to use only "wake up" word to open voice stream to some external destination? Reason is, i esp support only english and chinese. edit: ok, i think answer is in your videos... interesting... thx
Here's an example using Wake Up only. Based on this I guess you can add anything you want. Check this out.
github.com/espressif/esp-skainet/tree/master/examples/wake_word_detection
@@ThatProject thank you!
Can I use Arduino frame?
If you look at the latest Arduino ESP32 core, you can see that ESP-SR has been added. So the basics seem to be possible. I hope you give this a try. github.com/espressif/arduino-esp32/blob/master/libraries/ESP_SR/examples/Basic/Basic.ino
Does it work for arduino IDE?
I don't think so.
Hey, why my esp32s3 keeps reseting after uploading the code
Please check your SDKConfig settings again. It also requires the use of PSRAM.
@@ThatProject I enabled all features about PSRAM, and my board also support 8mb psram.
@@yeaboi726 I'm not sure what the problem is with this limited information.
Can i use esp32 devkit board?
It can be used, but it seems that the skainet source code needs to be modified.
@@ThatProject please modify it please
@@ThatProjecthow?
Apakah bisa menggunakan esp 8266?
Is this online or offline?
It's an offline solution.
why im not able to find he file for devkitc
Are you sure you're trying this branch? github.com/0015/esp-skainet/tree/ESP32-S3-Devkit-C
@@ThatProject it sends me this: Permission denied (publickey)
im using ubuntu 22.04 lts
@@denisskolcins6766 Just download and use it.
Green button [Code] -> Download Zip
@@ThatProject thank you For The help have a nice day
Its offlin or no
Thank you
It's offline, standalone.
Can it learn new language?
Currently, only Chinese and English are supported. You can add new voice commands under these languages.