Voice control system using a Raspberry Pi Zero and Bluetooth microphone.
Uses Pocketsphinx for offline speech recognition and a text file with wake words to run commands such as take a picture or video from an attached USB camera.
On the more powerful Raspberry Pi systems, wake word detection is a lot easier. The Pi Zero has limited computing resources but it’s possible with this lightweight speech to text library.
Start with a fresh install of Bullseye by downloading the Raspberry Pi Imager from here and selecting ‘Raspberry Pi OS Lite (32-bit) from the ‘(other)’ options. Click the gear icon to set up the Wi-Fi and username and password.
Log into the Pi over SSH (see the section ‘SSH Remote Access Setup’ in this tutorial Headless Install of Raspberry Pi OS Lite for more help with this)
The current version of Raspberry Pi OS Lite (Bullseye) doesn’t have BlueALSA (Bluetooth Audio ALSA Backend) available in the repositories but it is available in the next version (Bookworm). The repository can be added to the Bullseye installation and the package installed with the following four commands:
echo "deb http://archive.raspbian.org/raspbian/ bookworm main" | sudo tee /etc/apt/sources.list.d/armbian.list printf 'Package: *\nPin: release n=bookworm\nPin-Priority: 100\n' | sudo tee --append /etc/apt/preferences.d/limit-bookworm sudo apt update sudo apt install bluez-alsa-utils
When the installation has completed, run the following command to start the service:
sudo systemctl start bluetooth.service
You should now be ready to set up your Bluetooth microphone. Make sure the microphone is charged and switch it on. Type the following commands into the shell (replace the XXXXX with the device ID you see after the scan):
sudo bluetoothctl scan on pair XX:XX:XX:XX:XX:XX trust XX:XX:XX:XX:XX:XX connect XX:XX:XX:XX:XX:XX exit aplay -D bluealsa:DEV=XX:XX:XX:XX:XX:XX,PROFILE=a2dp /usr/share/sounds/alsa/*
You should see a successful playback of the test sounds. Now edit the asoundrc file. Open it with
sudo nano ~/.asoundrc
and paste these two lines (again replace XXXXX with the device ID):
aplay -D bluealsa /usr/share/sounds/alsa/*
Installation of Pocketsphinx
Type the following commands to install Pocketsphinx https://pypi.org/project/pocketsphinx/. The second command can take a while.
sudo apt-get install -qq python3 python3-dev python3-pip build-essential swig git libpulse-dev libasound2-dev pip3 install pocketsphinx
Open the init file for Pocketsphinx. The python3.9 directory will depend on the version of python installed:
sudo nano /home/pi/.local/lib/python3.9/site-packages/sphinxbase/__init__.py
and edit to look like this:
from .ad_alsa import *
from .sphinxbase import *
Installation of USB Webcam Libraries
To install the libraries needed to access the USB webcam, paste the following lines into the shell:
sudo apt install fswebcam
sudo apt install ffmpeg
Setting up the Wake Word Listening
To set up the Pi Zero to listen for wake words (voice commands), first edit the keyphrase list with the hotwords/phrases you want to use. More information on this can be found here: https://cmusphinx.github.io/wiki/faq/#q-how-to-implement-hot-word-listening
Open the keyphrase.list file:
sudo nano ~/keyphrase.list
Add the phrases in the same format as below:
oh mighty computer /1e-40/ hello world /1e-30/ make video /1e-40/ take picture /1e-30/
Python Code for Wake Word Listening and Actions
The code below continuously listens for the wake words and prints the recognised phrase to the shell. If the recognised phrase is ‘record video’ then a 10 second recording from the USB camera is taken. If the recognised wake word is ‘take picture’, a photo is taken by the USB camera.
Create listen.py with:
sudo nano ~/listen.py
and paste the following script. Check the tab formatting – it might need correcting in Nano. Also this works for me but you might need to change some settings depending on the camera.
import os from pocketsphinx import LiveSpeech speech = LiveSpeech(audio_device="bluealsa", lm=False, kws="keyphrase.list") for phrase in speech: print(phrase.segments(detailed=True)) print(phrase) if phrase == "take picture ": os.system( "fswebcam -r 640×480 –no-banner –verbose –log output.log /home/pi/image.jpg" ) if phrase == "make video ": os.system( "ffmpeg -f v4l2 -video_size 640×480 -i /dev/video0 -t 10 -y -loglevel 0 /home/pi/output.avi" )
Attach the USB camera. You might have lost Bluetooth by this stage. Just press the connect button on the device.
In the shell, paste:
Try speaking the wake words and watch the shell for messages. Check the SD card for the recordings.
Using the System After a Reboot
I’ve not worked out how to get the Pi to reconnect to the Bluetooth device during boot. In a previous tutorial (Raspberry Pi Zero W Bluetooth Text to Speech) I used Systemd to start the service but I’ve not had success with this on Bullseye. However you can just wait for the Pi to complete booting and press and hold the connect button on the Bluetooth microphone and it will connect. Paste the following in the shell to see this happen:
sudo bluetoothctl info XX:XX:XX:XX:XX:XX
This page helped me a lot with getting BlueALSA running on Bullseye: https://www.sigmdel.ca/michel/ha/rpi/bluetooth_in_rpios_02_en.html