MAJORDOMO CONNNECT

<<< Back

Собираем голосовой помощник на базе WM8960 Audio HAT и raspberry pi zero w

Используем mdmPiTerminal2, MPD, gmeriarender. После отладки и тестирования будет выложен готовый образ.

Вступление
Готовые голосовые помощники привязаны к инфраструктуре своего производителя. Это вносит свои ограничения. Ребята провели отличную работу, создав проект mdmTerminal2. Он имеет поддержку majordroid api, поэтому может вести полные диалог с пользователем.

Для реализации проекта нам нужно:
1) Внимательно прочитать https://github.com/Aculeasis/mdmTerminal2
2) Купить raspberry pi zero hw https://ru.aliexpress.com/item/32835306636.html?sp... 1500 Р
3) Купить WM8960_Audio_HAT https://ru.aliexpress.com/item/32957011775.html?sp... 1400 Р

Драйвера и инструкция на audi_hat есть тут https://github.com/respeaker/seeed-voicecard/blob/...

Но на ubuntu для arm они не встали. Скорее всего встанут на armbian

На всякий случай драйвера тут: (на ubuntu arm не завелись)
Устанавливаем драйвера на WM8960_Audio_HAT
https://www.waveshare.com/w/upload/5/54/WM8960_Aud...
https://github.com/waveshare/WM8960-Audio-HAT

 git clone https://github.com/waveshare/WM8960-Audio-HAT
 cd WM8960-Audio-HAT
 sudo ./install.sh
 sudo reboot

Я не стал проводить эксперименты, нашел готовый образ для raspbian у поставщиков платы WM8960_Audio_HAT
Качаем китайский дистрибутив https://v2.fangcloud.com/share/7395fd138a1cab496fd... - на нем звук работает

Проверяем запись и воспроизведение

arecord -d 5 -f S16_LE -r 16000 __.wav && aplay __.wav && rm __.wav и говорите что-нибудь 5 секунд.

Устанавливаем mdmpiterminal https://github.com/Aculeasis/mdmTerminal2

    cd ~/
    git clone https://github.com/Aculeasis/mdmTerminal2
    cd mdmTerminal2
    ./scripts/install.sh

Устанавливаем mpd

sudo apt-get install mpd

Конфиг /etc/mpd.conf

# An example configuration file for MPD.
# Read the user manual for documentation: http://www.musicpd.org/doc/user/
# or /usr/share/doc/mpd/user-manual.html

# Files and directories #######################################################
#
# This setting controls the top directory which MPD will search to discover the
# available audio files and add them to the daemon's online database. This 
# setting defaults to the XDG directory, otherwise the music directory will be
# be disabled and audio files will only be accepted over ipc socket (using
# file:// protocol) or streaming files over an accepted protocol.
#
music_directory     "/home/pi/music"
#
# This setting sets the MPD internal playlist directory. The purpose of this
# directory is storage for playlists created by MPD. The server will use 
# playlist files not created by the server but only if they are in the MPD
# format. This setting defaults to playlist saving being disabled.
#
playlist_directory      "/home/pi/playlists"
#
# This setting sets the location of the MPD database. This file is used to
# load the database at server start up and store the database while the 
# server is not up. This setting defaults to disabled which will allow
# MPD to accept files over ipc socket (using file:// protocol) or streaming
# files over an accepted protocol.
#
db_file         "/var/lib/mpd/tag_cache"
# 
# These settings are the locations for the daemon log files for the daemon.
# These logs are great for troubleshooting, depending on your log_level
# settings.
#
# The special value "syslog" makes MPD use the local syslog daemon. This
# setting defaults to logging to syslog, otherwise logging is disabled.
#
log_file            "/var/log/mpd/mpd.log"
#
# This setting sets the location of the file which stores the process ID
# for use of mpd --kill and some init scripts. This setting is disabled by
# default and the pid file will not be stored.
#
pid_file            "/home/pi/mpd/pid"
#
# This setting sets the location of the file which contains information about
# most variables to get MPD back into the same general shape it was in before
# it was brought down. This setting is disabled by default and the server 
# state will be reset on server start up.
#
state_file          "/var/lib/mpd/state"
#
# The location of the sticker database.  This is a database which
# manages dynamic information attached to songs.
#
sticker_file                   "/var/lib/mpd/sticker.sql"
#
###############################################################################

# General music daemon options ################################################
#
# This setting specifies the user that MPD will run as. MPD should never run as
# root and you may use this setting to make MPD change its user ID after
# initialization. This setting is disabled by default and MPD is run as the
# current user.
#
user                "pi"
#
# This setting specifies the group that MPD will run as. If not specified
# primary group of user specified with "user" setting will be used (if set).
# This is useful if MPD needs to be a member of group such as "audio" to
# have permission to use sound card.
#
#group                          "nogroup"
#
# This setting sets the address for the daemon to listen on. Careful attention
# should be paid if this is assigned to anything other then the default, any.
# This setting can deny access to control of the daemon. Choose any if you want
# to have mpd listen on every address. Not effective if systemd socket
# activation is in use.
#
# For network
bind_to_address     "0.0.0.0"
#
# And for Unix Socket
#bind_to_address        "/run/mpd/socket"
#
# This setting is the TCP port that is desired for the daemon to get assigned
# to.
#
port                "6600"
#
# This setting controls the type of information which is logged. Available 
# setting arguments are "default", "secure" or "verbose". The "verbose" setting
# argument is recommended for troubleshooting, though can quickly stretch
# available resources on limited hardware storage.
#
log_level           "verbose"
#
# If you have a problem with your MP3s ending abruptly it is recommended that 
# you set this argument to "no" to attempt to fix the problem. If this solves
# the problem, it is highly recommended to fix the MP3 files with vbrfix
# (available as vbrfix in the debian archive), at which
# point gapless MP3 playback can be enabled.
#
#gapless_mp3_playback           "yes"
#
# Setting "restore_paused" to "yes" puts MPD into pause mode instead
# of starting playback after startup.
#
#restore_paused "no"
#
# This setting enables MPD to create playlists in a format usable by other
# music players.
#
#save_absolute_paths_in_playlists   "no"
#
# This setting defines a list of tag types that will be extracted during the 
# audio file discovery process. The complete list of possible values can be
# found in the mpd.conf man page.
#metadata_to_use    "artist,album,title,track,name,genre,date,composer,performer,disc"
#
# This setting enables automatic update of MPD's database when files in 
# music_directory are changed.
#
#auto_update    "yes"
#
# Limit the depth of the directories being watched, 0 means only watch
# the music directory itself.  There is no limit by default.
#
#auto_update_depth "3"
#
###############################################################################

# Symbolic link behavior ######################################################
#
# If this setting is set to "yes", MPD will discover audio files by following 
# symbolic links outside of the configured music_directory.
#
#follow_outside_symlinks    "yes"
#
# If this setting is set to "yes", MPD will discover audio files by following
# symbolic links inside of the configured music_directory.
#
#follow_inside_symlinks     "yes"
#
###############################################################################

# Zeroconf / Avahi Service Discovery ##########################################
#
# If this setting is set to "yes", service information will be published with
# Zeroconf / Avahi.
#
#zeroconf_enabled       "yes"
#
# The argument to this setting will be the Zeroconf / Avahi unique name for
# this MPD server on the network.
#
#zeroconf_name          "Music Player"
#
###############################################################################

# Permissions #################################################################
#
# If this setting is set, MPD will require password authorization. The password
# can setting can be specified multiple times for different password profiles.
#
#password                        "password@read,add,control,admin"
#
# This setting specifies the permissions a user has who has not yet logged in. 
#
#default_permissions             "read,add,control,admin"
#
###############################################################################

# Database #######################################################################
#

#database {
#       plugin "proxy"
#        plugin "upnp"
#       host "other.mpd.host"
#       port "6600"
#}

# Input #######################################################################
#

input {
        plugin "curl"
#       proxy "proxy.isp.com:8080"
#       proxy_user "user"
#       proxy_password "password"
}

#
###############################################################################

# Audio Output ################################################################
#
# MPD supports various audio output types, as well as playing through multiple 
# audio outputs at the same time, through multiple audio_output settings 
# blocks. Setting this block is optional, though the server will only attempt
# autodetection for one sound card.
#
# An example of an ALSA output:
#
audio_output {
    type        "alsa"
    name        "My ALSA Device"
#   device      "hw:0,0"    # optional
    mixer_type      "software"      # optional
#   mixer_device    "default"   # optional
#   mixer_control   "PCM"       # optional
#   mixer_index "0"     # optional
}
#
# An example of an OSS output:
#
#audio_output {
#   type        "oss"
#   name        "My OSS Device"
#   device      "/dev/dsp"  # optional
#   mixer_type      "hardware"      # optional
#   mixer_device    "/dev/mixer"    # optional
#   mixer_control   "PCM"       # optional
#}
#
# An example of a shout output (for streaming to Icecast):
#
#audio_output {
#   type        "shout"
#   encoder     "vorbis"        # optional
#   name        "My Shout Stream"
#   host        "localhost"
#   port        "8000"
#   mount       "/mpd.ogg"
#   password    "hackme"
#   quality     "5.0"
#   bitrate     "128"
#   format      "44100:16:1"
#   protocol    "icecast2"      # optional
#   [5~user     "source"        # optional
#   description "My Stream Description" # optional
#   url             "http://example.com"    # optional
#   genre       "jazz"          # optional
#   public      "no"            # optional
#   timeout     "2"         # optional
#   mixer_type      "software"              # optional
#}
#
# An example of a recorder output:
#
#audio_output {
#       type            "recorder"
#       name            "My recorder"
#       encoder         "vorbis"                # optional, vorbis or lame
#       path            "/var/lib/mpd/recorder/mpd.ogg"
##      quality         "5.0"                   # do not define if bitrate is defined
#       bitrate         "128"                   # do not define if quality is defined
#       format          "44100:16:1"
#}
#
# An example of a httpd output (built-in HTTP streaming server):
#
#audio_output {
#   type        "httpd"
#   name        "My HTTP Stream"
#   encoder     "vorbis"        # optional, vorbis or lame
#   port        "8000"
#   bind_to_address "0.0.0.0"               # optional, IPv4 or IPv6
#   quality     "5.0"           # do not define if bitrate is defined
#   bitrate     "128"           # do not define if quality is defined
#   format      "44100:16:1"
#   max_clients     "0"                     # optional 0=no limit
#}
#
# An example of a pulseaudio output (streaming to a remote pulseaudio server)
# Please see README.Debian if you want mpd to play through the pulseaudio
# daemon started as part of your graphical desktop session!
#
#audio_output {
#   type        "pulse"
#   name        "My Pulse Output"
#   server      "remote_server"     # optional
#   sink        "remote_server_sink"    # optional
#}
#
# An example of a winmm output (Windows multimedia API).
#
#audio_output {
#   type        "winmm"
#   name        "My WinMM output"
#   device      "Digital Audio (S/PDIF) (High Definition Audio Device)" # optional
#       or
#   device      "0"     # optional
#   mixer_type  "hardware"  # optional
#}
#
# An example of an openal output.
#
#audio_output {
#   type        "openal"
#   name        "My OpenAL output"
#   device      "Digital Audio (S/PDIF) (High Definition Audio Device)" # optional
#}
#
## Example "pipe" output:
#
#audio_output {
#   type        "pipe"
#   name        "my pipe"
#   command     "aplay -f cd 2>/dev/null"
## Or if you're want to use AudioCompress
#   command     "AudioCompress -m | aplay -f cd 2>/dev/null"
## Or to send raw PCM stream through PCM:
#   command     "nc example.org 8765"
#   format      "44100:16:2"
#}
#
## An example of a null output (for no audio output):
#
#audio_output {
#   type        "null"
#   name        "My Null Output"
#   mixer_type      "none"                  # optional
#}
#
# If MPD has been compiled with libsamplerate support, this setting specifies 
# the sample rate converter to use.  Possible values can be found in the 
# mpd.conf man page or the libsamplerate documentation. By default, this is
# setting is disabled.
#
#samplerate_converter       "Fastest Sinc Interpolator"
#
###############################################################################

# Normalization automatic volume adjustments ##################################
#
# This setting specifies the type of ReplayGain to use. This setting can have
# the argument "off", "album", "track" or "auto". "auto" is a special mode that
# chooses between "track" and "album" depending on the current state of
# random playback. If random playback is enabled then "track" mode is used.
# See <http://www.replaygain.org> for more details about ReplayGain.
# This setting is off by default.
#
#replaygain         "album"
#
# This setting sets the pre-amp used for files that have ReplayGain tags. By
# default this setting is disabled.
#
#replaygain_preamp      "0"
#
# This setting sets the pre-amp used for files that do NOT have ReplayGain tags.
# By default this setting is disabled.
#
#replaygain_missing_preamp  "0"
#
# This setting enables or disables ReplayGain limiting.
# MPD calculates actual amplification based on the ReplayGain tags
# and replaygain_preamp / replaygain_missing_preamp setting.
# If replaygain_limit is enabled MPD will never amplify audio signal
# above its original level. If replaygain_limit is disabled such amplification
# might occur. By default this setting is enabled.
#
#replaygain_limit       "yes"
#
# This setting enables on-the-fly normalization volume adjustment. This will
# result in the volume of all playing audio to be adjusted so the output has 
# equal "loudness". This setting is disabled by default.
#
#volume_normalization       "no"
#
###############################################################################

# Character Encoding ##########################################################
#
# If file or directory names do not display correctly for your locale then you 
# may need to modify this setting.
#
filesystem_charset      "UTF-8"
#
# This setting controls the encoding that ID3v1 tags should be converted from.
#
id3v1_encoding          "UTF-8"
#
###############################################################################

# SIDPlay decoder #############################################################
#
# songlength_database:
#  Location of your songlengths file, as distributed with the HVSC.
#  The sidplay plugin checks this for matching MD5 fingerprints.
#  See http://www.c64.org/HVSC/DOCUMENTS/Songlengths.faq
#
# default_songlength:
#  This is the default playing time in seconds for songs not in the
#  songlength database, or in case you're not using a database.
#  A value of 0 means play indefinitely.
#
# filter:
#  Turns the SID filter emulation on or off.
#
#decoder {
#       plugin                  "sidplay"
#       songlength_database     "/media/C64Music/DOCUMENTS/Songlengths.txt"
#       default_songlength      "120"
#       filter "true"
#}
#
###############################################################################

запуск в качестве демона у меня не пошел.

Запустил с правами root через

sudo crontab -e

@reboot mpd

Протестировать работу mpd может через консольный клиент mpc
sudo apt-get install mpc

mpc clear
mpc add https://online.pilotfm.ru/pilot
mpc play
mpc volume 90

Как играть музыку
Для начала нужно скопировать музыку (mp3 файлы) в папке /var/lib/mpd/music (или ~/music, смотря какой путь вы указали в music_directory в файле mpd.conf).

дальше чистить плейлист:

Проиграть файл:
mpc clear
mpc update
добавить музыку в плейлист:
mpc add <имя папки или файла>
mpc add <имя папки или файла>
и запустить проигрывание:
mpc play
mpc volume 50
что соответствует 50% от максимальной мощности. Можно прибавить на 10% mpc volume +10 или убавить mpc volume -10. Чтобы можно было управлять громкостью, необходимо установить mixer_type "software" в разделе audio_output в файле mpd.conf.

Восстановление уровня звука при старте

Сохраняем уровни

alsactl -f /home/pi/.asound.state store

Восстанавливаем уровни

alsactl -f /home/pi/.asound.state restore

Для автоматического восстановления добавли в /etc/rc.local
sleep 10 && /usr/sbin/alsactl -f /home/pi/.asound.state restore

Для автоматического сохранения раз в 5 минут выполняем
sudo crontab -e
и добавляем
*/5 * * * * /usr/sbin/alsactl -f /home/pi/.asound.state store

Для управления уровнем звука нашей карты в конфиге settings.ini установил:

[settings]
ip = 192.168.1.135
sensitivity = 0.7
alarmkwactivated = on
providertts = yandex
first_love = on
chrome_mode = on
last_love = off
no_background_play = off
lazy_record = off
blocking_listener = on
software_player = 
alarm_recognized = off
lang_check = off
lang = ru
say_stt_error = off
chrome_choke = off
alarmstt = off
optimistic_nonblock_tts = on
providerstt = google
chrome_alarmstt = off
mic_index = -1
quiet = off
phrase_time_limit = 0
audio_gain = 1.0
alarmtts = off
no_hello = off
ask_me_again = 0

[plugins]
blacklist = 
enable = on
blacklist_on_failure = off
whitelist = 

[volume]
line_out = Playback
card = 1

[persons]

[update]
pip = on
fallback = on
turnoff = -1
apt = off
interval = 0

[aws]
speaker = Tatyana
access_key_id = 
boto3 = off
region = eu-central-1
secret_access_key = 

[google]
slow = off

[cache]
tts_priority = 1
tts_size = 100
path = /home/pi/mdmTerminal2/src/tts_cache

[smarthome]
ip = 192.168.1.39
outgoing_socket = 
password = 
terminal = 
unsafe_rpc = off
object_name = 
disable_http = off
username = 
object_method = 
heartbeat_timeout = 0
disable_server = off
allow_addresses = 
token = 

[music]
ip = 192.168.1.135
port = 6611
control = on
smoothly = off
username = pi
pause = off
type = mpd
quieter = 0
lms_player = 
wait_resume = 0
password = 

[listener]
no_listen_music = off
silent_multiplier = 1.0
vad_mode = snowboy
vad_lvl = 0
energy_lvl = 0
vad_chrome = 
energy_dynamic = on
stream_recognition = on
speech_timeout = 3

[yandex]
grpc = off
api = 1
apikeystt = 
apikeytts = 
speaker = alyss
emotion = good
speed = 1.0

[azure]
speaker = EkaterinaRUS
region = westus

[system]
ini_version = 42
ws_token = token_is_unset

[snowboy]
token = d4977cf8ff6ede6efb8d2277c1608c7dbebf18a7
age_group = 30_39
microphone = mic
name = unknown
gender = M
clear_models = off

[noise_suppression]
conservative = off
snowboy_apply_frontend = off
ns_lvl = 0
enable = off

[log]
print_ms = on
print_lvl = debug
file = /home/pi/mdmTerminal2/src/mdmterminal.log
remote_log = on
file_lvl = debug
method = 3

[rhvoice]
speaker = anna

[proxy]
monkey_patching = on
enable = 0
proxy = socks5h://127.0.0.1:9050

[pocketsphinx-rest]
server = http://127.0.0.1:8085

[rhvoice-rest]
speaker = anna
rate = 50
server = http://127.0.0.1:8080
volume = 50
pitch = 50

[models]
allow = 
model1.pmdl = Алиса

Посмотреть открытые порты

 netstat -lt4n

в ответ должно быть чтото типа такого


Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:6611            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:8989            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:7999            0.0.0.0:*               LISTEN
tcp        0      0 192.168.1.135:49152     0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:1900            0.0.0.0:*               LISTEN
pi@raspberrypi:~ $

Upnp render

Устанавливаем - gmediarender — это медиа-рендеринг UPnP для POSIX-совместимых систем, таких как Linux или UNIX. Он реализует серверный компонент, который обеспечивает UPnP-контроллеры средство для воспроизведения медиа-контента (аудио, видео и изображений) с медиа-сервера UPnP.

gmrender-resurrect — это вилка из GMediaRender, которая была оставлена вверх по течению.

sudo apt install gmediarender

Для корректного автозапуска нужно изменить конфиг демона

pi@raspberrypi:~ $ cat /etc/init.d/gmediarender

#!/bin/sh

### BEGIN INIT INFO
# Provides: gmediarender
# Required-Start: $remote_fs $syslog $all
# Required-Stop: $remote_fs $syslog
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start GMediaRender at boot time
# Description: Start GMediaRender at boot time.
### END INIT INFO

# User and group the daemon will be running as. On the Raspberry Pi, let's use
# the default user.
DAEMON_USER="pi:audio"

# Device name as it will be advertised to and shown in the UPnP controller UI.
# Some string that helps you recognize the player, e.g. "Livingroom Player"
UPNP_DEVICE_NAME="mdmPiTerminal"

# Initial volume in decibel. 0.0 is 'full volume', -10 correspondents to '75' on
# the exported volume scale (Note, this does not change the ALSA volume, only
# internal to gmrender. So make sure to leave the ALSA volume always to 100%).
INITIAL_VOLUME_DB=-10

# If you explicitly choose a specific ALSA device here (find them with 'aplay -L'), then
# gmediarenderer will use that ALSA device to play audio.
# Otherwise, whatever default is configured for gstreamer for the '$DAEMON_USER' is
# used.
ALSA_DEVICE="sysdefault"

# Path to the gmediarender binary.
BINARY_PATH=/usr/bin/gmediarender

if [ -n "$ALSA_DEVICE" ] ; then
    GS_SINK_PARAM="--gstout-audiosink=sysdefault"
    GS_DEVICE_PARAM="--gstout-audiodevice=$ALSA_DEVICE"
fi

# A simple stable UUID, based on this systems' first ethernet devices MAC address,
# only using tools readily available to generate.
UPNP_UUID=`ip link show | awk '/ether/ {print "salt:)-" $2}' | head -1 | md5sum | awk '{print $1}'`

USER=root
HOME=/root
export USER HOME
case "$1" in
    start)
        echo "Starting GMediaRender"
        start-stop-daemon -x $BINARY_PATH -c "$DAEMON_USER" -S -- -f "$UPNP_DEVICE_NAME" -d -u "$UPNP_UUID" $GS_SINK_PARAM $GS_DEVICE_PARAM --gstout-initial-volume-db=$INITIAL_VOLUME_DB
        ;;

    stop)
        echo "Stopping GMediaRender"
        start-stop-daemon -x $BINARY_PATH -K
        ;;

    *)
        echo "Usage: /etc/init.d/gmediarender {start|stop}"
        exit 1
        ;;
esac

далее sudo systemctl enable gmediarender

Если мы хотим поуправлять mpd через web, устанавливаем ympd https://ympd.org/

wget https://ympd.org/downloads/ympd-1.2.3-armhf.tar.bz2
tar -xvf ympd-1.2.3-armhf.tar.bz2
sudo ./ympd --webport 8080

для автозапуска можно прописать в rc.local
'''sleep 10 && /home/pi/ympd --webport 8080'''

интерфейс будет доступен на порту 8080.

Хороший микшер для браузера https://github.com/JiriSko/amixer-webui

Устанавливаем flask
pip install flask
cd ~
git clone https://github.com/JiriSko/amixer-webui.git
cd amixer-webui
python alsamixer_webui.py -p 8181 -l 192.168.1.232

Посмотреть системную громкость можно так
amixer -c1 sget 'Playback'

Установить громкость 70%
amixer -c1 sset 'Playback' 70%

Итого:
В настоящее время на raspberry pi 3 работает :
1) mdmpiterminal2 https://github.com/Aculeasis/mdmTerminal2
2) MPD
3) gmediarender
4) установлен ymdp на порту http://192.168.1.232:8080 (можно регулировать mpd, в том числе громкость музыки)
5) установлен amixer-webui на порту http://192.168.1.232:8181 (можно регулировать системную громкость)
6) управление громкостью голоса и воспроизведения медиа через голосовые команды "громкость 80" и "громкость музыки 80"
7) сохранение раз в 5 минут уровня громкости и восстановление при загрузке.

Discuss (1) (13)

directman

Собираем голосовой помощник на базе WM8960 Audio HAT и raspberry pi zero w