Text-to-Speech

EVA ICS can talk. This is perfect feature to provide e.g. action feedbacks or warn about alerts.

Speech voice is generated by Logic Manager using tts macro extension, which provides an interface to TTSBroker library.

Supported TTS providers: Amazon AWS (Polly), Google Cloud (Google TTS Engine), IBM Watson.

Configure audio hardware

Logic Manager must be installed on a node which has sound card installed, or stream audio to another host e.g. via PulseAudio.

If your hardware host doesn’t have audio output, any modern USB Audio Device card can be used. Connect card to free USB slot, then connect it to (optional) amplifier and speaker.

Obtain cloud provider key

You must obtain cloud key from supported TTS provider. Key is stored in JSON format on a host where Logic Manager is installed.

Google provides cloud keys for service accounts already in JSON format. Amazon provides only API keys, to pack it to JSON, use format (don’t forget to set proper region):

{
  "aws_access_key_id":"mykey",
  "aws_secret_access_key":"mysecretkey",
  "region_name":"us-west-1"
}

For IBM Watson, JSON key should look like:

{
  "url": "https://stream.watsonplatform.net/text-to-speech/api",
  "username": "myusername",
  "password": "mypassword"
}

Create TTS configuration

If you don’t like default TTS settings, you may create optional configuration file for defaults, e.g. let’s configure Google TTS to generate voices with voice=en-US-Wavenet-F

{
  "voice": "en-US-Wavenet-F"
}

Name file as you wish and put to the path Logic Manager has access to.

All available options:

  • Google

    • pitch pitch (default: 0)
    • rate speaking rate (default: 1.0)
    • lang language (default: en-US)
    • voice tts voice (default: en-US-Wavenet-A)
  • AWS

    • voice tts voice (default: Joanna)
  • IBM

    • voice tts voice (default: en-US_AllisonVoice)

Install TTSBroker Python module

Put EXTRA=”soundfile sounddevice ttsbroker oauth2client” to /opt/eva/etc/venv and rebuild Python venv:

/opt/eva/install/build-venv

Module oauth2client is required by gcloud provider. If you want to use polly provider (AWS), add boto3 module. For IBM watson no extra modules are required.

Note

If external playback command is used, sounddevice module is not required.

Load tts macro extension

eva -I
lm
ext load t1 tts -c p=gcloud,k=/path-to-cloudkey.json -y

All extension options:

  • p TTS provider (gcloud, polly or watson)
  • k Cloud key file
  • sdir path to pre-generated files
  • cdir path to cache directory
  • cf cache format (wav, or ogg, default: wav)
  • o TTS configuration file
  • g default gain (-10..inf)
  • cmd external playback command (e.g. play %f)
  • d playback device, if no external command provided (list: /opt/eva/python3/bin/python -m sounddevice)

Option sdir is used as “permanent cache” for audio files, e.g. you may put sdir on read-only partition, cdir on RAM drive and then periodically copy cached files from cdir to sdir.

Warning

Refer to TTS provider license about caching, storing, redistributing and playing rights for the audio files generated with TTS engine.

Use loaded function

As soon as macro extension is loaded, function <ext_id>_say becomes available in all macros.

Create optional alias:

eva -I
lm
macro edit common.py
alias('say', 't1_say')

Function arguments are equal to TTSBroker say:

  • text text to say
  • gain gain control (-10..inf), float, 0 - default volume
  • options audio generation options
  • use_cache set False to skip looking for a data in the local storage/cache
  • store_cache set False to skip saving a data in the local cache
  • cache set both use_cache and store_cache
  • generate_only set True to skip playback
  • wait block thread and wait until playback finish
  • cmd external playback command

Test it:

eva -I
lm
macro run @say -a "'this is a test, I can talk'" -w 5

If there’s no sound, check controller log files and hardware connection. If sound is generated but playback is broken, try changing extension playback device or use external playback command.

Combine with macro

Now you can create macro with voice feedback, e.g. let’s create macro which turns on 2 lamps:

start('unit:lamps/lamp1')
start('unit:lamps/lamp2')
say('lamps are turned on')