How was this made
Blood, sweat and tears
We have
A pipe server (
AACSpeakHelperServer.py
)Reads in a config dict from settings.cfg
Creates an object to the TTS Engine and holds it in memory to reduce coldstart time
Speaks using sounddevice (heavily reliant on py3-tts-wrapper)
Handles translation using various translation providers
Communicates via named pipes with clients
Client - calling executable (
client.py
)You can pass it a config file path and text string
If no text provided, it uses clipboard/pasteboard text
Calls the pipe service to process text
Supports command-line parameters for different configurations
CLI Configuration Tool (
cli_config_creator.py
)Interactive command-line interface for configuration
Supports all TTS engines and translation providers
Creates and manages settings.cfg files
Replaces the unreliable GUI configuration tool
Additional Tools
CreateGridset.py
- Creates AAC communication gridsmigrate_settings.py
- Migrates settings between versions
Configuration Architecture
The application uses a multi-layered configuration approach:
settings.cfg - Main configuration file in INI format
Contains all TTS engine settings, translation settings, and application preferences
Located in
%AppData%\Ace Centre\AACSpeakHelper
for installed versionsCan be customized and distributed to end users
config.enc - Encrypted configuration for sensitive data
Contains API keys and credentials for cloud services
Generated during build process from environment variables
Provides fallback credentials when user hasn't configured their own
Environment Variables (Development only)
Used during development via .envrc files
Automatically encrypted into config.enc during build process
Command-line Parameters
Allow specifying custom configuration files
Enable different configurations for different use cases
Technical Implementation
There is a lot of magic to make this work though. This includes
TTS-Wrapper - a unified wrapper to a range of TTS engines. This is needed as we need a unified way of get_voices and speak, speak_streamed etc
Sherpa-Onnx - a really nice tooling pipleine to deal with VITS models that run on the edge.
MMS and Models readied for Sherpa-Onnx - Massive help this work from Meta - and we converted their models for (Sherpa-)Onnx. We made some things on the way like a nice JSON with details on the voices. Commerical Providers: Please note the licence these are under
QT/QT Threading. We had "fun" with threads. Never again will I do it like this
Encryption in a github Action of keys and a hideous JSON file from Google. That wasted us a week.
Credits
Will Wade (original v1, refactoring v2 several times, dealing with encryption, build scripts and generally pulling my hair out)
Acer Jay Costillo (QT work and refactoring)
Gavin Henderson - for making the call on baking in creds. I hated that and several times threw the idea out.
Simon Poole - CTO at Smartbox for making me aware of MMS.
Whats next?
SAPI Bridge. This is really what is needed. C++ developers - we need your help. See Roadmap
Last updated