1 of 4

Developer Notes

It works on Python 3.10 or 3.11. The dependencies aren't well covered on all other versions (and there are a lot!)
We use a GitHub action to build the application (see workflow here.)

// Test
python translatepb.py
// Build for Windows
python -m PyInstaller translatepb.py --noupx --noconsole --onedir -i .\assets\translate.ico --clean
python -m PyInstaller .\GUI_TranslateAndTTS\widget.py --noupx --noconsole --name "Configure TranslateAndTTS" --onefile  -i .\assets\configure.ico --clean
// Build installer. 
//    You need to install InnoSetup (6) https://jrsoftware.org/isinfo.php
& "C:\Program Files (x86)\Inno Setup 6\ISCC.exe" .\buildscript.iss

Command Line Flags

The app (client.exe) is designed to be called by a AAC application - and relies on some text being available in the copy buffer. You can use these parameters/flags to control aspects like which settings file is loaded or other aspects. Read below for all the flags.

General Usage

client.exe [options]

Options

Flag

Description

Type

Required

Default

Example

Using the style flag for Azure voices

You can use the command line's --style flag for Azure voices. If you do this, follow it with one of these style flags. You can change the strength of these with --styledegree being 0.1 to 2. By default it is 1. So 2 would double it. Be warned. Some voices don't have all styles. .

Getting keys for Azure or Google

Azure TTS

You first need an Azure subscription - Create one for free.
Create a Speech resource in the Azure portal.
Your Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys. For more information about Azure AI services resources, see Get the keys for your resource

Google Cloud TTS

Creating a service account for OAuth 2.0 involves generating credentials for a non-human user, often used in server-to-server interactions. Here's how you can create OAuth 2.0 credentials using a service account for Google APIs:

Create a Service Account:

Go to the Google Cloud Console: Visit the Google Cloud Console.
Create a New Project: If you don't already have a project, create a new one in the developer console.
Enable APIs: Enable the APIs that your service account will be using. For example, if you're using Google Drive API, enable that API for your project.
Create a Service Account:

In the Google Cloud Console, navigate to "IAM & Admin" > "Service accounts."
Click on "Create Service Account."
Enter a name for the service account and an optional description.
Choose the role for the service account. This determines the permissions it will have.
Click "Continue" to proceed.

Create and Download Credentials:

On the next screen, you can grant the service account a role in your project. You can also skip this step and grant roles later.
Click "Create Key" to create and download the JSON key file. This file contains the credentials for your service account.
Keep this JSON file secure and do not expose it publicly.

Use the Service Account Credentials:

In your code, load the credentials from the JSON key file. The credentials can be used to authenticate and access the APIs on behalf of the service account.

Grant Required Permissions:

If you skipped assigning roles during the service account creation, you can now grant roles to the service account by navigating to "IAM & Admin" > "IAM" and adding the service account's email address with the appropriate roles.

How was this made

Blood, sweat and tears

We have

A pipe server that
1. Reads in a config dict
2. Creates an object to the TTS Engine and holds it in memory to reduce coldstart time
3. Speaks using sounddevice (heavily reliant on py3-tts-wrapper)
Client - calling executable
1. You can pass it a config and a string or no string and it will use the pasteboard text
2. Calls the pipe service
A GUI Configuration editor
1. QT Based editor
2. Note calls client.exe with temp configs.

There is a lot of magic to make this work though. This includes

TTS-Wrapper - a unified wrapper to a range of TTS engines. This is needed as we need a unified way of get_voices and speak, speak_streamed etc
Sherpa-Onnx - a really nice tooling pipleine to deal with VITS models that run on the edge.
MMS and Models readied for Sherpa-Onnx - Massive help this work from Meta - and we converted their models for (Sherpa-)Onnx. We made some things on the way like a nice JSON with details on the voices. Commerical Providers: Please note the licence these are under
QT/QT Threading. We had "fun" with threads. Never again will I do it like this
Encryption in a github Action of keys and a hideous JSON file from Google. That wasted us a week.

Credits

Will Wade (original v1, refactoring v2 several times, dealing with encryption, build scripts and generally pulling my hair out)
Acer Jay Costillo (QT work and refactoring)
Gavin Henderson - for making the call on baking in creds. I hated that and several times threw the idea out.
Simon Poole - CTO at Smartbox for making me aware of MMS.

Whats next?

SAPI Bridge. This is really what is needed. C++ developers - we need your help. See Roadmap