A short guide to the set-up of languages and voices for meSpeak.
Please mind that meSpeak is based on an Emscripten-port of eSpeak, so all of the eSpeak grammar applies also to meSpeak.
meSpeak's language-files provide eSpeak's language- and voice-files in a single package.
(Since a voice usually refers to a language and its dictionary, it seems suitable to bundle them together in a single file.)
The language-files are of the following structure (JSON):
The values of voice_id and dict_id are actually UNIX-filenames, dict_id
relative to the path of eSpeak's data-directory "espeak-data/
", voice_id relative to "espeak-data/voices/
".
If we were to embed the files for the langage "en-en
", these would be:
en/en-en
" for the voice anden_dict
" for the dictionary used by "en-en"For a standard language-file, you would add a base64-representation as the string value of dict and voice of the respective eSpeak-files.
There is an alternate layout for meSpeak's language-files, which is espacially usefull for the purpose of customizing and testing:
Since eSpeak's voice-files are actually plain-text files, you may use a simple string for these, if you provide an additional property "voice_encoding": "text"
at the same time.
For dictionaries, which are a binary files with eSpeak, see the note at the end of the page.
For an example we will configure a basic female voice for "en-us", which will be named "en-us-f".
voices/en/en-us.json
).en-us-f.json
") and open it in a text editor.espeak-data/
" directory.espeak-data/voices/en-us
" looks like this:
name
" parameter to make it unique (e.g.: "name english-us-f
").gender male
" to "gender female
" for a female voice.\n
" in order to get a valid JSON-string:
voice
"-property of the JSON-file."voice_encoding": "text"
to the JSON to indicate that the voice is plain-text.Please note that eSpeak is not very graceful with syntax errors in a voice-definition and will just throw an error, which will — in the case of meSpeak — show up in the console-log.
For further details on voice-parameters and fine-tuning, please refer to the eSpeak-documentation: http://espeak.sourceforge.net/voices.html.
eSpeak's dictonaries are binary files, which must be compiled with eSpeak first.
You would have to install eSpeak and compile a file following the eSpeak documentation.
Further, you would insert a base64-encoded string of the resulting object-file's content as the value of the dict property of a meSpeak-language-file.
Finally, you would set a suiting and unique value for the property dict_id (UNIX file path).
There is no shortcut to this. Sorry.
Please see also the section on the extended voice format at the main-page.
Norbert Landsteiner
Vienna, July 2013