[mimictts] Fix ssml and playing from audiosinks using the audio servlet (#14120)

* [mimictts] Fix ssml and playing from an audiosink using the audio servlet Fix : - ssml not working - add an option to store the audio on a file before sending it to openhab. It enables audiosink based on the audio servlet to play the sound (the servlet requires the getClonedStream method, unavailable with a pure streaming approach). The files are stored in the user data directory and deleted as soon as possible (stream close detection). - fix error with voice name not encoded Signed-off-by: Gwendal Roulleau <gwendal.roulleau@gmail.com>
2023-01-14 09:39:59 +01:00
parent 0de87b15d2
commit d497defe34
6 changed files with 157 additions and 8 deletions
--- a/bundles/org.openhab.voice.mimictts/README.md
+++ b/bundles/org.openhab.voice.mimictts/README.md
@@ -17,6 +17,7 @@ It supports a subset of SSML, and if you want to use it, be sure to start your t
 Using your favorite configuration UI to edit **Settings / Other Services - Mimic Text-to-Speech** and set:

 * **url** - Mimic URL. Default to `http://localhost:59125`
+* **workaroundServletSink** - A boolean activating a workaround for audiosink using the openHAB servlet. It stores audio file temporarily on disk, allowing the servlet to get a cloned stream as needed. Default false.
 * **speakingRate** - Controls how fast the voice speaks the text. A value of 1 is the speed of the training dataset. Less than 1 is faster, and more than 1 is slower.
 * **audioVolatility** - The amount of noise added to the generated audio (0-1). Can help mask audio artifacts from the voice model. Multi-speaker models tend to sound better with a lower amount of noise than single speaker models.
 * **phonemeVolatility** - The amount of noise used to generate phoneme durations (0-1). Allows for variable speaking cadance, with a value closer to 1 being more variable. Multi-speaker models tend to sound better with a lower amount of phoneme variability than single speaker models.