Text To Speech

Speech Synthesis is the part of the system that renders and creates an audio-friendly form. This includes what you hear as well as how the mouth moves. Multiple systems work together to both create the audio through Text To Speech (TTS) as well as mapping of phonemes (how your mouth moves).

info

There is a tradeoff between speed and quality. We can achieve higher quality voices, but it feels like you are on a badly lagging video call and can be disruptive.

Quality	Latency	Example
Low	Very Fast
Middle	Fast
High	Some
Ultra	High