Speech Tokenizers

1 minute read


Speech tokenizer, as the name suggests, is to convert continous speech waveform into discrete tokens (usually called units). It bridges the gap between speech and text representations, also simplifying the manipulation of speech signals.

Speech Language Models

5 minute read


With the popularity of language modeling, there have been many advances in speech language models leveraing their in-context learning capability in speech synthesis.

Voice Synthesis

6 minute read


Speech synthesis with controllable voice is a challenging task.