Audio samples for CHiVE

Audio samples for various CHiVE (Clockwork Hierarchical Variational Autoencoder) papers.

Improving Prosody of RNN-based English Text-To-Speech Synthesis by Incorporating a BERT model

These samples go with Improving Prosody of RNN-based English Text-To-Speech Synthesis by Incorporating a BERT model, T. Kenter, M. Sharma, R. A. J. Clark, INTERSPEECH 2020.

We also released the 'hard lines' evaluation set as used in the paper.

Modelling Intonation in Spectrograms for Neural Vocoder based Text-to-Speech

These samples go with Modelling Intonation in Spectrograms for Neural Vocoder based Text-to-Speech, V. Wan, J. Shen, H. Silen and R. A. J. Clark, Speech Prosody 2020.

CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network

See here for audio samples that go with the original CHiVE paper CHiVE: Varying Prosody in Speech Synthesis with a Linguistically Driven Dynamic Hierarchical Conditional Variational Network, V. Wan, C-a Chan, T. Kenter, J. Vit and R. A. J. Clark, ICML 2019.