Audio samples for CHiVE-BERT

Audio samples to go with "Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks", by Lev Finkelstein, Heiga Zen, Norman Casagrande, Chun-an Chan, Ye Jia, Tom Kenter, Alexey Petelin, Jonathan Shen, Vincent Wan, Yu Zhang, Yonghui Wu, Rob Clark (accepted to Interspeech 2022)

This page presents samples of British, Australian, and Indian accents produced by CHiVE-BERT trained on synthetic data. The comparison is done vs. the Tacotron accent transfer system inferring the same sentences. Each set of samples is shown for short and long sentences, for male and female speaker.

Short text example: "I'm looking for diet cat food"

Long text example: "Think about how much you read on your phone every day: catching up on the news, scanning a new blog, finally reading the article that everyone is talking about. This may require reading a lot of text, which can be a barrier for people with visual or reading difficulties, or who simply need a little help getting through meatier articles."

US accent

CHiVE-BERT trained on recordings

Samples of US accent produced by CHiVE-BERT trained on human recordings.

British accent

Tacotron accent transfer CHiVE-BERT trained on synthetic data

Samples of British accent produced by CHiVE-BERT trained on synthetic data vs. the Tacotron accent transfer system inferring the same sentences.

Australian accent

Tacotron accent transfer CHiVE-BERT trained on synthetic data

Samples of Australian accent produced by CHiVE-BERT trained on synthetic data vs. the Tacotron accent transfer system inferring the same sentences.

Indian accent

Tacotron accent transfer CHiVE-BERT trained on synthetic data

Samples of Indian accent produced by CHiVE-BERT trained on synthetic data vs. the Tacotron accent transfer system inferring the same sentences.