Audio Demos for "DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement"


Audio examples
Attention matrix examples

Audio examples

Example 1

Mixture
Speech Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Noise Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Example 2

Mixture
Speech Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Noise Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Example 3

Mixture
Speech Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Noise Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Example 4

Mixture
Speech Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Noise Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Example 5

Mixture
Speech Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Noise Ground-truth

Conformer-8-STFT

iTDCN++

iTasformer

iDF-Conformer-8

Attention matrix examples

Spectrograms of noisy input and enhanced output (top row), and attention matrices for first and third (middle row) and last (bottom row) Conformer blocks. The x and y axes of attention matrices denote the key and query, respectively.
Example 1
Input

Output


Example 2
Input

Output


Example 3
Input

Output


Example 4
Input

Output


Example 5
Input

Output


Example 6
Input

Output


Example 7
Input

Output


Example 8
Input

Output


Example 9
Input

Output


Example 10
Input

Output