Audio: originally recorded at 44.1 kHz then downsampled to 16 kHz.
We have chosen the first 450 out of the 460 sentences which were used in the MOCHA-TIMIT corpus [link], as it contains phonetically balanced utterances for the speech recording.
Further, the 450 sentences were divided into nine sets of each having 50 sentences.
SET | LINK |
---|---|
1 | link |
2 | link |
3 | link |
4 | link |
5 | link |
6 | link |
7 | link |
8 | link |
9 | link |
Timestamp file name: "ID0[ID]_00[SET]_[GENDER]_[MODE]_[DEVICE]_[SENTNUM].lab"
Example: “ID06_006_F_Neutral_Headset_11.lab”
Audio file name: “ID0[ID]_00[SET]_[GENDER]_[MODE]_[DEVICE]_[SENTNUM].wav”
Example: “ID06_006_F_Neutral_Headset_11.wav”