The Chains Corpus

The Chains corpus is a novel speech corpus collected with the primary aim of facilitating research in speaker identification. The corpus features approximately 36 speakers recorded under a variety of speaking conditions, allowing comparison of the same speaker across different well-defined speech styles. Speakers read a variety of texts alone, in synchrony with a dialect-matched co-speaker, in imitation of a dialect-matched co-speaker, in a whisper, and at a fast rate. There is also an unscripted spontaneous retelling of a read fable. The bulk of the speakers were speakers of Eastern Hiberno-English. The corpus is being made freely available for research purposes. Any errata we discover will be noted here (see bottom of page).

A full description of the corpus can be found in this paper. The main points are summarized below. Instructions for obtaining the corpus are given at the bottom of this page. Address any queries to Fred Cummins.

From December 2008, this corpus will be available from the Linguistic Data Consortium for a nominal fee. It will still be available for free download from here. From May 2013, we no longer offer the corpus for free on DVD.

Corpus description

Speakers

There are 36 speakers. 28 (14 male, 14 female) of those are from the Eastern part of Ireland, and speak Eastern Hiberno-English. The remaing 8 speakers (4 male, 4 female) are from the UK and the USA.

Speaking conditions

There are six speaking conditions:

Solo Speech: Subjects read all texts at a natural rate, after reading them through once silently.
Retelling.: Subjects retold the story of the Cinderella text in their own words. No constraints on time.
Synchronous Speech: Pairs of subjects read texts together, attempting to remain in synchrony with one another.
Repetitive Synchronous Imitation: Each phrase of the Cinderella text was played in a loop through headphones. Subjects repeated the phrase along with this model, attempting to match the model as best they could.
Fast Speech: Subjects read all texts at an accelerated rate.
Whispered Speech: Subjects read all texts in a whisper.

Texts

Texts used included four short fables (Cinderella, Rainbow text, North Wind and the Sun, Members of the Body) and 33 individual sentences, of which 9 were selected from the CSLU Speaker Identification Corpus, and 24 are from TIMIT.

Recording conditions

The Solo, Synchronous and Retell conditions were recorded in a professional recording studio. Each speaker sat in a sound treated booth, and, in the Synchronous condition, could see the other speaker through a thick glass partition. The remaining conditions (RSI, Whisper, Fast) were recorded in a quiet, but not sound-treated office environment using a head-mounted microphone.

Errata

A new release of the corpus is being made available from 16th April, 2007. In this release, the labels on speakers irf03 and irf04 in conditions 'fast', 'rsi' and 'whsp' have been switched to correctly reflect the speakers. If further errata are noticed, please inform us!

The new release will be available on DVD from 16th April, and for download by ftp from the same date. It will also include MFCC files for 4, 8 and 16 speaker sets, including all 6 speaking conditions. These are in ARFF format, suitable for use with the Weka data mining software.

Update, Nov 11th, 2008: Some copies of the distribution may be missing one file. This is irm08_s28_solo.wav, or the 28th sentence of the 8th male speaker in the solo condition. If you are missing that file, you can download it here.

Obtaining the Corpus

The corpus is being made available to researchers at no charge. There are two ways to obtain it.

FTP The entire corpus has been divided into multiple archive files. These can be downloaded directly from this page. Full instructions for reassembling and unpacking the archive are given there.

LDC The corpus can be obtained for a nominal fee from the Linguistics Data COnsortium. Further infomation is available at this web page.

Errata in version 1

These errata should now be corrected in Version 2.0, released April 16th, 2007

The recording labelled as Sentence 19 by speaker 1rf11 in the solo condition is, in fact, a copy of her reading of sentence 11. So file irf11_s19_solo.wav is not as advertised.

The Rainbow text is listed in the documentation as text f03, and the North Wind text is listed as text f02. These labels are reversed: the Rainbow text is f02, the North Wind is f03.

Speaker labels for irf03 and irf04 are reversed in the conditions 'fast', 'rsi', and 'whsp'