README.txt 1.11 KB
This directory (sre08) contains example scripts for speaker identification, not speech recognition. The following LDC corpora are required for the testing: SRE 2008 training set: LDC2011S05 SRE 2008 test set: LDC2011S08 (but all the subdirectories require additional data for system development, see the corresponding README.txt files in subdirectories). Note: In the speaker id community the words "train", "test" and "development" are used in a different sense from in the speech recognition community. In speaker-id land, the "development" data is the data you actually use to build the system; the "training" and "test" data are really both used to test how well the systme performs. The "training" data is used as enrolment data for a particular speaker, and the evaluation consists of taking utterances from the "test" data and having the system decide whether they were uttered by various training speakers. The subdirectories "v1" and so on are different versions of the recipe; we don't call them "s1" etc., because they don't really correspond to the speech recognition recipes.