Yannick Estève / ONTRAC-Kaldi

Download zip

Name	Last Update	Last Commit 8dcb6dfcb61 – first commit	history
..
v1	Loading commit data...
README.txt	Loading commit data...

README.txt

This directory (sre08) contains example scripts for speaker identification, not
 speech recognition.  The following LDC corpora are required for the testing:
    
   SRE 2008 training set:  LDC2011S05 
   SRE 2008 test set:      LDC2011S08

 (but all the subdirectories require additional data for system development, see
 the corresponding README.txt files in subdirectories).

 Note:
 In the speaker id community the words "train", "test" and "development"
 are used in a different sense from in the speech recognition community.  In
 speaker-id land, the "development" data is the data you actually use to build
 the system; the "training" and "test" data are really both used to test how
 well the systme performs.  The "training" data is used as enrolment data for a
 particular speaker, and the evaluation consists of taking utterances from the
 "test" data and having the system decide whether they were uttered by various
 training speakers.


 The subdirectories "v1" and so on are different versions of the recipe;
 we don't call them "s1" etc., because they don't really correspond to
 the speech recognition recipes.