Download zip Select Archive Format
Name Last Update history
File empty ..
File dir conf Loading commit data...
File txt README.txt Loading commit data...
File txt cmd.sh Loading commit data...
File txt local Loading commit data...
File txt path.sh Loading commit data...
File txt run.sh Loading commit data...
File txt sid Loading commit data...
File txt steps Loading commit data...
File txt utils Loading commit data...

README.txt

This recipe replaces the standard unsupervised GMM of the v1 recipe with a 
 UBM that uses a time-delay deep neural network (TDNN).  Posteriors from the
 TDNN are used in conjunction with features extracted using a standard approach
 for speaker recognition, to create the sufficient statistics for i-vector
 extraction.  The recipe also demonstrates a lightweight alternative in which
 a supervised GMM is derived from the TDNN posteriors. The recipe is based on
 http://www.danielpovey.com/files/2015_asru_tdnn_ubm.pdf. See run.sh for 
 updated results.

 The following describes data required for system development (on top of the 
 data for testing described in ../README.txt).  We use SWBD and the older 
 (prior to 2010) SREs to train the supervised-GMM and iVector extractor. To 
 create an in-domain system, the SREs are needed to train the PLDA backend.
 The TDNN is trained on Fisher English.
 
     Corpus              LDC Catalog No.
     SWBD2 Phase 2       LDC99S79
     SWBD2 Phase 3       LDC2002S06
     SWBD Cellular 1     LDC2001S13
     SWBD Ceullar 2      LDC2004S07
     SRE2004             LDC2006S44
     SRE2005 Train       LDC2011S01
     SRE2005 Test        LDC2011S04
     SRE2006 Train       LDC2011S09
     SRE2006 Test 1      LDC2011S10
     SRE2006 Test 2      LDC2012S01
     SRE2008 Train       LDC2011S05
     SRE2008 Test        LDC2011S08
     Fisher speech       LDC2004S13, LDC2005S13 
     Fisher test         LDC2004T19, LDC2005T19