Blame view
egs/wsj/README.txt
755 Bytes
8dcb6dfcb first commit |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
About the Wall Street Journal corpus: This is a corpus of read sentences from the Wall Street Journal, recorded under clean conditions. The vocabulary is quite large. About 80 hours of training data. Available from the LDC as either: [ catalog numbers LDC93S6A (WSJ0) and LDC94S13A (WSJ1) ] or: [ catalog numbers LDC93S6B (WSJ0) and LDC94S13B (WSJ1) ] The latter option is cheaper and includes only the Sennheiser microphone data (which is all we use in the example scripts). Each subdirectory of this directory contains the scripts for a sequence of experiments. [note: most of the older example scripts have been deleted, but are still available at ^/branches/complete]. s5: This is the current recommended recipe. |