About aurora4
The aurora4 database contains a) clean wsj0 data (Wall Street Journal)
b) artificially added noise with clean wsj0 data
for detailed information, please refer to: http://aurora.hsnr.de/aurora-4.html.
To obtain the data you should contact ELRA, see
http://catalog.elra.info/index.php?cPath=37_40, and look for aurora4a and aurora4b.
If you already have the WSJ license from LDC, you should not need any
additional licenses (but they may want to check that you have a license for
WSJ).
Note: we recommend to use the chime2 example scripts instead of these ones.
About the Wall Street Journal corpus:
This is a corpus of read
sentences from the Wall Street Journal, recorded under clean conditions.
The vocabulary is quite large. About 80 hours of training data.
Available from the LDC as either: [ catalog numbers LDC93S6A (WSJ0) and LDC94S13A (WSJ1) ]
or: [ catalog numbers LDC93S6B (WSJ0) and LDC94S13B (WSJ1) ]
The latter option is cheaper and includes only the Sennheiser
microphone data (which is all we use in the example scripts)