The LibriSpeech corpus is a large (1000 hour) corpus of English read speech
 derived from audiobooks in the LibriVox project, sampled at 16kHz.  The
 accents are various and not marked, but the majority are US English.  It is
 available for download for free at  It was prepared
 as a speech recognition corpus by Vassil Panayotov.

 The recipe is in s5/