Blame view
egs/apiai_decode/s5/README.md
2.31 KB
8dcb6dfcb first commit |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# Api.ai model decoding example scripts This directory contains scripts on how to use a pre-trained chain english model and kaldi base code to recognize any number of wav files. IMPORTANT: wav files must be in 16kHz, 16 bit little-endian format. ## Model English pretrained model were released by Api.ai under Creative Commons Attribution-ShareAlike 4.0 International Public License. - Acoustic data is mostly mobile recorded data - Language model is based on Assistant.ai logs and good for understanding short commands, like "Wake me up at 7 am" For more details, visit https://github.com/api-ai/api-ai-english-asr-model ## Usage Ensure kaldi is compiled and this scripts are inside kaldi/egs/<subfolder>/ directory then run ```sh $ ./download-model.sh # to download pretrained chain model $ ./recognize-wav.sh test1.wav test2.wav # to do recognition ``` See console output for recognition results. ### Using steps/nnet3/decode.sh You can use kaldi steps/nnet3/decode.sh, which will decode data and calculate Word Error Rate (WER) for it. Run: ```sh $ recognize-wav.sh test1.wav test2.wav ``` It will make data dir, calculate mfcc features for it and do decoding, you need only first two steps out of it. If you want WER then edit data/test-corpus/text and replace NO_TRANSCRIPTION with expected text transcription for every wav file. Run for decoding: ```sh $ steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 --cmd run.pl --nj 1 exp/api.ai-model/ data/test-corpus/ exp/api.ai-model/decode/ ``` See exp/api.ai-model/decode/wer* files for WER and exp/api.ai-model/decode/log/ files for decoding output. ### Online Decoder: See http://kaldi-asr.org/doc/online_decoding.html for more information about kaldi online decoding. Run: ```sh $./local/create-corpus.sh data/test-corpus/ test1.wav test2.wav ``` If you want WER then edit data/test-corpus/text and replace NO_TRANSCRIPTION with expected text transcription for every wav file. Make config file exp/api.ai-model/conf/online.conf with following content: ``` --feature-type=mfcc --mfcc-config=exp/api.ai-model/mfcc.conf ``` Then run: ```sh $ steps/online/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 --cmd run.pl --nj 1 exp/api.ai-model/ data/test-corpus/ exp/api.ai-model/decode/ ``` See exp/api.ai-model/decode/wer* files for WER and exp/api.ai-model/decode/log/ files for decoding output. |