###
# MGB-5 corpus: Moroccan Arabic Automatic Speech Recognition
# Created in collaboration between QCRI and ELRA
# More details can be found here: https://arabicspeech.org/mgb5
###
## INTRODUCTION ##
Training data: 10.2 hours from 69 programs
Development data: 1.8 hours from 10 programs
Testing data: 2.0 hours from 14 programs
## KNOWN ISSUES ##
1- The dev data does not have the same alignment across the four annotators
2- Once alignment is consistent, we can include multi-refence word error rate
3- Use MGB-2 as background model