.TH sc_stats 1 "" "" "" "" WORK IN PROGRESS .br RELEASED FOR Hub-5NE '97 Eval NAME sc_stats - SCLITE's Statistical System Comparison Program .PP .PP NOTE: This manual page was created automatically from HTMl pages in the sclite/doc directory. This manual page does not include output file examples. The author suggests using a HTML browser for reading the sclite documentation. .PP SYNOPSIS sc_stats \*LOPTIONS\*O .PP DESCRIPTION .PP sc_stats is a program which statistically compares the performance of two or more Automatic Speech Recognition (ASR) systems which have been run on identical test data. Sc_stats is part of the \*LNIST SCTK\*O Scoring Tookit. The program reads alignments generated by sclite, and produces \*L summary reports \*O, \*Lgraphs and/or compares the systems using any number tests of significant differences. .PP .PP INPUT ALIGNMENTS .RS .RE OUTPUT REPORTS .RS .RE STATISTICAL TESTS .RS .RE REVISION HISTORY .RS .RE BUGS/COMMENTS .RS Please contact Jon Fiscus at NIST with any bug reports or comments at the email address \*Ljonathan.fiscus@nist.gov \*O or by phone, (301)-975-3182. Please include the version number of rover, .RE .RE .\" $Id: sc_stats.1,v 1.6 2004/08/30 15:10:38 jfiscus Exp $ \*LSc_stats\*O Commandline Options .PP The commandline options for sc_stats can be broken into four categories: .LI \*L Input File Options: \*O .RS \*L-p\*O, .RE .LI \*L Output Options: \*O .RS \*L-e\*O, \*L-n\*O, \*L-O\*O, .RE .LI \*L Report Generation Options: \*O .RS \*L-r\*O .RE .LI \*L Statistical Test Options: \*O .RS \*L-t\*O \*L-v\*O \*L-u\*O \*L-g\*O .RE .LE .PP Input File Options: .RS These options control/define the input to sc_stats. Input must come from stdin and the -p option must be used. (Forcing the user to use the -p option enables future expandability while maintaining backward compatability.) .br .br -p .RS Alignments are read from 'stdin' as input to sc_stats. The format of the input must be in the "sgml" output format, created either by '-o sgml' or by piped input from another sctk utility. .RE .RE Output Options: .RS -e desc .RS Description of the ensemble of hyp files. .RE -O output_dir .RS Writes all output files into output_dir. Defaults to the hypfile's directory .RE -n name .RS Writes all multiple hypothesis file reports to files beginning with 'name'. Using '-' writes to stdout. Default: 'Ensemble' .RE .RE Report Generation Options: .RS -g .RS Generate per speaker range graphs, based on the formula defined by '-f'. The reports are written to files whose root name begins with the values defined by '-n'. There are two graphs produced, one showing speaker performance variability across systems and the second showing system performance variablity for across speakers. .PP - The 'range' graphs are an ASCII representation of the of the variablity in error rates for a given speaker. The graph is sorted be the mean of statistic computed for each speaker. \*LEXAMPLE\*O .PP - The 'grange' graph is a gnuplot version of the same data ploted in 'range. There are two sets of files created. The first set, which is called '*.grange.spk.plt' and '*.grange.spk.dat', contains the gnuplot command files and data files respectively for the speaker performance variability across systems graph. The second set, which is called '*.grange.sys.plt' and '*.grange.sys.dat', contains the gnuplot command files and data files respectively for the system performance variability across speakers graph. \*LEXAMPLE\*O .PP - The 'grange2' graph is similar to the 'grange' graph except that each systems speaker word error scores are identified by a unique symbol. \*LEXAMPLE\*O .RE .br -r .RS .VL 4m .LI " prn - \*LExample\*O .LI " sum - \*LExample\*O .LI " rsum - \*LExample\*O .LI " lur - \*LExample\*O .LI " es - \*LExample\*O .LI " res - \*LExample\*O .LI " none - Produce no output reports, Default. .LE .RE .RE Statistical Test Options: .RS -t [ mcn | mapsswe | sign | wilc | anovar | std4 ] .RS .VL 4m .LI " mcn - Perform the McNemar Test. .LI " mapsswe - Perform the Matched Pairs Sentence Segment Word Error Test .LI " sign - Perform the Sign Test .LI " wilc - Perform the Wilcoxon Signed Rank Test .LI " anovar - Perform the Analysis of Variance by Rank Test .LI " std - This is a shorthand notation to do the 'standard' four tests: mcn, mapsswe, wilc and sign. .LE .RE .br -v .RS For each test performed on a pair of systems files, output a detailed analysis. .RE .br -u .RS Rather than creating a comparison matrix for each test, unify statistical test results into a single comparision matrix .RE .br -f [ E | R | W ] .RS Use the identified formula for statistical tests: sign, wilcoxon and anovar tests. The formulas are: .AL .LI E -> Percentage Word Error .LI R -> Percentage Words Correctly Recognized .LI E -> Percentage Word Accuracy .LE By default 'E' .RE .RE