sc_stats.1 4.79 KB
edit raw blame history



1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238


.TH sc_stats 1 "" "" "" ""
WORK IN PROGRESS 
.br
 RELEASED FOR Hub-5NE '97 Eval

NAME
sc_stats - SCLITE's Statistical System Comparison Program
.PP
.PP

NOTE: This manual page was created automatically from
HTMl pages in the sclite/doc directory.  This manual page does not
include output file examples.  The author suggests using a HTML browser
for reading the sclite documentation.
.PP
SYNOPSIS
sc_stats \*LOPTIONS\*O
.PP
 
DESCRIPTION .PP

sc_stats is a program which statistically compares the performance of
two or more Automatic Speech Recognition (ASR) systems which have been
run on identical test data.  Sc_stats is part of the \*LNIST SCTK\*O Scoring Tookit. The program reads alignments generated by
sclite, and produces \*L summary reports
\*O, \*Lgraphs and/or compares the
systems using any number tests of significant differences.
.PP
.PP
INPUT ALIGNMENTS 
.RS
.RE
OUTPUT REPORTS 
.RS
.RE
STATISTICAL TESTS 
.RS
.RE
REVISION HISTORY 
.RS
.RE
BUGS/COMMENTS 
.RS
Please contact Jon Fiscus at NIST with any bug reports or comments at
the email address 
\*Ljonathan.fiscus@nist.gov \*O or
by phone, (301)-975-3182.  Please include the version number of rover,
.RE
.RE
.\"  $Id: sc_stats.1,v 1.6 2004/08/30 15:10:38 jfiscus Exp $ 

\*LSc_stats\*O Commandline Options
.PP
The commandline options for sc_stats can be broken into four categories:
.LI
\*L Input File Options: \*O
.RS
\*L-p\*O,
.RE
.LI
\*L Output Options: \*O
.RS
\*L-e\*O,
\*L-n\*O,
\*L-O\*O,
.RE
.LI
\*L Report Generation Options: \*O
.RS
\*L-r\*O
.RE
.LI
\*L Statistical Test Options: \*O
.RS
\*L-t\*O
\*L-v\*O
\*L-u\*O
\*L-g\*O
.RE
.LE
.PP
Input File Options: 
.RS
These options control/define the input to sc_stats.  Input must
come from stdin and the -p option must be used.  (Forcing the user to
use the -p option enables future expandability while maintaining backward
compatability.)
.br
.br
-p	
.RS
Alignments are read from 'stdin' as  input  to  sc_stats.
The  format  of  the input must be in the "sgml" output
format, created either by '-o sgml' or by  piped  input
from another sctk utility.  
.RE
.RE
Output Options: 
.RS
-e desc
.RS
Description of the ensemble of hyp files.
.RE
-O output_dir 
.RS
Writes all output files into output_dir.  Defaults to the
hypfile's directory
.RE
-n name 
.RS
Writes all multiple hypothesis file reports to files beginning
with 'name'.  Using '-' writes to stdout. Default: 'Ensemble'
.RE
.RE
Report Generation Options: 
.RS
-g
.RS
Generate per speaker range graphs, based on the formula defined
by '-f'.  The reports are written to files whose root name
begins with the values defined by '-n'.  There are two graphs
produced, one showing speaker performance variability across
systems and the
second showing system performance variablity for across speakers.
.PP
- The 'range' graphs are an ASCII
representation of the
of the variablity in error rates for a given speaker.  The
graph is sorted be the mean of statistic computed for each speaker.
\*LEXAMPLE\*O
.PP
 - The 'grange' graph is a gnuplot version of the same data
ploted in 'range.  There are two sets of files created. 
The first set, which is called '*.grange.spk.plt' and
'*.grange.spk.dat', contains the gnuplot command files and
data files respectively for the speaker performance variability 
across systems graph.
The second set, which is called '*.grange.sys.plt' and
'*.grange.sys.dat', contains the gnuplot command files and
data files respectively for the system
performance variability across speakers graph.
\*LEXAMPLE\*O
.PP
 - The 'grange2' graph is similar to the 'grange'
graph except that each systems speaker word error scores are 
identified by a unique symbol.
\*LEXAMPLE\*O
.RE

.br

-r
.RS
.VL 4m

.LI " prn -
\*LExample\*O

.LI " sum -
\*LExample\*O

.LI " rsum -
\*LExample\*O

.LI " lur -
\*LExample\*O

.LI " es -
\*LExample\*O

.LI " res -
\*LExample\*O

.LI " none -
Produce no output reports, Default.
.LE
.RE
.RE
Statistical Test Options: 
.RS
-t [ mcn | mapsswe | sign | wilc | anovar | std4 ]
.RS
.VL 4m

.LI " mcn -
Perform the McNemar Test.

.LI " mapsswe -
Perform the Matched Pairs Sentence Segment Word Error Test

.LI " sign -
Perform the Sign Test

.LI " wilc -
Perform the Wilcoxon Signed Rank Test

.LI " anovar -
Perform the Analysis of Variance by Rank Test

.LI " std -
This is a shorthand notation to do the 'standard' four tests:
mcn, mapsswe, wilc and sign.
.LE
.RE

.br

-v 
.RS
For each test performed on a pair of systems files, output a
detailed analysis.
.RE

.br

-u 
.RS
Rather than creating a comparison matrix for each test, unify
statistical test results into a single comparision matrix
.RE

.br

-f [ E | R | W ]  
.RS
Use the identified formula for statistical tests: sign,
wilcoxon and anovar tests.  The formulas are:
.AL
.LI
 E -> Percentage Word Error
.LI
 R -> Percentage Words Correctly Recognized
.LI
 E -> Percentage Word Accuracy
.LE
By default 'E'
.RE
.RE