Blame view
tools/sctk-2.4.10/doc/revis.htm
7.96 KB
8dcb6dfcb first commit |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 |
<!-- $Id: revis.htm,v 1.6 2004/08/30 15:10:38 jfiscus Exp $ --> <HTML><HEAD> <CENTER><TITLE>SCLITE revisions</TITLE> </HEAD> <BODY></CENTER><p><hr> <H1> <A NAME="revisions_name_0"> <strong> <A HREF="sclite.htm#sclite_name_0">Sclite</A> Revision.txt </A> </strong> </H1> <p> <pre> sclite 1.0 - Released July 27, 1995 sclite 1.1 - Released September 27, 1995 - New/modified output options: * Added options to '-o': 'none' to not make any reports, 'sgml' to create an sgml file for alignments, 'lur' for the labeled utterance report. * '-p'. Pipes output of alignments to other sclite utilities. in the sgml format. - New Input options: * '-P' accepts piped sgml format input from other sclite utilities. * '-e' identifies the input character encoding. - New alignment options: * '-S' performs an inferred word segmentation algorithm rather than using the word segmentation of the reference and hyp files. * '-F' aligns fragments to words with matching substrings and scores them as correct. * Changed the -c option to include the optional flag "ASCIITOO" which also splits ascii words when doing a character alignment. Also added another flag, "DH", to delete hyphens from the ref and hyp transcripts before alingment. - Fixes and Changes: * Modified the '-n' option to handle multiple hyp files. * Fixed a bug in 'parse_stm_line' to handle empty texts. * Modified the read function for a CTM file so that any length file will be properly read in. - Compiled and tested using the HP-UX and DEC OSF1 native cc compilers. sclite 1.2 - Released March 8, 1996 - Corrected a bug in the lur report that was activated if a speaker had no reference words, but had errorneously hypothesized words. - Added the sent, spk, and ovrdtl reports to sclite. - Added the option to score CTM to CTM files. This is essentially the same code used for the first SWB LVCSR evaluation, however, since the new network alignment routines were used unifying the alignment into a single step, alignments will differ slightly from those generated with the old scoring package. - Added the "-T" option to do time-mediated alignments. - Removed the size limitations in the report generation software, 'rpg.c'. There is still are hard limit on the length of characters for each cell of 200. - Standardize program exit codes to be 0 for successfull execution and 1 for failed execution. - Correct the handling of NULL alternatives in the hypothesis file. Scoring reference to hypothesis yields the same error rates as scoring hypothesis to reference. The only difference is insertions are swapped with deletions. - The installer now has the option to enable or disable alignments via GNU's diff. - Added informative error messages when label definitions, which are used by the 'lur' report, have been improperly specified. sclite 1.2a - Released March 15, 1996 - Forgot one minor file in the distrubution, "sclite.c". sclite 1.3 - Released April 22, 1996 - Corrected a minor makefile inconsistency. (One file was compiled twice). - Changed Network_dp_align to optionally include NULLS in the output. - Changed the -m option to now reduce either the reference or hypothesis file, or both before alignment takes place. - fixed an uninitialized variable in alex.c which became apparent in the 'dtl' and 'spk' reports. - Corrected a argument passed to fill_STM_structure() in stm2ctm.c which caused a warning on some compilers. - Added a bug report proceedures. Revision 1.4 - Released October 18, 1996 - Forced confidence values to flow through the entire data pipeline. - Added the '-C' option to include 'normalized cross-entropy' statistics in all output files. - Added algo2 for the inferred segmentation option '-S' - Added "IGNORE_TIME_SEGMENT_IN_SCORING" as an allowable transcript for an stm record. See the stm file documentation for it's use. Revision 1.4a - Released May 29, 1997 - Cleaned the distribution to be ISO-9669 compatable Released under a different name, sctk Version 1.0 - Modified the label extraction function 'parse_input_comment_line' to ignore duplicate LABEL and CATEGORY lines. - Added a sequence number to each PATH in alignment sequence so that the input sequence of alignments can be reconstructed. - Added the capability to keep track of reference confidence scores when aligning ref ctm's against hyp ctm's. - Corrected the .pre dump of the alignment structure when the case sensitive flag is set. The error was introduced by modifications. - Fixed a problem in TEXT_strcasecmp(). It failed to handle the case where str1 was shorter than srt2. - Fixed a problem in 'align.c/extract_speaker()' a NULL was not terminating each newly extracted speaker id. - Revised the reports lut, sum, snt, spkr,ovr to handle speakers W/o any reference tokens, In the sum report, the speakers W/o reference tokens are ignored when computing the speaker mean, sd, and median. - fixed a bug in tcslite.sh which output an error when test 5 was run and the use of gnudiff was not compiled in to sclite. - fixed a bug in config.in which was propagated to config.sh. The problem was a missing backquote on "uname -s". - Added error checking to the ctm2ctm alignment module. No checking had been performed to make sure the ref and hyp files had the same conversations and channels. - Fixed a problem in 'expand_words_to_chars()' it was not deleting hyphens from single character words do to an incorrect conditional. - Added a new way to score, 'Optionally Deletable'. This required a major set of modifications and generalizations. - Modified the character scoring proceedure so that confidence scores are imputed to the sub-characters making up the word. - Corrected a bug in Compute_ROC:det.c which incorrectly incremented pointers. SCTK Version 1.1 - Released November 13, 1997 - Utility versions in this release: sclite V2.1, sc_stats V1.1 - added the Executive and Raw Executive Summaries to sc_stats. - added the det curve to sc_stats so that combined plots are produced. - modified mapsswe test to handle arbitrary number of segments. - Correct a bug in mtchprs.c which was free-ing a the test confindence array prematurely. SCTK Version 1.2 - added the prn report to sc_stats. Prints N-system alignments together. - Added option alignment by word-weighted-mediated alignments. - Weight inputs include wwl file (-w) and LM file (-L). - Added testing scripts and documentation examples. - Added the .wws output format. - Update .prf output to include word weights and other information. - Add SLM toolkit v2 into the sctk package. - modified config.in, makefile.in and the installation process - Various internal structures modified to handle word weights. - Compiles under Linux using gmake. - Documetation changes, including additional comments concerning the waveform id in the STM and CTM file formats. SCTK Version 1.2a - Fixed an installation problem for Linux involving scfp. SCTK Version 1.2b - Released October 1, 2000 - Improved testing code to not report errors under Linux SCTK Version 1.2c - Released October 11, 2000 - Improved installation targets in makefile SCTK Version 1.3 - Release July 30, 2004 - Minor bug fixes for core dumps - Added the ability to pass two tags attached to each word through the scorer. The tags are attached to the words by appending ';<string>' to the word's text. There can be up to two tags, and they may be empty. - Added a '#' after NCE values in the .sys reports to indicate the abscence of reference lexemes for a speaker. - Expanded the buffers in the rpg.c suite of routines for report generation. - Expanded the maximum alternation size to 10000 characters. - Added a "Lattice" error rate calculation in the .prn reports. It's the percent of reference tokens not correct in any systems transcript.</pre> </body> </html> |