NAME

rover - Recognition Output Voting Error Reduction

SYNOPSIS

rover [ -sT -a alpha -c Nconf -f level -l width ] ( -h hypfile ctm )+ -o outfile -m meth

Input Options

Define the hypothesis file and it's format. This option must be used more than once. Currently, only the ctm format is recgnized.

Output Options

Define the output file. (Will be same format as hyps)

Defines feedback mode, default is 1

Defines the line width.

Voting Options

Voting by Using Maximum Confidence Score

Voting by Average Conf. Score

Same as maxconf, but set the confidence score for the NULL transition arcs to be the number of NULL transition arcs in the correspondence set divided by the number of input systems.

Set Alpha to 'alpha'. Alpha is the tradeoff between using word occurrence counts and confidence scores. By default alpha is 1.0

Set confidence score associated with NULL transition arcs to 'Nconf'. Default: 0.0

Alignment Options

Do Case-sensitive alignments.

Use time information, (if available), to calculated word-to- word distances.

DESCRIPTION

Rover is a tool combine hypothesized word outputs of multiple recognition systems and select the best scoring word sequence. Rover is part of the NIST SCTK Scoring Tookit. A number of different output formats can be generated and different scoring functions can be specified. A more complete description of the rover system can be found in the paper A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER).

The ROVER system is implemented in two modules. First, the system outputs from two or more ASR systems are combined into a single word transition network. The network is created using a modification of the dynamic programming alignment protocol traditionally used by NIST to evaluate ASR technology. Once the network is generated, the second module evaluates each branching point using a voting scheme, which selects the best scoring word (with the highest number of votes) for the new transcription. The following figure depicts the the overall system architecture.

The heart of the Rover program is the ability to combine system outputs of mulitple recognition systems using an iterative Dynamic Programming alignment protocol into a single, composite Word Transition Network (WTN). The protocol is fully described in the Section 2.1. RECOGNITION OUTPUT ALIGNMENT MODULE of the paper.

Once the composite WTN is produced, each correspondence set (CS) is evaluated using the selected scoring function. Section 2.2. WTN VOTING SEARCH MODULE: describes the voting process in detail. There are three voting schemes described in the paper:

By Word Frequency

To use word frequency as the scoring function, use the options '-m avgconf -a 1.0 -c 0.0'. By setting -a to 1.0, the tradeoff between word occurrences and confidence scores, only the word occurrences are used.

By Average Confidence Scores

The '-m avgconf' option make the voting function use average confidence scores.

By Word Maximum Confidence Scores

The '-m maxconf' option make the voting function use the maximum confidence per word as the scoring metric.

REVISION HISTORY

This is the initial release.

BUGS/COMMENTS

jonathan.fiscus@nist.gov