Blame view

HOWTO 3.96 KB
665a8dac3   Jean-François Rey   ! follow the whit...
1
2
3
4
5
  #---------------#
  # OTMEDIA LIA   #
  # HOWTO         #
  # version 1.0   #
  #---------------#
0052714e7   Jean-François Rey   bugfix 3pass
6
7
  1\ Main scripts options
  -----------------------
665a8dac3   Jean-François Rey   ! follow the whit...
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
  
  There are five main options for otmedia scripts.
  -h : for help
  -D : Debug mode
  -v n : Verbose mode 1 low to 3 high
  -c : Check results
  -r : force to rerun a script, without deleting work already done
  
  2\ Main scripts
  ---------------
      2.1\ FirstPass.sh
      -----------------
  
      FirstPass.sh do speaker diarization and transcription of an audio file. Convert it into wav format if not already done (16000Hz, 16 bits, mono).
      If a .SRT file is present in the same directory of the audio file it will copy it.
  
      $> FisrtPass.sh [options] 110624FR2_20002100.wav result_directory
  
      Options:
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/res_p1/
  
      2.2\ SecondPass.sh
      ------------------
  
      SecondPass.sh do speaker adaptation and transcriptions base on the first pass.
  
      $> SecondPass.sh [options] result_directory/110624FR2_20002100/
  
      Options:
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/res_p2/
  
      2.3\ ConfPass.sh
      ----------------
  
      ConfPass.sh do confidence measure using the second or third pass.
  
      $> Confpass.sh [options] result_directory/110624FR2_20002100/ <res_p2|res_p3>
  
      Output : result_directory/110624FR2_20002100/conf/res_p2/scored_ctm/
              and result_directory/110624FR2_20002100.usf file
  
      2.4\ ExploitConfidencePass.sh
      -----------------------------
  
      It exploits confidence pass measure to :
      - boost confidente zone
      - find alternative in non confidente zone (using SOLR DB)
      - extend the lexicon
  
      $> ExploitConfidencePass.sh [options] result_directory/110624FR2_20002100
  
      Output :   result_directory/110624FR2_20002100/trigg/speeral
                 result_directory/110624FR2_20002100/LEX/speeral/_ext
                  
      2.5\ ThirstPass.sh
      ------------------
  
      ThirdPass.sh do transcriptions using SecondPass speaker adaptation and ExploitConfidencePass trigg files and new lexicon.
      
      $> ThirdPass.sh [options] result_directory/110624FR2_20002100/
  
      Options :
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/conf/res_p3
      
      2.6\ RecomposePass.sh
      --------------------
  
      RecomposePass.sh copy results that missing in ThirsPass from the Second and First Pass.
  
      $> RecomposePass.sh [options] result_directory/110624FR2_20002100/
  
      Output : result_directory/110624FR2_20002100/res_all
      
      2.7\ ScoringRes.sh
      ------------------
  
      ScoringRes.sh run differents scoring tools to score the results using SRT file if exists.
  
      $> ScoringRes.sh [options] result_directory/110624FR2_20002100/
  
      Output : result_directory/110624FR2_20002100/scoring
       
      2.8\ CheckResults.sh
      --------------------
  
      CheckResults.sh parse results directories to synthesize works already done.
  
      $> CheckResults.sh [options] result_directory
  
      Output : "Directory name      #plp    #res_p1 #treil_p2   #treil_p3   usf_p2  usf_p3"
              #plp number of plp files
              #res_p1 number of .res files at first pass
              #treil_p2 number of .treil files at second pass
              #treil_p3 number of .treil files at third pass
              usf_p2 usf file from confidence pass result on second pass (OK|ERR|NAN)
              usf_p3 usf file from confidence pass result on third pass (OK|ERR|NAN)
  
  3\ OneScriptToRuleThemAll.sh
  ----------------------------
  
      The script to do all OTMEDIA LIA pass in one call.
  
      $> OneScriptToRuleThemAll.sh [options] 110624FR2_20002100.wav result_directory 
  
      Options : (default options are availables)
      -a Do every pass
      -1 Do First pass
      -2 Do Second pass
      -3 Do Third pass
      -C Do Confidence pass
      -e Do Exploit Confidence pass
      -R Do Recompose pass
      -s Do Scoring pass
0052714e7   Jean-François Rey   bugfix 3pass
127
128
129
130
131
132
  4\ Modify configuration
  
      4.1\ Scripts configurations
      4.2\ Speeral configurations
  
  5\ Modify binaries