Blame view

HOWTO 7.31 KB
665a8dac3   Jean-François Rey   ! follow the whit...
1
2
3
4
5
  #---------------#
  # OTMEDIA LIA   #
  # HOWTO         #
  # version 1.0   #
  #---------------#
0052714e7   Jean-François Rey   bugfix 3pass
6
  1\ Main scripts options
17d865629   Jean-François Rey   update doc
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
  2\ Main scripts
      2.1\ FirstPass.sh
      2.2\ SecondPass.sh
      2.3\ ConfPass.sh
      2.4\ ExploitConfidencePass.sh
      2.5\ ThirdPass.sh
      2.6\ RecomposePass.sh
      2.7\ ScoringRes.sh
      2.8\ CheckResults.sh
  3\ OneScriptToRuleThemAll.sh
  4\ Modify configuration
      4.1\ Scripts configurations
      4.2\ Speeral configurations
  5\ Modify binaries
  6\ Exemples
  
  
  1\ Main scripts options
0052714e7   Jean-François Rey   bugfix 3pass
25
  -----------------------
665a8dac3   Jean-François Rey   ! follow the whit...
26
27
28
29
30
  
  There are five main options for otmedia scripts.
  -h : for help
  -D : Debug mode
  -v n : Verbose mode 1 low to 3 high
52d318a61   Jean-François Rey   update doc
31
  -c : Check results (will check process and create log files)
665a8dac3   Jean-François Rey   ! follow the whit...
32
33
34
35
  -r : force to rerun a script, without deleting work already done
  
  2\ Main scripts
  ---------------
17d865629   Jean-François Rey   update doc
36
37
38
  
      Each script got a configuration file in OTMEDIA_HOME/cfg/<scriptname>.cfg .
      The main options can be modify individually through the arguments or in the configuration file.
665a8dac3   Jean-François Rey   ! follow the whit...
39
40
41
42
43
44
45
46
47
48
49
50
      2.1\ FirstPass.sh
      -----------------
  
      FirstPass.sh do speaker diarization and transcription of an audio file. Convert it into wav format if not already done (16000Hz, 16 bits, mono).
      If a .SRT file is present in the same directory of the audio file it will copy it.
  
      $> FisrtPass.sh [options] 110624FR2_20002100.wav result_directory
  
      Options:
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/res_p1/
5a14b8678   Jean-François Rey   update doc
51
              and .ctm, .trs and .txt files.
665a8dac3   Jean-François Rey   ! follow the whit...
52
53
54
55
56
57
58
59
60
61
62
63
  
      2.2\ SecondPass.sh
      ------------------
  
      SecondPass.sh do speaker adaptation and transcriptions base on the first pass.
  
      $> SecondPass.sh [options] result_directory/110624FR2_20002100/
  
      Options:
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/res_p2/
5a14b8678   Jean-François Rey   update doc
64
              and .ctm, .trs and .txt files.
665a8dac3   Jean-François Rey   ! follow the whit...
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
  
      2.3\ ConfPass.sh
      ----------------
  
      ConfPass.sh do confidence measure using the second or third pass.
  
      $> Confpass.sh [options] result_directory/110624FR2_20002100/ <res_p2|res_p3>
  
      Output : result_directory/110624FR2_20002100/conf/res_p2/scored_ctm/
              and result_directory/110624FR2_20002100.usf file
  
      2.4\ ExploitConfidencePass.sh
      -----------------------------
  
      It exploits confidence pass measure to :
      - boost confidente zone
      - find alternative in non confidente zone (using SOLR DB)
      - extend the lexicon
  
      $> ExploitConfidencePass.sh [options] result_directory/110624FR2_20002100
  
      Output :   result_directory/110624FR2_20002100/trigg/speeral
                 result_directory/110624FR2_20002100/LEX/speeral/_ext
                  
17d865629   Jean-François Rey   update doc
89
      2.5\ ThirdPass.sh
665a8dac3   Jean-François Rey   ! follow the whit...
90
91
92
93
94
95
96
97
98
99
      ------------------
  
      ThirdPass.sh do transcriptions using SecondPass speaker adaptation and ExploitConfidencePass trigg files and new lexicon.
      
      $> ThirdPass.sh [options] result_directory/110624FR2_20002100/
  
      Options :
      -f n : number of forks for speeral
  
      Output : result_directory/110624FR2_20002100/conf/res_p3
5a14b8678   Jean-François Rey   update doc
100
              and .ctm, .trs and .txt files.
665a8dac3   Jean-François Rey   ! follow the whit...
101
102
103
104
105
106
107
108
109
      
      2.6\ RecomposePass.sh
      --------------------
  
      RecomposePass.sh copy results that missing in ThirsPass from the Second and First Pass.
  
      $> RecomposePass.sh [options] result_directory/110624FR2_20002100/
  
      Output : result_directory/110624FR2_20002100/res_all
5a14b8678   Jean-François Rey   update doc
110
              and .ctm, .trs and .txt files.
665a8dac3   Jean-François Rey   ! follow the whit...
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
      
      2.7\ ScoringRes.sh
      ------------------
  
      ScoringRes.sh run differents scoring tools to score the results using SRT file if exists.
  
      $> ScoringRes.sh [options] result_directory/110624FR2_20002100/
  
      Output : result_directory/110624FR2_20002100/scoring
       
      2.8\ CheckResults.sh
      --------------------
  
      CheckResults.sh parse results directories to synthesize works already done.
  
      $> CheckResults.sh [options] result_directory
  
      Output : "Directory name      #plp    #res_p1 #treil_p2   #treil_p3   usf_p2  usf_p3"
              #plp number of plp files
              #res_p1 number of .res files at first pass
              #treil_p2 number of .treil files at second pass
              #treil_p3 number of .treil files at third pass
              usf_p2 usf file from confidence pass result on second pass (OK|ERR|NAN)
              usf_p3 usf file from confidence pass result on third pass (OK|ERR|NAN)
  
  3\ OneScriptToRuleThemAll.sh
  ----------------------------
  
      The script to do all OTMEDIA LIA pass in one call.
  
      $> OneScriptToRuleThemAll.sh [options] 110624FR2_20002100.wav result_directory 
  
      Options : (default options are availables)
      -a Do every pass
      -1 Do First pass
      -2 Do Second pass
      -3 Do Third pass
      -C Do Confidence pass
      -e Do Exploit Confidence pass
      -R Do Recompose pass
      -s Do Scoring pass
0052714e7   Jean-François Rey   bugfix 3pass
152
  4\ Modify configuration
5a14b8678   Jean-François Rey   update doc
153
154
155
156
157
  -----------------------
  
      Most of the main scripts got a configuration file (cfg/ directory).
      You can change script behaviour and data used.
      Speeral configuration file can be also change (tools/Speeral/CFG/ directory)
0052714e7   Jean-François Rey   bugfix 3pass
158
159
  
      4.1\ Scripts configurations
17d865629   Jean-François Rey   update doc
160
      ---------------------------
5a14b8678   Jean-François Rey   update doc
161
162
163
  
          In scripts configuration files (OTMEDIA_HOME/cfg/) you can change default options as architecture, verbose ...
          Scripts using Speeral got information on binaries, models path and name, and the configuration file for speeral.
0052714e7   Jean-François Rey   bugfix 3pass
164
      4.2\ Speeral configurations
17d865629   Jean-François Rey   update doc
165
      ---------------------------
0052714e7   Jean-François Rey   bugfix 3pass
166

5a14b8678   Jean-François Rey   update doc
167
168
169
          Speeral configuration files are in OTMEDIA_HOME/tools/Speeral/CFG directory.
          The .tmp files are use to generate .xml file from install.sh.
          You can modify .xml files for your needs, but most of data informations are pass through arguments at speeral call in scripts.
0052714e7   Jean-François Rey   bugfix 3pass
170
  5\ Modify binaries
5a14b8678   Jean-François Rey   update doc
171
172
173
174
175
176
177
178
179
180
181
  ------------------
  
      Binaries can be find in bin and tools directory.
      Some binaries are compiled in 32 and 64 bits. By default all binaries are compiled in 32 bits.
      You can update binaries as you need.
  
      To modify tools binaries, you need to download a compatible version.
          lia_ltbox can be found in /labo/Tools/
          Speeral (binaries) can be compiled from the git remote git@gitlia.univ-avignon.fr:vaudriguard/libspeeral.git . Do not modify Speeral data from OTMEDIA (unless you know what you do).
          In PACKAGES_MESURES_V1.0 you can update icsiboost binary (in bin) from the projet page : https://code.google.com/p/icsiboost/
          For QUOTE_FINDER and SIGMUND please contact support.
17d865629   Jean-François Rey   update doc
182
183
184
185
186
187
188
189
190
191
192
193
  6\ Exemples
  -----------
  
      Conventional use :
      $> FirstPass.sh 110624FR2_20002100.wav /my/output/directory/ && SecondPass.sh /my/output/directory/110624FR2_20002100 && ConfPass.sh /my/output/directory/110624FR2_20002100 res_p2 && ExploitConfPass.sh /my/output/directory/110624FR2_20002100 && ConfPass.sh /my/output/directory/110624FR2_20002100 res_p3 && ThirdPass.sh /my/output/directory/110624FR2_20002100 && RecomposePass.sh /my/output/directory/110624FR2_20002100 && ScoringRes.sh /my/output/directory/110624FR2_20002100
      or
      $> OneScriptToRuleThemAll.sh -a 110624FR2_20002100.wav /my/output/directory/
  
      Rerun SecondPass and ConfPass with verbose and speeral fork to 4 :
      $> SecondPass.sh -r -f 4 -v 3 /my/output/directory/110624FR2_20002100 && ConfPass.sh -r -v 3 /my/output/directory/110624FR2_20002100 res_p2
      or
      $> OneScriptToRuleThemAll.sh -r -2 -C 2 -v 3 -f 4 110624FR2_20002100.wav /my/output/directory/
52d318a61   Jean-François Rey   update doc
194
195
      Run with log file
      $> OneScriptToRuleThemAll.sh -a -c -v3 -f4 110624FR2_20002100.wav /my/output/directory/
17d865629   Jean-François Rey   update doc
196

5a14b8678   Jean-François Rey   update doc
197
198
  Good Luck ! Luke !
  And the force be with you !