Commit 665a8dac322f0a4232d39c379136a945f4d76081

Authored by Jean-François Rey
1 parent b9a54507e8
Exists in master

! follow the white rabbit !

Showing 6 changed files with 232 additions and 28 deletions Inline Diff

1 #---------------#
2 # OTMEDIA LIA #
3 # HOWTO #
4 # version 1.0 #
5 #---------------#
6
7 1\ Main options
8 ---------------
9
10 There are five main options for otmedia scripts.
11 -h : for help
12 -D : Debug mode
13 -v n : Verbose mode 1 low to 3 high
14 -c : Check results
15 -r : force to rerun a script, without deleting work already done
16
17 2\ Main scripts
18 ---------------
19 2.1\ FirstPass.sh
20 -----------------
21
22 FirstPass.sh do speaker diarization and transcription of an audio file. Convert it into wav format if not already done (16000Hz, 16 bits, mono).
23 If a .SRT file is present in the same directory of the audio file it will copy it.
24
25 $> FisrtPass.sh [options] 110624FR2_20002100.wav result_directory
26
27 Options:
28 -f n : number of forks for speeral
29
30 Output : result_directory/110624FR2_20002100/res_p1/
31
32 2.2\ SecondPass.sh
33 ------------------
34
35 SecondPass.sh do speaker adaptation and transcriptions base on the first pass.
36
37 $> SecondPass.sh [options] result_directory/110624FR2_20002100/
38
39 Options:
40 -f n : number of forks for speeral
41
42 Output : result_directory/110624FR2_20002100/res_p2/
43
44 2.3\ ConfPass.sh
45 ----------------
46
47 ConfPass.sh do confidence measure using the second or third pass.
48
49 $> Confpass.sh [options] result_directory/110624FR2_20002100/ <res_p2|res_p3>
50
51 Output : result_directory/110624FR2_20002100/conf/res_p2/scored_ctm/
52 and result_directory/110624FR2_20002100.usf file
53
54 2.4\ ExploitConfidencePass.sh
55 -----------------------------
56
57 It exploits confidence pass measure to :
58 - boost confidente zone
59 - find alternative in non confidente zone (using SOLR DB)
60 - extend the lexicon
61
62 $> ExploitConfidencePass.sh [options] result_directory/110624FR2_20002100
63
64 Output : result_directory/110624FR2_20002100/trigg/speeral
65 result_directory/110624FR2_20002100/LEX/speeral/_ext
66
67 2.5\ ThirstPass.sh
68 ------------------
69
70 ThirdPass.sh do transcriptions using SecondPass speaker adaptation and ExploitConfidencePass trigg files and new lexicon.
71
72 $> ThirdPass.sh [options] result_directory/110624FR2_20002100/
73
74 Options :
75 -f n : number of forks for speeral
76
77 Output : result_directory/110624FR2_20002100/conf/res_p3
78
79 2.6\ RecomposePass.sh
80 --------------------
81
82 RecomposePass.sh copy results that missing in ThirsPass from the Second and First Pass.
83
84 $> RecomposePass.sh [options] result_directory/110624FR2_20002100/
85
86 Output : result_directory/110624FR2_20002100/res_all
87
88 2.7\ ScoringRes.sh
89 ------------------
90
91 ScoringRes.sh run differents scoring tools to score the results using SRT file if exists.
92
93 $> ScoringRes.sh [options] result_directory/110624FR2_20002100/
94
95 Output : result_directory/110624FR2_20002100/scoring
96
97 2.8\ CheckResults.sh
98 --------------------
99
100 CheckResults.sh parse results directories to synthesize works already done.
101
102 $> CheckResults.sh [options] result_directory
103
104 Output : "Directory name #plp #res_p1 #treil_p2 #treil_p3 usf_p2 usf_p3"
105 #plp number of plp files
106 #res_p1 number of .res files at first pass
107 #treil_p2 number of .treil files at second pass
108 #treil_p3 number of .treil files at third pass
109 usf_p2 usf file from confidence pass result on second pass (OK|ERR|NAN)
110 usf_p3 usf file from confidence pass result on third pass (OK|ERR|NAN)
111
112 3\ OneScriptToRuleThemAll.sh
113 ----------------------------
114
115 The script to do all OTMEDIA LIA pass in one call.
116
117 $> OneScriptToRuleThemAll.sh [options] 110624FR2_20002100.wav result_directory
118
119 Options : (default options are availables)
120 -a Do every pass
121 -1 Do First pass
122 -2 Do Second pass
123 -3 Do Third pass
124 -C Do Confidence pass
125 -e Do Exploit Confidence pass
126 -R Do Recompose pass
127 -s Do Scoring pass
128
129
File was created 1 #---------------#
2 # OTMEDIA LIA #
3 # INSTALL #
4 # version : 1.0 #
5 #---------------#
6
7 OTMEDIA LIA ready to use ? Really ?
8 No ! You have to do manualy configuartion for some features.
9 Let see...
10
11 SUMMARY
12 -------
13
14 1\ Before installation
15 2\ install.sh script
16 3\ SOLR install
17
18
19 1\ Before installation
20 ----------------------
21
22 - Check and install dependencies.
23 - In 64 bits architcture be sure you can run 32 bits programs.
24 - Have 300 Go of free space.
25 - Have acces to the network and the nyx server.
26
27 2/ install.sh script
28 --------------------
29
30 install.sh script will do most of the work.
31 It will check dependencies and configure pass tools.
32 By default it will do a complet install (300 Go).
33
34 You can modifiy behavior by editing install.sh :
35
36 To disable lexicon adaption using SOLR DB put EXPLOITCONFPASS to 0 (mainly the 290 Go).
37 To disable confidence measure put CONFPASS to 0.
38 To disable second and third pass put PASS2 to 0.
39
40 run install.sh and follow the white rabbit.
41
42 3\ SOLR install
43 ---------------
44
45 The install.sh script download otmedia-2013-04.tar.gz and untar it in OTMEDIA_HOME/tools/SOLR/ .
46 See SOLR.INSTALL file to install OTMEDIA SOLR DB.
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
1 ___ _____ __ __ _____ ____ ___ _ _ ___ _ 1 ___ _____ __ __ _____ ____ ___ _ _ ___ _
2 / _ \_ _| \/ | ____| _ \_ _| / \ | | |_ _| / \ 2 / _ \_ _| \/ | ____| _ \_ _| / \ | | |_ _| / \
3 | | | || | | |\/| | _| | | | | | / _ \ | | | | / _ \ 3 | | | || | | |\/| | _| | | | | | / _ \ | | | | / _ \
4 | |_| || | | | | | |___| |_| | | / ___ \ | |___ | | / ___ \ 4 | |_| || | | | | | |___| |_| | | / ___ \ | |___ | | / ___ \
5 \___/ |_| |_| |_|_____|____/___/_/ \_\ |_____|___/_/ \_\ 5 \___/ |_| |_| |_|_____|____/___/_/ \_\ |_____|___/_/ \_\
6 6
7 7
8 #-------------------# 8 #---------------#
9 # OTMEDIA LIA # 9 # OTMEDIA LIA #
10 # README # 10 # README #
11 # version 1.0 # 11 # version 1.0 #
12 #-------------------# 12 #---------------#
13 13
14 DESCRIPTION 14 DESCRIPTION
15 ----------- 15 -----------
16 16
17 OTMEDIA means "Observatoire Transmedia", its main objective is to study the evolution and transformation of the media world. 17 OTMEDIA means "Observatoire Transmedia", its main objective is to study the evolution and transformation of the media world.
18 The scientific objective of the project is the creation of a new generation of media observatory 18 The scientific objective of the project is the creation of a new generation of media observatory
19 based on an interactive automatic analysis system (semi-automatic) transmedia to understand 19 based on an interactive automatic analysis system (semi-automatic) transmedia to understand
20 the world of information and developments. 20 the world of information and developments.
21 21
22 Web Site : http://www.otmedia.fr 22 Web Site : http://www.otmedia.fr
23 23
24 OTMEDIA LIA project is a set of tools to transcribe radio and TV shows. 24 OTMEDIA LIA project is a set of tools to transcribe radio and TV shows.
25 It does multiple things :
26 - First pass : default transcription with speeral and speaker diarization.
27 - Second pass : speaker adaptation and a second transcription pass with speeral.
28 - Confidence pass : calcul confidence measure from transcription output.
29 - Exploit Confidence Measure : use SOLR DB data to extend the lexicon on low confidence measure and create trigg files.
30 - Third pass : second pass using the new lexicon and trigg files.
31
25 32
26 DEPENDENCIES 33 DEPENDENCIES
27 ------------ 34 ------------
28 35
29 GNU Toolchain 36 GNU Toolchain
30 Available from : http://www.gnu.org 37 Available from : http://www.gnu.org
31 and debian packages 38 and debian packages
32 39
33 Compiling, linking, and building applications. 40 Compiling, linking, and building applications.
34 41
35 42
36 avconv (libav-tools >= 0.8) 43 avconv (libav-tools >= 0.8)
37 Available from : http://libav.org 44 Available from : http://libav.org
38 and debian package 45 and debian package
39 46
40 avconv is a very fast video and audio converter. 47 avconv is a very fast video and audio converter.
41 48
42 JAVA JDK and JRE ( >= 6) 49 JAVA JDK and JRE ( >= 6)
43 Available from : http://www.oralce.com 50 Available from : http://www.oralce.com
44 and debian packages 51 and debian packages
45 52
46 JAVA Developpment kit and JAVA runtime environment. 53 JAVA Developpment kit and JAVA runtime environment.
47 54
48 Python ( >= 2.7.0) 55 Python ( >= 2.7.0)
49 Available from : http://http://www.python.org/ 56 Available from : http://http://www.python.org/
50 and debian packages 57 and debian packages
51 58
52 Python is a programming language. 59 Python is a programming language.
53 60
54 Perl ( >= 5.0.0) 61 Perl ( >= 5.0.0)
55 Available from : http://www.perl.org/ 62 Available from : http://www.perl.org/
56 and debian packages 63 and debian packages
57 64
58 Perl is a programming language. 65 Perl is a programming language.
59 66
60 iconvi ( >= 2.0.0) 67 iconv ( >= 2.0.0)
61 Available from : http://www.gnu.org 68 Available from : http://www.gnu.org
62 and debian package 69 and debian package
63 70
64 Character set conversion. 71 Character set conversion.
65 72
66 csh shell (csh) 73 csh shell (csh)
67 Available on debian packages. 74 Available on debian packages.
68 75
69 The C shell was originally written at UCB to overcome limitations in the 76 The C shell was originally written at UCB to overcome limitations in the
70 Bourne shell. Its flexibility and comfort (at that time) quickly made it 77 Bourne shell. Its flexibility and comfort (at that time) quickly made it
71 the shell of choice until more advanced shells like ksh, bash, zsh or 78 the shell of choice until more advanced shells like ksh, bash, zsh or
72 tcsh appeared. Most of the latter incorporate features original to csh 79 tcsh appeared. Most of the latter incorporate features original to csh
73 80
74 The SRI Language Modeling Toolkit (SRILM >= 1.6.0) 81 The SRI Language Modeling Toolkit (SRILM >= 1.6.0)
75 Available from : http://www.speech.sri.com/projects/srilm/ 82 Available from : http://www.speech.sri.com/projects/srilm/
76 83
77 SRILM is a toolkit for building and applying statistical language models. 84 SRILM is a toolkit for building and applying statistical language models.
78 85
79 Tomcat ( >= 7.0.0) 86 Tomcat ( >= 7.0.0)
80 Available from : http://tomcat.apache.org/ 87 Available from : http://tomcat.apache.org/
81 and debian packages 88 and debian packages
82 89
83 Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies. 90 Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies.
84 91
85 INSTALLATION 92 INSTALLATION
86 ------------ 93 ------------
87 94
88 See the INSTALL file for the installation procedure. 95 See the INSTALL file for the installation procedure.
89 96
90 Quick install below. 97 Quick install below.
91 98
92 Before launch installation : 99 Before launching installation :
93 100
94 Be certain that all dependencies are satisfied. 101 Be certain that all dependencies are satisfied.
102 Have 300 Go of free space for complet install.
95 103
96 Issue the following commands to the shell : 104 Issue the following commands to the shell :
97 $> ./install.sh 105 $> ./install.sh
98 $> export OTMEDIA_HOME=path/to/OTMEDIA/directory 106 $> export OTMEDIA_HOME=path/to/OTMEDIA/directory
99 107
100 Read SOLR.INSTALL part 3/ to install SOLRDB. 108 Read SOLR.INSTALL part 3 to install SOLRDB.
101 109
102 RUNNING 110 RUNNING
103 ------- 111 -------
104 112
105 See HOWTO file. 113 See HOWTO file.
106 114
107 ACKNOWLEDGEMENTS 115 ACKNOWLEDGEMENTS
108 ---------------- 116 ----------------
109 117
110 Many thanks to Jean-François Rey for useful help and work done. 118 Many thanks to Jean-François Rey for useful help and work done.
111 119
112 KNOWN BUGS 120 KNOWN BUGS
113 ---------- 121 ----------
114 122
115 Many. 123 Many.
124 For Bug report, please contact Pascal Nocera at pascal.nocera@univ-avignon.fr
116 125
117 COPYRIGHT 126 COPYRIGHT
118 --------- 127 ---------
119 128
120 See the COPYING file. 129 See the COPYING file.
121 130
122 AUTHORS 131 AUTHORS
123 ------- 132 -------
124 133
125 Jean-François Rey <jean-francois.rey@univ-avignon.fr> 134 Jean-François Rey <jean-francois.rey@univ-avignon.fr>
126 Hugo Mauchrétien <hugo.mauchretien@univ-avignon.fr> 135 Hugo Mauchrétien <hugo.mauchretien@univ-avignon.fr>
127 Emmanuel Ferreira <emmanuel.ferreira@univ-avignon.fr> 136 Emmanuel Ferreira <emmanuel.ferreira@univ-avignon.fr>
128 137
129 138
1 ################ 1 ################
2 # SOLR INSTALL # 2 # SOLR INSTALL #
3 ################ 3 ################
4 # 4 #
5 # Author Jean-François Rey 5 # Author Jean-François Rey
6 # Version : 1.0 6 # Version : 1.0
7 # Date : 18/07/2013 7 # Date : 18/07/2013
8 # 8 #
9 9
10 1/ Edit install.sh and put CONFPASS=1 10 1/ Edit install.sh and put CONFPASS=1
11 11
12 2/ Run install.sh, this will check tomcat is installed, download and untar otmedia SOLR DB and ask for solr service info. 12 2/ Run install.sh, this will check tomcat is installed, download and untar otmedia SOLR DB and ask for solr service info.
13 13
14 3/ Configure Tomcat and SOLR 14 3/ Configure Tomcat and SOLR
15 15
16 SOLR_OTMEDIA_PATH=OTMEDIA_PATH/tools/SOLR/otemdia-2013-04 16 otmedia-2013-04 SOLR DB is untar in :
17 SOLR_OTMEDIA_PATH=OTMEDIA_HOME/tools/SOLR/otemdia-2013-04
17 18
18 3.1/ Set context file 19 3.1/ Set context file
19 ---------------- 20 ----------------
20 21
21 - in SOLR_OTMEDIA_PATH/solr/otmedia-document/solr-tomcat-deploy/solr-otmedia-document.xml 22 - in SOLR_OTMEDIA_PATH/solr/otmedia-document/solr-tomcat-deploy/solr-otmedia-document.xml
22 change DocBase to DocBase="SOLR_OTMEDIA_PATH/solr/otmedia-document/apache-solr-3.5.0.war" 23 change DocBase to DocBase="SOLR_OTMEDIA_PATH/solr/otmedia-document/apache-solr-3.5.0.war"
23 and value to value="SOLR_OTMEDIA_PATH/solr/otmedia-document/" 24 and value to value="SOLR_OTMEDIA_PATH/solr/otmedia-document/"
24 25
25 - in SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/solr-tomcat-deploy/solr-otmedia-multimedia.xml 26 - in SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/solr-tomcat-deploy/solr-otmedia-multimedia.xml
26 change DocBase to DocBase="SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/apache-solr-3.5.0.war" 27 change DocBase to DocBase="SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/apache-solr-3.5.0.war"
27 and value to value="SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/" 28 and value to value="SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/"
28 29
29 3.2/ SOLR data configuration 30 3.2/ SOLR data configuration
30 ----------------------- 31 -----------------------
31 32
32 - in SOLR_OTMEDIA_PATH/solr/otmedia-document/conf/solrconfig.xml 33 - in SOLR_OTMEDIA_PATH/solr/otmedia-document/conf/solrconfig.xml
33 change datadir (solr.data.dir) to SOLR_OTMEDIA_PATH/index/otmedia-document/ 34 change datadir (solr.data.dir) to SOLR_OTMEDIA_PATH/index/otmedia-document/
34 35
35 - in SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/conf/solrconfig.xml 36 - in SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/conf/solrconfig.xml
36 change datadir (solr.data.dir) to SOLR_OTMEDIA_PATH/index/otmedia-multimedia/ 37 change datadir (solr.data.dir) to SOLR_OTMEDIA_PATH/index/otmedia-multimedia/
37 38
38 3.3/ Add SOLR DB to Tomcat 39 3.3/ Add SOLR DB to Tomcat
39 --------------------- 40 ---------------------
40 41
41 - in tomcat/Catalina/localhost/ (mainly in /etc/tomcat/Catalina/localhost or /var/lib/tomcat/conf/Catalina/localhost) 42 - in tomcat/Catalina/localhost/ (mainly in /etc/tomcat/Catalina/localhost or /var/lib/tomcat/conf/Catalina/localhost)
42 run : $> ln -s SOLR_OTMEDIA_PATH/solr/otmedia-document/solr-tomcat-deploy/solr-otmedia-document.xml solr-otmedia-document.xml 43 run : $> ln -s SOLR_OTMEDIA_PATH/solr/otmedia-document/solr-tomcat-deploy/solr-otmedia-document.xml solr-otmedia-document.xml
43 run : $> ln -s SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/solr-tomcat-deploy/solr-otmedia-multimedia.xml solr-otmedia-document.xml 44 run : $> ln -s SOLR_OTMEDIA_PATH/solr/otmedia-multimedia/solr-tomcat-deploy/solr-otmedia-multimedia.xml solr-otmedia-document.xml
44 45
45 4/ Tomcat trouble 46 4/ Tomcat trouble
46 47
47 4.1/ SOLR use a lot of memory, you need to increase java heap space ! 48 4.1/ SOLR use a lot of memory, you need to increase java heap space !
48 ------------------------- 49 -------------------------
49 50
50 - in catalina.sh (/usr/share/tomcat/bin) 51 - in catalina.sh (/usr/share/tomcat/bin)
51 add CATALINA_OPTS="$CATALINA_OPTS -Xms256 -Xmx512m" 52 add CATALINA_OPTS="$CATALINA_OPTS -Xms256 -Xmx512m"
52 53
53 4.2/ Directory permissions 54 4.2/ Directory permissions
54 --------------------- 55 ---------------------
55 56
56 - SOLR_OTMEDIA_PATH and subdirectory (and files) need to belong to tomcat group (and tomcat user if the default user don't belong to tomcat group). 57 - SOLR_OTMEDIA_PATH and subdirectory (and files) need to belong to tomcat group (and tomcat user if the default user don't belong to tomcat group).
57 chgrp -r tomcat7 otmedia-2013-04 58 chgrp -r tomcat7 otmedia-2013-04
58 chmod g+rx otmedia-2013-04 59 chmod g+rx otmedia-2013-04
59 60
60 5/ Test 61 5/ Test
61 62
62 You can test those requests (change ip and port): 63 You can test those requests (change ip and port):
63 http://localhost:8080/solr-otmedia-multimedia/select?q=test+bonus+&fq=docDate:[2011-12-30T00\:00\:01Z+TO+2012-01-01T23\:59\:59Z] 64 http://localhost:8080/solr-otmedia-multimedia/select?q=test+bonus+&fq=docDate:[2011-12-30T00\:00\:01Z+TO+2012-01-01T23\:59\:59Z]
64 http://localhost:8080/solr-otmedia-document/select?q=test+bonus+&fq=docDate:[2011-12-30T00\:00\:01Z+TO+2012-01-01T23\:59\:59Z] 65 http://localhost:8080/solr-otmedia-document/select?q=test+bonus+&fq=docDate:[2011-12-30T00\:00\:01Z+TO+2012-01-01T23\:59\:59Z]
65 66
66 67
1 #!/bin/bash 1 #!/bin/bash
2 2
3 #-------------------# 3 #-------------------#
4 # OTMEDIA LIA #
4 # Install script # 5 # Install script #
5 # OTMEDIA # 6 # version : 1.0.0 #
6 #-------------------# 7 #-------------------#
7 8
8 # Color variables 9 # Color variables
9 txtgrn=$(tput setaf 2) # Green 10 txtgrn=$(tput setaf 2) # Green
10 txtylw=$(tput setaf 3) # Yellow 11 txtylw=$(tput setaf 3) # Yellow
11 txtblu=$(tput setaf 4) # Blue 12 txtblu=$(tput setaf 4) # Blue
12 txtpur=$(tput setaf 5) # Purple 13 txtpur=$(tput setaf 5) # Purple
13 txtcyn=$(tput setaf 6) # Cyan 14 txtcyn=$(tput setaf 6) # Cyan
14 txtwht=$(tput setaf 7) # White 15 txtwht=$(tput setaf 7) # White
15 txtrst=$(tput sgr0) # Text reset. 16 txtrst=$(tput sgr0) # Text reset.
16 #/color 17 #/color
17 18
18 # 19 #
19 ### Global Variables 20 ### Global Variables
20 # 21 #
21 PWD=$(pwd) 22 PWD=$(pwd)
22 OTMEDIA_HOME=$PWD 23 OTMEDIA_HOME=$PWD
23 test=$(arch) 24 test=$(arch)
24 if [ "$test" == "x86_64" ]; then ARCH=".64"; else ARCH=""; fi 25 if [ "$test" == "x86_64" ]; then ARCH=".64"; else ARCH=""; fi
25 #/Global 26 #/Global
26 27
27 28
28 # 29 #
29 # Put to 0 to disable dependencies of a pass 30 # Put to 0 to disable dependencies of a pass
30 # and 1 to enable 31 # and 1 to enable
31 # 32 #
32 PASS1=1 # First Pass 33 PASS1=1 # First Pass
33 PASS2=1 # Second Pass 34 PASS2=1 # Second and Third Pass
34 CONFPASS=1 # Confidence Pass 35 CONFPASS=1 # Confidence Pass
35 EXPLOITCONFPASS=1 # SOLR query and trigg 36 EXPLOITCONFPASS=1 # SOLR query and trigg
36 37
37 echo -e "\nWill do install for :" 38 echo -e "\nWill do install for :"
38 if [ $PASS1 -eq 1 ];then echo "- Pass 1";fi 39 if [ $PASS1 -eq 1 ];then echo "- Pass 1";fi
39 if [ $PASS2 -eq 1 ];then echo "- Pass 2";fi 40 if [ $PASS2 -eq 1 ];then echo "- Pass 2";fi
40 if [ $CONFPASS -eq 1 ];then echo "- Confidence Pass";fi 41 if [ $CONFPASS -eq 1 ];then echo "- Confidence Pass";fi
41 if [ $EXPLOITCONFPASS -eq 1 ];then echo "- Exploit Confidence Pass";fi 42 if [ $EXPLOITCONFPASS -eq 1 ];then echo "- Exploit Confidence Pass";fi
42 43
43 # 44 #
44 ### CHECK Dependencies ### 45 ### CHECK Dependencies ###
45 # 46 #
46 echo -e "\n\t${txtblu}Check Dependencies${txtrst}\n" 47 echo -e "\n\t${txtblu}Check Dependencies${txtrst}\n"
47 48
48 ## make 49 ## make
49 test=$(whereis make) 50 test=$(whereis make)
50 if [ "$test" == "make:" ] 51 if [ "$test" == "make:" ]
51 then 52 then
52 echo -e "${txtpur}ERROR${txtrst} make not found\n You have to install make\n sudo apt-get install make" 53 echo -e "${txtpur}ERROR${txtrst} make not found\n You have to install make\n sudo apt-get install make"
53 exit 1; 54 exit 1;
54 fi 55 fi
55 echo -e "make \t ${txtgrn}OK${txtrst}" 56 echo -e "make \t ${txtgrn}OK${txtrst}"
56 57
57 ## CC 58 ## CC
58 test=$(whereis cc) 59 test=$(whereis cc)
59 if [ "$test" == "cc:" ] 60 if [ "$test" == "cc:" ]
60 then 61 then
61 echo -e "${txtpur}ERROR${txtrst} cc not found\n You have to install cc\n sudo apt-get install gcc" 62 echo -e "${txtpur}ERROR${txtrst} cc not found\n You have to install cc\n sudo apt-get install gcc"
62 exit 1; 63 exit 1;
63 fi 64 fi
64 echo -e "cc \t ${txtgrn}OK${txtrst}" 65 echo -e "cc \t ${txtgrn}OK${txtrst}"
65 66
66 ## AVCONV 67 ## AVCONV
67 test=$(whereis avconv) 68 test=$(whereis avconv)
68 if [ "$test" == "avconv:" ] 69 if [ "$test" == "avconv:" ]
69 then 70 then
70 echo -e "${txtpur}ERROR${txtrst} avconv not found\n You have to install avconv\n sudo apt-get install libav-tools" 71 echo -e "${txtpur}ERROR${txtrst} avconv not found\n You have to install avconv\n sudo apt-get install libav-tools"
71 exit 1; 72 exit 1;
72 fi 73 fi
73 echo -e "libav-tools : avconv \t ${txtgrn}OK${txtrst}" 74 echo -e "libav-tools : avconv \t ${txtgrn}OK${txtrst}"
74 75
75 ## JAVA 76 ## JAVA
76 test=$(whereis java) 77 test=$(whereis java)
77 if [ "$test" == "java:" ] 78 if [ "$test" == "java:" ]
78 then 79 then
79 echo -e "${txtpur}ERROR${txtrst} java not found\n You have to install java JRE\n sudo apt-get install openjdk-7-jre" 80 echo -e "${txtpur}ERROR${txtrst} java not found\n You have to install java JRE\n sudo apt-get install openjdk-7-jre"
80 exit 1; 81 exit 1;
81 fi 82 fi
82 echo -e "Java : JRE \t ${txtgrn}OK${txtrst}" 83 echo -e "Java : JRE \t ${txtgrn}OK${txtrst}"
83 test=$(whereis javac) 84 test=$(whereis javac)
84 if [ "$test" == "javac:" ] 85 if [ "$test" == "javac:" ]
85 then 86 then
86 echo -e "${txtpur}ERROR${txtrst} javac not found\n You have to install java JDK\n sudo apt-get install openjdk-7-jdk" 87 echo -e "${txtpur}ERROR${txtrst} javac not found\n You have to install java JDK\n sudo apt-get install openjdk-7-jdk"
87 exit 1; 88 exit 1;
88 fi 89 fi
89 echo -e "Java : JDK \t ${txtgrn}OK${txtrst}" 90 echo -e "Java : JDK \t ${txtgrn}OK${txtrst}"
90 91
91 if [ $EXPLOITCONFPASS -eq 1 ] 92 if [ $EXPLOITCONFPASS -eq 1 ]
92 then 93 then
93 ## Python 94 ## Python
94 test=$(whereis python) 95 test=$(whereis python)
95 if [ "$test" == "python:" ] 96 if [ "$test" == "python:" ]
96 then 97 then
97 echo -e "${txtpur}ERROR${txtrst} python not found\n You have to install python\n sudo apt-get install python" 98 echo -e "${txtpur}ERROR${txtrst} python not found\n You have to install python\n sudo apt-get install python"
98 exit 1; 99 exit 1;
99 fi 100 fi
100 echo -e "python : \t ${txtgrn}OK${txtrst}" 101 echo -e "python : \t ${txtgrn}OK${txtrst}"
102
103 ## csh shell
104 test=$(whereis csh)
105 if [ "$test" == "csh:" ]
106 then
107 echo -e "${txtpur}ERROR${txtrst} csh shell not found\n You have to install csh shell\n sudo apt-get install csh"
108 exit 1;
109 fi
110 echo -e "csh shell : \t ${txtgrn}OK${txtrst}"
101 fi 111 fi
102 112
103 ## Perl 113 ## Perl
104 test=$(whereis perl) 114 test=$(whereis perl)
105 if [ "$test" == "perl:" ] 115 if [ "$test" == "perl:" ]
106 then 116 then
107 echo -e "${txtpur}ERROR${txtrst} perl not found\n You have to install perl\n sudo apt-get install perl" 117 echo -e "${txtpur}ERROR${txtrst} perl not found\n You have to install perl\n sudo apt-get install perl"
108 exit 1; 118 exit 1;
109 fi 119 fi
110 echo -e "perl : \t ${txtgrn}OK${txtrst}" 120 echo -e "perl : \t ${txtgrn}OK${txtrst}"
111 121
112 ## iconv 122 ## iconv
113 test=$(whereis iconv) 123 test=$(whereis iconv)
114 if [ "$test" == "iconv:" ] 124 if [ "$test" == "iconv:" ]
115 then 125 then
116 echo -e "${txtpur}ERROR${txtrst} iconv not found\n You have to install iconv\n sudo apt-cache search iconv" 126 echo -e "${txtpur}ERROR${txtrst} iconv not found\n You have to install iconv\n sudo apt-cache search iconv"
117 exit 1; 127 exit 1;
118 fi 128 fi
119 echo -e "iconv : \t ${txtgrn}OK${txtrst}" 129 echo -e "iconv : \t ${txtgrn}OK${txtrst}"
120 130
121 ## csh shell
122 test=$(whereis csh)
123 if [ "$test" == "csh:" ]
124 then
125 echo -e "${txtpur}ERROR${txtrst} csh shell not found\n You have to install csh shell\n sudo apt-get install csh"
126 exit 1;
127 fi
128 echo -e "csh shell : \t ${txtgrn}OK${txtrst}"
129
130 ## SRI LM 131 ## SRI LM
131 if [ -z "$SRILM" ] && [ -z "$MACHINE_TYPE" ] 132 if [ -z "$SRILM" ] && [ -z "$MACHINE_TYPE" ]
132 then 133 then
133 echo -e "${txtpur}ERROR${txtrst} SRILM toolkit variables are not defined (SRILM and MACHINE_TYPE)\n You have to install SRILM Toolkit\n" 134 echo -e "${txtpur}ERROR${txtrst} SRILM toolkit variables are not defined (SRILM and MACHINE_TYPE)\n You have to install SRILM Toolkit\n"
134 exit 1; 135 exit 1;
135 fi 136 fi
136 export SRILM_BIN=$SRILM/bin/$MACHINE_TYPE 137 export SRILM_BIN=$SRILM/bin/$MACHINE_TYPE
137 echo -e "SRILM toolkit : \t ${txtgrn}OK${txtrst}" 138 echo -e "SRILM toolkit : \t ${txtgrn}OK${txtrst}"
138 139
139
140
141 ### Speeral Configuration ### 140 ### Speeral Configuration ###
142 141
143 echo -e "\n\t${txtblu}Speeral configuration${txtrst}\n" 142 echo -e "\n\t${txtblu}Speeral configuration${txtrst}\n"
144 echo -e "Download Speeral bin and data :" 143 echo -e "Download Speeral bin and data :"
145 scp -r rey@nyx:~/OTMEDIA_DATA/Speeral $OTMEDIA_HOME/tools/ 144 scp -r rey@nyx:~/OTMEDIA_DATA/Speeral $OTMEDIA_HOME/tools/
146 echo -e "\n\t${txtblu}Generating Speeral configuration files :${txtrst}\n" 145 echo -e "\n\t${txtblu}Generating Speeral configuration files :${txtrst}\n"
147 cat $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \ 146 cat $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \
148 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \ 147 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \
149 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \ 148 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \
150 > $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml 149 > $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml
151 echo $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml 150 echo $PWD/tools/Speeral/CFG/SpeeralFirstPass.xml
152 cat $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \ 151 cat $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \
153 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \ 152 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \
154 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \ 153 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \
155 > $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml 154 > $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml
156 echo $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml 155 echo $PWD/tools/Speeral/CFG/SpeeralSecondPass.xml
157 cat $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \ 156 cat $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml.tmp | sed -e "s|<nom>[^<]*</nom>|<nom>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer</nom>|g" \
158 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \ 157 | sed -e "s|<ngramme>[^<]*</ngramme>|<ngramme>$PWD/tools/Speeral/LM/ML_4gOTMEDIA_LEXIQUE_V6</ngramme>|g" \
159 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \ 158 | sed -e "s|<binode>[^<]*</binode>|<binode>$PWD/tools/Speeral/LEX/LEXIQUE_V6.speer.bin</binode>|g" \
160 > $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml 159 > $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml
161 echo $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml 160 echo $PWD/tools/Speeral/CFG/SpeeralThirdPass.xml
162 161
163 162
164 if [ $EXPLOITCONFPASS -eq 1 ] 163 if [ $EXPLOITCONFPASS -eq 1 ]
165 then 164 then
166 ### LIA ltbox ### 165 ### LIA ltbox ###
167 echo -e "\t${txtblu}Install lia_ltbox${txtrst}\n" 166 echo -e "\t${txtblu}Install lia_ltbox${txtrst}\n"
168 export LIA_TAGG_LANG="french" 167 export LIA_TAGG_LANG="french"
169 export LIA_TAGG="$OTMEDIA_HOME/tools/lia_ltbox/lia_tagg/" 168 export LIA_TAGG="$OTMEDIA_HOME/tools/lia_ltbox/lia_tagg/"
170 export LIA_PHON_REP="$OTMEDIA_HOME/tools/lia_ltbox/lia_phon/" 169 export LIA_PHON_REP="$OTMEDIA_HOME/tools/lia_ltbox/lia_phon/"
171 export LIA_BIGLEX="$OTMEDIA_HOME/tools/lia_ltbox/lia_biglex/" 170 export LIA_BIGLEX="$OTMEDIA_HOME/tools/lia_ltbox/lia_biglex/"
172 171
173 ### config lia_phon 172 ### config lia_phon
174 cd $LIA_PHON_REP 173 cd $LIA_PHON_REP
175 make all > /dev/null 174 make all > /dev/null
176 make ressource > /dev/null 175 make ressource > /dev/null
177 ### config lia_tagg 176 ### config lia_tagg
178 cd $LIA_TAGG 177 cd $LIA_TAGG
179 make all > /dev/null 178 make all > /dev/null
180 make ressource.french > /dev/null 179 make ressource.french > /dev/null
181 ### config lia_biglex 180 ### config lia_biglex
182 cd $LIA_BIGLEX 181 cd $LIA_BIGLEX
183 make -f makefile.biglex > /dev/null 182 make -f makefile.biglex > /dev/null
184 cd $OTMEDIA_HOME 183 cd $OTMEDIA_HOME
185 184
186 185
187 ### SOLR DB ### 186 ### SOLR DB ###
188 # Tomcat fisrtly 187 # Tomcat fisrtly
189 test=$(dpkg -l | grep "^ii" | grep tomcat) 188 test=$(dpkg -l | grep "^ii" | grep tomcat)
190 if [ "$test" == "" ] 189 if [ "$test" == "" ]
191 then 190 then
192 echo -e "${txtpur}ERROR${txtrst} TOMCAT seems to not be installed)\n You have to install TOMCAT\n" 191 echo -e "${txtpur}ERROR${txtrst} TOMCAT seems to not be installed)\n You have to install TOMCAT\n"
193 exit 1; 192 #exit 1;
194 fi 193 fi
195 echo -e "\nTOMCAT : \t ${txtgrn}OK${txtrst}\n" 194 echo -e "\nTOMCAT : \t ${txtgrn}OK${txtrst}\n"
196 # SOLR secondly 195 # SOLR secondly
197 echo -e "\t${txtblu}Install SOLR DB${txtrst}\n" 196 echo -e "\t${txtblu}Install SOLR DB${txtrst}\n"
198 echo -e "You will need 300 Go of free space to install SOLR DB" 197 echo -e "You will need 300 Go of free space to install SOLR DB"
199 read -p "Continue ? (y/n) " solr 198 read -p "Continue ? (y/n) " solr
200 if [ "$solr" == "y" ] 199 if [ "$solr" == "y" ]
201 then 200 then
202 201
203 echo -e "Download SOLR DB\r" 202 echo -e "Download SOLR DB\r"
204 mkdir -p $OTMEDIA_HOME/tools/SOLR 2> /dev/null 203 mkdir -p $OTMEDIA_HOME/tools/SOLR 2> /dev/null
205 scp -r rey@nyx:~/OTMEDIA_DATA/SOLR/otmedia-2013-04.tar.gz $OTMEDIA_HOME/tools/SOLR 204 scp -r rey@nyx:~/OTMEDIA_DATA/SOLR/otmedia-2013-04.tar.gz $OTMEDIA_HOME/tools/SOLR
206 echo -e "Unzip SOLR DB\r" 205 echo -e "Unzip SOLR DB\r"
207 res=0 206 res=0
208 #res = $(tar -xvzf "$OTMEDIA_HOME/tools/SOLR/otmedia-2013-04.tar.gz" "$OTMEDIA_HOME/tools/SOLR/") 207 #res = $(tar -xvzf "$OTMEDIA_HOME/tools/SOLR/otmedia-2013-04.tar.gz" "$OTMEDIA_HOME/tools/SOLR/")
209 if [ $res -eq 2 ]; then echo " ${txtpur}NOT OK${txtrst}"; 208 if [ $res -eq 2 ]; then echo " ${txtpur}NOT OK${txtrst}";
210 else echo " ${txtgrn}OK${txtrst}"; fi 209 else echo " ${txtgrn}OK${txtrst}"; fi
211 else 210 else
212 echo "Skipping SOLR install" 211 echo "Skipping SOLR install"
213 fi 212 fi
214 read -e -p "Configure SOLR DB server ? (y/n) " solr 213 read -e -p "Configure SOLR DB server ? (y/n) " solr
215 if [ "$solr" == "y" ] 214 if [ "$solr" == "y" ]
216 then 215 then
217 read -p "Enter SOLR server IP :" ip 216 read -p "Enter SOLR server IP :" ip
218 if [ "${ip}" == "" ];then ip="localhost";fi 217 if [ "${ip}" == "" ];then ip="localhost";fi
219 echo "machine = \"${ip}\"" > $OTMEDIA_HOME/tools/scripts/solrinfo.py 218 echo "machine = \"${ip}\"" > $OTMEDIA_HOME/tools/scripts/solrinfo.py
220 read -p "Enter SOLR server port :" port 219 read -p "Enter SOLR server port :" port
221 if [ "${port}" == "" ]; then port="8080";fi 220 if [ "${port}" == "" ]; then port="8080";fi
222 echo -e "\n\tSOLR server IP ${ip}" 221 echo -e "\n\tSOLR server IP ${ip}"
223 echo -e "\tSOLR server port ${port}" 222 echo -e "\tSOLR server port ${port}"
224 echo "port = \"${port}\"" >> $OTMEDIA_HOME/tools/scripts/solrinfo.py 223 echo "port = \"${port}\"" >> $OTMEDIA_HOME/tools/scripts/solrinfo.py
225 else 224 else
226 echo "Skipping SOLR DB Configuration" 225 echo "Skipping SOLR DB Configuration"
227 fi 226 fi
228 echo -e "\nSee SOLR.INSTALL file for more information\n" 227 echo -e "\nSee SOLR.INSTALL file for more information\n"
229 fi 228 fi
230 229
231 ### Set Variables in bashrc ### 230 ### Set Variables in bashrc ###
232 cat ~/.bashrc | grep -v "OTMEDIA_HOME" | grep -v "SRILM_BIN" > ~/.bashrc.org 231 cat ~/.bashrc | grep -v "OTMEDIA_HOME" | grep -v "SRILM_BIN" > ~/.bashrc.org
233 #cat ~/.bashrc | grep -v "OTMEDIA_HOME" | grep -v "SRILM_BIN" | grep -v "LIA_TAGG" | grep -v "LIA_PHON" | grep -v "LIA_BIGLEX" > ~/.bashrc.org 232 #cat ~/.bashrc | grep -v "OTMEDIA_HOME" | grep -v "SRILM_BIN" | grep -v "LIA_TAGG" | grep -v "LIA_PHON" | grep -v "LIA_BIGLEX" > ~/.bashrc.org
234 cp ~/.bashrc.org ~/.bashrc 233 cp ~/.bashrc.org ~/.bashrc
235 export OTMEDIA_HOME=$PWD 234 export OTMEDIA_HOME=$PWD
236 echo "export OTMEDIA_HOME=$PWD" >> ~/.bashrc 235 echo "export OTMEDIA_HOME=$PWD" >> ~/.bashrc
236 echo "export $PATH=$PATH:$PWD/main_tools" >> ~/.bashrc
237 echo "export SRILM_BIN=$SRILM/bin/$MACHINE_TYPE" >> ~/.bashrc 237 echo "export SRILM_BIN=$SRILM/bin/$MACHINE_TYPE" >> ~/.bashrc
238 #echo "export LIA_TAGG_LANG=french" >> ~/.bashrc 238 #echo "export LIA_TAGG_LANG=french" >> ~/.bashrc
239 #echo "export LIA_TAGG=$OTMEDIA_HOME/tools/lia_ltbox/lia_tagg/" >> ~/.bashrc 239 #echo "export LIA_TAGG=$OTMEDIA_HOME/tools/lia_ltbox/lia_tagg/" >> ~/.bashrc
240 #echo "export LIA_PHON_REP=$OTMEDIA_HOME/tools/lia_ltbox/lia_phon/" >> ~/.bashrc 240 #echo "export LIA_PHON_REP=$OTMEDIA_HOME/tools/lia_ltbox/lia_phon/" >> ~/.bashrc
241 #echo "export LIA_BIGLEX=$OTMEDIA_HOME/tools/lia_ltbox/lia_biglex/" >> ~/.bashrc 241 #echo "export LIA_BIGLEX=$OTMEDIA_HOME/tools/lia_ltbox/lia_biglex/" >> ~/.bashrc
242 242
243 # set global configuration file 243 # set global configuration file
244 echo "OTMEDIA_HOME=$PWD" > $OTMEDIA_HOME/cfg/main_cfg.cfg 244 echo "OTMEDIA_HOME=$PWD" > $OTMEDIA_HOME/cfg/main_cfg.cfg
245 echo "ARCH=$ARCH" >> $OTMEDIA_HOME/cfg/main_cfg.cfg 245 echo "ARCH=$ARCH" >> $OTMEDIA_HOME/cfg/main_cfg.cfg
246 echo "VERBOSE=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg 246 echo "VERBOSE=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg
247 echo "DEBUG=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg 247 echo "DEBUG=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg
248 echo "CHECK=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg 248 echo "CHECK=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg
249 echo "RERUN=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg 249 echo "RERUN=0" >> $OTMEDIA_HOME/cfg/main_cfg.cfg
250 250
251 echo -e "\n\t${txtgrn}### Install completed ###${txtrst}\n" 251 echo -e "\n\t${txtgrn}### Install completed ###${txtrst}\n"
252 echo -e "do : source ~/.bashrc" 252 echo -e "do : source ~/.bashrc"
253 echo -e "or set variable :\n" 253 echo -e "or set variable :\n"
254 echo "export OTMEDIA_HOME=$PWD" 254 echo "export OTMEDIA_HOME=$PWD"
main_tools/ExploitConfidencePass.sh
1 #!/bin/bash 1 #!/bin/bash
2 2
3 ##################################################### 3 #####################################################
4 # File : ExploitConfidencePass.sh # 4 # File : ExploitConfidencePass.sh #
5 # Brief : Exploit the ASR confidence pass to : # 5 # Brief : Exploit the ASR confidence pass to : #
6 # -> boost the confident zone # 6 # -> boost the confident zone #
7 # -> find alternative in non confident zone 7 # -> find alternative in non confident zone
8 # -> dynamicly extend the lexicon # 8 # -> dynamicly extend the lexicon #
9 # Author : Jean-François Rey # 9 # Author : Jean-François Rey #
10 # (base on Emmanuel Ferreira # 10 # (base on Emmanuel Ferreira #
11 # and Hugo Mauchrétien works) # 11 # and Hugo Mauchrétien works) #
12 # Version : 1.0 # 12 # Version : 1.0 #
13 # Date : 25/06/13 # 13 # Date : 25/06/13 #
14 ##################################################### 14 #####################################################
15 15
16 echo "### ExploitConfidencePass.sh ###" 16 echo "### ExploitConfidencePass.sh ###"
17 17
18 # Check OTMEDIA_HOME env var 18 # Check OTMEDIA_HOME env var
19 if [ -z ${OTMEDIA_HOME} ] 19 if [ -z ${OTMEDIA_HOME} ]
20 then 20 then
21 OTMEDIA_HOME=$(dirname $(dirname $(readlink -e $0))) 21 OTMEDIA_HOME=$(dirname $(dirname $(readlink -e $0)))
22 export OTMEDIA_HOME=$OTMEDIA_HOME 22 export OTMEDIA_HOME=$OTMEDIA_HOME
23 fi 23 fi
24 24
25 # where is ExploitConfidencePass.sh 25 # where is ExploitConfidencePass.sh
26 MAIN_SCRIPT_PATH=$(dirname $(readlink -e $0)) 26 MAIN_SCRIPT_PATH=$(dirname $(readlink -e $0))
27 27
28 if [ -z ${SCRIPT_PATH} ] 28 if [ -z ${SCRIPT_PATH} ]
29 then 29 then
30 SCRIPT_PATH=$OTMEDIA_HOME/tools/scripts 30 SCRIPT_PATH=$OTMEDIA_HOME/tools/scripts
31 fi 31 fi
32 32
33 # Include scripts 33 # Include scripts
34 . $SCRIPT_PATH"/Tools.sh" 34 . $SCRIPT_PATH"/Tools.sh"
35 . $SCRIPT_PATH"/CheckExploitConfPass.sh" 35 . $SCRIPT_PATH"/CheckExploitConfPass.sh"
36 36
37 # where is ExploitConfidencePass.cfg 37 # where is ExploitConfidencePass.cfg
38 EXPLOITCONFIDENCEPASS_CONFIG_FILE=$OTMEDIA_HOME"/cfg/ExploitConfidencePass.cfg" 38 EXPLOITCONFIDENCEPASS_CONFIG_FILE=$OTMEDIA_HOME"/cfg/ExploitConfidencePass.cfg"
39 if [ -e $EXPLOITCONFIDENCEPASS_CONFIG_FILE ] 39 if [ -e $EXPLOITCONFIDENCEPASS_CONFIG_FILE ]
40 then 40 then
41 . $EXPLOITCONFIDENCEPASS_CONFIG_FILE 41 . $EXPLOITCONFIDENCEPASS_CONFIG_FILE
42 else 42 else
43 echo "ERROR : Can't find configuration file $EXPLOITCONFIDENCEPASS_CONFIG_FILE" >&2 43 echo "ERROR : Can't find configuration file $EXPLOITCONFIDENCEPASS_CONFIG_FILE" >&2
44 exit 1 44 exit 1
45 fi 45 fi
46 46
47 #---------------# 47 #---------------#
48 # Parse Options # 48 # Parse Options #
49 #---------------# 49 #---------------#
50 while getopts ":hDv:cf:r" opt 50 while getopts ":hDv:cr" opt
51 do 51 do
52 case $opt in 52 case $opt in
53 h) 53 h)
54 echo -e "$0 [OPTIONS] <INPUT_DIRECTORY>\n" 54 echo -e "$0 [OPTIONS] <INPUT_DIRECTORY>\n"
55 echo -e "\t Options:" 55 echo -e "\t Options:"
56 echo -e "\t\t-h :\tprint this message" 56 echo -e "\t\t-h :\tprint this message"
57 echo -e "\t\t-D :\tDEBUG mode on" 57 echo -e "\t\t-D :\tDEBUG mode on"
58 echo -e "\t\t-v l :\tVerbose mode, l=(1|2|3) level mode" 58 echo -e "\t\t-v l :\tVerbose mode, l=(1|2|3) level mode"
59 echo -e "\t\t-c :\tCheck process, stop if error detected" 59 echo -e "\t\t-c :\tCheck process, stop if error detected"
60 echo -e "\t\t-f n :\tspecify a speeral forks number (default 1)"
61 echo -e "\t\t-r n :\tforce rerun without deleting files" 60 echo -e "\t\t-r n :\tforce rerun without deleting files"
62 exit 1 61 exit 1
63 ;; 62 ;;
64 D) 63 D)
65 DEBUG=1 64 DEBUG=1
66 ;; 65 ;;
67 v) 66 v)
68 VERBOSE=$OPTARG 67 VERBOSE=$OPTARG
69 ;; 68 ;;
70 c) 69 c)
71 CHECK=1 70 CHECK=1
72 ;;
73 f)
74 FORKS="--forks $OPTARG"
75 ;; 71 ;;
76 r) 72 r)
77 RERUN=1 73 RERUN=1
78 ;; 74 ;;
79 :) 75 :)
80 echo "Option -$OPTARG requires an argument." >&2 76 echo "Option -$OPTARG requires an argument." >&2
81 exit 1 77 exit 1
82 ;; 78 ;;
83 \?) 79 \?)
84 echo "BAD USAGE : unknow opton -$OPTARG" 80 echo "BAD USAGE : unknow opton -$OPTARG"
85 #exit 1 81 #exit 1
86 ;; 82 ;;
87 esac 83 esac
88 done 84 done
89 85
90 # mode debug enable 86 # mode debug enable
91 if [ $DEBUG -eq 1 ] 87 if [ $DEBUG -eq 1 ]
92 then 88 then
93 set -x 89 set -x
94 echo -e "## Mode DEBUG ON ##" 90 echo -e "## Mode DEBUG ON ##"
95 fi 91 fi
96 92
97 # mode verbose enable 93 # mode verbose enable
98 if [ $VERBOSE -gt 0 ]; then echo -e "## Verbose level : $VERBOSE ##" ;fi 94 if [ $VERBOSE -gt 0 ]; then echo -e "## Verbose level : $VERBOSE ##" ;fi
99 95
100 # Check USAGE by arguments number 96 # Check USAGE by arguments number
101 if [ $(($#-($OPTIND-1))) -ne 1 ] 97 if [ $(($#-($OPTIND-1))) -ne 1 ]
102 then 98 then
103 echo "BAD USAGE : ExploitConfidencePass.sh [OPTIONS] <INPUT_DIRECTORY>" 99 echo "BAD USAGE : ExploitConfidencePass.sh [OPTIONS] <INPUT_DIRECTORY>"
104 echo "$0 -h for more info" 100 echo "$0 -h for more info"
105 exit 1 101 exit 1
106 fi 102 fi
107 103
108 shift $((OPTIND-1)) 104 shift $((OPTIND-1))
109 # check input directory - first argument 105 # check input directory - first argument
110 if [ ! -e $1 ] 106 if [ ! -e $1 ]
111 then 107 then
112 print_error "can't open $1" 108 print_error "can't open $1"
113 exit 1 109 exit 1
114 fi 110 fi
115 111
116 print_info "[${BASENAME}] => ExploitConfPass start | $(date +'%d/%m/%y %H:%M:%S')" 1 112 print_info "[${BASENAME}] => ExploitConfPass start | $(date +'%d/%m/%y %H:%M:%S')" 1
117 113
118 #-------------# 114 #-------------#
119 # GLOBAL VARS # 115 # GLOBAL VARS #
120 #-------------# 116 #-------------#
121 INPUT_DIR=$(readlink -e $1) 117 INPUT_DIR=$(readlink -e $1)
122 OUTPUT_DIR=$INPUT_DIR 118 OUTPUT_DIR=$INPUT_DIR
123 BASENAME=$(basename $OUTPUT_DIR) 119 BASENAME=$(basename $OUTPUT_DIR)
124 SHOW_DIR="$OUTPUT_DIR/shows/" 120 SHOW_DIR="$OUTPUT_DIR/shows/"
125 SOLR_RES="$OUTPUT_DIR/solr/" 121 SOLR_RES="$OUTPUT_DIR/solr/"
126 EXT_LEX="$OUTPUT_DIR/LEX/" 122 EXT_LEX="$OUTPUT_DIR/LEX/"
127 TRIGGER_CONFZONE="$OUTPUT_DIR/trigg/" 123 TRIGGER_CONFZONE="$OUTPUT_DIR/trigg/"
128 LOGFILE="$OUTPUT_DIR/info_exploitconf.log" 124 LOGFILE="$OUTPUT_DIR/info_exploitconf.log"
129 ERRORFILE="$OUTPUT_DIR/error_exploitconf.log" 125 ERRORFILE="$OUTPUT_DIR/error_exploitconf.log"
130 126
131 CONFPASS_CONFIG_FILE="$(readlink -e $1)/ConfPass.cfg" 127 CONFPASS_CONFIG_FILE="$(readlink -e $1)/ConfPass.cfg"
132 if [ -e $CONFPASS_CONFIG_FILE ] 128 if [ -e $CONFPASS_CONFIG_FILE ]
133 then 129 then
134 { 130 {
135 RES_CONF_DIR=$(cat $CONFPASS_CONFIG_FILE | grep "^RES_CONF_DIR=" | cut -f2 -d"=") 131 RES_CONF_DIR=$(cat $CONFPASS_CONFIG_FILE | grep "^RES_CONF_DIR=" | cut -f2 -d"=")
136 RES_CONF=$(cat $CONFPASS_CONFIG_FILE | grep "^CONF_DIR=" | cut -f2 -d"=") 132 RES_CONF=$(cat $CONFPASS_CONFIG_FILE | grep "^CONF_DIR=" | cut -f2 -d"=")
137 print_info "[${BASENAME}] Use confidence measure from : $RES_CONF" 2 133 print_info "[${BASENAME}] Use confidence measure from : $RES_CONF" 2
138 } 134 }
139 else 135 else
140 { 136 {
141 print_error "[${BASENAME}] Can't find $CONFPASS_CONFIG_FILE" 137 print_error "[${BASENAME}] Can't find $CONFPASS_CONFIG_FILE"
142 print_error "[${BASENAME}] -> use res_p2" 138 print_error "[${BASENAME}] -> use res_p2"
143 RES_CONF_DIR="$INPUT_DIR/conf/res_p2/scored_ctm" 139 RES_CONF_DIR="$INPUT_DIR/conf/res_p2/scored_ctm"
144 RES_CONF="$INPUT_DIR/conf/res_p2" 140 RES_CONF="$INPUT_DIR/conf/res_p2"
145 } 141 }
146 fi 142 fi
147 143
148 mkdir -p $SHOW_DIR > /dev/null 2>&1 144 mkdir -p $SHOW_DIR > /dev/null 2>&1
149 mkdir -p $SOLR_RES > /dev/null 2>&1 145 mkdir -p $SOLR_RES > /dev/null 2>&1
150 mkdir -p $EXT_LEX > /dev/null 2>&1 146 mkdir -p $EXT_LEX > /dev/null 2>&1
151 mkdir -p $TRIGGER_CONFZONE > /dev/null 2>&1 147 mkdir -p $TRIGGER_CONFZONE > /dev/null 2>&1
152 148
153 #------------------# 149 #------------------#
154 # Create Workspace # 150 # Create Workspace #
155 #------------------# 151 #------------------#
156 # Lock directory 152 # Lock directory
157 if [ -e "$OUTPUT_DIR_BASENAME/EXPLOITCONFPASS.lock" ] && [ $RERUN -eq 0 ] 153 if [ -e "$OUTPUT_DIR_BASENAME/EXPLOITCONFPASS.lock" ] && [ $RERUN -eq 0 ]
158 then 154 then
159 print_warn "[${BASENAME}] ExploitConfidencePass is locked -> exit" 2 155 print_warn "[${BASENAME}] ExploitConfidencePass is locked -> exit" 2
160 exit 1 156 exit 1
161 fi 157 fi
162 rm "$OUTPUT_DIR/EXPLOITCONFPASS.unlock" > /dev/null 2>&1 158 rm "$OUTPUT_DIR/EXPLOITCONFPASS.unlock" > /dev/null 2>&1
163 touch "$OUTPUT_DIR/EXPLOITCONFPASS.lock" > /dev/null 2>&1 159 touch "$OUTPUT_DIR/EXPLOITCONFPASS.lock" > /dev/null 2>&1
164 160
165 #------# 161 #------#
166 # Save # 162 # Save #
167 #------# 163 #------#
168 cp $EXPLOITCONFIDENCEPASS_CONFIG_FILE $OUTPUT_DIR/ExploitConfPass.cfg 164 cp $EXPLOITCONFIDENCEPASS_CONFIG_FILE $OUTPUT_DIR/ExploitConfPass.cfg
169 echo "TRIGGER_DIR=$TRIGGER_CONFZONE" >> $OUTPUT_DIR/ExploitConfPass.cfg 165 echo "TRIGGER_DIR=$TRIGGER_CONFZONE" >> $OUTPUT_DIR/ExploitConfPass.cfg
170 echo "TRIGGER_SPEERAL=$TRIGGER_CONFZONE/speeral/" >> $OUTPUT_DIR/ExploitConfPass.cfg 166 echo "TRIGGER_SPEERAL=$TRIGGER_CONFZONE/speeral/" >> $OUTPUT_DIR/ExploitConfPass.cfg
171 echo "LEX_SPEERAL=$EXT_LEX/speeral/${lexname}_ext" >> $OUTPUT_DIR/ExploitConfPass.cfg 167 echo "LEX_SPEERAL=$EXT_LEX/speeral/${lexname}_ext" >> $OUTPUT_DIR/ExploitConfPass.cfg
172 echo "LEX_BINODE_SPEERAL=$EXT_LEX/speeral/${lexname}_ext.bin" >> $OUTPUT_DIR/ExploitConfPass.cfg 168 echo "LEX_BINODE_SPEERAL=$EXT_LEX/speeral/${lexname}_ext.bin" >> $OUTPUT_DIR/ExploitConfPass.cfg
173 print_info "[${BASENAME}] Save config in $OUTPUT_DIR_BASENAME/ExploitConfPass.cfg" 1 169 print_info "[${BASENAME}] Save config in $OUTPUT_DIR_BASENAME/ExploitConfPass.cfg" 1
174 170
175 #---------------# 171 #---------------#
176 # Check Pass # 172 # Check Pass #
177 #---------------# 173 #---------------#
178 if [ $( ls ${RES_CONF_DIR}/*.res 2> /dev/null | wc -l) -eq 0 ] 174 if [ $( ls ${RES_CONF_DIR}/*.res 2> /dev/null | wc -l) -eq 0 ]
179 then 175 then
180 print_error "[${BASENAME}] No Conf Pass res -> exit ExploitConfPass" 176 print_error "[${BASENAME}] No Conf Pass res -> exit ExploitConfPass"
181 if [ $CHECK -eq 1 ]; then print_log_file $ERRORFILE "No ConfPass res in ${RES_CONF_DIR}" ;fi 177 if [ $CHECK -eq 1 ]; then print_log_file $ERRORFILE "No ConfPass res in ${RES_CONF_DIR}" ;fi
182 exit 1 178 exit 1
183 fi 179 fi
184 180
185 #-----------------------# 181 #-----------------------#
186 # Segmentation by show # 182 # Segmentation by show #
187 #-----------------------# 183 #-----------------------#
188 # create txt file from scored res 184 # create txt file from scored res
189 # tag pos and lemmatization of the txt file 185 # tag pos and lemmatization of the txt file
190 # merge the scored res and taglem file 186 # merge the scored res and taglem file
191 # segment using the last generated file 187 # segment using the last generated file
192 # and create a ctm file by show 188 # and create a ctm file by show
193 189
194 print_info "[${BASENAME}] Segmentation by show" 1 190 print_info "[${BASENAME}] Segmentation by show" 1
195 191
196 # -> to txt 192 # -> to txt
197 print_info "[${BASENAME}] Create txt from scored res" 3 193 print_info "[${BASENAME}] Create txt from scored res" 3
198 cat ${RES_CONF_DIR}/*.res > $INPUT_DIR/$BASENAME.sctm 194 cat ${RES_CONF_DIR}/*.res > $INPUT_DIR/$BASENAME.sctm
199 cat $INPUT_DIR/$BASENAME.seg | $SIGMUND_BIN/myConvert.pl $INPUT_DIR/$BASENAME.sctm $INPUT_DIR/$BASENAME.tmp 195 cat $INPUT_DIR/$BASENAME.seg | $SIGMUND_BIN/myConvert.pl $INPUT_DIR/$BASENAME.sctm $INPUT_DIR/$BASENAME.tmp
200 cat $INPUT_DIR/$BASENAME.tmp | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -f | sed -e "s/_/ /g" | sort -nt 'n' -k '2' > $INPUT_DIR/$BASENAME.txt 196 cat $INPUT_DIR/$BASENAME.tmp | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -f | sed -e "s/_/ /g" | sort -nt 'n' -k '2' > $INPUT_DIR/$BASENAME.txt
201 197
202 # -> to tagger + lemme 198 # -> to tagger + lemme
203 print_info "[${BASENAME}] Tag pos and lem in txt file" 3 199 print_info "[${BASENAME}] Tag pos and lem in txt file" 3
204 iconv -t ISO_8859-1 $INPUT_DIR/$BASENAME.txt > $INPUT_DIR/$BASENAME.tmp 200 iconv -t ISO_8859-1 $INPUT_DIR/$BASENAME.txt > $INPUT_DIR/$BASENAME.tmp
205 $SIGMUND_BIN/txt2lem.sh $INPUT_DIR/$BASENAME.tmp $INPUT_DIR/$BASENAME.taglem 201 $SIGMUND_BIN/txt2lem.sh $INPUT_DIR/$BASENAME.tmp $INPUT_DIR/$BASENAME.taglem
206 202
207 # merge sctm and taglem 203 # merge sctm and taglem
208 print_info "[${BASENAME}] Merge scored ctm with tag pos and lem file" 3 204 print_info "[${BASENAME}] Merge scored ctm with tag pos and lem file" 3
209 cat $INPUT_DIR/$BASENAME.sctm | $SCRIPT_PATH/BdlexUC.pl ${RULES}/basic -f | iconv -t ISO_8859-1 | $SCRIPT_PATH/scoredCtmAndTaggedLem2All.pl $INPUT_DIR/$BASENAME.taglem > $INPUT_DIR/$BASENAME.ctl 205 cat $INPUT_DIR/$BASENAME.sctm | $SCRIPT_PATH/BdlexUC.pl ${RULES}/basic -f | iconv -t ISO_8859-1 | $SCRIPT_PATH/scoredCtmAndTaggedLem2All.pl $INPUT_DIR/$BASENAME.taglem > $INPUT_DIR/$BASENAME.ctl
210 206
211 # -> new seg 207 # -> new seg
212 print_info "[${BASENAME}] Create xml file and run Topic Seg" 3 208 print_info "[${BASENAME}] Create xml file and run Topic Seg" 3
213 $SIGMUND_BIN/tagLem2xml.pl $INPUT_DIR/$BASENAME.taglem $INPUT_DIR/$BASENAME.doc.xml 209 $SIGMUND_BIN/tagLem2xml.pl $INPUT_DIR/$BASENAME.taglem $INPUT_DIR/$BASENAME.doc.xml
214 rm $INPUT_DIR/$BASENAME.tmp #$INPUT_DIR/$BASENAME.taglem 210 rm $INPUT_DIR/$BASENAME.tmp #$INPUT_DIR/$BASENAME.taglem
215 211
216 # Lia_topic_seg : bring together sentences into show 212 # Lia_topic_seg : bring together sentences into show
217 cp $INPUT_DIR/$BASENAME.doc.xml 0.xml 213 cp $INPUT_DIR/$BASENAME.doc.xml 0.xml
218 java -cp $LIATOPICSEG/bin Test > $INPUT_DIR/show.seg 214 java -cp $LIATOPICSEG/bin Test > $INPUT_DIR/show.seg
219 cat $INPUT_DIR/show.seg | $SIGMUND_BIN/toSegEmiss.pl $INPUT_DIR/$BASENAME.show.seg 215 cat $INPUT_DIR/show.seg | $SIGMUND_BIN/toSegEmiss.pl $INPUT_DIR/$BASENAME.show.seg
220 rm 0.xml $INPUT_DIR/show.seg 216 rm 0.xml $INPUT_DIR/show.seg
221 217
222 if [ $CHECK -eq 1 ] 218 if [ $CHECK -eq 1 ]
223 then 219 then
224 if [ ! -s $INPUT_DIR/$BASENAME.show.seg ] 220 if [ ! -s $INPUT_DIR/$BASENAME.show.seg ]
225 then 221 then
226 print_error "[${BASENAME}] No Topic segmentation ! " 222 print_error "[${BASENAME}] No Topic segmentation ! "
227 print_error "[${BASENAME}] Check $ERRORFILE " 223 print_error "[${BASENAME}] Check $ERRORFILE "
228 print_log_file "$ERRORFILE" "No Topic segmentation in ${BASENAME}.show.seg" 224 print_log_file "$ERRORFILE" "No Topic segmentation in ${BASENAME}.show.seg"
229 fi 225 fi
230 fi 226 fi
231 227
232 # Segment ctm into several show files and create a seg list by show 228 # Segment ctm into several show files and create a seg list by show
233 print_info "[${BASENAME}] Segment ctm into show files and a seg list by show" 1 229 print_info "[${BASENAME}] Segment ctm into show files and a seg list by show" 1
234 $SCRIPT_PATH/ctm2show.pl $INPUT_DIR/$BASENAME.ctl $INPUT_DIR/$BASENAME.show.seg $SHOW_DIR 230 $SCRIPT_PATH/ctm2show.pl $INPUT_DIR/$BASENAME.ctl $INPUT_DIR/$BASENAME.show.seg $SHOW_DIR
235 231
236 #-----------------------------------------------------------# 232 #-----------------------------------------------------------#
237 # SOLR QUERIES # 233 # SOLR QUERIES #
238 # -> Create Confidente Word # 234 # -> Create Confidente Word #
239 # Keep conf words and use Tags # 235 # Keep conf words and use Tags #
240 # -> Query SOLR (document & multimedia) # 236 # -> Query SOLR (document & multimedia) #
241 # concat word + add date 2 day before and after the show # 237 # concat word + add date 2 day before and after the show #
242 # query document & multimedia # 238 # query document & multimedia #
243 #-----------------------------------------------------------# 239 #-----------------------------------------------------------#
244 print_info "[${BASENAME}] Create SOLR queries and ask SOLR" 1 240 print_info "[${BASENAME}] Create SOLR queries and ask SOLR" 1
245 for show in $(ls $SHOW_DIR/*.ctm) 241 for show in $(ls $SHOW_DIR/*.ctm)
246 do 242 do
247 bn=$(basename $show .ctm) 243 bn=$(basename $show .ctm)
248 # Remove words with low confidence and keep useful tagger words 244 # Remove words with low confidence and keep useful tagger words
249 cat $show | $SCRIPT_PATH/KeepConfZone.pl | grep -e "MOTINC\|NMS\|NMP\|NFS\|NFP\|X[A-Z]{3,5}" | cut -f3 -d' ' > "$SHOW_DIR/$bn.confzone" 245 cat $show | $SCRIPT_PATH/KeepConfZone.pl | grep -e "MOTINC\|NMS\|NMP\|NFS\|NFP\|X[A-Z]{3,5}" | cut -f3 -d' ' > "$SHOW_DIR/$bn.confzone"
250 # Get date 2 day before and after the show 246 # Get date 2 day before and after the show
251 datePattern=`$SCRIPT_PATH/daybefore2after.sh $(echo $BASENAME | cut -c1-6)` 247 datePattern=`$SCRIPT_PATH/daybefore2after.sh $(echo $BASENAME | cut -c1-6)`
252 # Create SOLR queries 248 # Create SOLR queries
253 cat $SHOW_DIR/$bn".confzone" | $SCRIPT_PATH/GenerateSOLRQueries.pl | iconv -f ISO_8859-1 -t UTF-8 > "$SHOW_DIR/$bn.queries" 249 cat $SHOW_DIR/$bn".confzone" | $SCRIPT_PATH/GenerateSOLRQueries.pl | iconv -f ISO_8859-1 -t UTF-8 > "$SHOW_DIR/$bn.queries"
254 # Ask SOLR DB 250 # Ask SOLR DB
255 if [ $(wc -w "$SHOW_DIR/$bn.queries" | cut -f1 -d' ') -gt 0 ]; then 251 if [ $(wc -w "$SHOW_DIR/$bn.queries" | cut -f1 -d' ') -gt 0 ]; then
256 query=$(cat $SHOW_DIR/$bn.queries)"&fq=docDate:[$datePattern]" 252 query=$(cat $SHOW_DIR/$bn.queries)"&fq=docDate:[$datePattern]"
257 echo $query > $SHOW_DIR/$bn.queries 253 echo $query > $SHOW_DIR/$bn.queries
258 print_info "python $SCRIPT_PATH/ProcessSOLRQueries.py $SHOW_DIR/$bn.queries $SOLR_RES/$bn.keywords.tmp $SOLR_RES/$bn.txt.tmp" 3 254 print_info "python $SCRIPT_PATH/ProcessSOLRQueries.py $SHOW_DIR/$bn.queries $SOLR_RES/$bn.keywords.tmp $SOLR_RES/$bn.txt.tmp" 3
259 python $SCRIPT_PATH/ProcessSOLRQueries.py $SHOW_DIR/$bn.queries $SOLR_RES/$bn.keywords.tmp $SOLR_RES/$bn.txt.tmp 255 python $SCRIPT_PATH/ProcessSOLRQueries.py $SHOW_DIR/$bn.queries $SOLR_RES/$bn.keywords.tmp $SOLR_RES/$bn.txt.tmp
260 cat $SOLR_RES/$bn.keywords.tmp | sort -u > $SOLR_RES/$bn.keywords 256 cat $SOLR_RES/$bn.keywords.tmp | sort -u > $SOLR_RES/$bn.keywords
261 cat $SOLR_RES/$bn.txt.tmp | sort -u > $SOLR_RES/$bn.txt 257 cat $SOLR_RES/$bn.txt.tmp | sort -u > $SOLR_RES/$bn.txt
262 rm $SOLR_RES/*.tmp > /dev/null 2>&1 258 rm $SOLR_RES/*.tmp > /dev/null 2>&1
263 fi 259 fi
264 260
265 if [ $CHECK -eq 1 ] 261 if [ $CHECK -eq 1 ]
266 then 262 then
267 if [ ! -e $SOLR_RES/$bn.keywords ] || [ ! -e $SOLR_RES/$bn.txt ] 263 if [ ! -e $SOLR_RES/$bn.keywords ] || [ ! -e $SOLR_RES/$bn.txt ]
268 then 264 then
269 print_warn "$bn.keywords and $bn.txt are empty !\nMaybe SOLR server is down !" 2 265 print_warn "$bn.keywords and $bn.txt are empty !\nMaybe SOLR server is down !" 2
270 print_log_file "$LOGFILE" "$bn.keywords and $bn.txt are empty !\nMaybe SOLR server is down !" 266 print_log_file "$LOGFILE" "$bn.keywords and $bn.txt are empty !\nMaybe SOLR server is down !"
271 fi 267 fi
272 fi 268 fi
273 269
274 done 270 done
275 271
276 #----------------------------------------------------------------------------------------------- 272 #-----------------------------------------------------------------------------------------------
277 # Build trigger file 273 # Build trigger file
278 # 1) keywords are automatically boosted in the non confident zone of the current res 274 # 1) keywords are automatically boosted in the non confident zone of the current res
279 # confident zone are boosted 275 # confident zone are boosted
280 # previous words in sensible zone are penalized 276 # previous words in sensible zone are penalized
281 # 2) OOVs are extracted + phonetized 277 # 2) OOVs are extracted + phonetized
282 # 3) Try to find OOVs acousticly in the current segment 278 # 3) Try to find OOVs acousticly in the current segment
283 # 4) Generate the .trigg file 279 # 4) Generate the .trigg file
284 #------------------------------------------------------------------------------------------------ 280 #------------------------------------------------------------------------------------------------
285 print_info "[${BASENAME}] Build trigger files" 1 281 print_info "[${BASENAME}] Build trigger files" 1
286 for i in `ls $SOLR_RES/*.keywords` 282 for i in `ls $SOLR_RES/*.keywords`
287 do 283 do
288 basename=`basename $i .keywords` 284 basename=`basename $i .keywords`
289 285
290 # 286 #
291 # Tokenize & produce coverage report 287 # Tokenize & produce coverage report
292 # Use filter you need 288 # Use filter you need
293 # 289 #
294 print_info "[${BASENAME}] keywords filtering and produce coverage report" 3 290 print_info "[${BASENAME}] keywords filtering and produce coverage report" 3
295 # Default filter 291 # Default filter
296 cat $i | $SCRIPT_PATH/CleanFilter.sh | ${SCRIPT_PATH}/ApplyCorrectionRules.pl ${LEXICON}.regex | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t |\ 292 cat $i | $SCRIPT_PATH/CleanFilter.sh | ${SCRIPT_PATH}/ApplyCorrectionRules.pl ${LEXICON}.regex | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t |\
297 $SCRIPT_PATH/CoverageReportMaker.pl --out $SOLR_RES/${basename}_tmp_report $LEXICON.bdlex_tok 293 $SCRIPT_PATH/CoverageReportMaker.pl --out $SOLR_RES/${basename}_tmp_report $LEXICON.bdlex_tok
298 # do less filter 294 # do less filter
299 #cat $i | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t | sed -f $RULES/preprocess.regex | sed -f $RULES/lastprocess.regex | $SCRIPT_PATH/CoverageReportMaker.pl --out $SOLR_RES/${basename}_tmp_report $LEXICON.bdlex_tok 295 #cat $i | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t | sed -f $RULES/preprocess.regex | sed -f $RULES/lastprocess.regex | $SCRIPT_PATH/CoverageReportMaker.pl --out $SOLR_RES/${basename}_tmp_report $LEXICON.bdlex_tok
300 296
301 297
302 # 298 #
303 # Extract "real" OOV and phonetize them 299 # Extract "real" OOV and phonetize them
304 # -> petit filtrage persoo pour eviter d'avoir trop de bruits 300 # -> petit filtrage persoo pour eviter d'avoir trop de bruits
305 # 301 #
306 print_info "[${BASENAME}] Extract OOV and phonetize them" 3 302 print_info "[${BASENAME}] Extract OOV and phonetize them" 3
307 ${SCRIPT_PATH}/FindNormRules.pl $SOLR_RES/${basename}_tmp_report/report.oov $LEXICON.bdlex_tok | cut -f3 | grep -v "#" | grep -v "^[A-Z]\+$" | grep -v "^[0-9]" | grep --perl-regex -v "^([a-z']){1,3}$" | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -f | iconv -t ISO_8859-1 -f UTF-8 | ${LIA_LTBOX}/lia_phon/script/lia_lex2phon_variante | grep -v "core dumped" | cut -d"[" -f1 | sort -u | ${SCRIPT_PATH}/PhonFormatter.pl | iconv -f ISO_8859-1 -t UTF-8 | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $SOLR_RES/${basename}.phon_oov 303 ${SCRIPT_PATH}/FindNormRules.pl $SOLR_RES/${basename}_tmp_report/report.oov $LEXICON.bdlex_tok | cut -f3 | grep -v "#" | grep -v "^[A-Z]\+$" | grep -v "^[0-9]" | grep --perl-regex -v "^([a-z']){1,3}$" | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -f | iconv -t ISO_8859-1 -f UTF-8 | ${LIA_LTBOX}/lia_phon/script/lia_lex2phon_variante | grep -v "core dumped" | cut -d"[" -f1 | sort -u | ${SCRIPT_PATH}/PhonFormatter.pl | iconv -f ISO_8859-1 -t UTF-8 | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $SOLR_RES/${basename}.phon_oov
308 304
309 # 305 #
310 # Search INVOC & OOV in the current lattice 306 # Search INVOC & OOV in the current lattice
311 # 307 #
312 print_info "[${BASENAME}] Search INVOC and OOV in the current lattice" 3 308 print_info "[${BASENAME}] Search INVOC and OOV in the current lattice" 3
313 cat $SOLR_RES/${basename}_tmp_report/report.invoc | grep -v "\b0" | cut -f1 | grep -v --perl-regex -v "^[a-zA-Z']{1,3}$" | grep -v --perl-regex "^[a-zA-Z0-9]{1,3}$" | grep -v "<s>" | grep -v "</s>" | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $TRIGGER_CONFZONE/$basename.tosearch 309 cat $SOLR_RES/${basename}_tmp_report/report.invoc | grep -v "\b0" | cut -f1 | grep -v --perl-regex -v "^[a-zA-Z']{1,3}$" | grep -v --perl-regex "^[a-zA-Z0-9]{1,3}$" | grep -v "<s>" | grep -v "</s>" | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $TRIGGER_CONFZONE/$basename.tosearch
314 cat $SOLR_RES/${basename}.phon_oov | cut -f1 >> $TRIGGER_CONFZONE/$basename.tosearch 310 cat $SOLR_RES/${basename}.phon_oov | cut -f1 >> $TRIGGER_CONFZONE/$basename.tosearch
315 311
316 # For each treil 312 # For each treil
317 for baseseg in $(cat "$SHOW_DIR/$basename.lst") 313 for baseseg in $(cat "$SHOW_DIR/$basename.lst")
318 do 314 do
319 $OTMEDIA_HOME/tools/QUOTE_FINDER/bin/acousticFinder ${LEXICON}.speer_phon $RES_CONF/wlat/$baseseg.wlat $TRIGGER_CONFZONE/${basename}.tosearch $SOLR_RES/$basename.phon_oov > $TRIGGER_CONFZONE/$baseseg.acousticlyfound $OUTPUT_REDIRECTION 315 $OTMEDIA_HOME/tools/QUOTE_FINDER/bin/acousticFinder ${LEXICON}.speer_phon $RES_CONF/wlat/$baseseg.wlat $TRIGGER_CONFZONE/${basename}.tosearch $SOLR_RES/$basename.phon_oov > $TRIGGER_CONFZONE/$baseseg.acousticlyfound $OUTPUT_REDIRECTION
320 # 316 #
321 # Produce the boost file for the next decoding pass 317 # Produce the boost file for the next decoding pass
322 # 318 #
323 print_info "[${BASENAME}] Produce trigg file : $baseseg " 3 319 print_info "[${BASENAME}] Produce trigg file : $baseseg " 3
324 cat $RES_CONF_DIR/$baseseg.res | $SCRIPT_PATH/ScoreCtm2trigg.pl $TRIGGER_CONFZONE/$baseseg.acousticlyfound > $TRIGGER_CONFZONE/$baseseg.trigg 320 cat $RES_CONF_DIR/$baseseg.res | $SCRIPT_PATH/ScoreCtm2trigg.pl $TRIGGER_CONFZONE/$baseseg.acousticlyfound > $TRIGGER_CONFZONE/$baseseg.trigg
325 done 321 done
326 322
327 done 323 done
328 324
329 #----------------------------------------------------------------------------------------------- 325 #-----------------------------------------------------------------------------------------------
330 # Build the extended SPEERAL Lexicon 326 # Build the extended SPEERAL Lexicon
331 # 1) Merge OOVs + LEXICON 327 # 1) Merge OOVs + LEXICON
332 # 1) Related text are collected in order to find the invoc word with maximizing the ppl (LM proba) 328 # 1) Related text are collected in order to find the invoc word with maximizing the ppl (LM proba)
333 # 2) The current lexicon is extended with all the valid OOVs 329 # 2) The current lexicon is extended with all the valid OOVs
334 #----------------------------------------------------------------------------------------------- 330 #-----------------------------------------------------------------------------------------------
335 print_info "[${BASENAME}] Build extended Speeral Lexicon" 1 331 print_info "[${BASENAME}] Build extended Speeral Lexicon" 1
336 mkdir -p $EXT_LEX/final 332 mkdir -p $EXT_LEX/final
337 mkdir -p $EXT_LEX/tmp 333 mkdir -p $EXT_LEX/tmp
338 mkdir -p $EXT_LEX/tmp/txt 334 mkdir -p $EXT_LEX/tmp/txt
339 # 335 #
340 # Collect the acousticly found oov and their phonetisation 336 # Collect the acousticly found oov and their phonetisation
341 # 337 #
342 print_info "[${BASENAME}] Get all OOV and retrieve all phonetisation" 3 338 print_info "[${BASENAME}] Get all OOV and retrieve all phonetisation" 3
343 for i in `ls $SOLR_RES/*.phon_oov` 339 for i in `ls $SOLR_RES/*.phon_oov`
344 do 340 do
345 basename=`basename $i .phon_oov` 341 basename=`basename $i .phon_oov`
346 342
347 rm $EXT_LEX/$basename.acousticlyfound 2> /dev/null 343 rm $EXT_LEX/$basename.acousticlyfound 2> /dev/null
348 # list acousticly found for the show 344 # list acousticly found for the show
349 for baseseg in $(cat "$SHOW_DIR/$basename.lst") 345 for baseseg in $(cat "$SHOW_DIR/$basename.lst")
350 do 346 do
351 cat $TRIGGER_CONFZONE/$baseseg.acousticlyfound | cut -f1 | cut -f2 -d"=" >> $EXT_LEX/$basename.acousticlyfound 347 cat $TRIGGER_CONFZONE/$baseseg.acousticlyfound | cut -f1 | cut -f2 -d"=" >> $EXT_LEX/$basename.acousticlyfound
352 done 348 done
353 cat $EXT_LEX/$basename.acousticlyfound | sort -u > $EXT_LEX/.tmp 349 cat $EXT_LEX/$basename.acousticlyfound | sort -u > $EXT_LEX/.tmp
354 mv $EXT_LEX/.tmp $EXT_LEX/$basename.acousticlyfound 350 mv $EXT_LEX/.tmp $EXT_LEX/$basename.acousticlyfound
355 351
356 # 352 #
357 # Extract OOV really added 353 # Extract OOV really added
358 # 354 #
359 cat $SOLR_RES/$basename.phon_oov | cut -f1 | sort -u > $EXT_LEX/$basename.oov 355 cat $SOLR_RES/$basename.phon_oov | cut -f1 | sort -u > $EXT_LEX/$basename.oov
360 $SCRIPT_PATH/intersec.pl $EXT_LEX/$basename.oov $EXT_LEX/$basename.acousticlyfound > $EXT_LEX/$basename.oov_acousticlyfound 356 $SCRIPT_PATH/intersec.pl $EXT_LEX/$basename.oov $EXT_LEX/$basename.acousticlyfound > $EXT_LEX/$basename.oov_acousticlyfound
361 # 357 #
362 # Retrieve all phonetisation 358 # Retrieve all phonetisation
363 # 359 #
364 cat $SOLR_RES/${basename}.phon_oov | $SCRIPT_PATH/LexPhonFilter.pl $EXT_LEX/$basename.oov_acousticlyfound > $EXT_LEX/$basename.oov_acousticlyfound_phon 360 cat $SOLR_RES/${basename}.phon_oov | $SCRIPT_PATH/LexPhonFilter.pl $EXT_LEX/$basename.oov_acousticlyfound > $EXT_LEX/$basename.oov_acousticlyfound_phon
365 done 361 done
366 362
367 # 363 #
368 # Merge OOVs and their phonetisation 364 # Merge OOVs and their phonetisation
369 # 365 #
370 print_info "[${BASENAME}] Merge OOV and their phonetisation" 3 366 print_info "[${BASENAME}] Merge OOV and their phonetisation" 3
371 lexname=$(basename $LEXICON) 367 lexname=$(basename $LEXICON)
372 cat $EXT_LEX/*.oov_acousticlyfound_phon | sort -u > $EXT_LEX/final/all.oov_acousticlyfound_phon 368 cat $EXT_LEX/*.oov_acousticlyfound_phon | sort -u > $EXT_LEX/final/all.oov_acousticlyfound_phon
373 cat $EXT_LEX/*.oov_acousticlyfound | sort -u | grep --perl-regex -v "^([a-z']){3}$" > $EXT_LEX/final/all.oov_acousticlyfound 369 cat $EXT_LEX/*.oov_acousticlyfound | sort -u | grep --perl-regex -v "^([a-z']){3}$" > $EXT_LEX/final/all.oov_acousticlyfound
374 $SCRIPT_PATH/MergeLexicon.pl $EXT_LEX/final/all.oov_acousticlyfound_phon > $EXT_LEX/final/${lexname}_ext.phon 370 $SCRIPT_PATH/MergeLexicon.pl $EXT_LEX/final/all.oov_acousticlyfound_phon > $EXT_LEX/final/${lexname}_ext.phon
375 371
376 # 372 #
377 # Collect + clean retrieved txt 373 # Collect + clean retrieved txt
378 # 374 #
379 print_info "[${BASENAME}] Collect and clean SOLR txt answers" 2 375 print_info "[${BASENAME}] Collect and clean SOLR txt answers" 2
380 # choose filter 376 # choose filter
381 # default 377 # default
382 cat $SOLR_RES/*.txt | $SCRIPT_PATH/CleanFilter.sh | $SCRIPT_PATH/ApplyCorrectionRules.pl ${LEXICON}.regex | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $EXT_LEX/final/all.bdlex_txt 378 cat $SOLR_RES/*.txt | $SCRIPT_PATH/CleanFilter.sh | $SCRIPT_PATH/ApplyCorrectionRules.pl ${LEXICON}.regex | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t > $EXT_LEX/final/all.bdlex_txt
383 # low filter 379 # low filter
384 #cat $SOLR_RES/*.txt | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t | sed -f $RULES/preprocess.regex | sed -f $RULES/lastprocess.regex > $EXT_LEX/final/all.bdlex_txt 380 #cat $SOLR_RES/*.txt | $SCRIPT_PATH/BdlexUC.pl $RULES/basic -t | sed -f $RULES/preprocess.regex | sed -f $RULES/lastprocess.regex > $EXT_LEX/final/all.bdlex_txt
385 381
386 # 382 #
387 # Construct the map file 383 # Construct the map file
388 # 384 #
389 # Notes: 385 # Notes:
390 # - Expected format : 386 # - Expected format :
391 # <WORD1_STRING> <CANDIDATE1_STRING> <PHON_1> 387 # <WORD1_STRING> <CANDIDATE1_STRING> <PHON_1>
392 # 388 #
393 print_info "[${BASENAME}] Construct map file" 3 389 print_info "[${BASENAME}] Construct map file" 3
394 rm -f $EXT_LEX/final/${lexname}_ext.map 2>/dev/null 390 rm -f $EXT_LEX/final/${lexname}_ext.map 2>/dev/null
395 rm -f $EXT_LEX/final/${lexname}.unvalid_oov 2>/dev/null 391 rm -f $EXT_LEX/final/${lexname}.unvalid_oov 2>/dev/null
396 392
397 while read oov 393 while read oov
398 do 394 do
399 oov=`echo $oov | sed "s/\n//g"` 395 oov=`echo $oov | sed "s/\n//g"`
400 # 396 #
401 # Obtain the oov's tag 397 # Obtain the oov's tag
402 # 398 #
403 #oov_tag=`grep --perl-regex "^$oov\t" $DYNAMIC_TAGSTATS/all.tags | cut -f2` 399 #oov_tag=`grep --perl-regex "^$oov\t" $DYNAMIC_TAGSTATS/all.tags | cut -f2`
404 # 400 #
405 # Try to collect text containing the oov word 401 # Try to collect text containing the oov word
406 # 402 #
407 print_info "[${BASENAME}] Collect text containing the oov" 3 403 print_info "[${BASENAME}] Collect text containing the oov" 3
408 cat $EXT_LEX/final/all.bdlex_txt | grep --perl-regex " $oov " | $SCRIPT_PATH/NbMaxWordsFilter.pl 40 |uniq > $EXT_LEX/tmp/txt/$oov.bdlex_txt 404 cat $EXT_LEX/final/all.bdlex_txt | grep --perl-regex " $oov " | $SCRIPT_PATH/NbMaxWordsFilter.pl 40 |uniq > $EXT_LEX/tmp/txt/$oov.bdlex_txt
409 if [ -f $EXT_LEX/tmp/txt/$oov.bdlex_txt ]; then 405 if [ -f $EXT_LEX/tmp/txt/$oov.bdlex_txt ]; then
410 nbWords=`wc -l $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "` 406 nbWords=`wc -l $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "`
411 if [ $nbWords -eq 0 ]; then 407 if [ $nbWords -eq 0 ]; then
412 print_warn "[${BASENAME}] UNVALID OOV: $oov => $nbWords occurrences" 2 408 print_warn "[${BASENAME}] UNVALID OOV: $oov => $nbWords occurrences" 2
413 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov 409 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov
414 else 410 else
415 # 411 #
416 # Find a candidate in a filtred invoc lexicon => a candidate which maximize the ppl in the overall txt collected 412 # Find a candidate in a filtred invoc lexicon => a candidate which maximize the ppl in the overall txt collected
417 # 413 #
418 #echo "$/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $LEXICON.bdlex_tok $EXT_LEX/tmp/txt/$oov.bdlex_txt" 414 #echo "$/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $LEXICON.bdlex_tok $EXT_LEX/tmp/txt/$oov.bdlex_txt"
419 print_info `$SPEERAL_PATH/bin/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $CANDIDATE_LEXICON $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "` 3 415 print_info `$SPEERAL_PATH/bin/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $CANDIDATE_LEXICON $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "` 3
420 candidate=`$SPEERAL_PATH/bin/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $CANDIDATE_LEXICON $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "` 416 candidate=`$SPEERAL_PATH/bin/getCandidate $SPEER_LM_PATH $SPEER_LM_BASENAME $oov $CANDIDATE_LEXICON $EXT_LEX/tmp/txt/$oov.bdlex_txt | cut -f1 -d" "`
421 if [ ! "$candidate" == "" ]; then 417 if [ ! "$candidate" == "" ]; then
422 grep --perl-regex "^$oov\t" $EXT_LEX/final/all.oov_acousticlyfound_phon > $EXT_LEX/tmp/$oov.phon 418 grep --perl-regex "^$oov\t" $EXT_LEX/final/all.oov_acousticlyfound_phon > $EXT_LEX/tmp/$oov.phon
423 while read phonLine 419 while read phonLine
424 do 420 do
425 #<word> <phon> => <word> <candidate> <phon> 421 #<word> <phon> => <word> <candidate> <phon>
426 echo "$phonLine" | sed "s|\t|\t$candidate\t|" >> $EXT_LEX/final/${lexname}_ext.map 422 echo "$phonLine" | sed "s|\t|\t$candidate\t|" >> $EXT_LEX/final/${lexname}_ext.map
427 done < $EXT_LEX/tmp/$oov.phon 423 done < $EXT_LEX/tmp/$oov.phon
428 else 424 else
429 print_warn "[${BASENAME}] UNVALID OOV: $oov => no availaible Candidate word in LM" 2 425 print_warn "[${BASENAME}] UNVALID OOV: $oov => no availaible Candidate word in LM" 2
430 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov 426 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov
431 fi 427 fi
432 fi 428 fi
433 else 429 else
434 print_warn "[${BASENAME}] UNVALID OOV: $oov" 2 430 print_warn "[${BASENAME}] UNVALID OOV: $oov" 2
435 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov 431 echo "$oov" >> $EXT_LEX/final/${lexname}.unvalid_oov
436 fi 432 fi
437 done < $EXT_LEX/final/all.oov_acousticlyfound 433 done < $EXT_LEX/final/all.oov_acousticlyfound
438 434
439 # 435 #
440 ### Speeral 436 ### Speeral
441 # 437 #
442 438
443 lexname=`basename $LEXICON` 439 lexname=`basename $LEXICON`
444 # 440 #
445 # Build the final trigger file 441 # Build the final trigger file
446 # 442 #
447 print_info "[${BASENAME}] Clean trigg files" 3 443 print_info "[${BASENAME}] Clean trigg files" 3
448 mkdir -p $TRIGGER_CONFZONE/speeral/ 2> /dev/null 444 mkdir -p $TRIGGER_CONFZONE/speeral/ 2> /dev/null
449 mkdir -p $EXT_LEX/speeral/ 2> /dev/null 445 mkdir -p $EXT_LEX/speeral/ 2> /dev/null
450 for i in `ls $TRIGGER_CONFZONE/*.trigg` 446 for i in `ls $TRIGGER_CONFZONE/*.trigg`
451 do 447 do
452 basename=`basename $i .trigg` 448 basename=`basename $i .trigg`
453 cat $i | $SCRIPT_PATH/RemoveLineContaining.pl $EXT_LEX/$lexname.unvalid_oov > $TRIGGER_CONFZONE/speeral/$basename.trigg 449 cat $i | $SCRIPT_PATH/RemoveLineContaining.pl $EXT_LEX/$lexname.unvalid_oov > $TRIGGER_CONFZONE/speeral/$basename.trigg
454 done 450 done
455 # 451 #
456 # Compile the speeral extended lexicon 452 # Compile the speeral extended lexicon
457 # 453 #
458 print_info "[${BASENAME}] Compile Speeral extended lexicon" 3 454 print_info "[${BASENAME}] Compile Speeral extended lexicon" 3
459 print_info "$SPEERAL_PATH/bin/buildmappedbinode $LEXICON.bdlex_phon $EXT_LEX/final/${lexname}_ext.map $AM_SKL $EXT_LEX/speeral/${lexname}_ext" 3 455 print_info "$SPEERAL_PATH/bin/buildmappedbinode $LEXICON.bdlex_phon $EXT_LEX/final/${lexname}_ext.map $AM_SKL $EXT_LEX/speeral/${lexname}_ext" 3
460 $SPEERAL_PATH/bin/buildmappedbinode $LEXICON.bdlex_phon $EXT_LEX/final/${lexname}_ext.map $AM_SKL $EXT_LEX/speeral/${lexname}_ext 456 $SPEERAL_PATH/bin/buildmappedbinode $LEXICON.bdlex_phon $EXT_LEX/final/${lexname}_ext.map $AM_SKL $EXT_LEX/speeral/${lexname}_ext
461 457
462 if [ $CHECK -eq 1 ] 458 if [ $CHECK -eq 1 ]
463 then 459 then
464 check_exploitconfpass_lex_check "${EXT_LEX}/speeral/${lexname}_ext" 460 check_exploitconfpass_lex_check "${EXT_LEX}/speeral/${lexname}_ext"
465 if [ $? -eq 1 ] 461 if [ $? -eq 1 ]
466 then 462 then
467 print_error "[${BASENAME}] Building Speeral Lexicon $INPUT_DIR -> exit" 463 print_error "[${BASENAME}] Building Speeral Lexicon $INPUT_DIR -> exit"
468 print_error "[${BASENAME}] Check $ERRORFILE" 464 print_error "[${BASENAME}] Check $ERRORFILE"
469 print_log_file $ERRORFILE "ERROR : Building Speeral Lexicon $INPUT_DIR" 465 print_log_file $ERRORFILE "ERROR : Building Speeral Lexicon $INPUT_DIR"
470 print_log_file $ERRORFILE "ERROR : ${EXT_LEX}/speeral/${lexname}_ext Empty after buildmappedbinode ?" 466 print_log_file $ERRORFILE "ERROR : ${EXT_LEX}/speeral/${lexname}_ext Empty after buildmappedbinode ?"
471 exit 1; 467 exit 1;
472 fi 468 fi
473 fi 469 fi
474 470
475 471
476 #-------# 472 #-------#
477 # CLOSE # 473 # CLOSE #
478 #-------# 474 #-------#
479 # Seem OK 475 # Seem OK
480 print_info "[${BASENAME}] <= ExploitConfidencePass End | $(date +'%d/%m/%y %H:%M:%S')" 1 476 print_info "[${BASENAME}] <= ExploitConfidencePass End | $(date +'%d/%m/%y %H:%M:%S')" 1
481 477
482 # unlok directory 478 # unlok directory
483 mv "$OUTPUT_DIR/EXPLOITCONFPASS.lock" "$OUTPUT_DIR/EXPLOITCONFPASS.unlock" 479 mv "$OUTPUT_DIR/EXPLOITCONFPASS.lock" "$OUTPUT_DIR/EXPLOITCONFPASS.unlock"
484 480
485 481
486 482