Commit 1fa99e8a2be1b5c4c6a6b17107d4c6a23611ac81

Authored by Jean-François Rey
1 parent 0052714e74
Exists in master

add info in INSTALL and README

Showing 2 changed files with 76 additions and 2 deletions Inline Diff

1 #---------------# 1 #---------------#
2 # OTMEDIA LIA # 2 # OTMEDIA LIA #
3 # INSTALL # 3 # INSTALL #
4 # version : 1.0 # 4 # version : 1.0 #
5 #---------------# 5 #---------------#
6 6
7 OTMEDIA LIA ready to use ? Really ? 7 OTMEDIA LIA ready to use ? Really ?
8 No ! You have to do manualy configuartion for some features. 8 No ! You have to do manualy configuartion for some features.
9 Let see... 9 Let see...
10 10
11 SUMMARY 11 SUMMARY
12 ------- 12 -------
13 13
14 1\ Before installation 14 1\ Before installation
15 2\ install.sh script 15 2\ install.sh script
16 3\ SOLR install 16 3\ SOLR install
17 17
18 18
19 1\ Before installation 19 1\ Before installation
20 ---------------------- 20 ----------------------
21 21
22 - Check and install dependencies. 22 - Check and install dependencies.
23 - In 64 bits architcture be sure you can run 32 bits programs. 23 - In 64 bits architcture be sure you can run 32 bits programs.
24 - Have 300 Go of free space. 24 - Have 300 Go of free space.
25 - Have acces to the network and the nyx server. 25 - Have acces to the network and the nyx server.
26 26
27 2/ install.sh script 27 2/ install.sh script
28 -------------------- 28 --------------------
29 29
30 install.sh script will do most of the work. 30 install.sh script will do most of the work.
31 It will check dependencies and configure pass tools. 31 It will check dependencies and configure pass tools.
32 By default it will do a complet install (300 Go). 32 By default it will do a complet install (300 Go).
33 33
34 You can modifiy behavior by editing install.sh : 34 You can modifiy behavior by editing install.sh :
35 35
36 To disable lexicon adaption using SOLR DB put EXPLOITCONFPASS to 0 (mainly the 290 Go). 36 To disable lexicon adaption using SOLR DB put EXPLOITCONFPASS to 0 (mainly the 290 Go).
37 To disable confidence measure put CONFPASS to 0. 37 To disable confidence measure put CONFPASS to 0.
38 To disable second and third pass put PASS2 to 0. 38 To disable second and third pass put PASS2 to 0.
39 39
40 run install.sh and follow the white rabbit. 40 run install.sh and follow the white rabbit.
41 41
42 3\ SOLR install 42 3\ SOLR install
43 --------------- 43 ---------------
44 44
45 The install.sh script download otmedia-2013-04.tar.gz and untar it in OTMEDIA_HOME/tools/SOLR/ . 45 The install.sh script download otmedia-2013-04.tar.gz and untar it in OTMEDIA_HOME/tools/SOLR/ .
46 See SOLR.INSTALL file to install OTMEDIA SOLR DB. 46 See SOLR.INSTALL file to install OTMEDIA SOLR DB.
47 47
48 4\ Install descriptions 48 4\ Install descriptions
49 49
50 OTMEDIA_HOME 50 OTMEDIA_HOME
51 |-> bin/ 51 |-> bin/
52 |-> aff_mat
53 |-> aff_mat.64
54 |-> lia_plp_mt
55 |-> lia_plp_mt.64
56 |-> LIUM_SpkDiarization-4.2.jar
57 |-> sclite
52 |-> cfg/ 58 |-> cfg/
59 |-> ConfidenceMeasure.cfg
60 |-> ConfPass.cfg
61 |-> ExploitConfidencePass.cfg
62 |-> FirstPass.cfg
63 |-> main_cfg.cfg
64 |-> RecomposePass.cfg
65 |-> Scoring.cfg
66 |-> Secondass.cfg
67 |-> ThirdPass.cfg
53 |-> data/ 68 |-> data/
69 |-> rules/
70 |-> asupp
71 |-> basic
72 |-> lastprocess.regex
73 |-> muRules.tab
74 |-> numeric_rules
75 |-> postprocess.regex
76 |-> preprocess.regex
77 |-> random_regex.tab
54 |-> main_tools/ 78 |-> main_tools/
79 |-> CheckResults.sh
80 |-> ConfidenceMeasure.sh
81 |-> ConfPass.sh
82 |-> ExploitConfidencePass.sh
83 |-> FirstPass.sh
84 |-> OneScriptToRuleThemAll.sh
85 |-> RecomposePass.sh
86 |-> ScoringRes.sh
87 |-> SecondPass.sh
88 |-> ThirdPass.sh
55 |-> tools/ 89 |-> tools/
90 |-> lia_ltbox/
91 |-> PACKAGE_MESURES_V1.0/
92 |-> QUOTE_FINDER/
93 |-> scripts/
94 |-> ApplyCorrectionRules.pl
95 |-> BdlexUC.pl
96 |-> CheckConfPass.sh
97 |-> CheckExploitConfPass.sh
98 |-> CheckFirstPass.sh
99 |-> CheckSecondPass.sh
100 |-> CheckThirdPass.sh
101 |-> CleanFilter.sh
102 |-> CoverageReportMaker.pl
103 |-> ctm2show.pl
104 |-> Date2txt.pl
105 |-> daybefore2after.sh
106 |-> ExtractAudioFromTV.sh
107 |-> FindNormRules.pl
108 |-> formatRES.pl
109 |-> GenerateSOLRQueries.pl
110 |-> intersec.pl
111 |-> KeepConfZone.pl
112 |-> LexPhonFilter.pl
113 |-> MergeLexicon.pl
114 |-> NbMaxWordsFilter.pl
115 |-> Number2txt.pl
116 |-> perlmod/
117 |-> Utils.pm
118 |-> PhonFormatter.pl
119 |-> ProcessSOLRQueries.py
120 |-> RandomRegex.pl
121 |-> RemoveLineContaining.pl
122 |-> res2out.pl
123 |-> ScoreCtm2trigg.pl
124 |-> scoredCtmAndTaggedLem2All.pl
125 |-> Sentencer.pl
126 |-> srt2stm.pl
127 |-> Tools.sh
128 |-> UrlConverter.pl
129 |-> SIGMUND/
56 |-> COPYING 130 |-> COPYING
57 |-> CorpusOTMedia.txt 131 |-> CorpusOTMedia.txt
58 |-> HOWTO 132 |-> HOWTO
59 |-> INSTALL 133 |-> INSTALL
60 |-> README 134 |-> README
61 |-> SOLR.INSTALL 135 |-> SOLR.INSTALL
62 |-> TODO 136 |-> TODO
63 137
64 138
65 139
66 140
67 141
68 142
69 143
70 144
71 145
72 146
73 147
74 148
75 149
1 ___ _____ __ __ _____ ____ ___ _ _ ___ _ 1 ___ _____ __ __ _____ ____ ___ _ _ ___ _
2 / _ \_ _| \/ | ____| _ \_ _| / \ | | |_ _| / \ 2 / _ \_ _| \/ | ____| _ \_ _| / \ | | |_ _| / \
3 | | | || | | |\/| | _| | | | | | / _ \ | | | | / _ \ 3 | | | || | | |\/| | _| | | | | | / _ \ | | | | / _ \
4 | |_| || | | | | | |___| |_| | | / ___ \ | |___ | | / ___ \ 4 | |_| || | | | | | |___| |_| | | / ___ \ | |___ | | / ___ \
5 \___/ |_| |_| |_|_____|____/___/_/ \_\ |_____|___/_/ \_\ 5 \___/ |_| |_| |_|_____|____/___/_/ \_\ |_____|___/_/ \_\
6 6
7 7
8 #---------------# 8 #---------------#
9 # OTMEDIA LIA # 9 # OTMEDIA LIA #
10 # README # 10 # README #
11 # version 1.0 # 11 # version 1.0 #
12 #---------------# 12 #---------------#
13 13
14 DESCRIPTION 14 DESCRIPTION
15 ----------- 15 -----------
16 16
17 OTMEDIA means "Observatoire Transmedia", its main objective is to study the evolution and transformation of the media world. 17 OTMEDIA means "Observatoire Transmedia", its main objective is to study the evolution and transformation of the media world.
18 The scientific objective of the project is the creation of a new generation of media observatory 18 The scientific objective of the project is the creation of a new generation of media observatory
19 based on an interactive automatic analysis system (semi-automatic) transmedia to understand 19 based on an interactive automatic analysis system (semi-automatic) transmedia to understand
20 the world of information and developments. 20 the world of information and developments.
21 21
22 Web Site : http://www.otmedia.fr 22 Web Site : http://www.otmedia.fr
23 23
24 OTMEDIA LIA project is a set of tools to transcribe radio and TV shows. 24 OTMEDIA LIA project is a set of tools to transcribe radio and TV shows.
25 It does multiple things : 25 It does multiple things :
26 - First pass : default transcription with speeral and speaker diarization. 26 - First pass : default transcription with speeral and speaker diarization.
27 - Second pass : speaker adaptation and a second transcription pass with speeral. 27 - Second pass : speaker adaptation and a second transcription pass with speeral.
28 - Confidence pass : calcul confidence measure from transcription output. 28 - Confidence pass : calcul confidence measure from transcription output.
29 - Exploit Confidence Measure : use SOLR DB data to extend the lexicon on low confidence measure and create trigg files. 29 - Exploit Confidence Measure : use SOLR DB data to extend the lexicon on low confidence measure and create trigg files.
30 - Third pass : second pass using the new lexicon and trigg files. 30 - Third pass : second pass using the new lexicon and trigg files.
31 31
32 32
33 DEPENDENCIES 33 DEPENDENCIES
34 ------------ 34 ------------
35 35
36 GNU Toolchain 36 GNU Toolchain
37 Available from : http://www.gnu.org 37 Available from : http://www.gnu.org
38 and debian packages 38 and debian packages
39 39
40 Compiling, linking, and building applications. 40 Compiling, linking, and building applications.
41 41
42 42
43 avconv (libav-tools >= 0.8) 43 avconv (libav-tools >= 0.8)
44 Available from : http://libav.org 44 Available from : http://libav.org
45 and debian package 45 and debian package
46 46
47 avconv is a very fast video and audio converter. 47 avconv is a very fast video and audio converter.
48 48
49 JAVA JDK and JRE ( >= 6) 49 JAVA JDK and JRE ( >= 6)
50 Available from : http://www.oralce.com 50 Available from : http://www.oralce.com
51 and debian packages 51 and debian packages
52 52
53 JAVA Developpment kit and JAVA runtime environment. 53 JAVA Developpment kit and JAVA runtime environment.
54 54
55 Python ( >= 2.7.0) 55 Python ( >= 2.7.0)
56 Available from : http://http://www.python.org/ 56 Available from : http://http://www.python.org/
57 and debian packages 57 and debian packages
58 58
59 Python is a programming language. 59 Python is a programming language.
60 60
61 Perl ( >= 5.0.0) 61 Perl ( >= 5.0.0)
62 Available from : http://www.perl.org/ 62 Available from : http://www.perl.org/
63 and debian packages 63 and debian packages
64 64
65 Perl is a programming language. 65 Perl is a programming language.
66 66
67 iconv ( >= 2.0.0) 67 iconv ( >= 2.0.0)
68 Available from : http://www.gnu.org 68 Available from : http://www.gnu.org
69 and debian package 69 and debian package
70 70
71 Character set conversion. 71 Character set conversion.
72 72
73 csh shell (csh) 73 csh shell (csh)
74 Available on debian packages. 74 Available on debian packages.
75 75
76 The C shell was originally written at UCB to overcome limitations in the 76 The C shell was originally written at UCB to overcome limitations in the
77 Bourne shell. Its flexibility and comfort (at that time) quickly made it 77 Bourne shell. Its flexibility and comfort (at that time) quickly made it
78 the shell of choice until more advanced shells like ksh, bash, zsh or 78 the shell of choice until more advanced shells like ksh, bash, zsh or
79 tcsh appeared. Most of the latter incorporate features original to csh 79 tcsh appeared. Most of the latter incorporate features original to csh
80 80
81 The SRI Language Modeling Toolkit (SRILM >= 1.6.0) 81 The SRI Language Modeling Toolkit (SRILM >= 1.6.0)
82 Available from : http://www.speech.sri.com/projects/srilm/ 82 Available from : http://www.speech.sri.com/projects/srilm/
83 83
84 SRILM is a toolkit for building and applying statistical language models. 84 SRILM is a toolkit for building and applying statistical language models.
85 85
86 Tomcat ( >= 7.0.0) 86 Tomcat ( >= 7.0.0)
87 Available from : http://tomcat.apache.org/ 87 Available from : http://tomcat.apache.org/
88 and debian packages 88 and debian packages
89 89
90 Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies. 90 Apache Tomcat is an open source software implementation of the Java Servlet and JavaServer Pages technologies.
91 91
92 INSTALLATION 92 INSTALL
93 ------------ 93 -------
94 94
95 See the INSTALL file for the installation procedure. 95 See the INSTALL file for the installation procedure.
96 96
97 Quick install below. 97 Quick install below.
98 98
99 Before launching installation : 99 Before launching installation :
100 100
101 Be certain that all dependencies are satisfied. 101 Be certain that all dependencies are satisfied.
102 Have 300 Go of free space for complet install. 102 Have 300 Go of free space for complet install.
103 103
104 Issue the following commands to the shell : 104 Issue the following commands to the shell :
105 $> ./install.sh 105 $> ./install.sh
106 $> export OTMEDIA_HOME=path/to/OTMEDIA/directory 106 $> export OTMEDIA_HOME=path/to/OTMEDIA/directory
107 107
108 Read SOLR.INSTALL part 3 to install SOLRDB. 108 Read SOLR.INSTALL part 3 to install SOLRDB.
109 109
110 RUNNING 110 RUNNING
111 ------- 111 -------
112 112
113 See HOWTO file. 113 See HOWTO file.
114 114
115 ACKNOWLEDGEMENTS 115 ACKNOWLEDGEMENTS
116 ---------------- 116 ----------------
117 117
118 Many thanks to Jean-François Rey for useful help and work done. 118 Many thanks to Jean-François Rey for useful help and work done.
119 119
120 KNOWN BUGS 120 KNOWN BUGS
121 ---------- 121 ----------
122 122
123 Many. 123 Many.
124 For Bug report, please contact Pascal Nocera at pascal.nocera@univ-avignon.fr 124 For Bug report, please contact Pascal Nocera at pascal.nocera@univ-avignon.fr
125 125
126 COPYRIGHT 126 COPYRIGHT
127 --------- 127 ---------
128 128
129 See the COPYING file. 129 See the COPYING file.
130 130
131 AUTHORS 131 AUTHORS
132 ------- 132 -------
133 133
134 Jean-François Rey <jean-francois.rey@univ-avignon.fr> 134 Jean-François Rey <jean-francois.rey@univ-avignon.fr>
135 Hugo Mauchrétien <hugo.mauchretien@univ-avignon.fr> 135 Hugo Mauchrétien <hugo.mauchretien@univ-avignon.fr>
136 Emmanuel Ferreira <emmanuel.ferreira@univ-avignon.fr> 136 Emmanuel Ferreira <emmanuel.ferreira@univ-avignon.fr>
137 137
138 138