Blame view

src/doc/versions.dox 7.51 KB
8dcb6dfcb   Yannick Estève   first commit
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
  // doc/versions.dox
  
  // Copyright     2017 Johns Hopkins University (author: Daniel Povey)
  
  // See ../../COPYING for clarification regarding multiple authors
  //
  // Licensed under the Apache License, Version 2.0 (the "License");
  // you may not use this file except in compliance with the License.
  // You may obtain a copy of the License at
  
  //  http://www.apache.org/licenses/LICENSE-2.0
  
  // THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
  // KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
  // WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
  // MERCHANTABLITY OR NON-INFRINGEMENT.
  // See the Apache 2 License for the specific language governing permissions and
  // limitations under the License.
  
  // note: you have to run the file get_version_info.sh in order
  // to generate the HTML files that we include via \htmlinclude.
  // Any time you add a new version you need to edit get_version_info.sh
  
  
  /**
  
    \page versions Versions of Kaldi
  
     \section versions_scheme Versioning scheme
  
       During its lifetime, Kaldi has three different versioning methods.
       Originally Kaldi was a subversion (svn)-based project, and was hosted
       on Sourceforge.  Then Kaldi was moved to github, and for some time the
       only version-number available was the git hash of the commit.
  
       In January 2017 we introduced a version number scheme.  The first version
       of Kaldi was 5.0.0, in recognition of the fact that the project had
       already existed for quite a long time.  The basic scheme is major/minor/patch,
       but the "patch" version number may also encompass features (usually
       back-compatible ones).  The "patch number" automatically increases whenever
       a commit to Kaldi is merged on github.
  
      We only intend to change the major or minor
      version number when making relatively larger changes, or non-back compatible
      changes.
  
      We always plan to recommend that Kaldi users check out the latest version of
      'master', since actively supporting multiple versions would increase our
      workload.
  
     \section versions_versions Versions (and changes)
  
     This section lists the version numbers of Kaldi with the commit messages
     for each patch commit (by "patch commit" we mean a commit that does not
     increase the major or minor version number).
     Each time we add a new major/minor version number we will include a longer
     section explaining the changes involved.
  
   \subsection versions_versions_50 Version 5.0
  
      This is the first major/minor version number after introducing the versioning scheme.
      The latest revision of version 5.0 is saved as branch "5.0" on github.
  
      Below are commits corresponding to minor version numbers 5.0.x.
  
       \htmlinclude 5.0.html
  
  
   \subsection versions_versions_51 Version 5.1
  
     Some of the major changes introduced in version 5.1 are:
       - Kaldi now requires C++11 to compile, and we support only the latest
         version of OpenFst (1.6.0).  (This simplifies Kaldi's code, and will later
         enable the threading code to be
         <a href="https://github.com/kaldi-asr/kaldi/pull/1350" target="_blank">rewritten</a>
         to use C++11's better and more portable mechanisms).
       - The way chunk size and feature context is handled in nnet3 is changed
         to allow variable chunk size and shorter context at utterance boundaries.
         See \ref dnn3_scripts_context for more information.
       - A new decoding mechanism, \ref dnn3_scripts_context_looped, is introduced
         in nnet3; this allows faster and more-easily-online decoding for
         recurrent setups (but only unidirectionally-recurrent ones, like LSTMs
         but not BLSTMs).
       - \ref online_decoding_nnet3 is now rewritten; it's faster and it supports
         models like LSTMs.
       - The sequence-training scripts in nnet3 are refactored and are now simpler
         and use less disk space.
       - There are scripts for segmentation of long transcribed audio files.
  
     The latest revision of version 5.1 is saved as branch "5.1" on github.
  
     Below are commits corresponding to minor version numbers 5.1.x.
  
      \htmlinclude 5.1.html
  
   \subsection versions_versions_52 Version 5.2
  
    Some of the changes introduced between 5.1 and 5.2 are:
      - Upgrades to nnet3 to support batch-norm and convolutional components;
        recipes for certain image tasks (like CIFAR).
      - nnet3 training script simplifications and refactoring.
      - Some of the recipes are upgraded to include dropout and
        the --proportional-shrink option (which approximates l2 regularization);
        this improves results.
  
    Many changes were made in the commits listed below (i.e. in the minor versions 5.2.x), including:
      - <a href="https://github.com/kaldi-asr/kaldi/pull/1676"> Speech/nonspeech segmentation based on nnet3 </a>
      - <a href="https://github.com/kaldi-asr/kaldi/pull/1633"> Scripts for transfer learning/domain-adaptation of nnet3 models </a>
      - <a href="https://github.com/kaldi-asr/kaldi/pull/1896"> Xvectors: DNN Embeddings for Speaker Recognition </a>
  
    The latest revision of version 5.2 is saved as branch "5.2" on github.
  
    Below are commits corresponding to minor version numbers 5.1.x.
  
      \htmlinclude 5.2.html
  
   \subsection versions_versions_53 Version 5.3
  
     Major changes that were made between the end of 5.2.x
     and the start of the 5.3 branch include:
        - Create a nnet3-based setup for RNN language models (i.e. recurrent and neural net based
          language models)
        - Some extentions to the core of the nnet3 framework to support constant values and
          scalar multiplication without dedicated components.
  
     Below are commits corresponding to minor version numbers 5.3.x.
  
     \htmlinclude 5.3.html
  
   \subsection versions_versions_54 Version 5.4
  
  
     The main changes that were made between
     the end of 5.3.x and the start of the 5.4 branch include:
      - Some code changes in the nnet3 codebase, for speed and memory efficiency.
      - Various simplifications and code reorganizations in the nnet3 code.
      - Support for a new kind of factorized TDNN (TDNN-F) which gives substantially better
        results than our old TDNN recipe, and is even better than our old TDNN+LSTM
        recipe.  A good example of this is in egs/swbd/s5c/local/chain/tuning/run_tdnn_lstm_1n.sh.
        Some nnet3 code changes were needed for this as well (mostly: support for constraining
        a matrix to have orthonormal rows).
  
    Some of the larger changes that were made while 5.4 was the major version number include:
      - Improvements to handwriting recognition and OCR recipes, including BPE (word-piece) encoding.
      - An updated version of the TDNN-F configuration, including ResNet-style bypass,
        which is now the default in many recipes.  (it's called tdnnf-layer in xconfigs).
      - A rewrite of the CUDA memory allocator to be based on a small number of large regions
        (since with newer drivers and hardware, allocation speed was becoming a bottleneck).
      - A decoder speedup (make use of OpenFst's NumInputEpsilons() function).
  
  
     Below are commits corresponding to minor version numbers 5.4.x.
  
     \htmlinclude 5.4.html
  
  
  
   \subsection versions_versions_55 Version 5.5
  
  
    Version 5.5 is the current master branch.   The change that was made between the end of
    5.4 and the start of 5.5 is support for \ref grammar grammar decoding; this allows support for things like
    the "contact list scenario" where you want to use a dynamically changing contact list in
    a larger, fixed decoding graph.
  
    Below are commits corresponding to minor version numbers 5.5.x.
  
  
      \htmlinclude 5.5.html
  
  
  */