TODO 9.95 KB
edit raw blame history



1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

83

84

85

86

87

88

89

90

91

92

93

94

95

96

97

98

99

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201


- To Matrix and CuMatrix, add a function 


(Z)
 Need to improve the efficiency of CudaMalloc.  I would suggest
 to have some kind of CPU-hosted map from size (or x,y size for CudaMallocPitch),
 to the address, that when we free stuff we just keep the pointer..


(A)
  Need to improve the efficiency of
  (i)     TraceMatMat (for CuMatrix)
  (ii)    CuMatrix::AddDiagMatMat
  (iii)   CuVector::Sum();

(B)
  
  // Make a matrix symmetric by either copying the lower to upper triangle
  // copying upper to lower, taking the mean, or taking the mean and checking
  // it was already almost symmetric.  You can test it by making sure that
  // it behaves the same as the SpMatrix CopyFromMat function, e.g. first
  // calling Symmetrize with a particular argument "t" and then creating an SpMatrix
  // from the matrix with kTakeMeanAndCheck, should give the same result as
  // initializing the SpMatrix with the original matrix and "t".  kTakeMeanAndCheck
  // means set to the mean, but check they were approximately equal; see the SpMatrix
  // code for how to implement it (in fact, the code can be based on the same code
  // in SpMatrix::CopyFrommat).
  void Symmetrize(SpCopyType symmetrize_type);

  // This function should call dsymm (it can, arbitrarily, add to the lower triangle),
  // and then Symmetrize(kTakeLower).  Before calling dsymm we should check that
  // the original matrix is approximately symmetric: we can do #ifdef KALDI_PARANOID,
  // Symmetrize(kTakeMeanAndCheck). #endif.
  void SymmetricAddMatMat(Real alpha, const Matrix<Real> &A, MatrixTransposeType transA,
                          const Matrix<Real> &B, MatrixTransposeType transB, Real beta);
                   

(C)
The functions in CuMath.h that say "this needs documentation" need documentation
(i.e.  comments saying what the functions do).  Someone would need to look at
the code and figure out what the functions do (it's not my code).  Please look
at cu-matrix.h, for the similar functions CopyRows() and CopyCols(), for what I
consider to be adequate documentation.  Also these functions would need testing
code, if it doesn't already exist.


-
 Help needed:
  - Proofread documentation and make suggestions for 
    changes/clarification.
  - Try to run the scripts and compile on various platforms, and
    report problems.

  - eventually convert the refs to numeric_limits in fstext/lattice-weight.h
    and lat/arctic-weight.h, back to fst::FloatLimits
=====
dan's TODO:

 remove the weighting in my neural net setup-- not helpful

 put informative text in local/score.sh RE how to see results.

 change on-disk formats to make memory mapping easier?
 address roundoff issues RE lattice generation?
 suggestions from Sanjeev: modify decoding scripts to adjust num-jobs;
   improve MMI training stuff or at least its docs, RE sub-split?
 remove reverse stuff from gmm-latgen-faster
  
 fix SGMM w/ resizing spk vecs. [not sure if done]

minor:
 Add something in training tools, to verify that spk-id is prefix of utt-id.
 Refactor UBM-building code "ClusterGaussiansToUbm"
 add combination scripts for wsj/s3
 look into why results not reproducible?   check if reproducible?
 change UBM-building in WSJ to match RM, and test the effect of this.
 Maybe test SGMMs, with Gaussian alignments fixed to UBM for first few iters.
 add void Check(); function to most config classes.
 Put 'using namespace kaldi;' in all main()'s and remove kaldi::
 Consider adding "see also" line to "usage" messages.
 make fstbin/ programs "proper" Kaldi programs, e.g. change
   cerr to KALDI_ERR.  Consider doing same in fstext/
 Modify reading-in function for vectors, matrices, etc., so they
   successfully read in -inf and -nan.
 normalize binaries so they never take summed accs (unless sum-accs program)?
 remove mix-up, mix-down options from gmm-est (have separate program)
 possibly remove gmm-align (modifiy scripts accordingly).
 make questions on disk be in text form (use sym2int.pl etc.)... same
   for transcripts while decoding.
 at kaldi-matrix.cc:1198, make it read in -nan and -inf
 maybe move UBM clustering somewhere else [from am-diag-gmm]?
 document the --cmd options in wsj/s3 scripts
 change sgmm code so gselect is no longer optional [only in discrim/?]
 Make sure scripts don't depend on . being on PATH


lattice-rand-path [not started]  Eventually have a version that computes a random path weighted by probability?  Not sure how useful.


=============
COMPLETED, MAY REQUIRE ADDITIONAL FEATURES

gmm-latgen-simple   [finished]  Create lattices.  Write either state-level traceback or word-level lattice-determinized output, depending on options.

lattice-lmrescore [not started] Given an FST representing an LM, and a scale with which to add the LM score (typically 1 or -1): composes it (on the right) with the lattice (in Lattice format), and then lattice-determinizes it to make there be only one path per word-seq.  Note: if the LM scale (to add) is negative, we would have to negate the weights before lattice-determinizing, and then negate again after.

lm-rescore-lattice-{fst,arpa} [for Gilles; not started]  Replace (or add) graph scores on a lattice with newly computed LM scores.  Normal usage: lm-rescore-lattice <LM FST/arpa> <lattice-in-archive> <lattice-out-archive>.
  Options would include --lm-scale [default = 1]; set to -1 for removing old LM scores; and --old-graph-scale [default = 0] to keep the old graph scores with some scale [e.g. useful if we previously added transition probabilities].

lattice-to-post [not started]  Does forward-backward on the lattice, using its current weights, and converts it to state-level (transition-id-level) posteriors.   Would include --acoustic-scale option for convenience.
   Normal usage: lattice-to-post --acoustic-scale=$acwt <in-archive> <out-archive>.  Another option would be --rand-prune, which would do randomized pruning of state-level posteriors if they are below a threshold, as I already do in some programs that create Gaussian-level posteriors. [the randomization is there to preserve expectations].

lattice-scale [not started] Applies scaling to lattices' scores [ScaleLattice function].... would probably be called normally as lattice-scale --acoustic-scale=0.1 <in-archive> <out-archive>, but
also with options --graph-scale, --acoustic2graph-scale, --graph2acoustic-scale, corresponding to a 2x2 scaling matrix.

gmm-rescore-lattice [not started]  Replace acoustic scores on a lattice with newly computed acoustic scores.  Normal usage: gmm-rescore-lattice model <lattice-in-archive> <feats-in-archive> <lattice-out-archive>.  
  Would have option --old-acoustic-scale [default = 0] to keep the old acoustics with some scale.
  Might add options to add in the transition probabilities to either the acoustic part or the graph part of the weights-- or might create a separate program for this.


=============
OLDER TODOS
=====

 TODO items (Arnab):
  Check the clapack configuration in the configure script... not clear
     what the purpose of CLAPACK_ROOT is.  I think the configure script
     should create a Makefile that doesn't depend on external variables (Dan)
  Add separate min-count at root of tree for regression-tree fMLLR/MLLR
  Add fMLLR scripts for SGMM.
  Add documentation for regression-tree fMLLR (etc.) and possibly expand
   documentation for acoustic modeling code.

 TODO items (Dan):
  Harmonize output style of decoders (e.g. same type of lattices).
  Check kaldi-decoder and add example scripts.
  Look into how better to estimate vectors in training time
    (I think WERs degraded when I fixed a script bug that
    was having the effect of re-estimating spk-vecs from
    zero each time).
  Remove non-Kaldi code from decoder/
  Rename to branches/kaldi-1.0

  Add code to average over multiple models while aligning ...

 documentation:
  add something on dir structure
  add something on doxygen

  -restore table_examples to mainpage.dox when done.
  -Include some examples of advanced usage of tables.

  [? + make sub-directories for different language-models/lexicons in scripts]

   - See if I can still do the reversed decoding if possible, and
     maybe rationalize the graph creation.

   - Implement lattice generation 

--
Minor:

 RE reversing FSTs:  need to reverse Arpa.  
   programs to: convert Arpa->exclusive counts;
   exclusive counts-> inclusive counts; reverse inclusive counts;
   inclusive counts-> exclusive counts; exclusive counts->Arpa.


in future (maybe)
 make sure calls like fopen, strtoll, strtod (check for more) use reentrancy 
 structures.  (also rand_r, snprintf vs spritnf...?) (?)

 Make sure, when we get the SGMM stuff working, that results are fully
  reproducible (last year's code does not
  seem to have been reproducible, due I imagine to rand() issues.) 
 
 Add the stuff RE symmetric SGMMS, from the old to the new  SGMM code. [?]


--------

# I believe these were the commands I used to use the new style of reading
# and writing Kaldi objects.
for x in */*.cc; do perl -e '$/ = "}"; while(<>){  if ( m:\n(\s+){\s+Output (\w+)\(([\w\d]+), (\w+)\);\s+([\w\d]+)\.Write\((\w+)\.Stream\(\), (\w+)\);\s+}:) {
     $indent = $1; $output_name = $2; $filename = $3; $binary_name = $4; $object_name = $5; $output_name2 = $6; $binary_name2 = $7;
     if ($output_name ne $output_name2) { print STDERR "Warning: $ARGV[0]: $output_name ne $output_name2\n"; }
     if ($binary_name ne $binary_name2) { print STDERR "Warning: $ARGV[0]: $binary_name ne $binary_name2\n"; }
     print "$`\n${indent}WriteKaldiObject($object_name, $filename, $binary_name);";
   } else { print; }} ' $x > tmp; mv tmp $x; done

for x in */*.cc; do echo $x; perl -e '$/ = "}"; while(<>){  if ( m:\n(\s+){\s+bool binary(|_in);\s+Input (\w+)\(([\w\d]+), &binary(|_in)\);\s+([\w\d]+)\.Read\((\w+)\.Stream\(\), binary(|_in)\);\s+}:) {
     $indent = $1; $input_name = $3; $filename = $4; $obj_name = $6; $input_name2 = $7;
     if ($input_name ne $input_name2) { print STDERR "Warning: $ARGV[0]: $input_name ne $input_name2\n"; }
     print "$`\n${indent}ReadKaldiObject($filename, &$obj_name);";
   } else { print; }} ' $x > tmp; mv tmp $x; done