io.dox 34.6 KB
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665
// doc/io.dox


// Copyright 2009-2011 Microsoft Corporation
//                2013 Johns Hopkins University (author: Daniel Povey)

// See ../../COPYING for clarification regarding multiple authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at

//  http://www.apache.org/licenses/LICENSE-2.0

// THIS CODE IS PROVIDED *AS IS* BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, EITHER EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED
// WARRANTIES OR CONDITIONS OF TITLE, FITNESS FOR A PARTICULAR PURPOSE,
// MERCHANTABLITY OR NON-INFRINGEMENT.
// See the Apache 2 License for the specific language governing permissions and
// limitations under the License.


namespace kaldi {
/** \page io Kaldi I/O mechanisms

 This page gives an overview of input-output mechanisms in Kaldi.
 This section of the documentation is oriented towards the code-level mechanisms
 for I/O; for documentation more oriented towards the command-line, see \ref io_tut.

 \section io_sec_style The input/output style of Kaldi classes

  Classes defined in Kaldi have a uniform interface for
  I/O.  The standard interface is illustrated here:
 \code
  class SomeKaldiClass {
   public:
     void Read(std::istream &is, bool binary);
     void Write(std::ostream &os, bool binary) const;
  };
 \endcode
 Notice that these return void; errors are indicated via exceptions
 (see \ref error).  The boolean "binary" argument indicates whether the
 object should be written (or read) as binary data or text data.  The calling
 code must know whether we want the object to be written or read
 in binary or text form (see \ref io_sec_files for how it knows this in the
 case of reading).  Note that this "binary" variable is not necessarily the
 same as the binary or text mode the file is opened with (on Windows);
 see \ref io_sec_windows for more explanation.

 The Read and Write functions may have additional optional arguments.
 A common case is to have a Read function of the form:
 \code
  class SomeKaldiClass {
   public:
    void Read(std::istream &is, bool binary, bool add = false);
  };
 \endcode
 If add==true, the Read function would add whatever is on disk (e.g. statistics)
 to the current class's contents, if the class is not currently empty.

 \section io_sec_basic Input/output mechanisms for fundamental types and STL types

   See \ref io_funcs_basic for a list of functions involved in this.  We have
 provided thse functions to make it easier to read and write fundamental types;
 they are mostly called from the Read and Write functions of Kaldi classes.
 The Kaldi classes are under no obligation to use
 these functions, as long as they ensure that their Read function can read the
 data that their Write function produces.

 The most important functions in this category are ReadBasicType() and WriteBasicType();
 these are templates that cover bool, float, double, and integer types.  An example of using these
 in Read and Write functions is:
\code
  // we suppose that class_member_ is of type int32.
  void SomeKaldiClass::Read(std::istream &is, bool binary) {
    ReadBasicType(is, binary, &class_member_);
  }
  void SomeKaldiClass::Write(std::ostream &os, bool binary) const {
    WriteBasicType(os, binary, class_member_);
  }
\endcode
  We have assumed that \c class_member_ is of type int32, which is a type of known
  size.  Using types like int with these functions is not safe.  In binary mode,
  these functions actually write a character that encodes the
  size and signedness of integer types, and the read will fail if it doesn't match.  We
  could have decided to attempt to convert them automatically, but we didn't;
  currently, you have to use integer types of known size in I/O (int32 is recommended for
  "normal" use).  Floating-point types, on the other hand, are automatically
  converted.  This is for ease of debugging, so you can compile with
  \c -DKALDI_DOUBLE_PRECISION and still read your binary files that were written without
  that option.  Our I/O routines have no byte swapping; if this is a problem for you,
  use the text formats.

  There are also the WriteIntegerVector() and ReadIntegerVector() templated functions.
  These are in the same style as the WriteBasicType() and ReadBasicType() functions, but
  work for \c std::vector<I>, where I is some integer type (again, its size should
  be known at compile time, e.g. int32).

  Some other important low-level I/O functions are;
 \code
  void ReadToken(std::istream &is, bool binary, std::string *token);
  void WriteToken(std::ostream &os, bool binary, const std::string & token);
 \endcode
  A token must be a nonempty string with no spaces, typically in practice an XML-looking
  string like "<SomeKaldiClass>" or "<SomeClassMemberName>" or "</SomeKaldiClass>".
  These functions do what they look like they would do.  For convenience, we also
  provide ExpectToken(), which is like ReadToken() except you give it the string
  you expect (and it will throw an exception if it doesn't get it).  Typical lines
  of code invoking these are:
\code
   // in writing code:
   WriteToken(os, binary, "<MyClassName>");
   // in reading code:
   ExpectToken(is, binary, "<MyClassName>");
   // or, if a class has multiple forms:
   std::string token;
   ReadToken(is, binary, &token);
   if(token == "<OptionA>") { ... }
   else if(token == "<OptionB>") { ... }
   ...
\endcode
  There are also the WritePretty() and ExpectPretty() functions.
  These are less frequently used, and they behave like the corresponding Token
  functions except that they only actually read and write in text mode, and they
  accept arbitrary strings (i.e. they allow spaces); the ReadPretty function also
  accepts input that has differs in whitespace versus what was expected.
  The Read functions in Kaldi classes never check for end of file, but are expected
  to read until the end of where the Write function wrote to (in text mode,
  leaving some whitespace unread doesn't matter).  This is so
  that multiple Kaldi objects can be put in the same file, and also allows
  the archive concept (see \ref io_sec_archive) to work.

 \section io_sec_files How Kaldi objects are stored in files

 As we have seen above, the Kaldi reading code needs to know whether it is
 reading in text or binary mode, and we don't want the user to have to keep
 track of whether a given file is text or binary.  For this reason,
 files that contain Kaldi objects need to announce whether they contain
 binary or text data.  A binary Kaldi file will start with the string
 "\0B"; since text files can't contain "\0", they don't need a header.
 If you opened a file using standard C++ mechanisms (and you won't normally
 be doing this, see \ref io_sec_opening), you would have to take care of
 this header before doing anything.  You could do this with
 the functions InitKaldiOutputStream()
 (this also sets the stream precision), and InitKaldiInputStream().

 \section io_sec_opening How to open files in Kaldi

 Suppose you want to load or save a Kaldi object from/to disk,
 and suppose it is something like speech model (but not something
 that you need many of, like speech features; for that, see \ref io_sec_tables).
 You will typically use the Input and Output classes.  An example is:
 \code
   { // input.
     bool binary_in;
     Input ki(some_rxfilename, &binary_in);
     my_object.Read(ki.Stream(), binary_in);
     // you can have more than one object in a file:
     my_other_object.Read(ki.Stream(), binary_in);
   }
   // output.  note, "binary" is probably a command-line option.
   {
     Output ko(some_wxfilename, binary);
     my_object.Write(ko.Stream(), binary);
   }
  \endcode
  The purpose of the braces is to make the Input and Output objects go out of scope
  as soon as we're done, so the file gets closed immediately.  This might seem
  a bit pointless (why not use a normal C++ stream?).  The reason is so we can
  support various extended types of filename.  It also makes handling errors
  a bit easier (the Input and Output classes will print an informative
  error message and throw an exception on error).  Notice the filenames have "rxfilename"
  and "wxfilename" in them.  We use these types of names a lot, and they are supposed
  to remind the coder that these are extended filenames.  We describe these entities
  in the next section.

  The Input and Output classes have a slightly richer interface than used in the
  example code above.  You can open them with Open(), and you can call Close()
  rather than just letting them go out of scope.  These functions return boolean
  status values rather than throwing exceptions on error the way the constructors
  and destructors will.  The Open() functions (and the constructors) can also be
  called in such a way that they don't handle the Kaldi binary header, in case
  you need to read or write non-Kaldi objects.  You probably won't need any of
  this extra functionality.

  See \ref io_group for classes and functions related to Input and Output,
  and to rxfilenames and wxfilenames (next section).

 \section io_sec_xfilename Extended filenames: rxfilenames and wxfilenames

 The words "rxfilename" and "wxfilename" are not classes; they are descriptors that usually
 appear in variable names, and they indicate the following:
    - an rxfilename is a string that is to be interpreted by the Input class
      as an extended filename for reading
    - a wxfilename is a string that is to be interpreted by the Output class
      as an extended filename for writing

 The types of rxfilename are as follows:

    - "-" or "" means the standard input
    - "some command |" means an input piped command, i.e. we strip off the "|" and give the
          rest of the string to the shell via popen().
    - "/some/filename:12345" means an offset into a file, i.e. we open the file and
       seek to position 12345.
    - "/some/filename" ... anything not matching the patterns above is treated as a normal filename
       (however, some obviously wrong things will be recognized as errors before attempting
        to open them).

 You can find out what type an rxfilename is using ClassifyRxfilename(), but this typically
  won't be necessary.

 The types of wxfilename are as follows:
    - "-" or "" means the standard input
    - "| some command" means an output piped command, i.e. we strip off the "|" and give the
          rest of the string to the shell via popen().
    - "/some/filename" ... anything not matching the patterns above is treated as a normal
       filename (again, barring obvious errors).

  Again, ClassifyWxfilename() tells you the type of a filename.

 \section io_sec_tables The Table concept

  A Table is a concept rather than actual C++ class.  It consists of a collection of
  objects of some known type, indexed by strings.  These strings must be
  tokens (a token is defined as a non-empty string without whitespaces).  Typical examples
  of Tables include:

    - A collection of feature files (represented as Matrix<float>) indexed by utterance id
    - A collection of transcriptions (represented as std::vector<int32>), indexed
       by utterance id
    - A collection of Constrained MLLR transforms (represented as Matrix<float>), indexed
       by speaker id.

  We will deal with these types of tables in more detail on the page
  \subpage table_examples; here we just explain the general principles and the
  internal mechanisms.
  A Table can exist on disk (or indeed, in a pipe) in two possible formats: a script
  file, or an archive (see below, \ref io_sec_scp and \ref io_sec_archive).
  For a list of classes and types that relate to Tables, see \ref table_group.

  A Table can be accessed in three ways: using a TableWriter, a
   SequentialTableReader, and a RandomAccessTableReader (there is also
  RandomAccessTableReaderMapped, which is a special case we will introduce later).
  These are all templates; they are templated not on the
  object in the table, but on a Holder type (see below, \ref io_sec_holders) that
  tells the Table code how to read and write that type of object.  To open
  a Table type, you must provide a string called a wspecifier or rspecifier (see below, \ref
  io_sec_specifiers) that tells the Table code how the table is stored on
  disk and gives it various other directives.  We illustrate this with some example code.
  This code reads features, linearly transforms them and writes them out.
\code
  std::string feature_rspecifier = "scp:/tmp/my_orig_features.scp",
     transform_rspecifier = "ark:/tmp/transforms.ark",
     feature_wspecifier = "ark,t:/tmp/new_features.ark";
  // there are actually more convenient typedefs for the types below,
  // e.g. BaseFloatMatrixWriter, SequentialBaseFloatMatrixReader, etc.
  TableWriter<BaseFloatMatrixHolder> feature_writer(feature_wspecifier);
  SequentialTableReader<BaseFloatMatrixHolder> feature_reader(feature_rspecifier);
  RandomAccessTableReader<BaseFloatMatrixHolder> transform_reader(transform_rspecifier);
  for(; !feature_reader.Done(); feature_reader.Next()) {
     std::string utt = feature_reader.Key();
     if(transform_reader.HasKey(utt)) {
        Matrix<BaseFloat> new_feats(feature_reader.Value());
        ApplyFmllrTransform(new_feats, transform_reader.Value(utt));
        feature_writer.Write(utt, new_feats);
     }
  }
\endcode
  The nice thing about this setup is that the code that accesses the tables
  can treat them as generic maps or lists.  The format of the data and
  other aspects of the reading process (e.g., its error tolerance) can be
  controlled by options in the rspecifiers and wspecifiers and does not
  have to be handled by the calling code; in the example above,
  the option ",t" tells it to write the data in text form.

  The Platonic ideal of a Table would probably be a map from a string to the object.
  However, as long as we're not doing random access on a particular table, the
  code will not complain if it contains duplicate entries for a particular string
  (i.e. for writing and sequential access, it behaves more like a list of pairs).

  For a list of typedefs corresponding to Table types to read and write
  specific types, see \ref table_types.

  \section io_sec_scp The Kaldi script-file format

  A script file (perhaps slightly misnamed) is a text file where each line
  will typically contain something like:
 \verbatim
  some_string_identifier /some/filename
 \endverbatim
  Another valid line in a script file would be:
 \verbatim
  utt_id_01002 gunzip -c /usr/data/file_010001.wav.gz |
 \endverbatim
 The general form of these lines is:
 \verbatim
  <key> <rxfilename>
 \endverbatim

 \subsection io_sec_scp_range Ranges in script-file lines (for taking sub-parts of matrices)

 We also allow an optional 'range-specifier' to appear after the rxfilename;
 this is useful for representing parts of matrices, such as row ranges.
 Ranges are currently not supported for any data types other than matrices.
 For example, we can express a row range of a matrix as follows:
 \verbatim
  utt_id_01002 foo.ark:89142[0:51]
 \endverbatim
 which means rows 0 through 51 (inclusive) of the matrix.
 Both row and column ranges may be expressed, e.g.
 \verbatim
  utt_id_01002 foo.ark:89142[0:51,89:100]
 \endverbatim
 and if you just want to express a column range, you can leave the row-range blank, as follows:
 \verbatim
  utt_id_01002 foo.ark:89142[,89:100]
 \endverbatim

 \subsection io_sec_scp_details  How Kaldi processes lines of scp files

  When reading a line of script file, Kaldi will trim off leading and trailing whitespace,
  and then split the line on the first region of whitespace.  The first part
  becomes the key into the table (e.g. the utterance id, in this case "utt_id_01001"),
  and the second part (after stripping off the optional range-specifier)
  becomes the xfilename (by which we mean an wxfilename or rxfilename, in
  this case "gunzip -c /usr/data/file_010001.wav.gz |").
  An empty line or an empty xfilename is not allowed.  A script file may be
  valid for reading or writing or both, depending whether the xfilenames are
  valid rxfilenames, or wxfilenames, or both.

 Note: once the optional ranges are stripped off,
 the (r,x)filenames that appear on lines of script files may generally be given
 to any Kaldi program in the same way you'd give a filename.  This is even
 true of rspecifiers that contain byte offsets, like foo.ark:8432.   The byte offsets
 will point to the beginning of the data of the object (not to the key-value that
 precedes the data in the archive).  For binary data, the byte offset points to
 the "\0B" that precedes the object; this allows the reading code to ascertain
 that the data is binary before it reads the object.

 \section io_sec_archive The Kaldi archive format

  The Kaldi archive format is quite simple.  First recall that a token is defined
  as a whitespace-free string.  The archive format could be described as:
  \verbatim
     token1 [something]token2 [something]token3 [something] ....
  \endverbatim
  We can describe this as zero or more repetitions of: (a token; then a
  space character; then the result of calling the Write function of the Holder).
  Recall that the Holder is an object that tells the Table code how to read or
  write something.

  When writing Kaldi objects, the [something] written by the Holder will constist
  of the binary-mode header (if binary), and then the result of calling the Write
  function of the object.  When writing non-Kaldi objects that are simple (like
  int32 or float or vector<int32>), the Holder classes that we write generally
  ensure that in the text format, the [something] is a newline-terminated string.
  That way, the archive has a nice one-line-per-entry format that looks
  superfically like a script file, for instance:
  \verbatim
    utt_id_1 5
    utt_id_2 7
    ...
  \endverbatim
  is the text archive format we use for storing integers.

  The archive format is such that you can concatenate archives together and they
  will still be a valid archive (assuming they hold the same type of object).  The
  format has been designed to be pipe-friendly, i.e. you can put an archive in a pipe
  and the program reading it won't have to wait till the end of the pipe before
  it can process the data.  For efficient random access into archives it's possible
  to simultaneously write an archive to disk together with a script file that contains
  offsets into the archive.  For this, see the next section.


 \section io_sec_specifiers Specifying Table formats: wspecifiers and rspecifiers

 The Table classes require a string that is passed to the constructor or to the
 Open method.  This string is called a wspecifier if passed to the TableWriter
 class, or a rspecifier if passed to the RandomAccessTableReader or SequentialTableReader
 classes.  Examples of valid rspecifiers and wspecifiers include:
 \code
  std::string rspecifier1 = "scp:data/train.scp"; // script file.
  std::string rspecifier2 = "ark:-"; // archive read from stdin.
  // write to a gzipped text archive.
  std::string wspecifier1 = "ark,t:| gzip -c > /some/dir/foo.ark.gz";
  std::string wspecifier2 = "ark,scp:data/my.ark,data/my.scp";
 \endcode

 Usually, an rspecifier or wspecifier consists of a comma-separated, unordered
 list of one or two-letter options and one of the strings "ark" and "scp",
 followed by a colon, followed by an rxfilename or wxfilename respectively.
 The order of options before the colon doesn't matter.

 \subsection io_sec_specifiers_both Writing an archive and a script file simultaneously

 There is a special case available for wspecifiers: they can "ark,scp" before the
 colon, and after the colon, a wxfilename for writing the archive, then a comma,
 then a wxfilename (for the script file).  For example,
 \verbatim
  "ark,scp:/some/dir/foo.ark,/some/dir/foo.scp"
 \endverbatim
 This will write an archive, and a
 script file with lines like "utt_id /somedir/foo.ark:1234" that specify offsets into the
 archive for more efficient random access.  You can then do whatever you like with
 the script file, including breaking it up into segments, and it will behave like
 any other script file.  Note that although the order of options before the colon
 doesn't generally matter, in this particular case the "ark" must come before
 the "scp"; this is in order to prevent confusion about the order of the
 two wxfilenames after the colon (the archive always comes first).  The wxfilename
 that specifies the archive should be a normal filename or otherwise the script file that gets
 written won't be directly readable by Kaldi, but the code doesn't enforce this.

 \subsection io_sec_wspecifiers Valid options for wspecifiers

   The allowable wspecifier options are:
     - "b" (binary) means write in binary mode (currently unnecessary as it's always the default).
     - "t" (text) means write in text mode.
     - "f" (flush) means flush the stream after each write operation.
     - "nf" (no-flush) means don't flush the stream after each write operation (would currently
        be pointless, but calling code can change the default).
     - "p" means permissive mode, which affects "scp:" wspecifiers where the scp
        file is missing some entries: the "p" option will cause it to silently
        not write anything for these files, and report no error.

    Examples of wspecifiers using a lot of options are
    \verbatim
       "ark,t,f:data/my.ark"
       "ark,scp,t,f:data/my.ark,|gzip -c > data/my.scp.gz"
   \endverbatim


  \subsection io_sec_rspecifiers Valid options for rspecifiers

   When reading the options below, bear in mind the code that reads archives can
   never seek in the archive, in case the archive is actually a pipe (and it very
   often is).  If a RandomAccessTableReader is reading an archive, the reading
   code may have to store many objects in memory just in case they are requested
   again later, or it may have to seek to the end of an archive while looking for
   a key that was not actually present in the archive.  Some of the options below
   represent ways to prevent this.

   The important rspecifier options are:
      - "o" (once) is the user's way of asserting to the RandomAccessTableReader code
         that each key will be queried only once.  This stops it
         from having to keep already-read objects in memory just in case they are needed again.
      - "p" (permissive) instructs the code to ignore errors and just provide what
         data it can; invalid data is treated as not existing.  In scp files,
         this means that a query to HasKey() forces the load of the corresponding file,
         so the code can know to return false if the file is corrupt. In archives,
         this option
         stops exceptions from being raised if the archive is corrupted or truncated
         (it will just stop reading at that point).
      - "s" (sorted) instructs the code that the keys in an archive being read are in
         sorted string order.  For RandomAccessTableReader, this means that when HasKey() is
         called for some key not in the archive, it can return false as soon as it
         encounters a "higher" key; it won't have to read till the end.
      - "cs" (called-sorted) instructs the code that the calls to HasKey() and Value()
         will be in sorted string order.  Thus, if one of these functions is called for
         some string, the reading code can discard the objects for lower-numbered keys.
         This saves memory.  In effect, "cs" represents the user's assertion that some other
         archive that the program may be iterating over, is itself sorted.

    If the user provides any of these options wrongly, e.g. provides the "s" option for
    an archive that is not actually sorted, the RandomAccessTableReader code will make
    a best-effort attempt to detect this error and crash.

    The following options are included for symmetry and convenience but are
    not very useful at the moment.
      - "no" (not-once) is the opposite of "o" (in current code,
             this would never have any effect).
      - "np" (not-permissive) is the opposite of "p" (in current code,
             this would never have any effect).
      - "ns" (not-sorted) is the opposite of "s" (in current code,
             this would never have any effect).
      - "ncs" (not-called-sorted) is the opposite of "cs" (in current code,
             this would never have any effect).
      - "b" (binary) does nothing but is allowed for scripting convenience.
      - "t" (text) does nothing but is allowed for scripting convenience.

   Typical examples of rspecifiers using a lot of options are:
   \verbatim
     "ark:o,s,cs:-"
     "scp,p:data/my.scp"
   \endverbatim

 \section io_sec_holders Holders as helpers to Table classes

  As mentioned before, the Table classes i.e. TableWriter, RandomAccessTableReader
  and SequentialTableReader, are templated on a Holder class.  Holder is not an actual
  class or base class but describes a category of classes, and these have been given names ending in Holder,
  e.g. TokenHolder or KaldiObjectHolder.  (KaldiObjectHolder is a generic Holder that
  may be templated on any class satisfying that Kaldi I/O style described
  in \ref io_sec_style).  We have written the template class GenericHolder, which is not intended
  to be used, in order to document the properties that the Holder classes must satisfy.

  The type of the class "held" by the Holder class is a typedef Holder::T  (where Holder is
  the name of the actual Holder class in question).
  A list of the available holder types may be found in \ref holders.

 \section io_sec_windows How the binary/text mode relates to the file open mode

 This section is only relevant on the Windows platform.  The general rule is
 that when writing, the file mode will always match the "binary" argument to the
 Write function; when reading binary data, the file mode will always be
 binary, but when reading text data, the file mode may be binary or text (thus
 the text-mode reading functions must always accept the extra '\\r' characters
 that Windows inserts).  This is because we don't always know until we open a
 file, whether its contents are binary or text and so when unsure, we open
 in binary mode.

 \section io_sec_bloat Avoiding memory bloat when reading archives in random-access mode

 When large archives are read in random access mode by the Table code, there is a
 potential for memory bloat.  This potentially occurs whenever an object of type
 RandomAccessTableReader<SomeHolder> reads in an archive.  The Table code is
 written so as to first and foremost ensure correctness, so when reading an
 archive in random access mode, unless you give the Table code some additional
 information (which we will discuss below), it can never throw away any object it
 has read in case you ask for it again.  An obvious questions here is: why
 doens't the Table code simply keep track of the position in the file at which
 each object starts, and fseek() to that location when needed?  We have not
 implemented this, and the reason is as follows: the only situation that you can
 fseek() is when the archive being read is an actual file (i.e. not a piped
 command or the standard input).  If the archive was an actual file on disk, you
 could have written it out with an attached scp file containing offsets into the
 file (using the "ark,scp:" prefix, see \ref io_sec_specifiers_both), and then
 provided that scp file to the program that needs to read the archive.  This
 would be almost as time-efficient as reading the archive directly, since the
 code that reads in scp files is smart enough to avoid reopening files when not
 needed and calling fseek() unnecessarily.  So treating file archives as a
 special case and caching offsets into the file would not solve any problems.

 There are two separate problems that can happen when you read an archive in random
 access mode; these can both happen if you use just the "ark:" prefix with no
 additional options.
    - If you ask for a key that is not present in the archive, the reading code
      is forced to read till the end of the archive to make sure it is not there.
    - Every time the code reads an object, it is forced to keep it in memory in case
      you ask for it later.

 With regard to the first problem (having to read till the end of the file),
 the way you can avoid this is to assert that the archive is sorted on key (using
 the normal string sorted order that "C" uses, and that the program "sort" uses
 if you do "export LC_ALL=C").  You can do this using the "s" option when reading
 archives: for example, the rspecifier "ark,s:-" instructs the code to read the
 standard input as an archive and expect it to be in sorted order.  The Table code
 checks that what you have asserted is actually true, and will crash if not.
 Of course, you have to set up your scripts in such a way that the archives are
 actually sorted on key (usually this will be done in the initial feature-extraction
 stage).

 With regard to the second problem (being forced to keep things in memory in
 case needed later), there are two solutions.

  - The first solution, which is
    a rather brittle solution, is to provide the "once" option;
    for example, the rspecifier "ark,o:-" reads in from the standard input and asserts
    that you will only ask for each object once.  To be able to assert this you would
    have to know something about how the program in question works and you would probably
    have to know that some other Table provided to the program does not contain any
    repeated keys (yes, Tables can have repeated keys as long as they are only accessed
    in sequential mode).

    If you provide the "o" option the Table can deallocate objects after they have been
    accessed.  However, this only works well if your archives are perfectly synchronized with
    no gaps or missing elements.  For example, suppose you execute the command:
\verbatim
 some-program ark:somedir/some.ark "ark,o:some command|"
\endverbatim
    The program "some-program" will first iterate sequentially over the archive "somedir/some.ark"
    and then for each key it encounters, access the second archive via random access.
    Note that the order of command-line arguments is not arbitrary: we have tried to
    adopt the convention that rspecifiers that will be accessed sequentially appear
    before those that will be accessed via random access.

    Suppose the two archives are mostly synchronized but may have gaps (i.e. missing keys,
    e.g. due to failures in feature extraction, data alignment, and so on).
    Any time there
    is a gap in the first archive, the program will have to cache the associated object
    from the second archive because it doesn't know that it won't be called for later
    (it can only throw away an object once you have read it).  Gaps in the second
    archive are more serious, because if there is a gap of even one element, when
    the program asks for that key it will have to read right till the end of the
    second archive to look for it, and will have to cache all objects along the way.

  - The second solution, which is more robust, is to use the "called-sorted" (cs) option.
    This asserts that the objects will be requested in sorted order, and again this
    requires knowledge of how the program works, plus that any sequentially accessed
    archives are in sorted order.  The "cs" option is normally most useful in conjunction
    with the "s" option.  Suppose we execute the following command:
\verbatim
 some-program ark:somedir/some.ark "ark,s,cs:some command|"
\endverbatim
    We assume that both archives are in sorted order, and the the program does
    sequential access on the first archive and random access on the second.
    This is now robust to gaps
    in the archives.  First imagine there is a gap in the first archive (e.g., its keys
    are 001, 002, 003, 081, 082, ...).  When the second archive is searched for key 081 right
    after key 003, the code that reads the
    second archive will encounter keys 004, 005, and so on, but it can discard the associated
    objects because it knows that no key before 081 will be asked for again (thanks to the "cs" option).
    If there is a gap in the second archive, it can use the fact that the second archive is sorted
    to avoid searching till the end of the file (this is the job of the "s" option).

 \subsection io_sec_mapped

  In order to condense a particular code pattern that was recurring in many programs, we have introduced the template type
 RandomAccessTableReaderMapped.  Unlike RandomAccessTableReader, this takes two initializer arguments, for instance:
\verbatim
   std::string rspecifier, utt2spk_map_rspecifier; // get these from somewhere.
   RandomAccessTableReaderMapped<BaseFloatMatrixHolder> transform_reader(rspecifier,
                                                                         utt2spk_map_rspecifier);
\endverbatim
  If utt2spk_map_rspecifier is the empty string, this will behave just like a
  regular RandomAccessTableReader.  If it is nonempty, e.g. ark:data/train/utt2spk,
  it will read an utterance-to-speaker map from that location and whenever a particular
  string e.g. utt1 is queried, it will use that map to convert the utterance-id
  to a speaker-id (e.g. spk1) and use that as the key to query the table being
  read from rspecifier.  The utterance-to-speaker map is also an archive
  because it happens that the Table code is the easiest way to read in such maps.


*/

/**
  \defgroup io_funcs_basic "Low-level I/O functions"

 These functions are provided to write fundamental types, strings, and a few STL types
 to and from C++ streams; see \ref io_sec_basic for how this fits into the bigger picture
 of Kaldi-style I/O.

 \defgroup holders "Holder types"

  Holder types are types that are used as template arguments to the Table types
  (see \ref table_group), and which help the Table types to read and write the object of type SomeHolder::T;
  see \ref io_sec_holders for more information.

  \defgroup table_group "Table types and related functions"

 This group is for classes and functions relatied to Tables; see also
 \ref table_impl_types and \ref table_types, and for a description
 of the Table concept see \ref io_sec_tables.

 \defgroup table_impl_types "Implementation classes for Table types"

 This group is for classes that implement specific ways of reading and
 writing Tables; see also \ref table_group, \ref table_types, \ref
 table_types, and for a description of the Table concept see \ref io_sec_tables.

 \defgroup table_types "Specific Table types"

 This group is for typedefs that define specific instantiations of
 Table types, for various kinds of access to collections of various
 kinds of types, indexed by strings;
 for a description of the Table concept see \ref io_sec_tables.

 \defgroup io_group "Classes for opening streams"

 This group contains the Input and Output classes, which are provided
 to open streams for reading and writing in Kaldi code; for an explanation
 of how this fits into the bigger picture of Kaldi I/O, see \ref io_sec_opening.

*/

}