[mlpack-git] [mlpack] HMM simple example (#470)

Fri Nov 13 11:24:48 EST 2015

Hi Davud,

You are right that there is no HMM tutorial.  I'd like to write one at some point, but I haven't had the time.  If you are using the command-line interface, you have four programs available to you: `hmm_generate`, `hmm_loglik`, `hmm_train`, and `hmm_viterbi`.  You can use `--help` with each of these programs to get some information on what they do and how to use them.  Another place you could look, if you are writing C++ (which I think you may need to do for what you're trying to do) is in the tests: the file `src/mlpack/tests/hmm_test.cpp` may be helpful, for example.  The Doxygen documentation for the `HMM` class could also be useful: http://mlpack.org/docs/mlpack-git/doxygen.php?doc=classmlpack_1_1hmm_1_1HMM.html#details

> How can I prepare an "already-trained" hmm?

In C++, assuming you have the transition matrix and emission distribution information, you can set the transition matrix with `HMM::Transition()` (i.e. `hmm.Transition() = myTransitionMatrix`), and then you can set the emission distributions individually.  You mentioned using GMM-HMMs, so if you call `hmm.Emission(0)` you get back a `GMM` object, which will have weights, means, and variances that you can set individually; see http://mlpack.org/docs/mlpack-git/doxygen.php?doc=classmlpack_1_1gmm_1_1GMM.html#details for more information.

If you are trying to use the command-line, there isn't any support for that, so you would have to manually write the model file by hand... (more on that in a moment)

> How can I connect labels with emissions/observations?

I don't know what you mean.  If you use `hmm_train`, you pass in a sequence of observations with `--input_file` and their corresponding labels to `--labels_file`.  If you use C++, the API documentation should make this clear.

> How does a label-file/model-file/observation-file look like?

For label and observation files, mlpack supports many input file types, but the simplest will just be csv; one observation/label per line.  See mlpack.org/docs/mlpack-git/doxygen.php?doc=namespacemlpack_1_1data.html#ac3dd73bcdb8c2d990a3bb1aafe65af23 for more information.

For model files, boost::serialization is used, which currently can output models in xml, binary, or text (depending on the filename suffix you use).

I hope this is helpful.  Let me know if I can clarify anything.

Thanks,

Ryan

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/470#issuecomment-156478968
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20151113/3c0ecf1d/attachment.html>