[mlpack-git] [mlpack] hmm_train number of gaussians (#479)

Ryan Curtin notifications at github.com
Tue Nov 24 13:26:50 EST 2015


Hey Davud,

Thanks for the backtrace; judging by its output, it looks like this bug is exactly what was fixed in #481 a few days ago.  I used the dataset that you linked to, and tested with the current git master and had no problem, then tested with git master before #481 was merged (specifically, a7d8231), and had the exact same issue that you had here.  So I believe the issue is solved if you update to the newest git master.

If you do try to keep increasing the number of Gaussians, though, note that you can't increase it past the number of samples in your smallest class.  For instance, in your data, here is the class breakdown:

```
(( ryan @ adam )) ~/src/mlpack/build $ cat labels.csv | sort | uniq -c
    158 0
    101 1
     34 10
     67 11
     68 12
     45 2
     27 3
     54 4
     56 5
    194 6
     26 7
     42 8
    162 9
```

So you can't increase the number of Gaussians above 26, because class 7 only has 26 observations.  If you, for instance, specify `-g 45`, but only have 26 observations, k-means (which is used before GMM training to initialize the model) will end up with empty clusters no matter what is done, and this will probably cause the program to fail (note that you'll get a warning: `[WARN ] KMeans::Cluster(): more clusters requested than points given.`).  It would be possible to add a check for this to `hmm_train` but at the moment I don't have the time, unfortunately...

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/479#issuecomment-159363098
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20151124/632fe652/attachment.html>


More information about the mlpack-git mailing list