[mlpack-git] [mlpack/mlpack] Adds a Train() function that only needs dataset statistics (#748)
notifications at github.com
Tue Aug 2 10:28:11 EDT 2016
Instead of using the dataset itself, this Train() function only needs the dataset statistics. This function works for 1-dimensional distributions only (you can only provide 1 number for each statistic).
This is useful for my code in LSHModel, where instead of using data to fit the distribution, we create a regression function that can predict the arithmetic and geometric mean of squared distances given the size of the dataset. Since we don't have the actual distances - we only have an estimation of the distances' statistics - we can't use the previous Train() function in GammaDistribution.
I modified my code so that the Train() that accepts the dataset simply calls this function for each row after computing the statistics, to avoid code reuse.
I also added a test to make sure this produces the same result as giving the Train() function the dataset.
You can view, comment on, or merge this pull request online at:
-- Commit Summary --
* Adds a Train() function that only needs dataset statistics, not the dataset itself
-- File Changes --
M src/mlpack/core/dists/gamma_distribution.cpp (81)
M src/mlpack/core/dists/gamma_distribution.hpp (15)
M src/mlpack/tests/distribution_test.cpp (22)
-- Patch Links --
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mlpack-git