[mlpack-git] [mlpack/mlpack] GammaDistribution: Adds functionality to solve #749 (#751)

Sat Aug 6 00:46:57 EDT 2016

> @@ -53,6 +64,13 @@ void GammaDistribution::Train(const arma::mat& rdata, const double tol)
>    Train(logMeanxVec, meanLogxVec, meanxVec, tol);
>  }
>  
> +// Fits an alpha and beta parameter according to observation probabilities.
> +void GammaDistribution::Train(const arma::mat& observations, 
> +                              const arma::vec& probabilities,
> +                              const double tol)
> +{

Here's an idea for this.  The idea is that we are training the distribution, but we do not know for certain that the points we are training on came from this distribution---we only have an estimate between 0 and 1 of how likely we think it is that the point came from this distribution.  So we can train using the probabilities as weights.

We can use each of the probabilities to construct a "weighted" version of `meanLogxVec`, `meanxVec`, and `logMeanxVec`.  This means we can't use Armadillo's nice syntax though since we have to implement this ourselves.

```
arma::vec meanLogxVec(rdata.n_rows, arma::fill::zeros);
arma::vec meanxVec(rdata.n_rows, arma::fill::zeros);
arma::vec logMeanxVec(rdata.n_rows, arma::fill::zeros);

for (size_t i = 0; i < rdata.n_cols; ++i)
{
  meanLogxVec += probabilities(i) * std::log(rdata.col(i));
  meanxVec += probabilities(i) * rdata.col(i);
}
meanLogxVec /= arma::accu(probabilities);
meanxVec /= arma::accu(probabilities);
logMeanxVec = arma::log(meanxVec);

Train(logMeanxVec, meanLogxVec, meanxVec, tol);
```

What do you think?

Testing that could be done by generating points from a Gamma distribution, assigning them a random weight between, e.g., 0.9 and 1, and then generating points from a uniform distribution, and assigning those points a random weight between, e.g, 0.0 and 0.02.  Then, call Train() with probabilities, and ensure the resulting trained Gamma distribution is close to the distribution that was used to generate points.

What do you think?  I've only outlined ideas here, but I can implement them if you need.

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/751/files/9cb117f671f55186baddf38ce71107a2a3ae027f#r73780816
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160805/45ebb9fb/attachment.html>