[mlpack] Degenerate cases and other GMM problems

Tue Apr 9 14:02:24 EDT 2013

On Tue, Apr 09, 2013 at 01:34:15PM -0400, John Demme wrote:
> Hi All-
> 
> I'm trying to use mlpack's GMM with some data I've got. I'm not so familiar
> with the statistical tools used here as I should be, so I've run into some
> problems that I'm having trouble debugging on my own:
> 
> - First, I often get "error: inv(): matrix appears to be singular" during
> estimation. It appears that during estimation, one (or more) rows and
> columns of a covariance matrix become 0, and I think this causes it to
> become non-invertible.
> 
> - Second, in cases when estimation completes, I often end up with means,
> weights and covariances which are all -nan.
> 
> I'm not sure whether I'm mis-using the tool or I've got funny data which
> need to be conditioned. It's six-dimensional, values less than 1.0 and one
> of the features is very often zero. (I'm wondering if that last bit means
> that one good gaussian would be zero mean and zero stdev, resulting in a
> degenerate covariance matrix -- though I don't know enough stat and linear
> algebra to work this out.) Can someone give some advice?
> 
> I've posted a small sub-set of my data which can trigger these problems:
> www.cs.columbia.edu/~jdd/gmm_obs0.csv
> 
> If I run "./gmm -i gmm_obs0.csv -g 5" I can get the first problem. Changing
> the number of gaussians to 8 results in the second problem.

Hello John,

My own work has brought me back to GMMs in recent weeks and I have come
across the same issues you suggested.

Are you using the svn trunk or mlpack 1.0.4?  If you aren't using trunk,
try doing that ('svn co http://svn.cc.gatech.edu/fastlab/mlpack/trunk
mlpack').  I recently committed a fix which adds some small values to
the covariance matrices so that they don't end up being entirely
zero-valued.  That situation arises when the initial clustering to set
the Gaussians returns a cluster with only one point in it (by default
this is K-Means).

I'll try and hunt down the -nan issue, although it is worth pointing out
that the log-likelihood of the model is correctly -nan if there exist
points which are very far outside any of the Gaussians.  That is, if the
probability of an individual point is 0, then log(0) = -nan and the
log-likelihood of the whole model is -nan.  I'll let you know what I
find as I dig into it.  Thanks for attaching the test case -- this makes
debugging much easier.

Thanks for pointing out these issues.

Ryan

-- 
Ryan Curtin       | "You got to stick with your principles."
ryan at igglybob.com |   - Harry Waters