[mlpack-svn] [MLPACK] #242: LARS produces NaNs when the data matrix has duplicate features
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Wed Jan 22 11:40:40 EST 2014
#242: LARS produces NaNs when the data matrix has duplicate features
----------------------------------+-----------------------------------------
Reporter: niche | Owner: niche
Type: defect | Status: closed
Priority: major | Milestone: mlpack 1.1.0
Component: mlpack | Resolution: fixed
Keywords: lars, sparse coding | Blocking:
Blocked By: |
----------------------------------+-----------------------------------------
Changes (by rcurtin):
* status: new => closed
* resolution: => fixed
Comment:
A fix from Michael Fox in r16154 solves the issue. It turns out that in
the Cholesky case, checking for rank-deficiency is very simple. In the
non-Cholesky case one can call arma::solve() and check its return value.
When a linearly dependent feature is discovered, it is then added to a set
of ignored features (which, in the case of a full-rank data matrix, is
empty) and won't be added to the generated model (i.e., the beta parameter
for that feature is 0).
In r16155 I've adapted the test executable he gave into an actual test
that's integrated with mlpack_test, to help ensure that future
modifications to LARS still work when rank-deficient data is encountered.
Now, in this particular case, the mlpack LARS implementation is consistent
with the Hastie implementation of LARS in R.
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/242#comment:2>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list