[mlpack-svn] [MLPACK] #292: PCA failed with scale option.

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Thu Jun 20 13:50:36 EDT 2013


#292: PCA failed with scale option.
---------------------+------------------------------------------------------
 Reporter:  marcus   |        Owner:     
     Type:  defect   |       Status:  new
 Priority:  trivial  |    Milestone:     
Component:  mlpack   |     Keywords:     
 Blocking:           |   Blocked By:     
---------------------+------------------------------------------------------
 I've tested the PCA code and got an error (element-wise division:
 incompatible matrix dimensions: 0x0 and 0x1) with the scale option.

 To fix this problem we have to write the following line:
 {{{
 // This line fix the problem.
 covMat = covMat / (arma::ones<arma::colvec>(covMat.n_cols) *
 trans(stddev(covMat, 1, 1)));
 }}}

 instead of this line:
 {{{
 // This produces an error.
 covMat = covMat / (arma::ones<arma::colvec>(covMat.n_rows) *
 stddev(covMat, 0, 0));
 }}}

 Maybe I got something wrong. But I suppose it's not the right way to scale
 the data after the covariance matrix was calculated. I've written some
 code which provides the expected results. Maybe someone can explain if I
 made a wrong assumption.
 {{{
 void PCA::Apply(const arma::mat& data,
                 arma::mat& transformedData,
                 arma::vec& eigVal,
                 arma::mat& coeffs) const
 {
   arma::mat covMat, dataScale;

   if (scaleData)
   {
     arma::colvec means = arma::mean(data, 1);
     arma::mat meanData = data - (means *
 arma::ones<arma::rowvec>(data.n_cols));
     dataScale = trans(meanData) / (arma::ones<arma::colvec>(data.n_cols) *
                                       trans(stddev(data, 1, 1)));

     covMat = cov(dataScale);
   }
   else
   {
     // Centering is built into ccov().
     covMat = ccov(data);
   }

   arma::eig_sym(eigVal, coeffs, covMat);

   int nEigVal = eigVal.n_elem;
   for (int i = 0; i < floor(nEigVal / 2.0); i++)
     eigVal.swap_rows(i, (nEigVal - 1) - i);

   coeffs = arma::fliplr(coeffs);


   if (scaleData)
   {
     transformedData = trans(dataScale * coeffs);
   }
   else
   {
     transformedData = trans(coeffs) * data;
     arma::colvec transformedDataMean = arma::mean(transformedData, 1);
     transformedData = transformedData - (transformedDataMean *
         arma::ones<arma::rowvec>(transformedData.n_cols));
   }
 }
 }}}

 Another question about the ccov function. Are you using the ccov function
 in this place to save memory? The PCA code centers the data once at the
 beginning and once at the end.

 Thanks and regards,
 Marcus

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/292>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list