[mlpack-git] master, mlpack-1.0.x: Rank is not yet a parameter, but this heuristic from Siddharth improves upon the previous "rank = 2". (8c5140b)

gitdub at big.cc.gt.atl.ga.us gitdub at big.cc.gt.atl.ga.us
Thu Mar 5 21:44:34 EST 2015


Repository : https://github.com/mlpack/mlpack

On branches: master,mlpack-1.0.x
Link       : https://github.com/mlpack/mlpack/compare/904762495c039e345beba14c1142fd719b3bd50e...f94823c800ad6f7266995c700b1b630d5ffdcf40

>---------------------------------------------------------------

commit 8c5140b196d05cca36eec064c6d8d57517bcc87f
Author: Ryan Curtin <ryan at ratml.org>
Date:   Thu Feb 20 16:34:54 2014 +0000

    Rank is not yet a parameter, but this heuristic from Siddharth improves upon the
    previous "rank = 2".


>---------------------------------------------------------------

8c5140b196d05cca36eec064c6d8d57517bcc87f
 src/mlpack/methods/cf/cf.cpp | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mlpack/methods/cf/cf.cpp b/src/mlpack/methods/cf/cf.cpp
index c8e2487..b9a6ea5 100644
--- a/src/mlpack/methods/cf/cf.cpp
+++ b/src/mlpack/methods/cf/cf.cpp
@@ -97,8 +97,10 @@ void CF::GetRecommendations(arma::Mat<size_t>& recommendations,
 
   // Operations independent of the query:
   // Decompose the sparse data matrix to user and data matrices.
-  // Should this rank be parameterizable?
-  size_t rank = 2;
+  // This is a simple heuristic that picks a rank based on the density of the
+  // dataset between 5 and 105.
+  const double density = (cleanedData.n_nonzero * 100.0) / cleanedData.n_elem;
+  size_t rank = size_t(density) + 5;
 
   // Presently only ALS (via NMF) is supported as an optimizer.  This should be
   // converted to a template when more optimizers are available.



More information about the mlpack-git mailing list