[mlpack-svn] [MLPACK] #300: allknn fails for mnist8m dataset

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Tue Aug 20 20:04:55 EDT 2013


#300: allknn fails for mnist8m dataset
----------------------+-----------------------------------------------------
  Reporter:  rozyang  |        Owner:  rcurtin 
      Type:  defect   |       Status:  accepted
  Priority:  major    |    Milestone:          
 Component:  mlpack   |   Resolution:          
  Keywords:           |     Blocking:          
Blocked By:           |  
----------------------+-----------------------------------------------------

Comment (by rozyang):

 Hi, rcurtin, thank you very much. Your answer really make sense. If the
 whole matrix has to be loaded to RAM in DOUBLE, then it is

  8100000 * 784 * 8 = 47.3142 G

 I will wait for you new codes and buy more RAM. Besides your suggestions,
 I tried this dataset because the matrix entries are actually not DOUBLE.
 They are UINT8, i.e. a byte of each. So what should really be in memory is

  8100000 * 784 * 8 = 5.9143 G

 which should be affordable in 16G RAM, by some extra online integer-double
 conversion. Of course, it requires modification of the allknn code. I am
 not sure whether it is doable in the near future. It should be useful for
 all applications uisng UINT8, especially for vision.

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/300#comment:4>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list