[mlpack-svn] [MLPACK] #300: allknn fails for mnist8m dataset
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Tue Aug 20 20:04:55 EDT 2013
#300: allknn fails for mnist8m dataset
----------------------+-----------------------------------------------------
Reporter: rozyang | Owner: rcurtin
Type: defect | Status: accepted
Priority: major | Milestone:
Component: mlpack | Resolution:
Keywords: | Blocking:
Blocked By: |
----------------------+-----------------------------------------------------
Comment (by rozyang):
Hi, rcurtin, thank you very much. Your answer really make sense. If the
whole matrix has to be loaded to RAM in DOUBLE, then it is
8100000 * 784 * 8 = 47.3142 G
I will wait for you new codes and buy more RAM. Besides your suggestions,
I tried this dataset because the matrix entries are actually not DOUBLE.
They are UINT8, i.e. a byte of each. So what should really be in memory is
8100000 * 784 * 8 = 5.9143 G
which should be affordable in 16G RAM, by some extra online integer-double
conversion. Of course, it requires modification of the allknn code. I am
not sure whether it is doable in the near future. It should be useful for
all applications uisng UINT8, especially for vision.
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/300#comment:4>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list