[mlpack-svn] [MLPACK] #300: allknn fails for mnist8m dataset
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Wed Aug 14 23:29:17 EDT 2013
#300: allknn fails for mnist8m dataset
----------------------+-----------------------------------------------------
Reporter: rozyang | Owner: rcurtin
Type: defect | Status: accepted
Priority: major | Milestone:
Component: mlpack | Resolution:
Keywords: | Blocking:
Blocked By: |
----------------------+-----------------------------------------------------
Comment (by rozyang):
The idx file format is given in the MNIST website
http://yann.lecun.com/exdb/mnist/
I used Matlab for the conversion:
X = readidx_uint8(train8m-images-idx3-ubyte',8100000, 8100000);
csvwrite('mnist8m.csv', X);
The first function is given below. The second function is a Matlab built-
in. I have checked the variable 'X' as well as the content of mnist8m.csv.
The rows are indeed valid handwritten digit images.
function [arr1,arr2] = readidx_uint8(FILENAME,t1,t2)
fid= fopen(FILENAME,'r','b');
magic = fread(fid, 1, 'int32');
if magic==2051
num = fread(fid, 1, 'int32');
ndim(1) = fread(fid, 1, 'int32');
ndim(2) = fread(fid, 1, 'int32');
a = ndim(1)*ndim(2);
elseif magic==2049
num = fread(fid, 1, 'int32');
a=1;
else
disp('unknown magic number');
end
arr1 = uint8(zeros(a,t1));
arr2 = uint8(zeros(a,t2-t1));
for i=1:t1
arr1(:,i) = fread(fid, a, 'uint8');
end
for i=t1+1:t2
arr2(:,i-t1) = fread(fid, a, 'uint8');
end
fclose(fid);
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/300#comment:2>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list