[mlpack] Automatic Benchmarking
Praveen
vpraveen99 at gmail.com
Fri Feb 28 05:04:12 EST 2014
Hello,
I am Praveen Venkateswaran, an undergraduate doing Computer Science and
Mathematics in India.
I have worked with various machine learning algorithms as well as on
information retrieval and I would love to be able to contribute to
mlpack starting with GSoC 2014.
I am interested in working on the improvement to the automatic
benchmarking that was done during the last summer. I would like to
start in terms of comparing the accuracy of implementations of various
libraries. I've been browsing resources to try and find some starting
point for this. [0] basically describes WiseRF's benchmarking of the
random forest classification.
The point that strikes me the most is that they tried whenever possible
to see which parameters yielded better results on individual libraries
and then compared the libraries based on those individualistic
parameters instead of just using default parameters which i totally
agree with as it yields more unbiased results, what do you think about
this?
I had already spoken to Ryan asking him to clarify details on the
project and the crux would be the comparison of parameters. Now if we
use the above point, then we would have to individually check out the
best parameters to be used in particular libraries for the database
size range and then run the individual methods on those.
Then we could score them on that basis (For classification algorithms,
it’s the fraction of correctly classified samples, for regression
algorithms it’s the mean squared error and for k-means it’s the
inertia criterion) or something along those lines(not too sure about
this, as I don't have experience with all the libraries being tested).
Please let me know what you think about this and any further
suggestions would be most welcome.
[0]
http://about.wise.io/blog/2013/07/15/benchmarking-random-forest-part-1/
Regards
Praveen Venkateswaran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20140228/321a4582/attachment-0001.html>
More information about the mlpack
mailing list