[mlpack] Patch for Ticket #251 and GSoC Intro

Pararth Shah pararthshah717 at gmail.com
Sat Apr 13 06:11:06 EDT 2013


Hi,

I am interested in working with MLPACK this summer as part of GSoC 2013.
Since this list has already been flooded with introduction emails, I went
ahead and created a patch for a bug, in order to (i) get acquainted with
the code, and (ii) help towards reaching the 1.0.5 milestone.

*Patch for #251: kmeans -f has no test and does not work*

The ticket is here <http://trac.research.cc.gatech.edu/fastlab/ticket/251>.
I figured that the KMeans::FastCluster() function is giving an error due to
issues in construction of the MRKDStatistic object. I modified the
constructor implementation (diff attached) which seems to have solved the
issue.

I will go ahead and add a test for FastCluster(), but please confirm that I
am on the right track. If yes, I'll assign the ticket to myself and attach
the diff to the ticket (already created an account on trac).

*Brief Bio*

I am a final year student at IIT-Bombay, majoring in Computer Science. I
participated in GSoC 2011, working with Point Clouds Library on building an
automated benchmarking framework for 3-D point cloud processing algorithms.
(Here's my blog
<http://www.pointclouds.org/blog/gsoc/pararthshah/index.php> from
that summer). Last year, I interned at Google Los Angeles, working with
their machine learning and distributed processing systems, mainly on
large-scale entity disambiguation and classification for improving ads
relevance. I have developed an interest in machine learning through various
courses, internships and research projects. For more info, please visit my home
page <http://www.cse.iitb.ac.in/~pararth>.

I am excited about MLPACK as I am most comfortable with C++, and I feel
that a high-speed general-purpose library of ML algorithms in C++ will play
a big role in simplifying data mining applications, similar to what OpenCV
has done for Vision. I will be graduating in May, so I'll have a lot of
free time to work on the GSoC project, until I move to Stanford in
mid-September to pursue my MS in CS degree with specialization in Machine
Learning.

*Project Interests*

The "Automated Benchmarking of MLPACK Methods" project sounds interesting
to me, as I did similar work during my previous GSoC experience. I am work
on sketching out a proposal for the same, while simultaneously getting a
better understanding of the MLPACK codebase. However, I am also on the
lookout for other project ideas that may interest me, and will email again
once I have something substantial to discuss (two such ideas are
parallelization using OpenMP, and support for graph min-cut based
optimizations).

Thanks,
Pararth
http://www.cse.iitb.ac.in/~pararth
http://www.linkedin.com/profile/view?id=202978774


P.S. @Ryan and other mentors, what would be a good time of the day to catch
one of you on IRC?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20130413/6a6a4b15/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ticket_253.diff
Type: application/octet-stream
Size: 3185 bytes
Desc: not available
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20130413/6a6a4b15/attachment.obj>


More information about the mlpack mailing list