[mlpack] GSoC Project Idea
Ryan Curtin
gth671b at mail.gatech.edu
Thu Apr 11 11:58:01 EDT 2013
On Thu, Apr 11, 2013 at 01:36:55PM +0530, Akshay wrote:
> Hello,
>
> I want to discuss the idea of parallelizing some of the already implemented
> algorithms using OpenCL. I have prior experience with OpenCL and its C++
> bindings, and have developed a parallelized implementation of KMeans
> clustering.
> OpenCL is traditionally for GPU computing but, in my experience it can also
> drive multi-core CPUs to full load for significant speedups. Algorithms
> consuming large data simultaneously(like kmeans) can benefit greatly.
> Also the base implementation of algorithms could be used as ground truth
> which should be useful in debugging the parallel version.
>
> I am also well versed in the standard ML techniques and statistics through
> MOOCs and course projects, and I'm willing to go further.
OpenCL is a difficult proposition. One of goals of mlpack is
parallelization, but not at the cost of code maintainability. This is
why we've preferred OpenMP up to this point; OpenMP code can be
implemented with simple #pragma commands which can just as easily be
ignored by the person reading the code, if they do not understand
parallel code -- and most people in machine learning do not.
We do currently have an OpenMP implementation of k-means that is being
merged into trunk. Perhaps a more suitable project would be OpenMP
support for other machine learning methods. Would this be interesting
to you?
--
Ryan Curtin | "And they say there is no fate, but there is: it's
ryan at igglybob.com | what you create." - Minister
More information about the mlpack
mailing list