[mlpack] GSOC-2013 : Working on Collaborative Filtering

Srijan Kumar srijankedia at gmail.com
Tue Apr 23 18:41:39 EDT 2013


Hi Ryan,

Thanks for pointing out the paper on Sudoku. I will take some out to read
it.

I have gone through the mail archives and most of my questions are clear as
of now.
I will let you know if I need more help.

Thanks
Srijan

On Tue, Apr 23, 2013 at 12:16 AM, Ryan Curtin <gth671b at mail.gatech.edu>wrote:

> On Sun, Apr 21, 2013 at 09:29:03PM +0530, Srijan Kumar wrote:
> > Hi,
> >
> > My full profile can be found at
> > https://sites.google.com/site/srijankedia/home.
>
> Interesting aside -- I see that you like to solve sudoku puzzles.  Are
> you familiar with the literature on sudoku?  Some years ago it was shown
> that sudoku is np-complete:
>
> http://www-imai.is.s.u-tokyo.ac.jp/~yato/data2/SIGAL87-2.pdf
>
> > I do have a few questions that would help to decide the CF algorithm to
> be
> > finally implemented.
> > It would be great if the mentors could please answer the following
> > questions -
> > 1. What is the kind and size of data that we would require to handle? Do
> > you have anything in mind or is it general at the moment?
>
> mlpack is generally used on single systems, not clusters, so datasets up
> to probably 16GB is about the aim.  I suppose you could OpenMP-ize the
> code and then run it on a cluster on a huge amount of data, but I don't
> imagine many people are planning to use mlpack that way.
>
> > 2. What are the other factors that we would need to consider while
> choosing
> > the algorithm?
>
> Extensibility is helpful.  If we can provide an algorithm that is highly
> modular, this opens up possibilities for other researchers to try
> modifying the algorithm slightly with ease.  For instance, take a look
> at the k-means code in src/mlpack/methods/kmeans/ and note that one of
> the template parameters is a class which defines how to find the
> starting points.  If a researcher wanted to play around with different
> initialization methods for k-means (which actually I have been doing
> this past week), then they can implement their own without having to
> deal with the k-means algorithm at all.
>
> Make sure to take a look at the list archives to find the previous
> discussion on the collaborative filtering projects.  You may find useful
> information there.
>
> https://mailman.cc.gatech.edu/pipermail/mlpack/2013-April/thread.html
>
> If you have more questions, feel free to ask.
>
> Thanks,
>
> Ryan
>
> --
> Ryan Curtin       | "I love it when a plan comes together."
> ryan at igglybob.com |   - Hannibal Smith
>



-- 
Srijan Kumar
Final Year Undergraduate Student
Department of Computer Science and Engineering
Indian Institute of Technology, Kharagpur
India
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack/attachments/20130424/b91cec06/attachment.html>


More information about the mlpack mailing list