[mlpack] Gsoc 2013,ML Project Idea Query

Ryan Curtin gth671b at mail.gatech.edu
Fri Apr 12 16:00:11 EDT 2013


On Thu, Apr 11, 2013 at 01:46:02AM +0530, satya pradeep wrote:
> Hello,
> 
> I am Pradeep , currently studying in pre-final year B.Tech in Electrical
> engineering  from Indian Institute of Technology Kanpur(IITK),with research
> interests in Image Processing & pattern recognition, Machine Learning and
> Artificial Intelligence programming.Also, I'm new to this world of open
> source although I've been using Ubuntu & various other Open Sourced
> software for past 3-4 years, I had no idea that it was such a huge thing. As
> I was going through GSoc 2013 ideas list, the following project idea really
> struck me:
> 
> 1.collaborative filtering package

Hello Pradeep,

I wrote more about the CF package in a recent email:

https://mailman.cc.gatech.edu/pipermail/mlpack/2013-April/000039.html

Hopefully that contains useful information, but if you have more
questions, let me know.

> 2.fixes to mvu and low-rank semidefinite programs
> 
>  I have implemented projects based on Machine learning and Pattern
> recognition, problems like "Identification of digits from handwritten
> numerals" in MATLAB using learning methods like Manifold model,generative
> model and KNN classifier,a purely discriminative model to classify 60,000
> images of MNIST dataset.I have uploaded a crude demo on my homepage at,
> http://home.iitk.ac.in/~deepu/cs365/hw1/
> and "Manifold Motion Planning to map a robot's arm motion", - used Isomap
> for dimensionality reduction in the control space.
> http://home.iitk.ac.in/~deepu/cs365/hw3/
> I am familiar with programming languages Java,C,C++,Python.I have done
> basic data structure course and also an advanced artificial intelligence
> and Machine learning courses.As I am new to this library, I'm
> researching/googling
> everything I can find on the web about it. Also, any help from your side
> regarding the project or a starting point will be great.
> 
> I like the second project "fixes to mvu and low-rank semidefinite
> programs" more
> than the first , since its relation to dimensionality reduction techniques
> which i have had experience with.but am little skeptical if it would be
> very difficult to approach.It would be great if you could provide some more
> information or a starting point about both these projects.

The MVU problem is very difficult.  Usually MVU is expressed as a
semidefinite program [1, 2], which is convex and can be solved with
well-known techniques.  However, solving convex SDPs is often slow.
Fortunately, about ten years ago, some researchers had the idea of
reformulating an SDP in a non-convex low-rank manner (LRSDP), so that
the matrix you are optimizing is low-rank [3].  This provides good
speedup, and would be applicable to something like MVU.  Some
semi-recent work did apply LRSDP to MVU with reported success [4].

However, the current mlpack implementation of MVU with LRSDP does not
converge, in spite of the positive results reported in the paper I
referenced.  Debugging numerical optimization algorithms, especially
ones this complex, is *not easy* and tends to be incredibly complex.
This doesn't mean it is impossible.  If you are interested in getting
started with this, I'd suggest reading the four references I've listed
below (and things referenced therein) so that you are comfortable with
the problem of MVU and with LRSDP.  Then, you could look at the existing
codebase to understand it.

[1] K.Q. Weinberger, L.K. Saul. "An introduction to nonlinear
    dimensionality reduction by maximum variance unfolding."
    Proceedings of the National Conference on Artificial Intelligence.
    Vol. 21, No. 2.  2006.

[2] L. Vandenberge, S. Boyd.  "Semidefinite programming."  SIAM Review
    38.1 (1996): 49-95.

[3] S. Burer, R.D.C. Monteiro.  "A nonlinear programming algorithm for
    solving semidefinite programs via low-rank factorization."
    Mathematical Programming 95.2 (2003): 329-357.

[4] N. Vasiloglou, A.G. Gray, and D.V. Anderson.  "Scalable semidefinite
    manifold learning."  IEEE Workshop on Machine Learning for Signal
    Processing (MLSP), 2008.

Hopefully this information is helpful.  Let me know if you have any more
questions.

Thanks,

Ryan

-- 
Ryan Curtin       | "Are you or are you not the Black Angel of Death?"
ryan at igglybob.com |   - Steve


More information about the mlpack mailing list