[mlpack] GSOC'16 Deep Learning Aspirant

Ryan Curtin ryan at ratml.org
Thu Mar 3 12:06:01 EST 2016


On Thu, Mar 03, 2016 at 02:54:08AM +0530, Shashank Gupta wrote:
> Hello , 
> 
> I am Shashank Gupta, pursuing MS By Research at IIIT-Hyderabad,
> working under supervision of Dr. Vasudeva Varma (Dean R&D at IIIT-H).
> My research area is applications of Deep Learning in Text analysis. I
> have done Machine Learning course (intermediate level, Deep Learning
> covered) at IIIT Hyderabad and I was Teaching Assistant for Machine
> Learning Course at BITS Pilani during session 2014-15. 
> 
> I have decent knowledge of Deep Learning and I am familiar with basic
> NN architectures like CNN, RNN and RBM. I am familiar with Python,
> C++, Java and Matlab. For Deep Learning I have worked with Theano and
> Keras and I have basic knowledge of tensorflow. I am looking for a
> Deep Learning project in GSOC this year (as it matches with my
> research area) and came across MLPACK. 
> 
> I have a project proposal. Recently there has been a lot of work in
> the area of Representation learning for NLP. They are now heavily used
> in almost all Machine Learning algorithms for texts (and even with
> Deep learning methods on text). Yet these are not given much thought
> in popular libraries. I think there should be one module which can
> learn Representations in unsupervised manner from text. One method
> which I find easy to understand and works well is Glove. Reference
> paper for the same is: 
> 
> http://www-nlp.stanford.edu/pubs/glove.pdf 
> 
> This will be a good starting point to integrate word embedding modules
> in existing code base and other methods can be added later (prediction
> based methods etc.). If this proposal is feasible than I would like to
> contribute to it. 
> 
> If the above idea is not feasible due to some reasons than I am also
> interested in the idea "Essential Deep Learning Modules". I am very
> much interested in working on such projects as it will give me chance
> to understand these models closely and get chance to implement them
> which is always best way to learn. 
> 
> I would appreciate if anyone can assess the feasibility of my proposal
> and also point out how to start with Deep Learning in mlpack. 

Hi Shashank,

Here is a response to another prospective student about the deep
learning modules project:

https://mailman.cc.gatech.edu/pipermail/mlpack/2016-February/000738.html

As for GloVe, please note that mlpack doesn't really have any support
for features as words; mlpack is built on the assumption that all data
is numeric (or, in some places, categorical, but still represented by
numbers).

So if we were to add something like GloVe, we would need to clearly
think through the abstractions so that whatever you implemented still
acted like the rest of mlpack; I don't really want to add just one
module to mlpack that takes words as input whereas the rest of mlpack
takes numeric data; this can be confusing to users.

If you want to continue down that path, please do feel free, but be sure
to detail completely in your proposal how your interface will be like
the other pieces of mlpack, how it will differ, and consider ideas for
how to adapt the existing code (in, for instance, src/mlpack/core/data/)
to handle words.

I hope this is helpful.

Thanks,

Ryan

-- 
Ryan Curtin    | "Get off my lawn!"
ryan at ratml.org |   - Kowalski


More information about the mlpack mailing list