[mlpack] mlpack's projects in Google Summer of Code 2016

Ryan Curtin ryan at ratml.org
Mon Apr 25 13:30:34 EDT 2016


Hello there,

After much deliberation, we have selected 6 projects for mlpack this
year for GSoC, out of a record-breaking 119 applications.  In many
cases, these decisions were quite difficult to make, and we are sorry
that we could not accept all of the exceptional students who applied.  I
want to thank everyone who applied---I know how much work goes into the
application process.

This year, here are the six projects that will be done as part of Google
Summer of Code:

----

"Neuroevolution Algorithms Implementation"
  by Bang Liu, mentored by Marcus Edel

  Bang will implement neuroevolution algorithms for the neural network
framework in mlpack, such as CNE, NEAT, and HyperNEAT.  These will be
tested on various problems, including applying HyperNEAT to NES games.
In addition, benchmarking will be done to verify that mlpack's
implementations are competitive with---or faster than---other
implementations of these neuroevolution algorithms.

----

"Dataset and Experimentation Tools"
  by Keon Kim, mentored by Tham Ngap Wei

  Keon will design and develop utilities for dataset management in
mlpack.  Specifically, Keon will create four separate modules for
working with datasets: (1) dataset I/O (convert dataset formats, etc.);
(2) data transformation (join/split data, clean missing data, etc.); (3)
statistical analytics (mean/mode/median, t-test, etc.); (4) mathematical
operators (rounding, timezone handling, etc.).  These will supplement
the existing mlpack machine learning algorithms and can be used for
preprocessing (or postprocessing).

----

"Approximate Nearest Neighbor Search"
  by Marcos Pividori, mentored by Sumedh Ghaisas

  Marcos will extend the existing knn/kfn implementation to support
approximation, and then implement spill trees and the "defeatist"
strategy for approximate nearest neighbor search.  Then, he will
benchmark the mlpack aknn implementation against other aknn strategies
using the mlpack benchmarking system.

----

"Implement tree types"
  by Mikhail Lozhnikov, mentored by Ryan Curtin

  Mikhail will extend previous years' tree types projects by
implementing several types of trees: R+ trees, Hilbert R Trees, vantage
point trees, random projection trees, and UB trees.  Each of these will
be usable by mlpack's various dual-tree algorithms (like nearest
neighbor search, range search, FastMKS, emst, and others).

----

"We need to go deeper - GoogLeNeT"
  by Nilay Jain, mentored by Tham Ngap Wei and Marcus Edel

  Nilay will implement the components of the GoogLeNet architecture (the
inception layer, global average pooling, and other pieces), and then
build a GoogLeNet on a sample of ImageNet data.  The pieces of this
architecture will be usable for other neural network applications.

----

"Implementation of Multiprobe LSH and LSH Tuning"
  by Yannis Mentekidis, mentored by Ryan Curtin

  Yannis will significantly improve the existing LSH framework we have
in place by implementing multiprobe LSH and also LSH tuning, which is
what the LSHKIT package implements, as well as OpenMP support since many
parts of LSH are embarrassingly parallel.  If time permits, he will
implement more LSH strategies.

----

I am very excited about the great projects we have for this year, and I
am very much looking forward to seeing the results.  When the Summer of
Code starts, on May 23, students will make progress reports available
either on the mlpack blog (http://www.mlpack.org/gsocblog/) or as
mailing list posts, so you can follow how the projects are going and
provide any comments if you like.

You can find more information on each of the projects on the Summer of
Code website:

http://summerofcode.withgoogle.com/organizations/5376684740050944/

Anyway, congratulations to Bang, Keon, Marcos, Mikhail, Nilay, and
Yannis!

Thanks,

Ryan

-- 
Ryan Curtin    | "I was misinformed."
ryan at ratml.org |   - Rick Blaine


More information about the mlpack mailing list