[mlpack] Enquiry About the GSoC

Ryan Curtin gth671b at mail.gatech.edu
Wed Apr 10 12:14:26 EDT 2013


On Wed, Apr 10, 2013 at 10:34:56PM +0800, Tommy Lin wrote:
> Although, it's may be ridiculous for a year 1 student to try such a
> difficult project, I don't want to give up before trying.
> 
> Coulu you please recommend some papers for me?

There are a number of other projects listed there which are not as
difficult and that don't require such specific domain knowledge.  Since
you have no knowledge of the field, I highly suggest you consider one of
the other ones.  Nevertheless, here are some references that you could
look into...

----

J.H. Friedman, J.L. Bentley, and R.A. Finkel. "An algorithm for finding
best matches in logarithmic expected time." ACM Transactions on
Mathematical Software (TOMS) 3.3 (1977): 209-226.

J.L. Bentley and J.H. Friedman. "Data structures for range searching."
ACM Computing Surveys (CSUR) 11.4 (1979): 397-409.

K.L. Clarkson. "Nearest neighbor queries in metric spaces." Discrete &
Computational Geometry 22.1 (1999): 63-93.

A.W. Moore. "The Anchors Hierarchy: using the triangle inequality to
survive high-dimensional data."  Proceedings of the Sixteenth Conference
on Uncertainty in Artificial Intelligence (2000).

A.G. Gray and A.W. Moore. "N-body problems in statistical learning."
Advances in Neural Information Processing Systems (2001): 521-527.

A.W. Moore.  "Nonparametric density estimation: toward computational
tractability."  Proceedings of the Third SIAM International Conference
on Data Mining (2003).

A. Beygelzimer, S. Kakade, and J.L. Langford.  "Cover trees for nearest
neighbor."  Proceedings of the 23rd International Conference on Machine
Learning (2006).

P. Ram, D. Lee, W.B. March, A.G. Gray.  "Linear-time algorithms for
pairwise statistical problems."  Advances in Neural Information
Processing Systems (2009).

----

I would not expect a first (or second) year undergraduate to be able to
effectively tackle this project.  Instead a more likely candidate would
be someone near the completion of an undergraduate degree or a graduate
student who is well-versed with data structures, low-level code
optimization, template metaprogramming, and perhaps somewhat familiar
with some of the papers I listed above.

-- 
Ryan Curtin       | "Like, with jetpacks?"
ryan at igglybob.com |   - Scott


More information about the mlpack mailing list