[mlpack] GSoC 2014 : Introduction and Interests

Ryan Curtin gth671b at mail.gatech.edu
Thu Mar 6 11:38:45 EST 2014


On Wed, Mar 05, 2014 at 08:39:10PM +0530, Anand Soni wrote:
> Thanks a lot Ryan!
> 
> I too, would want to have a single and nice application submitted
> rather than many. It was just out of interest that I was reading up on
> dual trees and yes, most of the literature that I found was from
> gatech. I also came across your paper on dual trees
> (http://arxiv.org/pdf/1304.4327.pdf ). Can you give me some more
> pointers where I can get a better understanding of dual trees?

There are lots of papers on dual-tree algorithms but the paper you
linked to is (to my knowledge) the only one that tries to describe
dual-tree algorithms in an abstract manner.  Here are some links to
other papers, but keep in mind that they focus on particular algorithms
and often don't devote very much space to describing exactly what a
dual-tree algorithm is:

A.G. Gray and A.W. Moore. "N-body problems in statistical learning."
Advances in Neural Information Processing Systems (2001): 521-527.

A.W. Moore.  "Nonparametric density estimation: toward computational
tractability."  Proceedings of the Third SIAM International Conference
on Data Mining (2003).

A. Beygelzimer, S. Kakade, and J.L. Langford.  "Cover trees for nearest
neighbor."  Proceedings of the 23rd International Conference on Machine
Learning (2006).

P. Ram, D. Lee, W.B. March, A.G. Gray.  "Linear-time algorithms for
pairwise statistical problems."  Advances in Neural Information
Processing Systems (2009).

W.B. March, P. Ram, A.G. Gray.  "Fast Euclidean minimum spanning tree:
algorithm, analysis, and applications."  Proceedings of the 16th ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining
(2010).

R.R. Curtin, P. Ram.  "Dual-tree fast exact max-kernel search." (this
one hasn't been published yet...
http://www.ratml.org/pub/pdf/2013fastmks.pdf ).

I know that's a lot of references and probably way more than you want to
read, so don't feel obligated to read anything, but it will probably
help explain exactly what a dual-tree algorithm is... I hope!  I can
link to more papers too, if you want...

> But, of course, I am more willing to work on automatic benchmarking,
> on which I had a little talk with Marcus and I am brewing ideas.

Ok, sounds good.

Thanks,

Ryan

-- 
Ryan Curtin    | "Somebody dropped a bag on the sidewalk."
ryan at ratml.org |   - Kit


More information about the mlpack mailing list