[mlpack] [GSOC 2016] Contribute for MLPack

Ryan Curtin ryan at ratml.org
Fri Mar 11 09:56:40 EST 2016


On Fri, Mar 11, 2016 at 06:00:28PM +0530, Parijat Dewangan wrote:
> Hi,
> 
> I went through various research papers to have a better understanding of
> the dual tree algorithms and various trees. Currently, I am focusing on
> vantage point trees, by referring the following papers.
> 
>    -  "IMPROVING DUAL-TREE ALGORITHMS"
>    <http://www.ratml.org/pub/pdf/2015improving.pdf> Thesis by Ryan Curtin .
>    -  "Data Structures and Algorithms for Nearest Neighbor Search in
>    General Metric Spaces by Peter N. Yianilos*"
> 
> I am completely comfortable with the MLPack API after going through the
> above thesis of Ryan Curtin. So, I was thinking of coding vantage point
> trees. What do you suggest ?

Sure, vantage point trees would be interesting.  Note that vantage point
trees are actually the same as metric trees, so we'll have to provide
some documentation somewhere indicating that they are the same thing.

> Should I provide you with the pseudo code of Vantage Point Trees? Or should
> I try fixing some issues? I was thinking of taking up  issue #275.
> https://github.com/mlpack/mlpack/issues/275.

If you can do #275 without breaking any of the tests, please feel free
and I'd be happy to merge in the improvement.  It will be a significant
refactoring.

Another possibility from there would be to implement a leaf size
parameter for cover trees, that would cause the tree building process to
terminate when the number of points in a node was small enough.  But I
think that would be a lot more difficult and maybe we can save that for
another day... :)

Thanks,

Ryan

-- 
Ryan Curtin    | "Like, with jetpacks?"
ryan at ratml.org |   - Scott


More information about the mlpack mailing list