[mlpack-git] [mlpack/mlpack] Random projection trees (#726)

Ryan Curtin notifications at github.com
Tue Aug 16 19:19:05 EDT 2016

Ok, things look pretty good to me.  As for the refactoring, we can do that after spill trees are merged.

I ran timing simulations, and as expected, the RP tree is slower than the kd-tree (after all it has looser bounds in all cases).  I only have one remaining question and then a few observations.

 * Should we use RPMax as the default?  Nearly all of my simulations indicate that it performs better than RPMean.

 * The random projection tree appears to do almost as well as the kd-tree in higher dimensions; for instance, performance is almost the same on the MNIST dataset.  I wonder if there exist datasets where it may outperform the kd-tree.

 * The results of one of Pari's papers (http://papers.nips.cc/paper/5121-which-space-partitioning-tree-to-use-for-search) compares random projection trees, kd-trees, and some other types of trees for defeatist search.  Pari's results indicate that the rp-tree is generally the worst-performing of the tree types that he tried.  I wonder if we will see the same results in our own surveys for defeatist search.

I am not sure, in the end, how effective the random projection tree will be, but I think that it is good to have it implemented.  It seems there are not many studies of its empirical effectiveness (if you know of any, send some links!  I would be interested to read).

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160816/f2f48326/attachment.html>

More information about the mlpack-git mailing list