[mlpack-git] [mlpack/mlpack] Spill trees (#747)

Ryan Curtin notifications at github.com
Sun Aug 14 16:30:41 EDT 2016

> @@ -375,6 +375,9 @@ inline double NeighborSearchRules<SortPolicy, MetricType, TreeType>::Rescore(
>    if (oldScore == DBL_MAX)
>      return oldScore;
> +  if (oldScore == SortPolicy::BestDistance())
> +    return oldScore;

`SortPolicy::BestDistance()` is `DBL_MAX`, if `SortPolicy = FurthestNeighborSort`, so this statement would end up being meaningless.  I think that what you are meaning to do here is avoid calculating the bound if there is no possibility of pruning, but in that case the score should be `0.0`, so we should probably instead write this:

if (oldScore == DBL_MAX || oldScore == 0.0)
  return oldScore;

Two small notes:

1. I realized, while writing this comment, that the furthest neighbor search actually did not invert the distances before they were returned!  This means that the runtime of furthest neighbor search was far slower than it needed to be, since the traversal would always recurse into the worst node combination first!  I added some methods to `SortPolicy` to convert distances to and from scores in 1092b1c, so now furthest-neighbor-search is much faster.

2. Technically I think it is possible to prune if the score is 0, so this short-circuit may sometimes overlook those situations: for instance, if you have a nearest neighbor with distance 0, you don't need to look at any other nearest neighbors that could have a distance of 0---you already have a candidate with that distance.  But for our implementation here, bounds like B_2() mean that we cannot prune in those situations.  So my little footnote here is inapplicable to our situation, but I still wanted to point it out. :)

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160814/29bcfe85/attachment-0001.html>

More information about the mlpack-git mailing list