[mlpack-git] [mlpack/mlpack] Spill trees (#747)
notifications at github.com
Wed Aug 10 15:45:20 EDT 2016
> + }
> + std::vector<size_t> leftPoints, rightPoints;
> + // Split the node.
> + overlappingNode = SplitPoints(tau, rho, points, leftPoints, rightPoints);
> + // We don't need the information in points, so lets clean it.
> + std::vector<size_t>().swap(points);
> + // Now we will recursively split the children by calling their constructors
> + // (which perform this splitting process).
> + left = new SpillTree(this, leftPoints, tau, maxLeafSize, rho);
> + right = new SpillTree(this, rightPoints, tau, maxLeafSize, rho);
> + // Update count number, to represent the number of descendant points.
> + count = left->NumDescendants() + right->NumDescendants();
I see what you mean; I did not understand what you wrote the first time. Sorry about that. I don't think holding a list of unique points in each node is a good solution, and I can't think of any other easy solution to this problem. Any way I can think of to organize the points and still get access to unique descendants from arbitrary levels requires caching potentially very much. So I think unfortunately we should go with your second solution---disable this type of tree for rank-approximate nearest neighbor search or other algorithms that depend on `NumDescendants()` being correct. Let's do that like this:
- add a `TreeTraits` element indicating whether `NumDescendants()` returns the number of unique descendants
- add a `static_assert` in `RASearch` to ensure that with the given tree type, the number of unique descendants is correctly returned by `NumDescendants()`, using that new `TreeTraits` element
What do you think? I think that's the best we can do here, unfortunately.
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mlpack-git