[mlpack-git] [mlpack/mlpack] Spill trees (#747)

Ryan Curtin notifications at github.com
Mon Aug 8 17:06:36 EDT 2016


> +  }
> +
> +  std::vector<size_t> leftPoints, rightPoints;
> +  // Split the node.
> +  overlappingNode = SplitPoints(tau, rho, points, leftPoints, rightPoints);
> +
> +  // We don't need the information in points, so lets clean it.
> +  std::vector<size_t>().swap(points);
> +
> +  // Now we will recursively split the children by calling their constructors
> +  // (which perform this splitting process).
> +  left = new SpillTree(this, leftPoints, tau, maxLeafSize, rho);
> +  right = new SpillTree(this, rightPoints, tau, maxLeafSize, rho);
> +
> +  // Update count number, to represent the number of descendant points.
> +  count = left->NumDescendants() + right->NumDescendants();

Sometimes you want to sample descendant points from a node.  Rank-approximate nearest neighbor search (`src/mlpack/methods/rann/`) does this.  So you would just sample uniformly from `i` in [0, `NumDescendants()`) and then take `Descendant(i)` as your random point.  But if descendants are not unique (that is, if they are double-counted), then you get a biased random sample.  In this case, points in the spill region are twice as likely to be sampled.  Let me know if I can clarify further.

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/747/files/a71b57caa90311f5542180bc0553449c3691395d#r73954733
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160808/f7c622b6/attachment-0001.html>


More information about the mlpack-git mailing list