<p>In <a href="https://github.com/mlpack/mlpack/pull/747#discussion_r73954733">src/mlpack/core/tree/spill_tree/spill_tree_impl.hpp</a>:</p>
<pre style='color:#555'>&gt; +  }
&gt; +
&gt; +  std::vector&lt;size_t&gt; leftPoints, rightPoints;
&gt; +  // Split the node.
&gt; +  overlappingNode = SplitPoints(tau, rho, points, leftPoints, rightPoints);
&gt; +
&gt; +  // We don&#39;t need the information in points, so lets clean it.
&gt; +  std::vector&lt;size_t&gt;().swap(points);
&gt; +
&gt; +  // Now we will recursively split the children by calling their constructors
&gt; +  // (which perform this splitting process).
&gt; +  left = new SpillTree(this, leftPoints, tau, maxLeafSize, rho);
&gt; +  right = new SpillTree(this, rightPoints, tau, maxLeafSize, rho);
&gt; +
&gt; +  // Update count number, to represent the number of descendant points.
&gt; +  count = left-&gt;NumDescendants() + right-&gt;NumDescendants();
</pre>
<p>Sometimes you want to sample descendant points from a node.  Rank-approximate nearest neighbor search (<code>src/mlpack/methods/rann/</code>) does this.  So you would just sample uniformly from <code>i</code> in [0, <code>NumDescendants()</code>) and then take <code>Descendant(i)</code> as your random point.  But if descendants are not unique (that is, if they are double-counted), then you get a biased random sample.  In this case, points in the spill region are twice as likely to be sampled.  Let me know if I can clarify further.</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/747/files/a71b57caa90311f5542180bc0553449c3691395d#r73954733">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFBDnCVnoXZeT9rC7Fy3_eaLD6Kfrks5qd5pcgaJpZM4JZzLU">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFHj5nxJeoVcR7uGAwCFaOrYuWjFJks5qd5pcgaJpZM4JZzLU.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/747/files/a71b57caa90311f5542180bc0553449c3691395d#r73954733"></link>
  <meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>

<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/modules/aws/aws-bg.jpg","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rcurtin in #747: Sometimes you want to sample descendant points from a node.  Rank-approximate nearest neighbor search (`src/mlpack/methods/rann/`) does this.  So you would just sample uniformly from `i` in [0, `NumDescendants()`) and then take `Descendant(i)` as your random point.  But if descendants are not unique (that is, if they are double-counted), then you get a biased random sample.  In this case, points in the spill region are twice as likely to be sampled.  Let me know if I can clarify further."}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/747/files/a71b57caa90311f5542180bc0553449c3691395d#r73954733"}}}</script>