<p>Hi <a href="https://github.com/sumedhghaisas" class="user-mention">@sumedhghaisas</a> <a href="https://github.com/rcurtin" class="user-mention">@rcurtin</a>,</p>

<p>I have implemented Spill trees with axis-parallel splitting hyperplanes. I have made an effort to avoid duplicating existing code for neighbor search.</p>

<p>I created a new class <code>SpillSearch</code> that provides an interface similar to <code>NeighborSearch</code> but with an extension to properly set the tau parameter. It encapsulates an instance of <code>NeighborSearch</code>.</p>

<p>Also, I have implemented a new version of <code>NeighborSearchRules</code> specialized for Spill Trees, because I needed to modify the methods:</p>

<ul>
<li>
<code>Score()</code> to consider splitting hyperplanes for overlapping nodes.</li>
<li>
<code>CalculateBounds()</code> to ignore B_2 bound (we can not use B_2 bound for Spill Trees).</li>
</ul>

<h2>Single Tree Search:</h2>

<p>The <code>SingleTreeTraverser</code> is similar to the implementation for <code>BinarySpaceTree</code>.<br>
The difference is in the implementation of <code>NeighborSearchRules</code> for SpillTrees.<br>
When calculating the score of a query point and a reference node I consider 2 cases:</p>

<ul>
<li>If the reference node is non-overlapping, I calculate the score the same than before.</li>
<li>If the reference node is overlapping, I analyze the reference node's half space. If it contains the given query point, I return 0 (best score). Else, I return DBL_MAX (prune).</li>
</ul>

<h2>Dual Tree Search:</h2>

<p>The Query tree is built without overlapping.</p>

<p>When calculating the score of a query node and a reference node, I consider 2 cases:</p>

<ul>
<li>If the reference node is a non-overlapping node, I calculate the score the same as before.</li>
<li>If the reference node is a overlapping node, I analyze query node's bounding box. If it intersects the reference node's half space, I return 0 (best score). Else, I return DBL_MAX (prune).</li>
</ul>

<p>The <code>DualTreeTraverser</code> is slightly different to the implementation for <code>BinarySpaceTree</code>. When referenceNode is a overlapping node and we can't decide which child node to traverse, this means that queryNode is at both sides of the splitting hyperplane, we analyse the queryNode:</p>

<ul>
<li>If queryNode is a non-leaf node, I recurse down the query node.</li>
<li>If queryNode is a leaf node, I do single tree search for each point in the query node.</li>
</ul>

<p>The <code>DualTreeTraverser</code> is faster than the <code>SingleTreeTraverser</code>. Specially when the value of tau increases, because we will have more non-overlapping nodes which implies more time involved in backtracking.</p>

<p>The extension was incorporated to existing <em>mlpack_knn</em>. With actual implementation, we can use <em>"-t spill"</em> to consider spill trees and <em>"--tau 0.1"</em> to set different values for the overlapping size (default value is tau=0).</p>

<p>Every feedback is welcome! :)</p>

<p>Thanks</p>

<hr>

<h4>You can view, comment on, or merge this pull request online at:</h4>
<p>&nbsp;&nbsp;<a href='https://github.com/mlpack/mlpack/pull/747'>https://github.com/mlpack/mlpack/pull/747</a></p>

<h4>Commit Summary</h4>
<ul>
  <li>Add Spill Trees, based on binary space trees.</li>
  <li>Fix details.</li>
  <li>Improve expansion of node&#39;s bound. Do not include the overlapping buffer in the bounds of both nodes.</li>
  <li>Fix simple errors.</li>
  <li>Add support for Hybrid SP-Tree Search (Single tree traverser).</li>
  <li>Add support for spill trees in knn search.</li>
  <li>Syntax details.</li>
  <li>Fix error in the order of parameters.</li>
  <li>Add tests for spill tree.</li>
  <li>Add approximate knn search tests for hybrid spill trees.</li>
  <li>Add exact knn search tests for hybrid spill trees.</li>
  <li>Include missing headers.</li>
  <li>Use a simple tree traverser for spill tree (Remove defeatist seach included in single tree traverser).</li>
  <li>Implement defeatist search in the Rescore() method, with a specialization for Spill Trees.</li>
  <li>Remove unnecessary BoundType template parameter.</li>
  <li>Record splitDimension and splitValue.</li>
  <li>Set Overlapping node when percentage greater than rho but not because of</li>
  <li>Add NeighborSearchRules specialization for Spill Trees.</li>
  <li>Add dual tree traverser for Spill Trees.</li>
  <li>Set non-overlapping for spill query tree.</li>
  <li>Avoid calculating bounds when oldScore is the best possible.</li>
  <li>Remove unnecessary copying of dataset in SpillTrees.</li>
  <li>Create a new class SpillSearch that encapsulates an instance of NeighborSearch class, and adds the functionality to deal with spill trees.</li>
  <li>Update NSModel to use SpillSearch class.</li>
  <li>Log Num of BaseCases/Score.</li>
  <li>Add more tests for SpillSearch</li>
  <li>Update documentation.</li>
  <li>Remove B_2 bound for neighbor search with spill trees.</li>
  <li>Change spill rules&#39;s filename.</li>
  <li>Fix SpillTree&#39;s move constructor</li>
</ul>

<h4>File Changes</h4>
<ul>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-0">src/mlpack/core/tree/CMakeLists.txt</a>
    (9)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-1">src/mlpack/core/tree/binary_space_tree/mean_split.hpp</a>
    (16)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-2">src/mlpack/core/tree/binary_space_tree/mean_split_impl.hpp</a>
    (71)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-3">src/mlpack/core/tree/binary_space_tree/midpoint_split.hpp</a>
    (17)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-4">src/mlpack/core/tree/binary_space_tree/midpoint_split_impl.hpp</a>
    (65)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-5">src/mlpack/core/tree/spill_tree.hpp</a>
    (20)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-6">src/mlpack/core/tree/spill_tree/dual_tree_traverser.hpp</a>
    (94)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-7">src/mlpack/core/tree/spill_tree/dual_tree_traverser_impl.hpp</a>
    (384)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-8">src/mlpack/core/tree/spill_tree/single_tree_traverser.hpp</a>
    (63)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-9">src/mlpack/core/tree/spill_tree/single_tree_traverser_impl.hpp</a>
    (108)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-10">src/mlpack/core/tree/spill_tree/spill_tree.hpp</a>
    (440)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-11">src/mlpack/core/tree/spill_tree/spill_tree_impl.hpp</a>
    (683)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-12">src/mlpack/core/tree/spill_tree/traits.hpp</a>
    (60)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-13">src/mlpack/core/tree/spill_tree/typedef.hpp</a>
    (80)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-14">src/mlpack/methods/neighbor_search/CMakeLists.txt</a>
    (4)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-15">src/mlpack/methods/neighbor_search/knn_main.cpp</a>
    (26)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-16">src/mlpack/methods/neighbor_search/neighbor_search.hpp</a>
    (19)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-17">src/mlpack/methods/neighbor_search/neighbor_search_impl.hpp</a>
    (8)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-18">src/mlpack/methods/neighbor_search/neighbor_search_rules.hpp</a>
    (3)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-19">src/mlpack/methods/neighbor_search/neighbor_search_rules_impl.hpp</a>
    (3)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-20">src/mlpack/methods/neighbor_search/ns_model.hpp</a>
    (41)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-21">src/mlpack/methods/neighbor_search/ns_model_impl.hpp</a>
    (92)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-22">src/mlpack/methods/neighbor_search/spill_search.hpp</a>
    (289)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-23">src/mlpack/methods/neighbor_search/spill_search_impl.hpp</a>
    (213)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-24">src/mlpack/methods/neighbor_search/spill_search_rules.hpp</a>
    (228)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-25">src/mlpack/methods/neighbor_search/spill_search_rules_impl.hpp</a>
    (440)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-26">src/mlpack/tests/CMakeLists.txt</a>
    (1)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-27">src/mlpack/tests/aknn_test.cpp</a>
    (41)
  </li>
  <li>
    <strong>M</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-28">src/mlpack/tests/knn_test.cpp</a>
    (117)
  </li>
  <li>
    <strong>A</strong>
    <a href="https://github.com/mlpack/mlpack/pull/747/files#diff-29">src/mlpack/tests/spill_tree_test.cpp</a>
    (122)
  </li>
</ul>

<h4>Patch Links:</h4>
<ul>
  <li><a href='https://github.com/mlpack/mlpack/pull/747.patch'>https://github.com/mlpack/mlpack/pull/747.patch</a></li>
  <li><a href='https://github.com/mlpack/mlpack/pull/747.diff'>https://github.com/mlpack/mlpack/pull/747.diff</a></li>
</ul>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/747">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFKx5x6KILwHiNX1peTsbpCYiI6iZks5qbim_gaJpZM4JZzLU">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFETTRbjeusxQrw7ZnxeuUIlR5NBvks5qbim_gaJpZM4JZzLU.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/747"></link>
  <meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>