[mlpack-git] [mlpack/mlpack] Spill Trees implementation. (#728)

sumedhghaisas notifications at github.com
Wed Aug 3 01:54:46 EDT 2016


Hey Marcos,

I will look at yiur code tonight. I will also be available online so we can
discuss about it. Will you be free tonight??

Regards,
Sumedh Ghaisas

On 01-Aug-2016 10:46 PM, "MarcosPividori" <notifications at github.com> wrote:

> Hi @sumedhghaisas <https://github.com/sumedhghaisas> @rcurtin
> <https://github.com/rcurtin>,
>
> I have implemented Spill trees with axis-parallel splitting hyperplanes. I
> have made an effort to avoid duplicating existing code for neighbor search.
>
> I created a new class SpillSearch that provides an interface similar to
> NeighborSearch but with an extension to properly set the tau parameter.
> It encapsulates an instance of NeighborSearch.
>
> Also, I have implemented a new version of NeighborSearchRules specialized
> for Spill Trees, because I needed to modify the methods:
>
>    - Score() to consider splitting hyperplanes for overlapping nodes.
>    - CalculateBounds() to ignore B_2 bound (we can not use B_2 bound for
>    Spill Trees).
>
> Single Tree Search:
>
> The SingleTreeTraverser is similar to the implementation for
> BinarySpaceTree.
> The difference is in the implementation of NeighborSearchRules for
> SpillTrees.
> When calculating the score of a query point and a reference node I
> consider 2 cases:
>
>    - If the reference node is non-overlapping, I calculate the score the
>    same than before.
>    - If the reference node is overlapping, I analyze the reference node's
>    half space. If it contains the given query point, I return 0 (best score).
>    Else, I return DBL_MAX (prune).
>
> Dual Tree Search:
>
> The Query tree is built without overlapping.
>
> When calculating the score of a query node and a reference node, I
> consider 2 cases:
>
>    - If the reference node is a non-overlapping node, I calculate the
>    score the same as before.
>    - If the reference node is a overlapping node, I analyze query node's
>    bounding box. If it intersects the reference node's half space, I return 0
>    (best score). Else, I return DBL_MAX (prune).
>
> The DualTreeTraverser is slightly different to the implementation for
> BinarySpaceTree. When referenceNode is a overlapping node and we can't
> decide which child node to traverse, this means that queryNode is at both
> sides of the splitting hyperplane, we analyse the queryNode:
>
>    - If queryNode is a non-leaf node, I recurse down the query node.
>    - If queryNode is a leaf node, I do single tree search for each point
>    in the query node.
>
> The DualTreeTraverser is faster than the SingleTreeTraverser. Specially
> when the value of tau increases, because we will have more non-overlapping
> nodes which implies more time involved in backtracking.
>
> The extension was incorporated to existing *mlpack_knn*. With actual
> implementation, we can use *"-t spill"* to consider spill trees and *"--tau
> 0.1"* to set different values for the overlapping size (default value is
> tau=0).
>
> I have made a pull request with this implementation in: #747
> <https://github.com/mlpack/mlpack/pull/747>.
>
> If you agree, I plan to work next days in these topics:
>
>    - Implement another version of SpillTrees to consider
>    non-axis-parallel hyperplanes, using BallBound instead of HrectBound,
>    and holding a projection vector in each node.
>    - Add a command line flag *"--get_real_error"* to compare the
>    approximate neighbor search against the naive method and print the real
>    relative error, so we can test with differents values of tau and see the
>    difference.
>
> Every feedback is welcome! :)
>
> Thanks
>
>> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <https://github.com/mlpack/mlpack/issues/728#issuecomment-236645481>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AFC3QaeOekficTE9vjQxBLZS9OloQoxeks5qbinwgaJpZM4JQY7d>
> .
>


---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/728#issuecomment-237148866
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160802/2593fa4c/attachment-0001.html>


More information about the mlpack-git mailing list