[mlpack-git] (blog) master: Fix details with items. (1d2b6b2)
gitdub at mlpack.org
gitdub at mlpack.org
Wed Aug 3 09:33:37 EDT 2016
Repository : https://github.com/mlpack/blog
On branch : master
Link : https://github.com/mlpack/blog/compare/8250d76a65e3c9cc8847b2427be2ecbf466d52e9...1d2b6b22d16ca9135d8bc73180ce3531c8a40cdb
>---------------------------------------------------------------
commit 1d2b6b22d16ca9135d8bc73180ce3531c8a40cdb
Author: MarcosPividori <marcos.pividori at gmail.com>
Date: Wed Aug 3 10:33:37 2016 -0300
Fix details with items.
>---------------------------------------------------------------
1d2b6b22d16ca9135d8bc73180ce3531c8a40cdb
content/blog/MarcosWeekTen.md | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/content/blog/MarcosWeekTen.md b/content/blog/MarcosWeekTen.md
index 4415e8c..121b442 100644
--- a/content/blog/MarcosWeekTen.md
+++ b/content/blog/MarcosWeekTen.md
@@ -8,7 +8,9 @@ Last week, I have implemented Spill trees with axis-orthogonal splitting hyperpl
I created a new class `SpillSearch` that provides an interface similar to `NeighborSearch` but with an extension to properly set the tau parameter. It encapsulates an instance of `NeighborSearch`.
Also, I have implemented a new version of `NeighborSearchRules` specialized for Spill Trees, because I needed to modify the methods:
+
+ `Score()` to consider splitting hyperplanes when calculating the score for overlapping nodes.
+
+ `CalculateBounds()` to ignore $B_2$ bound (we can not use $B_2$ bound for Spill Trees).
## Single Tree Search:
@@ -16,19 +18,25 @@ Also, I have implemented a new version of `NeighborSearchRules` specialized for
The `SingleTreeTraverser` is similar to the implementation for `BinarySpaceTree`.
The difference is in the implementation of `NeighborSearchRules` for SpillTrees.
When calculating the score of a query point and a reference node I consider 2 cases:
- + If the reference node is non-overlapping, I calculate the score the same than before.
- + If the reference node is overlapping, I analyze the reference node's half space. If it contains the given query point, I return 0 (best score). Else, I return DBL_MAX (prune). As we consider axis-orthogonal splitting hyperplanes, we can decide which child node to traverse in O(1), analising the appropiate dimension.
+
+ + If the reference node is non-overlapping, I calculate the score the same than before.
+
+ + If the reference node is overlapping, I analyze the reference node's half space. If it contains the given query point, I return `0` (best score). Else, I return `DBL_MAX` (prune). As we consider axis-orthogonal splitting hyperplanes, we can decide which child node to traverse in O(1), analising the appropiate dimension.
## Dual Tree Search:
The Query tree is built without overlapping.
When calculating the score of a query node and a reference node, I consider 2 cases:
- + If the reference node is a non-overlapping node, I calculate the score the same as before.
- + If the reference node is a overlapping node, I analyze query node's bounding box. If it intersects the reference node's half space, I return 0 (best score). Else, I return DBL_MAX (prune). As the reference spill tree is built with axis-orthogonal hyperplanes and the query tree considers hrect bounds, we can make this decision in O(1) time.
+
+ + If the reference node is a non-overlapping node, I calculate the score the same as before.
+
+ + If the reference node is a overlapping node, I analyze query node's bounding box. If it intersects the reference node's half space, I return `0` (best score). Else, I return `DBL_MAX` (prune). As the reference spill tree is built with axis-orthogonal hyperplanes and the query tree considers hrect bounds, we can make this decision in O(1) time.
The `DualTreeTraverser` is slightly different to the implementation for `BinarySpaceTree`. When referenceNode is a overlapping node and we can't decide which child node to traverse, this means that queryNode is at both sides of the splitting hyperplane, we analyse the queryNode:
+
+ If queryNode is a non-leaf node, I recurse down the query node.
+
+ If queryNode is a leaf node, I do single tree search for each point in the query node.
The `DualTreeTraverser` is faster than the `SingleTreeTraverser`, specially when the value of tau increases.
@@ -38,7 +46,9 @@ The extension was incorporated to existing *mlpack_knn*. With actual implementat
I have made a pull request with this implementation in: [[2]](https://github.com/mlpack/mlpack/pull/747).
I plan to work next days in these topics:
+
+ Implement another version of SpillTrees to consider general hyperplanes (not necessarily axis-orthogonal), using `BallBound` instead of `HrectBound`, and holding a projection vector in each node.
+
+ Add a command line flag *"--get_real_error"* to compare the approximate neighbor search against the naive method and print the real relative error, so we can test with differents values of tau and see the difference.
Follow the progress in: [[1]](https://github.com/MarcosPividori/mlpack/tree/spill-trees/src/mlpack/core/tree/spill_tree).
More information about the mlpack-git
mailing list