[mlpack-git] [mlpack/mlpack] Spill trees (#747)

MarcosPividori notifications at github.com
Mon Aug 15 13:36:14 EDT 2016


@rcurtin, Thanks for your feedback!

In response to the comment: https://github.com/mlpack/mlpack/pull/747#issuecomment-239602549
A `defeatist single tree traverser` would perform a `greedy search` that chooses the child with the best score and doesn't do backtracking. We could also call it: `SingleTreeGreedyTraverser`
The problem of using `GetNearestChild()` inside the traverser instead of a `RuleType` template parameter, is that that traverser will only be useful for *neighbor search*.
If we continue using the `Score()` function from the `RuleType` class, we can use it for other purposes such as: *Defeatist Farthest Neighbor search*, or any other algorithm that uses the single tree traverser and we want to provide a greedy alternative.

I think we could solve this problem this way:
+ Add a new method to the `RuleType` class:
```
TreeType& GetChildWithBestScore(
    const size_t queryIndex,
    const TreeType& referenceNode);
```
+ `SingleTreeGreedyTraverser` will always call `GetChildWithBestScore()` and continue with that child until it reaches a leaf node.
+ We could implement `NeighborSearchRules::GetChildWithBestScore()` as :
```
TreeType& NeighborSearchRules::GetChildWithBestScore(
    const size_t queryIndex,
    const TreeType& referenceNode)
{
  SortPolicy::GetBestChild(dataset.col(queryIndex), referenceNode);
}
```
```
TreeType& NearestNeighborSort::GetBestChild(
    const VecType& queryPoint,
    const TreeType& referenceNode)
{
   return referenceNode.GetNearestChild(queryPoint);
}

TreeType& FurthestNeighborSort::GetFurthestChild(
    const VecType& queryPoint,
    const TreeType& referenceNode)
{
   return referenceNode.GetFurthestChild(queryPoint);
}
```

And the trivial implementation of that method for other RuleTypes would be:
```
TreeType& SomeRuleType::GetChildWithBestScore(
    const size_t queryPoint,
    const TreeType& referenceNode)
{
  size_t best_child = size_t() - 1;
  size_t best_score = DBL_MAX;
  for (size_t i = 0; i < referenceNode.NumChildren(); i++)
  {
    double score = Score(queryPoint, referenceNode.Child(i));
    if (score < best_score)
    {
      best_score = score;
      best_child = i;
    }
  }
  return referenceNode.Child(best_child);
}
```

What do you think? This way we can take advantage of different tree types when it is possible to calculate which node a query point is closer to in less time. And, at the same time, we can use the `SingleTreeGreedyTraverser` for other purposes.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/747#issuecomment-239870358
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160815/1cad5665/attachment.html>


More information about the mlpack-git mailing list