[mlpack-git] [mlpack/mlpack] Spill trees (#747)

Ryan Curtin notifications at github.com
Wed Aug 17 16:51:18 EDT 2016

> +         typename MatType,
> +         template<typename HyperplaneMetricType> class HyperplaneType,
> +         template<typename SplitBoundT, typename SplitMatT> class SplitType>
> +void SpillSearch<MetricType, MatType, HyperplaneType, SplitType>::
> +Search(const MatType& querySet,
> +       const size_t k,
> +       arma::Mat<size_t>& neighbors,
> +       arma::mat& distances)
> +{
> +  if (Naive() || SingleMode())
> +    neighborSearch.Search(querySet, k, neighbors, distances);
> +  else
> +  {
> +    // For Dual Tree Search on SpillTrees, the queryTree must be built with non
> +    // overlapping (tau = 0).
> +    Tree queryTree(querySet, 0 /* tau */, leafSize, rho);

When I originally designed the `NeighborSearch` class, my intention was to make it as tree-independent as possible (or, well, that's what happened in the end).  So it doesn't hold any `leafSize` parameter despite the fact that the `BinarySpaceTree` uses it.  The policy has always been, `NeighborSearch` will build your trees with the default parameters, and if you don't want the default parameters, you must pass the trees.  I think that we should continue that policy here, so I don't think it's advisable to have `leafSize`, `rho`, and `tau` stored in the class.  If the user wants to specify those, they can build the tree themselves.  (This means that `NSModel` will need to store those parameters, if they can be specified from the command-line.)

For the traversers, an idea is to add two more template parameters to `NeighborSearch` that specify the desired single-tree and dual-tree traversers (and default to the defaults of the TreeType).

Then we could typedef `SpillTreeKNN = NeighborSearch<EuclideanDistance, arma::mat, SPTree, SPTree::DefeatistSingleTreeTraverser, SPTree::DefeatistDualTreeTraverser>` (or whatever the names and syntax may be).  This, plus the template specialization I suggested for building the query tree for spill tree search, should be sufficient to avoid the need for the `SpillSearch` class at all, and keep the number of files and classes down.  What do you think?  Have I overlooked a detail?

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160817/bf388d93/attachment-0001.html>

More information about the mlpack-git mailing list