[mlpack-svn] [MLPACK] #228: allknn and allkfn are slower using SingleTreeDepthFirstTraverser
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Wed May 2 12:34:55 EDT 2012
#228: allknn and allkfn are slower using SingleTreeDepthFirstTraverser
---------------------+------------------------------------------------------
Reporter: rcurtin | Owner:
Type: defect | Status: new
Priority: major | Milestone: mlpack 1.1.0
Component: mlpack | Keywords: single tree allknn allkfn traverser
Blocking: | Blocked By:
---------------------+------------------------------------------------------
The introduction of the `SingleTreeDepthFirstTraverser` is a good start in
r12600, but its application to `NeighborSearch` (r12601) is not as good.
Running single-tree kNN on a dataset of size 500x784 before r12601:
{{{
[INFO ] Program timers:
[INFO ] computing_neighbors: 1.034682s
[INFO ] total_time: 1.584380s
[INFO ] tree_building: 0.509508s
}}}
After:
{{{
[INFO ] Program timers:
[INFO ] computing_neighbors: 1.865637s
[INFO ] total_time: 2.412491s
[INFO ] tree_building: 0.507201s
}}}
Repeated runs show that it is a pretty constant factor of increase of
about 1.8 to 1.9. This isn't optimal.
I am somewhat certain this has to do with the choice of order of recursion
that the `SingleTreeDepthFirstTraverser` uses: it just recurses into child
0, then child 1, then child 2, and so forth, whereas the previous single-
tree implementation recursed into the closest child first.
An abstraction should be developed that allows the `RuleType` class to
specify the order in which to recurse, but this also has to be implemented
in an efficient manner so that distance calculations are not being
needlessly re-performed.
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/228>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed by the FASTLAB at Georgia Tech under Dr. Alex Gray.
More information about the mlpack-svn
mailing list