[mlpack-svn] [MLPACK] #228: allknn and allkfn are slower using SingleTreeDepthFirstTraverser

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Wed May 2 12:34:55 EDT 2012


#228: allknn and allkfn are slower using SingleTreeDepthFirstTraverser
---------------------+------------------------------------------------------
 Reporter:  rcurtin  |        Owner:                                     
     Type:  defect   |       Status:  new                                
 Priority:  major    |    Milestone:  mlpack 1.1.0                       
Component:  mlpack   |     Keywords:  single tree allknn allkfn traverser
 Blocking:           |   Blocked By:                                     
---------------------+------------------------------------------------------
 The introduction of the `SingleTreeDepthFirstTraverser` is a good start in
 r12600, but its application to `NeighborSearch` (r12601) is not as good.

 Running single-tree kNN on a dataset of size 500x784 before r12601:

 {{{
 [INFO ] Program timers:
 [INFO ]   computing_neighbors: 1.034682s
 [INFO ]   total_time: 1.584380s
 [INFO ]   tree_building: 0.509508s
 }}}

 After:

 {{{
 [INFO ] Program timers:
 [INFO ]   computing_neighbors: 1.865637s
 [INFO ]   total_time: 2.412491s
 [INFO ]   tree_building: 0.507201s
 }}}

 Repeated runs show that it is a pretty constant factor of increase of
 about 1.8 to 1.9.  This isn't optimal.

 I am somewhat certain this has to do with the choice of order of recursion
 that the `SingleTreeDepthFirstTraverser` uses: it just recurses into child
 0, then child 1, then child 2, and so forth, whereas the previous single-
 tree implementation recursed into the closest child first.

 An abstraction should be developed that allows the `RuleType` class to
 specify the order in which to recurse, but this also has to be implemented
 in an efficient manner so that distance calculations are not being
 needlessly re-performed.

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/228>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed by the FASTLAB at Georgia Tech under Dr. Alex Gray.


More information about the mlpack-svn mailing list