[mlpack-svn] [MLPACK] #243: Dual cover tree traverser is quite slow

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Mon Feb 10 08:44:04 EST 2014


#243: Dual cover tree traverser is quite slow
------------------------------------------------------------+---------------
  Reporter:  rcurtin                                        |        Owner:  rcurtin     
      Type:  defect                                         |       Status:  closed      
  Priority:  major                                          |    Milestone:  mlpack 1.0.9
 Component:  mlpack                                         |   Resolution:  fixed       
  Keywords:  cover tree, dualtreetraverser, neighborsearch  |     Blocking:              
Blocked By:  264                                            |  
------------------------------------------------------------+---------------
Changes (by rcurtin):

  * status:  accepted => closed
  * resolution:  => fixed


Comment:

 Using the new TraversalInfo concept committed in r16217 through r16230,
 the child-parent prunes are now possible and the results are now (on
 test_data_3_1000.csv):

  * jl-constructed cover tree, jl dual-tree traversal: 86818
  * jl-constructed cover tree, mlpack dual-tree traversal: 86524
  * mlpack-constructed cover tree, mlpack dual-tree traversal: 107561

 And runtime results for the same dataset:

  * jl-constructed cover tree, jl dual-tree traversal: 0.021s
  * jl-constructed cover tree, mlpack dual-tree traversal: 0.091s
  * mlpack-constructed cover tree, mlpack dual-tree traversal: 0.036s

 Then, on a larger dataset, the corel dataset (32x37748) (one duplicate
 point had to be removed):

  * jl-constructed cover tree, jl dual-tree traversal: 157 792 508
  * jl-constructed cover tree, mlpack dual-tree traversal: 124 067 723
  * mlpack-constructed cover tree, mlpack dual-tree traversal: 146 564 813

 And runtimes:

  * jl-constructed cover tree, jl dual-tree traversal: 18.679s
  * jl-constructed cover tree, mlpack dual-tree traversal: 247.655s
  * mlpack-constructed cover tree, mlpack dual-tree traversal: 75.117s

 I am not sure the runtime numbers for the jl-constructed cover tree and
 mlpack dual-tree traversal are reasonable, because the tree needs to be
 reformatted and this is done inefficiently.  Additional slowdown is
 probably present because the bound calculation
 (NeighborSearchRules::CalculateBound()) is slow to calculate and should be
 improved.  But that is outside the scope of this ticket because this
 ticket is only about the traversal, so I can (finally) mark this resolved.

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/243#comment:13>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list