[mlpack-svn] [MLPACK] #243: Dual cover tree traverser is quite slow
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Mon Feb 10 08:44:04 EST 2014
#243: Dual cover tree traverser is quite slow
------------------------------------------------------------+---------------
Reporter: rcurtin | Owner: rcurtin
Type: defect | Status: closed
Priority: major | Milestone: mlpack 1.0.9
Component: mlpack | Resolution: fixed
Keywords: cover tree, dualtreetraverser, neighborsearch | Blocking:
Blocked By: 264 |
------------------------------------------------------------+---------------
Changes (by rcurtin):
* status: accepted => closed
* resolution: => fixed
Comment:
Using the new TraversalInfo concept committed in r16217 through r16230,
the child-parent prunes are now possible and the results are now (on
test_data_3_1000.csv):
* jl-constructed cover tree, jl dual-tree traversal: 86818
* jl-constructed cover tree, mlpack dual-tree traversal: 86524
* mlpack-constructed cover tree, mlpack dual-tree traversal: 107561
And runtime results for the same dataset:
* jl-constructed cover tree, jl dual-tree traversal: 0.021s
* jl-constructed cover tree, mlpack dual-tree traversal: 0.091s
* mlpack-constructed cover tree, mlpack dual-tree traversal: 0.036s
Then, on a larger dataset, the corel dataset (32x37748) (one duplicate
point had to be removed):
* jl-constructed cover tree, jl dual-tree traversal: 157 792 508
* jl-constructed cover tree, mlpack dual-tree traversal: 124 067 723
* mlpack-constructed cover tree, mlpack dual-tree traversal: 146 564 813
And runtimes:
* jl-constructed cover tree, jl dual-tree traversal: 18.679s
* jl-constructed cover tree, mlpack dual-tree traversal: 247.655s
* mlpack-constructed cover tree, mlpack dual-tree traversal: 75.117s
I am not sure the runtime numbers for the jl-constructed cover tree and
mlpack dual-tree traversal are reasonable, because the tree needs to be
reformatted and this is done inefficiently. Additional slowdown is
probably present because the bound calculation
(NeighborSearchRules::CalculateBound()) is slow to calculate and should be
improved. But that is outside the scope of this ticket because this
ticket is only about the traversal, so I can (finally) mark this resolved.
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/243#comment:13>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list