[mlpack-svn] [MLPACK] #371: KNN Search on the Torus
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Mon Nov 17 17:01:39 EST 2014
#371: KNN Search on the Torus
-----------------------------+----------------------------------------------
Reporter: georgehagstrom | Owner:
Type: defect | Status: closed
Priority: major | Milestone:
Component: mlpack | Resolution: fixed
Keywords: Torus | Blocking: 83
Blocked By: |
-----------------------------+----------------------------------------------
Changes (by rcurtin):
* status: new => closed
* resolution: => fixed
* blocking: => 83
Comment:
Okay, I've taken a look at the program and noticed first that LMetricTorus
is not a metric. It returns the squared distance, which may not
necessarily satisfy the triangle inequality. I haven't done the math for
the case of the torus, but at least the squared L2 distance doesn't
satisfy the triangle inequality. The fix is simple: add a call to
`sqrt()` in the return statement. If I'm wrong, this comment can easily
be ignored...
The larger issue of why the algorithm does not work is a little in-depth.
With kd-trees, the pruning during nearest neighbor search is done using a
function MinDistance(a, b) which uses the bounding hyper-rectangle of the
two nodes a and b. The class which implements this, HRectBound, assumes
an Lp-space. For the case of the torus metric you are using, then,
MinDistance() will return incorrect results.
The difficulty in using non-Lp-metrics with trees like the kd-tree is that
the kd-tree must know how to split in each dimension, and it is hard to
make code generic enough to work with a large class of metrics that is
also fast. #83 references this issue, and #30 contains more information
on the PeriodicHRectBound which potentially would have worked in this
case. Unfortunately, the PeriodicHRectBound class did not work so it no
longer exists...
However, the good news is that I got this to work with the cover tree. My
suggestion was not correct; instead the invocation of the NeighborSearch
object should look like this:
{{{
NeighborSearch<NearestNeighborSort,LMetricTorus<&L>,
tree::CoverTree<LMetricTorus<&L>,
tree::FirstPointIsRoot, NeighborSearchStat<NearestNeighborSort> > >
nn(positions);
}}}
and in this case I get the following output:
{{{
Nearest neighbor to (.1,50): 99, x position of near neighbor: 99.9, y
position of near neigbor: 50, distance: 0.2
Nearest neighbor to (99.9,50): 98, x position of near neighbor: 0.1, y
position of near neigbor: 50, distance: 0.2
}}}
So maybe in this case the cover tree will work for your experiments.
I hope this is helpful... if I haven't answered something or explained
something adequately, please let me know.
Thanks for reporting this issue! I am glad to see people using mlpack's
tree-based algorithms in non-L2 space and I'm interested in extending the
support. As a result, I've listed this ticket as blocking #83, which I
will refactor to note that the class of metrics that BinarySpaceTree deals
with is limited and should be generalized somewhat.
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/371#comment:2>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list