[mlpack-svn] [MLPACK] #371: KNN Search on the Torus

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Mon Nov 17 17:01:39 EST 2014


#371: KNN Search on the Torus
-----------------------------+----------------------------------------------
  Reporter:  georgehagstrom  |        Owner:        
      Type:  defect          |       Status:  closed
  Priority:  major           |    Milestone:        
 Component:  mlpack          |   Resolution:  fixed 
  Keywords:  Torus           |     Blocking:  83    
Blocked By:                  |  
-----------------------------+----------------------------------------------
Changes (by rcurtin):

  * status:  new => closed
  * resolution:  => fixed
  * blocking:  => 83


Comment:

 Okay, I've taken a look at the program and noticed first that LMetricTorus
 is not a metric.  It returns the squared distance, which may not
 necessarily satisfy the triangle inequality.  I haven't done the math for
 the case of the torus, but at least the squared L2 distance doesn't
 satisfy the triangle inequality.  The fix is simple: add a call to
 `sqrt()` in the return statement.  If I'm wrong, this comment can easily
 be ignored...

 The larger issue of why the algorithm does not work is a little in-depth.
 With kd-trees, the pruning during nearest neighbor search is done using a
 function MinDistance(a, b) which uses the bounding hyper-rectangle of the
 two nodes a and b.  The class which implements this, HRectBound, assumes
 an Lp-space.  For the case of the torus metric you are using, then,
 MinDistance() will return incorrect results.

 The difficulty in using non-Lp-metrics with trees like the kd-tree is that
 the kd-tree must know how to split in each dimension, and it is hard to
 make code generic enough to work with a large class of metrics that is
 also fast.  #83 references this issue, and #30 contains more information
 on the PeriodicHRectBound which potentially would have worked in this
 case.  Unfortunately, the PeriodicHRectBound class did not work so it no
 longer exists...

 However, the good news is that I got this to work with the cover tree.  My
 suggestion was not correct; instead the invocation of the NeighborSearch
 object should look like this:

 {{{
 NeighborSearch<NearestNeighborSort,LMetricTorus<&L>,
 tree::CoverTree<LMetricTorus<&L>,
 tree::FirstPointIsRoot, NeighborSearchStat<NearestNeighborSort> > >
 nn(positions);
 }}}

 and in this case I get the following output:

 {{{
  Nearest neighbor to (.1,50): 99, x position of near neighbor: 99.9, y
 position of near neigbor: 50, distance: 0.2
  Nearest neighbor to (99.9,50): 98, x position of near neighbor: 0.1, y
 position of near neigbor: 50, distance: 0.2
 }}}

 So maybe in this case the cover tree will work for your experiments.

 I hope this is helpful... if I haven't answered something or explained
 something adequately, please let me know.

 Thanks for reporting this issue!  I am glad to see people using mlpack's
 tree-based algorithms in non-L2 space and I'm interested in extending the
 support.  As a result, I've listed this ticket as blocking #83, which I
 will refactor to note that the class of metrics that BinarySpaceTree deals
 with is limited and should be generalized somewhat.

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/371#comment:2>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list