[mlpack-svn] [MLPACK] #235: Templatize splitting procedure for BinarySpaceTree

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Sun Mar 9 07:17:33 EDT 2014


#235: Templatize splitting procedure for BinarySpaceTree
----------------------------------------------------------+-----------------
  Reporter:  rcurtin                                      |        Owner:              
      Type:  enhancement                                  |       Status:  new         
  Priority:  minor                                        |    Milestone:  mlpack 1.1.0
 Component:  mlpack                                       |   Resolution:              
  Keywords:  binary space tree, splitting, tree building  |     Blocking:  234         
Blocked By:                                               |  
----------------------------------------------------------+-----------------

Comment (by yashdv):

 I have edited the template for BinarySpaceTree to include an additional
 parameter SplitType.

 The current splitting procedure is now implemented by the HalfSplit class.
 I have removed GetSplitIndex from BinarySpaceTree, but not SplitNode
 because it is used by the constructors when building the tree recursively.
 Additionally, there is some code in SplitNode which sets some tree
 variables, so we don't want that part removed.

 We could shift the code in SplitNode in the constructors but it would lead
 to code duplication. The splitting part in SplitNode is now done via
 SplitType's methods.

 SplitType does not need any information about the tree except for the
 dataset it is using. I have also included bound, because it helps in the
 splitting process (eg: to calculate splitDimension).

 Regarding the kind of interface SplitType needs to have, I think it is
 essential to provide methods that tell us about the split dimension, split
 column and whether the points where actually split.

 Any suggestions on a better interface/implementation are welcome :)

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/235#comment:1>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list