[mlpack] #251

Ryan Curtin gth671b at mail.gatech.edu
Fri Jan 17 15:05:44 EST 2014


On Fri, Jan 17, 2014 at 12:07:13PM +0530, Abhinav Pathak wrote:
> Hey,
> sorry for the delay, now i would be continuing with the work.

No worries, there's no hurry.

> So basically i'm going to shift the split node function from the
> BinarySpaceTree class and put it in a new function. I wanted to ask that
> should i create a new class which would for now split the data as it is
> doing now and later we can add more methods for splitting the data in this
> new class.
> But can i let the splitnode be there in BinarySpaceTree class and inside it
> remove the splitting code and rather instantiate the object of the new
> class and use the splitting method? Is it the correct approach for it?

So the approach here is policy-based design [1].  The idea is that we
give the user flexibility by allowing them to define their own policy
for how the splitting occurs.  So the BinarySpaceTree template signature
will look like this...

template<
    typename BoundType,
    typename MetricType,
    typename StatisticType,
    typename SplitType
>
class BinarySpaceTree;

And then the user can choose whatever class they want for SplitType
presuming it has a function

  void SplitNode(...).

Then, in the BinarySpaceTree code, instead of calling
BinarySpaceTree::SplitNode(), call SplitType::SplitNode().  Each
SplitType class should not need to know anything about the tree itself,
just that it needs to split the points it is given into a left and right
set according to some heuristic in a particular dimension.

Right now the BinarySpaceTree::SplitNode() function implements mean
split, which means that in whichever dimension it is splitting the
points on, it finds the mean, then anything above the mean goes to the
right and otherwise it goes to the left.  So you could move that into a
MeanSplit class.

If you wanted, you could also make a MedianSplit class, which a user
could plug in.

The last thing: tests.

Once you've refactored BinarySpaceTree, ensure that it still compiles
and that when you run mlpack_test, there are no new failures.  If you
decide to make a MedianSplit class, you should write tests for that too.

Let me know if I can further clarify anything.

Thanks,

Ryan

-- 
Ryan Curtin    | "Aw, Brian's doing it again, dude.  Brian, you ain't
ryan at ratml.org | no pimp, dude."   "Where's my money?"


More information about the mlpack mailing list