[mlpack-git] [mlpack/mlpack] R+ and R++ trees implementation (#699)

Ryan Curtin notifications at github.com
Wed Jun 29 16:09:15 EDT 2016


> +  std::vector<SortStruct<ElemType>> sorted(node->NumChildren());
> +
> +  for (size_t i = 0; i < node->NumChildren(); i++)
> +  {
> +    sorted[i].d = SplitPolicy::Bound(node->Child(i))[axis].Hi();
> +    sorted[i].n = i;
> +  }
> +  // Sort high bounds of children.
> +  std::sort(sorted.begin(), sorted.end(), StructComp<ElemType>);
> +
> +  size_t splitPointer = fillFactor * node->NumChildren();
> +
> +  axisCut = sorted[splitPointer - 1].d;
> +
> +  // Check if the partition is suitable.
> +  if (!CheckNonLeafSweep(node, axis, axisCut))

Why are you doing this check first before iterating over different values of `splitPointer`?  I guess I don't understand the reasoning.  It seems like this entire block of code is just to check that there does exist a valid split on this axis, so I guess the first `if` is just a shortcut to avoid checking every possible cut.  If by default you want to go with a midpoint split, you can hardcode the `0.5` and remove `fillFactor`, I agree with that.  But I guess if the default split does not work, you are trying all other possible splits until you find one that works?

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/699/files/e165d759f9ae612b9965f70fbbf8abdb19dc8d07#r69017699
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160629/117042c7/attachment-0001.html>


More information about the mlpack-git mailing list