[mlpack-git] [mlpack/mlpack] R+ and R++ trees implementation (#699)

Ryan Curtin notifications at github.com
Wed Jun 29 16:01:18 EDT 2016


> +      if (node->Dataset().col(node->Point(sorted[i].n))[k] > highBound2[k])
> +        highBound2[k] = node->Dataset().col(node->Point(sorted[i].n))[k];
> +    }
> +  }
> +
> +  // Evaluate the cost of the split i.e. calculate the total coverage
> +  // of two resulting nodes.
> +
> +  ElemType area1 = 1.0, area2 = 1.0;
> +  ElemType overlappedArea = 1.0;
> +
> +  for (size_t k = 0; k < node->Bound().Dim(); k++)
> +  {
> +    area1 *= highBound1[k] - lowerBound1[k];
> +    area2 *= highBound2[k] - lowerBound2[k];
> +  }

I think this function could be simplified a lot if you used the native `HRectBound` functions.  Like instead of creating `lowerBound1`, `lowerBound2`, `highBound1`, and `highBound2` as `std::vector<ElemType>`s, you could just create `bound1` and `bound2` as `HRectBound<MetricType, ElemType>` and then use `operator|=()` to expand the bounds as necessary, then `Volume()` to calculate the last bit.

Also it seems like `overlappedArea` is not used here or calculated, so maybe that is wrong, or maybe it just needs to be removed.  I thought that in the R+ and R++ tree, nodes could not overlap, so I am not sure why `overlappedArea` is not 0 for both leaf nodes and non-leaf nodes.

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/699/files/e165d759f9ae612b9965f70fbbf8abdb19dc8d07#r69016507
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160629/2ae341fc/attachment.html>


More information about the mlpack-git mailing list