[mlpack-git] master: An optimization to speed up the binary split -- but it's still slow... (41bf107)

gitdub at big.cc.gt.atl.ga.us gitdub at big.cc.gt.atl.ga.us
Wed Dec 23 11:46:46 EST 2015


Repository : https://github.com/mlpack/mlpack

On branch  : master
Link       : https://github.com/mlpack/mlpack/compare/de9cc4b05069e1fa4793d9355f2f595af5ff45d2...6070527af14296cd99739de6c62666cc5d2a2125

>---------------------------------------------------------------

commit 41bf107cf017d6f6c40640371294e696e95b8fb8
Author: Ryan Curtin <ryan at ratml.org>
Date:   Wed Dec 2 10:56:49 2015 -0800

    An optimization to speed up the binary split -- but it's still slow...


>---------------------------------------------------------------

41bf107cf017d6f6c40640371294e696e95b8fb8
 src/mlpack/methods/hoeffding_trees/binary_numeric_split_impl.hpp | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/mlpack/methods/hoeffding_trees/binary_numeric_split_impl.hpp b/src/mlpack/methods/hoeffding_trees/binary_numeric_split_impl.hpp
index eca9222..d135b17 100644
--- a/src/mlpack/methods/hoeffding_trees/binary_numeric_split_impl.hpp
+++ b/src/mlpack/methods/hoeffding_trees/binary_numeric_split_impl.hpp
@@ -68,14 +68,17 @@ void BinaryNumericSplit<FitnessFunction, ObservationType>::
   // Initialize to the first observation, so we don't calculate gain on the
   // first iteration (it will be 0).
   ObservationType lastObservation = (*sortedElements.begin()).first;
+  size_t lastClass = classCounts.n_elem;
   for (typename std::multimap<ObservationType, size_t>::const_iterator it =
       sortedElements.begin(); it != sortedElements.end(); ++it)
   {
-    // If this value is the same as the last, or if this is the first value,
-    // don't calculate the gain.
-    if ((*it).first != lastObservation)
+    // If this value is the same as the last, or if this is the first value, or
+    // we have the same class as the previous observation, don't calculate the
+    // gain---it can't be any better.  (See Fayyad and Irani, 1991.)
+    if (((*it).first != lastObservation) || ((*it).second != lastClass))
     {
       lastObservation = (*it).first;
+      lastClass = (*it).second;
 
       const double value = FitnessFunction::Evaluate(counts);
       if (value > bestFitness)



More information about the mlpack-git mailing list