[mlpack-git] master: Compute max cluster distance after failed prune. This gives us another chance at a prune, and gives very minor speedup. (e2be4fb)

gitdub at big.cc.gt.atl.ga.us gitdub at big.cc.gt.atl.ga.us
Thu Mar 12 16:03:38 EDT 2015


Repository : https://github.com/mlpack/mlpack

On branch  : master
Link       : https://github.com/mlpack/mlpack/compare/eddd7167d69b6c88b271ef2e51d1c20e13f1acd8...70342dd8e5c17e0c164cfb8189748671e9c0dd44

>---------------------------------------------------------------

commit e2be4fb3911823869895eb29a3c72727b7caabf7
Author: Ryan Curtin <ryan at ratml.org>
Date:   Wed Feb 4 15:32:04 2015 -0500

    Compute max cluster distance after failed prune. This gives us another chance at a prune, and gives very minor speedup.


>---------------------------------------------------------------

e2be4fb3911823869895eb29a3c72727b7caabf7
 src/mlpack/methods/kmeans/dtnn_kmeans_impl.hpp | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/src/mlpack/methods/kmeans/dtnn_kmeans_impl.hpp b/src/mlpack/methods/kmeans/dtnn_kmeans_impl.hpp
index f30f85d..6ed0de5 100644
--- a/src/mlpack/methods/kmeans/dtnn_kmeans_impl.hpp
+++ b/src/mlpack/methods/kmeans/dtnn_kmeans_impl.hpp
@@ -330,10 +330,27 @@ void DTNNKMeans<MetricType, MatType, TreeType>::UpdateTree(
           (mcd < 0.5 * closestClusterDistance))
         node.Stat().Pruned() = true;
 
-      // Adjust bounds for next iteration, regardless of whether or not the node
-      // was pruned.  (Does this adjustment need to happen if there is no prune?
-      node.Stat().MaxClusterDistance() += ownerMovement;
-      node.Stat().SecondClusterBound() -= maxMovement;
+      if (!node.Stat().Pruned() && (mcd - ownerMovement) < (scb - maxMovement))
+      {
+        // Calculate the next MCD by hand.
+        const double newDist = node.MaxDistance(centroids.col(owner));
+        ++distanceCalculations;
+        node.Stat().MaxClusterDistance() = newDist;
+
+        if ((newDist < scb - maxMovement) ||
+            (newDist < 0.5 * closestClusterDistance))
+          node.Stat().Pruned() = true;
+        else
+          node.Stat().SecondClusterBound() -= maxMovement;
+      }
+      else
+      {
+        // Adjust bounds for next iteration, regardless of whether or not the
+        // node was pruned.  (Does this adjustment need to happen if there is no
+        // prune?
+        node.Stat().MaxClusterDistance() += ownerMovement;
+        node.Stat().SecondClusterBound() -= maxMovement;
+      }
     }
     else if (childrenPruned && node.NumChildren() > 0 && node.NumPoints() == 0)
     {



More information about the mlpack-git mailing list