[mlpack-git] master: Bug fix for Elkan? (3988fc6)

gitdub at mlpack.org gitdub at mlpack.org
Thu Mar 24 17:02:27 EDT 2016


Repository : https://github.com/mlpack/mlpack
On branch  : master
Link       : https://github.com/mlpack/mlpack/compare/a0f1dd5632004b26cd3592e6ceaa455792df8c3a...9e4e126589d19ddbaaaebec256c77d1f3eb75ce2

>---------------------------------------------------------------

commit 3988fc6987178a006091c98bd046b097f8b412da
Author: Erich Schubert <kno10 at users.noreply.github.com>
Date:   Thu Mar 24 22:02:27 2016 +0100

    Bug fix for Elkan?
    
    AFAICT, the old code always killed all centroids if one cluster is empty.
    Keeping the last centroids is much more stable.


>---------------------------------------------------------------

3988fc6987178a006091c98bd046b097f8b412da
 src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp b/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
index 638e35c..d4a1b1f 100644
--- a/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
+++ b/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
@@ -153,14 +153,15 @@ double ElkanKMeans<MetricType, MatType>::Iterate(const arma::mat& centroids,
   double cNorm = 0.0; // Cluster movement for residual.
   for (size_t c = 0; c < centroids.n_cols; ++c)
   {
-    if (counts[c] > 0)
+    if (counts[c] > 0) {
       newCentroids.col(c) /= counts[c];
-    else
-      newCentroids.fill(DBL_MAX); // Fill with invalid value.
-
-    moveDistances(c) = metric.Evaluate(newCentroids.col(c), centroids.col(c));
-    cNorm += std::pow(moveDistances(c), 2.0);
-    distanceCalculations++;
+      moveDistances(c) = metric.Evaluate(newCentroids.col(c), centroids.col(c));
+      cNorm += std::pow(moveDistances(c), 2.0);
+      distanceCalculations++;
+    } else {
+      newCentroids.col(c) = centroids.col(c); // Keep old centroid.
+      moveDistances(c) = 0.0;
+    }
   }
 
   for (size_t i = 0; i < dataset.n_cols; ++i)




More information about the mlpack-git mailing list