[mlpack-git] master: Bug fix for Elkan? (3988fc6)
gitdub at mlpack.org
gitdub at mlpack.org
Thu Mar 24 17:02:27 EDT 2016
Repository : https://github.com/mlpack/mlpack
On branch : master
Link : https://github.com/mlpack/mlpack/compare/a0f1dd5632004b26cd3592e6ceaa455792df8c3a...9e4e126589d19ddbaaaebec256c77d1f3eb75ce2
>---------------------------------------------------------------
commit 3988fc6987178a006091c98bd046b097f8b412da
Author: Erich Schubert <kno10 at users.noreply.github.com>
Date: Thu Mar 24 22:02:27 2016 +0100
Bug fix for Elkan?
AFAICT, the old code always killed all centroids if one cluster is empty.
Keeping the last centroids is much more stable.
>---------------------------------------------------------------
3988fc6987178a006091c98bd046b097f8b412da
src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp b/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
index 638e35c..d4a1b1f 100644
--- a/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
+++ b/src/mlpack/methods/kmeans/elkan_kmeans_impl.hpp
@@ -153,14 +153,15 @@ double ElkanKMeans<MetricType, MatType>::Iterate(const arma::mat& centroids,
double cNorm = 0.0; // Cluster movement for residual.
for (size_t c = 0; c < centroids.n_cols; ++c)
{
- if (counts[c] > 0)
+ if (counts[c] > 0) {
newCentroids.col(c) /= counts[c];
- else
- newCentroids.fill(DBL_MAX); // Fill with invalid value.
-
- moveDistances(c) = metric.Evaluate(newCentroids.col(c), centroids.col(c));
- cNorm += std::pow(moveDistances(c), 2.0);
- distanceCalculations++;
+ moveDistances(c) = metric.Evaluate(newCentroids.col(c), centroids.col(c));
+ cNorm += std::pow(moveDistances(c), 2.0);
+ distanceCalculations++;
+ } else {
+ newCentroids.col(c) = centroids.col(c); // Keep old centroid.
+ moveDistances(c) = 0.0;
+ }
}
for (size_t i = 0; i < dataset.n_cols; ++i)
More information about the mlpack-git
mailing list