[mlpack-svn] r16150 - mlpack/trunk/src/mlpack/methods/kmeans

fastlab-svn at coffeetalk-1.cc.gatech.edu fastlab-svn at coffeetalk-1.cc.gatech.edu
Tue Jan 14 15:50:04 EST 2014


Author: rcurtin
Date: Tue Jan 14 15:50:03 2014
New Revision: 16150

Log:
This wasn't actually calculating the variance correctly.


Modified:
   mlpack/trunk/src/mlpack/methods/kmeans/max_variance_new_cluster_impl.hpp

Modified: mlpack/trunk/src/mlpack/methods/kmeans/max_variance_new_cluster_impl.hpp
==============================================================================
--- mlpack/trunk/src/mlpack/methods/kmeans/max_variance_new_cluster_impl.hpp	(original)
+++ mlpack/trunk/src/mlpack/methods/kmeans/max_variance_new_cluster_impl.hpp	Tue Jan 14 15:50:03 2014
@@ -30,12 +30,16 @@
 
   // Add the variance of each point's distance away from the cluster.  I think
   // this is the sensible thing to do.
-  for (size_t i = 0; i < data.n_cols; i++)
+  for (size_t i = 0; i < data.n_cols; ++i)
   {
-    variances[assignments[i]] += arma::as_scalar(
-        arma::var(data.col(i) - centroids.col(assignments[i])));
+    variances[assignments[i]] += metric::SquaredEuclideanDistance::Evaluate(
+        data.col(i), centroids.col(assignments[i]));
   }
 
+  // Divide by the number of points in the cluster to produce the variance.
+  for (size_t i = 0; i < clusterCounts.n_elem; ++i)
+    variances[i] /= clusterCounts[i];
+
   // Now find the cluster with maximum variance.
   arma::uword maxVarCluster;
   variances.max(maxVarCluster);
@@ -43,7 +47,7 @@
   // Now, inside this cluster, find the point which is furthest away.
   size_t furthestPoint = data.n_cols;
   double maxDistance = -DBL_MAX;
-  for (size_t i = 0; i < data.n_cols; i++)
+  for (size_t i = 0; i < data.n_cols; ++i)
   {
     if (assignments[i] == maxVarCluster)
     {



More information about the mlpack-svn mailing list