[mlpack-svn] r17223 - mlpack/trunk/doc/tutorials/kmeans

fastlab-svn at coffeetalk-1.cc.gatech.edu fastlab-svn at coffeetalk-1.cc.gatech.edu
Thu Oct 9 14:59:54 EDT 2014


Author: rcurtin
Date: Thu Oct  9 14:59:53 2014
New Revision: 17223

Log:
Remove references to overclustering from tutorial.


Modified:
   mlpack/trunk/doc/tutorials/kmeans/kmeans.txt

Modified: mlpack/trunk/doc/tutorials/kmeans/kmeans.txt
==============================================================================
--- mlpack/trunk/doc/tutorials/kmeans/kmeans.txt	(original)
+++ mlpack/trunk/doc/tutorials/kmeans/kmeans.txt	Thu Oct  9 14:59:53 2014
@@ -67,24 +67,22 @@
  - \ref intro_kmtut
  - \ref toc_kmtut
  - \ref cli_kmtut
-   - \ref cli_ex1_kmtut 
-   - \ref cli_ex2_kmtut 
-   - \ref cli_ex3_kmtut 
-   - \ref cli_ex4_kmtut 
-   - \ref cli_ex5_kmtut 
-   - \ref cli_ex6_kmtut 
+   - \ref cli_ex1_kmtut
+   - \ref cli_ex2_kmtut
+   - \ref cli_ex3_kmtut
+   - \ref cli_ex4_kmtut
+   - \ref cli_ex6_kmtut
  - \ref kmeans_kmtut
-   - \ref kmeans_ex1_kmtut 
-   - \ref kmeans_ex2_kmtut 
-   - \ref kmeans_ex3_kmtut 
-   - \ref kmeans_ex4_kmtut 
-   - \ref kmeans_ex5_kmtut 
-   - \ref kmeans_ex6_kmtut 
-   - \ref kmeans_ex7_kmtut 
+   - \ref kmeans_ex1_kmtut
+   - \ref kmeans_ex2_kmtut
+   - \ref kmeans_ex3_kmtut
+   - \ref kmeans_ex5_kmtut
+   - \ref kmeans_ex6_kmtut
+   - \ref kmeans_ex7_kmtut
  - \ref kmeans_template_kmtut
-   - \ref kmeans_metric_kmtut 
-   - \ref kmeans_initial_partition_kmtut 
-   - \ref kmeans_empty_cluster_kmtut 
+   - \ref kmeans_metric_kmtut
+   - \ref kmeans_initial_partition_kmtut
+   - \ref kmeans_empty_cluster_kmtut
  - \ref further_doc_kmtut
 
 @section cli_kmtut Command-Line 'kmeans'
@@ -97,6 +95,11 @@
 $ kmeans --help
 @endcode
 
+As of October 2014, support for overclustering has been removed due to bugs and
+lack of usage.  If this is support you were using, or are interested, please
+file a bug or get in touch with the \b mlpack developers in some way so that the
+support can be re-implemented.
+
 Below are several examples demonstrating simple use of the \c kmeans executable.
 
 @subsection cli_ex1_kmtut Simple k-means clustering
@@ -147,26 +150,6 @@
 $ kmeans -c 5 -i dataset.csv -v -o assignments.csv -m 250
 @endcode
 
- at subsection cli_ex5_kmtut Setting the overclustering factor
-
-The \b mlpack k-means implementation allows "overclustering", which is when the
-k-means algorithm is run with more than the requested number of clusters.  Upon
-convergence, the clusters with the nearest centroids are merged until only the
-requested number of centroids remain.  This can provide better clustering
-results.  The overclustering factor, specified with \c -O or \c
---overclustering, determines how many more clusters are found than were
-requested.  For instance, with \c k set to 5 and an overclustering factor of 2,
-10 clusters will be found.  Note that the overclustering factor does not need to
-be an integer.
-
-The following code snippet finds 5 clusters, but with an overclustering factor
-of 2.4 (so 12 clusters are found and then merged together to produce 5 final
-clusters).
-
- at code
-$ kmeans -c 5 -O 2.4 -i dataset.csv -v -o assignments.csv
- at endcode
-
 @subsection cli_ex6_kmtut Using Bradley-Fayyad "refined start"
 
 The method proposed by Bradley and Fayyad in their paper "Refining initial
@@ -278,23 +261,6 @@
 
 Then you can run \c Cluster() as normal.
 
- at subsection kmeans_ex4_kmtut Setting the overclustering factor
-
-For a description of what overclustering is, see
- at ref cli_ex5_kmtut "the command-line interface tutorial about overclustering".
-
-The overclustering factor, which by default is 1.0 (this indicates that no
-overclustering is happening), is specified in the second argument to the
-constructor.
-
- at code
-// We will keep the default maximum iterations of 1000, but set the
-// overclustering factor to 2.5.
-KMeans<> k(1000, 2.5);
- at endcode
-
-Then you can run \c Cluster() as normal.
-
 @subsection kmeans_ex5_kmtut Setting initial cluster assignments
 
 If you have an initial guess for the cluster assignments for each point, you can
@@ -493,9 +459,9 @@
 // The initialized Mahalanobis distance.
 extern mlpack::metric::MahalanobisDistance distance;
 
-// We keep the default arguments for the maximum number of iterations and
-// overclustering factor, but pass our instantiated metric.
-KMeans<mlpack::metric::MahalanobisDistance> k(1000, 1.0, distance);
+// We keep the default arguments for the maximum number of iterations, but pass
+// our instantiated metric.
+KMeans<mlpack::metric::MahalanobisDistance> k(1000, distance);
 @endcode
 
 @note



More information about the mlpack-svn mailing list