[mlpack-git] master: Remove references to overclustering from tutorial. (2c5b482)

gitdub at big.cc.gt.atl.ga.us gitdub at big.cc.gt.atl.ga.us
Thu Mar 5 22:00:46 EST 2015


Repository : https://github.com/mlpack/mlpack

On branch  : master
Link       : https://github.com/mlpack/mlpack/compare/904762495c039e345beba14c1142fd719b3bd50e...f94823c800ad6f7266995c700b1b630d5ffdcf40

>---------------------------------------------------------------

commit 2c5b482b78c4f5eb6ef9ea6f78005d678b63a320
Author: Ryan Curtin <ryan at ratml.org>
Date:   Thu Oct 9 18:59:53 2014 +0000

    Remove references to overclustering from tutorial.


>---------------------------------------------------------------

2c5b482b78c4f5eb6ef9ea6f78005d678b63a320
 doc/tutorials/kmeans/kmeans.txt | 50 +++++++----------------------------------
 1 file changed, 8 insertions(+), 42 deletions(-)

diff --git a/doc/tutorials/kmeans/kmeans.txt b/doc/tutorials/kmeans/kmeans.txt
index 21a9071..1cac057 100644
--- a/doc/tutorials/kmeans/kmeans.txt
+++ b/doc/tutorials/kmeans/kmeans.txt
@@ -71,13 +71,11 @@ A list of all the sections this tutorial contains.
    - \ref cli_ex2_kmtut
    - \ref cli_ex3_kmtut
    - \ref cli_ex4_kmtut
-   - \ref cli_ex5_kmtut 
    - \ref cli_ex6_kmtut
  - \ref kmeans_kmtut
    - \ref kmeans_ex1_kmtut
    - \ref kmeans_ex2_kmtut
    - \ref kmeans_ex3_kmtut
-   - \ref kmeans_ex4_kmtut 
    - \ref kmeans_ex5_kmtut
    - \ref kmeans_ex6_kmtut
    - \ref kmeans_ex7_kmtut
@@ -97,6 +95,11 @@ be found by typing
 $ kmeans --help
 @endcode
 
+As of October 2014, support for overclustering has been removed due to bugs and
+lack of usage.  If this is support you were using, or are interested, please
+file a bug or get in touch with the \b mlpack developers in some way so that the
+support can be re-implemented.
+
 Below are several examples demonstrating simple use of the \c kmeans executable.
 
 @subsection cli_ex1_kmtut Simple k-means clustering
@@ -147,26 +150,6 @@ potentially forever.  The example below sets a maximum of 250 iterations.
 $ kmeans -c 5 -i dataset.csv -v -o assignments.csv -m 250
 @endcode
 
- at subsection cli_ex5_kmtut Setting the overclustering factor
-
-The \b mlpack k-means implementation allows "overclustering", which is when the
-k-means algorithm is run with more than the requested number of clusters.  Upon
-convergence, the clusters with the nearest centroids are merged until only the
-requested number of centroids remain.  This can provide better clustering
-results.  The overclustering factor, specified with \c -O or \c
---overclustering, determines how many more clusters are found than were
-requested.  For instance, with \c k set to 5 and an overclustering factor of 2,
-10 clusters will be found.  Note that the overclustering factor does not need to
-be an integer.
-
-The following code snippet finds 5 clusters, but with an overclustering factor
-of 2.4 (so 12 clusters are found and then merged together to produce 5 final
-clusters).
-
- at code
-$ kmeans -c 5 -O 2.4 -i dataset.csv -v -o assignments.csv
- at endcode
-
 @subsection cli_ex6_kmtut Using Bradley-Fayyad "refined start"
 
 The method proposed by Bradley and Fayyad in their paper "Refining initial
@@ -278,23 +261,6 @@ KMeans<> k(500);
 
 Then you can run \c Cluster() as normal.
 
- at subsection kmeans_ex4_kmtut Setting the overclustering factor
-
-For a description of what overclustering is, see
- at ref cli_ex5_kmtut "the command-line interface tutorial about overclustering".
-
-The overclustering factor, which by default is 1.0 (this indicates that no
-overclustering is happening), is specified in the second argument to the
-constructor.
-
- at code
-// We will keep the default maximum iterations of 1000, but set the
-// overclustering factor to 2.5.
-KMeans<> k(1000, 2.5);
- at endcode
-
-Then you can run \c Cluster() as normal.
-
 @subsection kmeans_ex5_kmtut Setting initial cluster assignments
 
 If you have an initial guess for the cluster assignments for each point, you can
@@ -493,9 +459,9 @@ MahalanobisDistance in the constructor.
 // The initialized Mahalanobis distance.
 extern mlpack::metric::MahalanobisDistance distance;
 
-// We keep the default arguments for the maximum number of iterations and
-// overclustering factor, but pass our instantiated metric.
-KMeans<mlpack::metric::MahalanobisDistance> k(1000, 1.0, distance);
+// We keep the default arguments for the maximum number of iterations, but pass
+// our instantiated metric.
+KMeans<mlpack::metric::MahalanobisDistance> k(1000, distance);
 @endcode
 
 @note



More information about the mlpack-git mailing list