[mlpack-git] [mlpack/mlpack] mlpack KMeans class much slower than armadillo kmeans() (#514)
cypro666
notifications at github.com
Tue Jun 28 04:46:36 EDT 2016
Infact, dual-tree or pelleg-moore(same as x-means) are really fast enough!!!
The implement of naive-kmean now is not suitable for parallel using omp, it's “textbook style”, but also good for small dataset :)
**I changed the details of NaiveKMeans::Iterate method directly to use arma::kmeans**
```cpp
template<typename MetricType, typename MatType>
double NaiveKMeans<MetricType, MatType>::Iterate(const arma::mat& centroids, arma::mat& newCentroids, arma::Col<size_t>& counts) {
counts.zeros(centroids.n_cols); // never used, in fact
newCentroids = centroids;
arma::kmeans(newCentroids, dataset, centroids.n_cols, arma::keep_existing, 10, false);
Log::Assert(newCentroids.n_cols == centroids.n_cols);
// Now normalize the centroid.
distanceCalculations += centroids.n_cols * dataset.n_cols;
// Calculate cluster distortion for this iteration.
double cNorm = 0.0;
for (size_t i = 0; i < centroids.n_cols; ++i) {
cNorm += std::pow(metric.Evaluate(centroids.col(i), newCentroids.col(i)), 2.0);
}
distanceCalculations += centroids.n_cols;
return std::sqrt(cNorm);
}
```
Now it's as faster as armadillo's
But this should not be a correct, good or final solution *_*
---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/514#issuecomment-228989205
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160628/854ad967/attachment.html>
More information about the mlpack-git
mailing list