<p>Are you sure you compiled mlpack without debugging symbols?  Here is what I get when compiling mlpack with <code>-DDEBUG=OFF</code> and <code>-DPROFILE=OFF</code>.  I used this test program for scikit:</p>

<pre><code>#!/usr/bin/python

import numpy

from sklearn.cluster import MeanShift

from sklearn.cluster import estimate_bandwidth

import time

d = numpy.genfromtxt('/home/ryan/datasets/corel.csv', delimiter=',')

bw = estimate_bandwidth(d, quantile=0.2, n_samples=500)

print(bw)

ms = MeanShift(bandwidth=bw, bin_seeding = True)

t1 = time.time()

ms.fit(d)

t2 = time.time()

print t2 - t1

print(len(numpy.unique(ms.labels_)))

</code></pre>

<p>This gave me the following output:</p>

<pre><code>0.430335887828

7.03606009483

1

</code></pre>

<p>So, bandwidth of 0.430336, it took 7.036 seconds, and we got 1 cluster as a result.  Then, I use your implementation for mlpack:</p>

<pre><code>$ mean_shift -i ~/datasets/corel.csv -r 0.430335887828 -v -C centers.csv

[INFO ] Loading '/home/ryan/datasets/corel.csv' as CSV data.  Size is 32 x 37749.

[INFO ] Performing mean shift clustering...

[INFO ] 46511 node combinations were scored.

[INFO ] 37749 base cases were calculated.

[INFO ] Found 1 centroids.

[WARN ] No extension given with filename ''; type unknown.  Save failed.

[INFO ] Saving CSV data to 'centers.csv'.

[INFO ] 

[INFO ] Execution parameters:

[INFO ]   bandwidth: (Unknown data type - )

[INFO ]   centroid_file: centers.csv

[INFO ]   help: false

[INFO ]   in_place: false

[INFO ]   info: ""

[INFO ]   inputFile: /home/ryan/datasets/corel.csv

[INFO ]   max_iterations: 1000

[INFO ]   output_file: ""

[INFO ]   radius: 0.430336

[INFO ]   verbose: true

[INFO ]   version: false

[INFO ] 

[INFO ] Program timers:

[INFO ]   clustering: 3.681358s

[INFO ]   computing_neighbors: 0.009845s

[INFO ]   loading_data: 0.459559s

[INFO ]   range_search/computing_neighbors: 2.075936s

[INFO ]   range_search/tree_building: 0.440392s

[INFO ]   saving_data: 0.000118s

[INFO ]   total_time: 4.143638s

[INFO ]   tree_building: 0.487500s

</code></pre>

<p>So, the mlpack implementation appears to be twice as fast as the scikit implementation.  (I'm using Python 2.7.9 with Debian's <code>python-sklearn</code> 0.15.2-3 package.)  I wouldn't be surprised if newer versions of scikit seem faster, but either way, the timings I'm getting are drastically different than you are, so maybe there is a configuration issue on your end?</p>

<p>With the covertype dataset and a bandwidth of 1524.6535, scikit takes 157.3325s while mlpack takes 42.159s.</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br>Reply to this email directly or <a href="https://github.com/mlpack/mlpack/pull/388#issuecomment-95606901">view it on GitHub</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFHxaE6PCM6cdJUGBDqBn7YEZC6Udks5oCPpFgaJpZM4DTzb1.gif" width="1" /></p>

<div itemscope itemtype="http://schema.org/EmailMessage">

  <div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">

    <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/388#issuecomment-95606901"></link>

    <meta itemprop="name" content="View Pull Request"></meta>

  </div>

  <meta itemprop="description" content="View this Pull Request on GitHub"></meta>

</div>