[mlpack-git] [mlpack] Mean shift clustering (#388)

Ryan Curtin notifications at github.com
Mon Feb 23 14:04:58 EST 2015


I'm sorry, I think I may have over-thought this whole thing a bit.  Consider the more general derivation from before, for any kernel K and the Euclidean distance (I'm ignoring the constants on the front of f(x) for convenience.  You can rederive with the constants if you like):

∇f(x) = ∑ K'(∥ x - x_i ∥) d/dx(∥ x - x_i ∥).
= ∑ K'(∥ x - x_i ∥) d/dx((∥ x - x_i ∥^2)^1/2)
= ∑ K'(∥ x - x_i ∥) (1/2 (∥ x - x_i ∥^2)^(-1/2)) (x - x_i)
= ∑ K'(∥ x - x_i ∥) (1 / (2 ∥ x - x_i ∥^2)) (x - x_i)

Now plug in the Gaussian kernel:

K(∥ x - x_i ∥) = exp(-∥ x - x_i ∥^2 / b^2)

which gives by the chain rule

K'(∥ x - x_i ∥) = exp(-∥ x - x_i ∥^2 / b^2) * -b^-2 * 2 (∥ x - x_i ∥^2)

Note that last term, 2 (∥ x - x_i ∥^2).  When we plug that into ∇f(x) above, it cancels with the second term and we get the familiar

∇f(x) = K'(∥ x - x_i ∥) (x - x_i)

So there isn't actually a need for knowing whether or not the kernel uses a squared distance.  We can use this calculation always:

∇f(x) = ∑ K'(∥ x - x_i ∥) (1 / (2 ∥ x - x_i ∥^2)) (x - x_i)

and if the kernel _does_ use the squared distance, then there will exist the term 2 (∥ x - x_i ∥^2) in its gradient.  The `Gradient()` function should be inlined with the rest of the expression, so a smart compiler should be able to cancel out the 2 (∥ x - x_i ∥^2) and the (1 / (2 ∥ x - x_i ∥^2)).  And even if not, this is a few floating point calculations and not an unnecessary distance calculation (because you need to calculate ∥ x - x_i ∥ no matter what), so I don't think it will be incredibly computationally costly.  I'd be interested in comparing the runtime of the more general implementation with your Gaussian-specific implementation.  Ideally, the speed should be the same.

Does that make sense?  I'm sorry for the confusion.

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/388#issuecomment-75608745
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150223/51595e83/attachment.html>


More information about the mlpack-git mailing list