[mlpack-git] [mlpack] Mean shift clustering (#388)
Ryan Curtin
notifications at github.com
Mon Feb 23 14:04:58 EST 2015
I'm sorry, I think I may have over-thought this whole thing a bit. Consider the more general derivation from before, for any kernel K and the Euclidean distance (I'm ignoring the constants on the front of f(x) for convenience. You can rederive with the constants if you like):
∇f(x) = ∑ K'(∥ x - x_i ∥) d/dx(∥ x - x_i ∥).
= ∑ K'(∥ x - x_i ∥) d/dx((∥ x - x_i ∥^2)^1/2)
= ∑ K'(∥ x - x_i ∥) (1/2 (∥ x - x_i ∥^2)^(-1/2)) (x - x_i)
= ∑ K'(∥ x - x_i ∥) (1 / (2 ∥ x - x_i ∥^2)) (x - x_i)
Now plug in the Gaussian kernel:
K(∥ x - x_i ∥) = exp(-∥ x - x_i ∥^2 / b^2)
which gives by the chain rule
K'(∥ x - x_i ∥) = exp(-∥ x - x_i ∥^2 / b^2) * -b^-2 * 2 (∥ x - x_i ∥^2)
Note that last term, 2 (∥ x - x_i ∥^2). When we plug that into ∇f(x) above, it cancels with the second term and we get the familiar
∇f(x) = K'(∥ x - x_i ∥) (x - x_i)
So there isn't actually a need for knowing whether or not the kernel uses a squared distance. We can use this calculation always:
∇f(x) = ∑ K'(∥ x - x_i ∥) (1 / (2 ∥ x - x_i ∥^2)) (x - x_i)
and if the kernel _does_ use the squared distance, then there will exist the term 2 (∥ x - x_i ∥^2) in its gradient. The `Gradient()` function should be inlined with the rest of the expression, so a smart compiler should be able to cancel out the 2 (∥ x - x_i ∥^2) and the (1 / (2 ∥ x - x_i ∥^2)). And even if not, this is a few floating point calculations and not an unnecessary distance calculation (because you need to calculate ∥ x - x_i ∥ no matter what), so I don't think it will be incredibly computationally costly. I'd be interested in comparing the runtime of the more general implementation with your Gaussian-specific implementation. Ideally, the speed should be the same.
Does that make sense? I'm sorry for the confusion.
---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/388#issuecomment-75608745
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150223/51595e83/attachment.html>
More information about the mlpack-git
mailing list