[mlpack] GSoC 2015: Fast k-centers algorithm & implementation
Ryan Curtin
ryan at ratml.org
Wed Mar 4 10:00:04 EST 2015
On Wed, Mar 04, 2015 at 03:23:55AM -0300, Lukas Zorich wrote:
> Hi Ryan,
>
> First of all, it's too bad that mlpack hasn't been accepted into GSoC'15 :(
Indeed -- once I learn from Google on Friday why mlpack wasn't accepted,
I'll make a more detailed post to the mailing list.
> I spent a lot of time reading about dual-tree algorithms and diving into
> mlpack code. The thing is that I'm still interested in dual-tree
> algorithms, and I also want to sharpen my C++ skills, so I still want to
> contribute to mlpack :)
Great! :)
> I thought that I could implement KDE (
> https://github.com/mlpack/mlpack/issues/150) to start. I will use the
> various mlpack dual-tree algorithms already implemented as a guide. Do you
> think this is achivable for me?
Yes, KDE should be pretty straightforward. You can use the rules in
``Tree-Independent Dual-Tree Algorithms'', Algorithms 9 and 10 as a
simple implementation. That's relative-value approximate kernel density
estimation, which ensures that the estimate f*(x) is such that
| f*(x) - f(x) | < \epsilon
for every point x in the dataset.
There are more complicated algorithms in other papers, so if you want to
try those, feel free, but if they're not in a tree-independent form
already you may have to do some thinking. The one in ``Nonparametric
density estimation: towards computational tractability'' by Alex Gray
and Andrew Moore is especially crazy. I don't know how well it works in
practice.
> I will let you know if I have questions :)
Sure, I'm more than happy to help.
--
Ryan Curtin | "You can think about it... but don't do it."
ryan at ratml.org | - Sheriff Justice
More information about the mlpack
mailing list