[mlpack] GSOC 2016 Aspirant - Parallel Stochastic Optimisation Methods
Ryan Curtin
ryan at ratml.org
Sun Mar 6 13:40:27 EST 2016
On Sat, Mar 05, 2016 at 07:49:54AM +0530, Aditya Sharma wrote:
> Hi Ryan,
>
> I agree that just implementing Hogwild! would be a pretty trivial. Adding
> support for distributed computing to ml-pack along with Hogwild! for
> multithreading on each node, on the other hand, could be a much more
> interesting project.
>
> Your mention of Spark got me thinking, if there were any stable frameworks
> for running C++ programs using the hadoop system. And sure enough after a
> quick internet search, I came across MR4C
> <http://google-opensource.blogspot.in/2015/02/mapreduce-for-c-run-native-code-in.html>
> developed
> by Google. From the example programs that I have seen, it appears to have a
> very clean interface, and would surely help to keep the final code simple.
>
> I have not explored it in detail, but I think it could be a possible option
> to add support for distributed computing to ml-pack.
>
> The same thing for CMU's Parameter Server model <http://parameterserver.org>.
> I have personally used it, so I know that it offers really good speed up
> and the methods are pretty straightforward, so the code remains simple.
>
> Do check these out and tell me if any of them is an area worthy of
> exploration for ml-pack.
Hi Aditya,
Like I said, the big issue with either of these frameworks is that
mlpack does not have any support for distributed matrices and is not
traditionally a distibuted computing library. So that would be a
prerequisite for even considering a framework like this, unfortunately.
Thanks,
Ryan
--
Ryan Curtin | "I can't believe you like money too. We should
ryan at ratml.org | hang out." - Frito
More information about the mlpack
mailing list