<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Hi Ryan,</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">I agree that just implementing Hogwild! would be a pretty trivial. Adding support for distributed computing to ml-pack along with Hogwild! for multithreading on each node, on the other hand, could be a much more interesting project.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Your mention of Spark got me thinking, if there were any stable frameworks for running C++ programs using the hadoop system. And sure enough after a quick internet search, I came across <a href="http://google-opensource.blogspot.in/2015/02/mapreduce-for-c-run-native-code-in.html">MR4C</a> developed by Google. From the example programs that I have seen, it appears to have a very clean interface, and would surely help to keep the final code simple.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">I have not explored it in detail, but I think it could be a possible option to add support for distributed computing to ml-pack.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">The same thing for<a href="http://parameterserver.org"> CMU's Parameter Server model</a>. I have personally used it, so I know that it offers really good speed up and the methods are pretty straightforward, so the code remains simple.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Do check these out and tell me if any of them is an area worthy of exploration for ml-pack.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Thank You!</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Best,</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Aditya</div></div><div hspace="streak-pt-mark" style="max-height:1px"><img style="width:0px;max-height:0px;overflow:hidden" src="https://mailfoogae.appspot.com/t?sender=aYWRpc2hhcm1hMDc1QGdtYWlsLmNvbQ%3D%3D&type=zerocontent&guid=7dfac8da-c9d9-44da-86fa-eb4ab775efb0"><font color="#ffffff" size="1">ᐧ</font></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 5, 2016 at 7:16 AM, Ryan Curtin <span dir="ltr"><<a href="mailto:ryan@ratml.org" target="_blank">ryan@ratml.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Thu, Mar 03, 2016 at 11:11:05PM +0530, Aditya Sharma wrote:<br>
> Hi Ryan,<br>
><br>
> I read the Hogwild! paper, which to my understanding, gives theoretical<br>
> convergence guarantees for just parallelizing SGD without worrying about<br>
> locking, etc in a shared memory model, if the data is large enough and<br>
> updates happen atomically.<br>
><br>
> I also went through your implementations of SGD and mini-batch SGD. I think<br>
> it would be fairly easy to OpenMP-ize the current implementations along the<br>
> lines of Hogwild.<br>
><br>
> But, in my opinion, if we just use multi-threading, ml-pack might not be<br>
> very attractive for researchers working with truly large-scale data.<br>
><br>
> I think it would be a good idea if we could add support for GPU processing<br>
> to the existing optimizers. I have prior experience working with CUDA and I<br>
> think I would be able to add a CUDA version of Hogwild! built on the<br>
> existing SGD implementations in ml-pack, over the summer. Such that<br>
> researchers with little knowledge of CUDA can directly use ml-pack to speed<br>
> their code, without worrying about what's under the hood (much like what<br>
> Theano does for python).<br>
><br>
> Another direction could be to add support for distributed computing, by<br>
> linking ml-pack to either the Parameter Server by CMU (<br>
> <a href="http://parameterserver.org" rel="noreferrer" target="_blank">http://parameterserver.org</a>) or integrating the MPI based Parameter Server<br>
> that I've built, and parallelizing the existing SGD and mini-batch code in<br>
> ml-pack along the lines of Downpour SGD (similar to Tensor FLow and<br>
> DistBelief systems developed by Googole).<br>
><br>
> The distributed implementation would be a bit more complicated, but I think<br>
> I should be able to do it over the summer, as that's exactly what the focus<br>
> of my research is currently.<br>
><br>
> I would love to know your thoughts and suggestions.<br>
<br>
</div></div>Hi Aditya,<br>
<br>
We could definitely use OpenMP on the current SGD implementations, but<br>
we would have to be careful to ensure that this wouldn't modify the<br>
result. Hogwild! is almost certainly easiest to implement in OpenMP.<br>
(Actually it's sufficiently simple that just a Hogwild! implementation<br>
would be too little work for a GSoC project I think, but it could<br>
definitely be a component of a larger project.)<br>
<br>
The problem with CUDA is that you will have to be shipping the data back<br>
and forth from the GPU every iteration, because the optimizer is<br>
separate from the function it is optimizing. The optimizer only makes<br>
calls to function.Evaluate() and function.Gradient(), and it's not<br>
reasonable to expect that every Evaluate() and Gradient() call will be<br>
written for GPUs. This means that the only step that you could put on a<br>
GPU would realistically be the update step, and given the huge overhead<br>
of the communication cost, I'm doubtful that we'd see any speedup.<br>
<br>
It's a very hard challenge to support GPUs while still keeping the<br>
algorithms simple enough to be maintained.<br>
<br>
I think the same thing is true for MPI; the code written for MPI can end<br>
up being very complex and hard to maintain. Here we have another<br>
problem: mlpack has no support for distributed matrices or distributed<br>
problems of any form (and in general isn't aimed at that use case; there<br>
are maybe better tools, like Spark for instance).<br>
<br>
I don't mean to say these ideas are impossible: what you've suggested is<br>
a set of really great improvements and ideas. But we would need to do a<br>
lot of thinking to figure out how they would fit into the core<br>
abstractions of mlpack, how we can preserve the basic interface we have<br>
now, and (maybe most importantly) how we can keep the code simple.<br>
<br>
Thanks,<br>
<br>
Ryan<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
Ryan Curtin | "Reprogram him!"<br>
<a href="mailto:ryan@ratml.org">ryan@ratml.org</a> | - Master Control Program<br>
</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><p dir="ltr"><font face="verdana, sans-serif">Aditya Sharma<br>Fourth Year Undergraduate<br>Department of Electrical and Electronics Engineering,<br>Birla Institute of Technology and Science, Pilani<br>Rajasthan, India - 333031</font></p><p><font face="verdana, sans-serif">WWW: <a href="http://adityasharma.space" target="_blank">http://adityasharma.space</a></font></p><p dir="ltr"><font face="verdana, sans-serif">E-mail: <a href="mailto:adisharma075@gmail.com" style="color:rgb(17,85,204)" target="_blank">adisharma075@gmail.com</a>, <a href="mailto:f2012075@pilani.bits-pilani.ac.in" style="color:rgb(17,85,204)" target="_blank">f2012075@pilani.bits-pilani.ac.in</a><br>LinkedIn: <a href="https://www.linkedin.com/in/adityabits" style="color:rgb(17,85,204)" target="_blank">https://www.linkedin.com/in/adityabits</a></font></p></div></div></div></div>
</div>