<div dir="ltr"><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Hi Ryan,</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">I agree that just implementing Hogwild! would be a pretty trivial. Adding support for distributed computing to ml-pack along with Hogwild! for multithreading on each node, on the other hand, could be a much more interesting project.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Your mention of Spark got me thinking, if there were any stable frameworks for running C++ programs using the hadoop system. And sure enough after a quick internet search, I came across <a href="http://google-opensource.blogspot.in/2015/02/mapreduce-for-c-run-native-code-in.html">MR4C</a> developed by Google. From the example programs that I have seen, it appears to have a very clean interface, and would surely help to keep the final code simple.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">I have not explored it in detail, but I think it could be a possible option to add support for distributed computing to ml-pack.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">The same thing for<a href="http://parameterserver.org"> CMU&#39;s Parameter Server model</a>. I have personally used it, so I know that it offers really good speed up and the methods are pretty straightforward, so the code remains simple.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Do check these out and tell me if any of them is an area worthy of exploration for ml-pack.</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Thank You!</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000"><br></div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Best,</div><div class="gmail_default" style="font-family:verdana,sans-serif;color:#000000">Aditya</div></div><div hspace="streak-pt-mark" style="max-height:1px"><img style="width:0px;max-height:0px;overflow:hidden" src="https://mailfoogae.appspot.com/t?sender=aYWRpc2hhcm1hMDc1QGdtYWlsLmNvbQ%3D%3D&amp;type=zerocontent&amp;guid=7dfac8da-c9d9-44da-86fa-eb4ab775efb0"><font color="#ffffff" size="1">ᐧ</font></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 5, 2016 at 7:16 AM, Ryan Curtin <span dir="ltr">&lt;<a href="mailto:ryan@ratml.org" target="_blank">ryan@ratml.org</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Thu, Mar 03, 2016 at 11:11:05PM +0530, Aditya Sharma wrote:<br>

&gt; Hi Ryan,<br>

&gt;<br>

&gt; I read the Hogwild! paper, which to my understanding, gives theoretical<br>

&gt; convergence guarantees for just parallelizing SGD without worrying about<br>

&gt; locking, etc in a shared memory model, if the data is large enough and<br>

&gt; updates happen atomically.<br>

&gt;<br>

&gt; I also went through your implementations of SGD and mini-batch SGD. I think<br>

&gt; it would be fairly easy to OpenMP-ize the current implementations along the<br>

&gt; lines of Hogwild.<br>

&gt;<br>

&gt; But, in my opinion, if we just use multi-threading, ml-pack might not be<br>

&gt; very attractive for researchers working with truly large-scale data.<br>

&gt;<br>

&gt; I think it would be a good idea if we could add support for GPU processing<br>

&gt; to the existing optimizers. I have prior experience working with CUDA and I<br>

&gt; think I would be able to add a CUDA version of Hogwild! built on the<br>

&gt; existing SGD implementations in ml-pack, over the summer. Such that<br>

&gt; researchers with little knowledge of CUDA can directly use ml-pack to speed<br>

&gt; their code, without worrying about what&#39;s under the hood (much like what<br>

&gt; Theano does for python).<br>

&gt;<br>

&gt; Another direction could be to add support for distributed computing, by<br>

&gt; linking ml-pack to either the Parameter Server by CMU (<br>

&gt; <a href="http://parameterserver.org" rel="noreferrer" target="_blank">http://parameterserver.org</a>) or integrating the MPI based Parameter Server<br>

&gt; that I&#39;ve built, and parallelizing the existing SGD and mini-batch code in<br>

&gt; ml-pack along the lines of Downpour SGD (similar to Tensor FLow and<br>

&gt; DistBelief systems developed by Googole).<br>

&gt;<br>

&gt; The distributed implementation would be a bit more complicated, but I think<br>

&gt; I should be able to do it over the summer, as that&#39;s exactly what the focus<br>

&gt; of my research is currently.<br>

&gt;<br>

&gt; I would love to know your thoughts and suggestions.<br>

<br>

</div></div>Hi Aditya,<br>

<br>

We could definitely use OpenMP on the current SGD implementations, but<br>

we would have to be careful to ensure that this wouldn&#39;t modify the<br>

result.  Hogwild! is almost certainly easiest to implement in OpenMP.<br>

(Actually it&#39;s sufficiently simple that just a Hogwild! implementation<br>

would be too little work for a GSoC project I think, but it could<br>

definitely be a component of a larger project.)<br>

<br>

The problem with CUDA is that you will have to be shipping the data back<br>

and forth from the GPU every iteration, because the optimizer is<br>

separate from the function it is optimizing.  The optimizer only makes<br>

calls to function.Evaluate() and function.Gradient(), and it&#39;s not<br>

reasonable to expect that every Evaluate() and Gradient() call will be<br>

written for GPUs.  This means that the only step that you could put on a<br>

GPU would realistically be the update step, and given the huge overhead<br>

of the communication cost, I&#39;m doubtful that we&#39;d see any speedup.<br>

<br>

It&#39;s a very hard challenge to support GPUs while still keeping the<br>

algorithms simple enough to be maintained.<br>

<br>

I think the same thing is true for MPI; the code written for MPI can end<br>

up being very complex and hard to maintain.  Here we have another<br>

problem: mlpack has no support for distributed matrices or distributed<br>

problems of any form (and in general isn&#39;t aimed at that use case; there<br>

are maybe better tools, like Spark for instance).<br>

<br>

I don&#39;t mean to say these ideas are impossible: what you&#39;ve suggested is<br>

a set of really great improvements and ideas.  But we would need to do a<br>

lot of thinking to figure out how they would fit into the core<br>

abstractions of mlpack, how we can preserve the basic interface we have<br>

now, and (maybe most importantly) how we can keep the code simple.<br>

<br>

Thanks,<br>

<br>

Ryan<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Ryan Curtin    | &quot;Reprogram him!&quot;<br>

<a href="mailto:ryan@ratml.org">ryan@ratml.org</a> |   - Master Control Program<br>

</font></span></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><p dir="ltr"><font face="verdana, sans-serif">Aditya Sharma<br>Fourth Year Undergraduate<br>Department of Electrical and Electronics Engineering,<br>Birla Institute of Technology and Science, Pilani<br>Rajasthan, India - 333031</font></p><p><font face="verdana, sans-serif">WWW: <a href="http://adityasharma.space" target="_blank">http://adityasharma.space</a></font></p><p dir="ltr"><font face="verdana, sans-serif">E-mail: <a href="mailto:adisharma075@gmail.com" style="color:rgb(17,85,204)" target="_blank">adisharma075@gmail.com</a>, <a href="mailto:f2012075@pilani.bits-pilani.ac.in" style="color:rgb(17,85,204)" target="_blank">f2012075@pilani.bits-pilani.ac.in</a><br>LinkedIn: <a href="https://www.linkedin.com/in/adityabits" style="color:rgb(17,85,204)" target="_blank">https://www.linkedin.com/in/adityabits</a></font></p></div></div></div></div>

</div>