<p>Don't worry about the Travis failure, that seems to be some other test where the tolerance needs adjusting.</p>


<p>The impression I am getting---and correct me if I am wrong---is that you agree it would be nice to have only one implementation of <code>ExtractSplits()</code> that had, e.g., <code>MatType</code> as a template parameter, but this is not possible due to the current drawbacks of the Armadillo sparse matrix implementation.  I agree that there are definitely problems with the Armadillo CSC implementation so for now I think it's reasonable to have two versions of <code>ExtractSplits()</code>, but I think in the long run we should aim in the following directions:</p>


<ul>

<li><p>reimplement <code>SpMat</code> as coordinate list, because really I have not found any cases where CSC gives significant speedup.  Coordinate lists also mean that a lot of non-linear-algebraic operations like sorting, iterating, and so forth become a million times easier.  This task is something I will do, I am just not sure when I'll have the time.</p></li>

<li><p>Get better sparse matrix file reading in place (and submitted upstream to Armadillo when possible).  I know that Armadillo has coordinate list file loading support, but there isn't a way to use that currently with any of the mlpack command-line programs.  Maybe that is worth looking into too.  What formats are you thinking of?  The more I think of it the following <em>should</em> work:</p></li>

</ul>


<pre><code>arma::sp_mat matrix;

data::Load("coordinate_list.csv", matrix);

</code></pre>


<ul>

<li>Take your <code>sort()</code> implementation and submit it upstream to Armadillo.  Would you like to do this?  I can help.  Basically we'll just need to write tests for it then send it to Conrad and he's likely to take a look over and then include it.</li>

</ul>


<p>Are you finished with this PR otherwise?  If so I'll go ahead and merge it.  Thanks again for your time on this, it is a definite improvement.</p>


<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/802#issuecomment-257598603">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFCcYEBiB9gwrfPoG5J7VYQzTWTz2ks5q51vEgaJpZM4KZnsm">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFFAG2pO8c0NQOS-dchsEoXkEvaGtks5q51vEgaJpZM4KZnsm.gif" width="1" /></p>

<div itemscope itemtype="http://schema.org/EmailMessage">

<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">

  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/802#issuecomment-257598603"></link>

  <meta itemprop="name" content="View Pull Request"></meta>

</div>

<meta itemprop="description" content="View this Pull Request on GitHub"></meta>

</div>


<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rcurtin in #802: Don't worry about the Travis failure, that seems to be some other test where the tolerance needs adjusting.\r\n\r\nThe impression I am getting---and correct me if I am wrong---is that you agree it would be nice to have only one implementation of `ExtractSplits()` that had, e.g., `MatType` as a template parameter, but this is not possible due to the current drawbacks of the Armadillo sparse matrix implementation.  I agree that there are definitely problems with the Armadillo CSC implementation so for now I think it's reasonable to have two versions of `ExtractSplits()`, but I think in the long run we should aim in the following directions:\r\n\r\n * reimplement `SpMat` as coordinate list, because really I have not found any cases where CSC gives significant speedup.  Coordinate lists also mean that a lot of non-linear-algebraic operations like sorting, iterating, and so forth become a million times easier.  This task is something I will do, I am just not sure when I'll have the time.\r\n\r\n * Get better sparse matrix file reading in place (and submitted upstream to Armadillo when possible).  I know that Armadillo has coordinate list file loading support, but there isn't a way to use that currently with any of the mlpack command-line programs.  Maybe that is worth looking into too.  What formats are you thinking of?  The more I think of it the following *should* work:\r\n\r\n```\r\narma::sp_mat matrix;\r\ndata::Load(\"coordinate_list.csv\", matrix);\r\n```\r\n\r\n * Take your `sort()` implementation and submit it upstream to Armadillo.  Would you like to do this?  I can help.  Basically we'll just need to write tests for it then send it to Conrad and he's likely to take a look over and then include it.\r\n\r\nAre you finished with this PR otherwise?  If so I'll go ahead and merge it.  Thanks again for your time on this, it is a definite improvement."}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/802#issuecomment-257598603"}}}</script>