<p>The problem with CSC in this task is that we want in-row iteration and we <em>need</em> it to be of complexity O(1), and with CSC it is not. In the same time we need <em>swap_cols</em> operation, which would be much slower (I guess) if we work on the transposed matrix. If we have dual indices, so we can "walk" on both direction fast - that would be fine. And, generally - the matrix that I'm trying with is pretty sparse as well - approx. 99.995% is empty. </p>


<p>The benefit of <code>ExtractSplit</code> is that for dense matrices it makes in-place sorting, which is not the case with <code>arma::sort</code>. Another thing is that even with <code>row_col_iterator</code> you'll need to know you're in sparse matrix - because if the first entry you get is not in the beginning you need to <em>fake</em> a previous value of zero, to get a split between them. Also if you happen to jump over indices (with sorted sparse rows/cols it is good that all emptiness is in a single slot, so you have one jump at most), etc. So split-attempts won't be generic. That's why I've chosen to have split extraction points custom, and then iteration over them - generic. I'm a big enemy of duplicated code too! :-) Also, I've tried avoiding <code>submat</code> call and directly iterating over the given region to extract splits - it turned to be <em>way</em> much slower for sparse matrices. So now the most expensive step is <code>submat</code>.</p>


<p>For sparse sorting - isn't it best if you make another branch here, in the master repo and I make a pull request for <code>SpMat</code> sort to that branch, rather then to <code>master</code>? Or, you can get it directly from my repo - it is in <code>feature/sparse_sort</code> branch. Whichever is easier for you.</p>


<p>And finally - the Travis CI is failing on some other tests - not DET-related :-)</p>


<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/802#issuecomment-256286777">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFIBDpySfzIaEE4B0Fd9RF9wI3r6oks5q3xQKgaJpZM4KZnsm">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFC9QDCSy4biyiVuhchzEH318-tHqks5q3xQKgaJpZM4KZnsm.gif" width="1" /></p>

<div itemscope itemtype="http://schema.org/EmailMessage">

<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">

  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/802#issuecomment-256286777"></link>

  <meta itemprop="name" content="View Pull Request"></meta>

</div>

<meta itemprop="description" content="View this Pull Request on GitHub"></meta>

</div>


<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@thejonan in #802: The problem with CSC in this task is that we want in-row iteration and we _need_ it to be of complexity O(1), and with CSC it is not. In the same time we need _swap_cols_ operation, which would be much slower (I guess) if we work on the transposed matrix. If we have dual indices, so we can \"walk\" on both direction fast - that would be fine. And, generally - the matrix that I'm trying with is pretty sparse as well - approx. 99.995% is empty. \r\n\r\nThe benefit of `ExtractSplit` is that for dense matrices it makes in-place sorting, which is not the case with `arma::sort`. Another thing is that even with `row_col_iterator` you'll need to know you're in sparse matrix - because if the first entry you get is not in the beginning you need to _fake_ a previous value of zero, to get a split between them. Also if you happen to jump over indices (with sorted sparse rows/cols it is good that all emptiness is in a single slot, so you have one jump at most), etc. So split-attempts won't be generic. That's why I've chosen to have split extraction points custom, and then iteration over them - generic. I'm a big enemy of duplicated code too! :-) Also, I've tried avoiding `submat` call and directly iterating over the given region to extract splits - it turned to be _way_ much slower for sparse matrices. So now the most expensive step is `submat`.\r\n\r\nFor sparse sorting - isn't it best if you make another branch here, in the master repo and I make a pull request for `SpMat` sort to that branch, rather then to `master`? Or, you can get it directly from my repo - it is in `feature/sparse_sort` branch. Whichever is easier for you.\r\n\r\nAnd finally - the Travis CI is failing on some other tests - not DET-related :-)\r\n"}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/802#issuecomment-256286777"}}}</script>