<p>In <a href="https://github.com/mlpack/mlpack/pull/749#discussion_r75926330">src/mlpack/methods/lsh/lshmodel_impl.hpp</a>:</p>
<pre style='color:#555'>&gt; +  maxKValue = k;
&gt; +
&gt; +  // Save pointer to training set.
&gt; +  this-&gt;referenceSet = &amp;referenceSet;
&gt; +
&gt; +  // Step 1. Select a random sample of the dataset. We will work with only that
&gt; +  // sample.
&gt; +  arma::vec sampleHelper(referenceSet.n_cols, arma::fill::randu);
&gt; +
&gt; +  // Keep a sample of the dataset: We have uniformly random numbers in [0, 1],
&gt; +  // so we expect about N*sampleRate of them to be in [0, sampleRate).
&gt; +  arma::mat sampleSet = referenceSet.cols(
&gt; +        arma::find(sampleHelper &lt; sampleRate));
&gt; +  // Shuffle to be impartial (in case dataset is sorted in some way).
&gt; +  sampleSet = arma::shuffle(sampleSet);
&gt; +  const size_t numSamples = sampleSet.n_cols; // Points in sampled set.
</pre>
<p>Are you sampling with or without replacement?  If you're sampling without replacement (I don't think that's the case based on the code here) you can use <code>math::ObtainDistinctSamples()</code> from somewhere in <code>core/math/</code>.  Otherwise it might be better to simply keep a list of indices of samples, and don't actually extract it from the original matrix.  Then later you can use that vector of indices to create a non-contiguous matrix subview, like this:</p>

<pre><code>extern arma::uvec indices; // This has already been filled with stuff.
extern arma::mat dataset; // This is our dataset.
dataset.cols(indices); // Returns all the columns we're interested in.
</code></pre>

<p>This is a pretty low-priority comment, though, so don't worry too heavily about it, only if you want to.  I'd say testing is higher priority. :)</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/749/files/57c9d5e634d7d3d7e2ca1618353fe37d9e23b34a#r75926330">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFJ2uMEXvgqLh4wH0Yv6muue6rbSbks5qiz6AgaJpZM4JczVR">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFPRjfakP3dQAqWdID2Xltvy-X-Leks5qiz6AgaJpZM4JczVR.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/749/files/57c9d5e634d7d3d7e2ca1618353fe37d9e23b34a#r75926330"></link>
  <meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>

<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rcurtin in #749: Are you sampling with or without replacement?  If you're sampling without replacement (I don't think that's the case based on the code here) you can use `math::ObtainDistinctSamples()` from somewhere in `core/math/`.  Otherwise it might be better to simply keep a list of indices of samples, and don't actually extract it from the original matrix.  Then later you can use that vector of indices to create a non-contiguous matrix subview, like this:\r\n\r\n```\r\nextern arma::uvec indices; // This has already been filled with stuff.\r\nextern arma::mat dataset; // This is our dataset.\r\ndataset.cols(indices); // Returns all the columns we're interested in.\r\n```\r\n\r\nThis is a pretty low-priority comment, though, so don't worry too heavily about it, only if you want to.  I'd say testing is higher priority. :)"}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/749/files/57c9d5e634d7d3d7e2ca1618353fe37d9e23b34a#r75926330"}}}</script>