<p>In <a href="https://github.com/mlpack/mlpack/pull/749#discussion_r74690056">src/mlpack/methods/lsh_model/lshmodel.hpp</a>:</p>
<pre style='color:#555'>&gt; +    *
&gt; +    * @param numProj The number of projections for the LSH scheme for which we
&gt; +    *     want to compute the template perturbation sequence.
&gt; +    * @param hashWidth The hash width for the LSH scheme.
&gt; +    * @param numProbes The number of probes to generate.
&gt; +    */
&gt; +   void GenerateTemplateSequence(size_t numProj, 
&gt; +                                 double hashWidth, 
&gt; +                                 size_t numProbes);
&gt; +
&gt; +   /** Matrix that stores, in each column, the &quot;direction&quot; of the perturbation:
&gt; +    * 0 means no perturbation on that dimension, -1 means reduce dimension value
&gt; +    * by 1, and +1 means increase dimension value by 1.
&gt; +    */
&gt; +   
&gt; +   arma::Mat&lt;short int&gt; templateSequence;
</pre>
<p>Consider <code>std::vector&lt;std::vector&lt;bool&gt;&gt;</code> here, it might be more space-efficient and possibly quicker.  Also a possibility is <code>boost::dynamic_bitset</code> instead of <code>std::vector&lt;bool&gt;</code>, but I am not sure that gets you anything over <code>std::vector&lt;bool&gt;</code> here.</p>

<p>Yet another thought is that generally these sequences will be pretty sparse, so it might be more reasonable to encode each perturbation sequence as a 2-dimensional matrix: one dimension represents the nonzero coordinate and the other dimension represents whether the value is positive or negative.  For that matter you could be tricky and encode both the coordinate and the direction as a single number (i.e. direction*coordinate), but then you could only have number of coordinates equal to half the possible number of coordinates of that data type (but I don't think that is a problem).</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/749/files/cdcb575826bfb3bd0ef4cafacf465435b3d6d144#r74690056">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFEOas8Nb7-T4w89rjZlpGYhi3QvDks5qfgYTgaJpZM4JczVR">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFCaN2TG4Mb49OdhTHajFkECPFgmnks5qfgYTgaJpZM4JczVR.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
  <link itemprop="url" href="https://github.com/mlpack/mlpack/pull/749/files/cdcb575826bfb3bd0ef4cafacf465435b3d6d144#r74690056"></link>
  <meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>

<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rcurtin in #749: Consider `std::vector\u003cstd::vector\u003cbool\u003e\u003e` here, it might be more space-efficient and possibly quicker.  Also a possibility is `boost::dynamic_bitset` instead of `std::vector\u003cbool\u003e`, but I am not sure that gets you anything over `std::vector\u003cbool\u003e` here.\r\n\r\nYet another thought is that generally these sequences will be pretty sparse, so it might be more reasonable to encode each perturbation sequence as a 2-dimensional matrix: one dimension represents the nonzero coordinate and the other dimension represents whether the value is positive or negative.  For that matter you could be tricky and encode both the coordinate and the direction as a single number (i.e. direction*coordinate), but then you could only have number of coordinates equal to half the possible number of coordinates of that data type (but I don't think that is a problem)."}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/749/files/cdcb575826bfb3bd0ef4cafacf465435b3d6d144#r74690056"}}}</script>