[mlpack-git] [mlpack/mlpack] Modeling LSH For Performance Tuning (#749)

Ryan Curtin notifications at github.com
Sat Aug 13 13:59:47 EDT 2016


> +    *
> +    * @param numProj The number of projections for the LSH scheme for which we
> +    *     want to compute the template perturbation sequence.
> +    * @param hashWidth The hash width for the LSH scheme.
> +    * @param numProbes The number of probes to generate.
> +    */
> +   void GenerateTemplateSequence(size_t numProj, 
> +                                 double hashWidth, 
> +                                 size_t numProbes);
> +
> +   /** Matrix that stores, in each column, the "direction" of the perturbation:
> +    * 0 means no perturbation on that dimension, -1 means reduce dimension value
> +    * by 1, and +1 means increase dimension value by 1.
> +    */
> +   
> +   arma::Mat<short int> templateSequence;

Consider `std::vector<std::vector<bool>>` here, it might be more space-efficient and possibly quicker.  Also a possibility is `boost::dynamic_bitset` instead of `std::vector<bool>`, but I am not sure that gets you anything over `std::vector<bool>` here.

Yet another thought is that generally these sequences will be pretty sparse, so it might be more reasonable to encode each perturbation sequence as a 2-dimensional matrix: one dimension represents the nonzero coordinate and the other dimension represents whether the value is positive or negative.  For that matter you could be tricky and encode both the coordinate and the direction as a single number (i.e. direction*coordinate), but then you could only have number of coordinates equal to half the possible number of coordinates of that data type (but I don't think that is a problem).

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/749/files/cdcb575826bfb3bd0ef4cafacf465435b3d6d144#r74690056
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160813/2b584b07/attachment-0001.html>


More information about the mlpack-git mailing list