[mlpack-git] [mlpack/mlpack] Modeling LSH For Performance Tuning (#749)

Ryan Curtin notifications at github.com
Wed Aug 24 11:23:31 EDT 2016


> +  maxKValue = k;
> +
> +  // Save pointer to training set.
> +  this->referenceSet = &referenceSet;
> +
> +  // Step 1. Select a random sample of the dataset. We will work with only that
> +  // sample.
> +  arma::vec sampleHelper(referenceSet.n_cols, arma::fill::randu);
> +
> +  // Keep a sample of the dataset: We have uniformly random numbers in [0, 1],
> +  // so we expect about N*sampleRate of them to be in [0, sampleRate).
> +  arma::mat sampleSet = referenceSet.cols(
> +        arma::find(sampleHelper < sampleRate));
> +  // Shuffle to be impartial (in case dataset is sorted in some way).
> +  sampleSet = arma::shuffle(sampleSet);
> +  const size_t numSamples = sampleSet.n_cols; // Points in sampled set.

I think I wrote my comment without thoroughly looking at the code; I see now that it's sampling without replacement.  Thanks for the explanation.  In that case `ObtainDistinctSamples()` should do what you need, I think.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/749/files/57c9d5e634d7d3d7e2ca1618353fe37d9e23b34a#r76078363
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160824/d4203393/attachment.html>


More information about the mlpack-git mailing list