[mlpack-git] [mlpack/mlpack] Modeling LSH For Performance Tuning (#749)
notifications at github.com
Tue Aug 23 14:40:00 EDT 2016
> + maxKValue = k;
> + // Save pointer to training set.
> + this->referenceSet = &referenceSet;
> + // Step 1. Select a random sample of the dataset. We will work with only that
> + // sample.
> + arma::vec sampleHelper(referenceSet.n_cols, arma::fill::randu);
> + // Keep a sample of the dataset: We have uniformly random numbers in [0, 1],
> + // so we expect about N*sampleRate of them to be in [0, sampleRate).
> + arma::mat sampleSet = referenceSet.cols(
> + arma::find(sampleHelper < sampleRate));
> + // Shuffle to be impartial (in case dataset is sorted in some way).
> + sampleSet = arma::shuffle(sampleSet);
> + const size_t numSamples = sampleSet.n_cols; // Points in sampled set.
Are you sampling with or without replacement? If you're sampling without replacement (I don't think that's the case based on the code here) you can use `math::ObtainDistinctSamples()` from somewhere in `core/math/`. Otherwise it might be better to simply keep a list of indices of samples, and don't actually extract it from the original matrix. Then later you can use that vector of indices to create a non-contiguous matrix subview, like this:
extern arma::uvec indices; // This has already been filled with stuff.
extern arma::mat dataset; // This is our dataset.
dataset.cols(indices); // Returns all the columns we're interested in.
This is a pretty low-priority comment, though, so don't worry too heavily about it, only if you want to. I'd say testing is higher priority. :)
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mlpack-git