[mlpack-git] [mlpack/mlpack] Modeling LSH For Performance Tuning (#749)

Ryan Curtin notifications at github.com
Thu Aug 25 23:46:47 EDT 2016


> +
> +    // Reference set for kNN
> +    arma::mat refMat = sampleSet.cols(refSetStart, refSetEnd);
> +    referenceSizes(i) = refMat.n_cols;
> +
> +    arma::Mat<size_t> neighbors; // Not going to be used but required.
> +    arma::mat kNNDistances; // What we need.
> +    KNN naive(refMat, true); // true: train and use naive kNN.
> +    naive.Search(queryMat, k, neighbors, kNNDistances);
> +
> +    // Store the squared distances (what we need).
> +    kNNDistances = arma::pow(kNNDistances, 2);
> +
> +    // Compute Arithmetic and Geometric mean of the distances.
> +    Ek.row(i) = arma::mean(kNNDistances.t());
> +    Gk.row(i) = arma::exp(arma::mean(arma::log(kNNDistances.t()), 0));

I took a look at the most current code, and I see you are doing `find(kNNDistances > 0)`, but I don't think this will adequately filter duplicates.  Tomorrow I'll try and think about a good way to filter duplicate points; probably the best time to do that is during the calculation of the kNN distances matrix.  (i.e. if we encounter a zero distance, clear the row/column of the matrix and skip to the next point)  I need to think a little bit more about it...

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/749/files/57c9d5e634d7d3d7e2ca1618353fe37d9e23b34a#r76360737
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160825/3b31971b/attachment.html>


More information about the mlpack-git mailing list