[mlpack-git] [mlpack/mlpack] Implementation of Multiprobe LSH (#691)

Tue Jun 28 07:51:56 EDT 2016

So, I noticed something weird which I hadn't noticed before. If you can reproduce it on your system then there might be something wrong either with our implementation or the cheap LSH run test.

I expected that setting L=1 and K=100 (numTables, numProj respectively) will never lead to high recall. This is not the case though, some times this returns the entire dataset and has 100% recall.

For example, run this a few times:
```
bin/mlpack_lsh -r ~/Projects/data/csv/iris_r.csv -q ~/Projects/data/csv/iris_q.csv -t ~/Projects/data/csv/iris_t.csv -K 40 -L 1 -H 0.00001 -k 32 -n garbage.csv -v
```

Does it ever return recall==100? Intuitively, it shouldn't... Have we messed something up or is it just that iris is a weird dataset? I never saw that with sift.

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/691#issuecomment-229027078
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160628/bab3a530/attachment.html>