[mlpack-git] [mlpack/mlpack] Issue261/lsh test (#605)

Tue Apr 5 17:33:37 EDT 2016

Ah, yeah, what I mean is that we should split each of the five things you are testing into separate test cases.  Like this:

```
//Test: Run LSH with varying number of tables, keeping all other parameters 
//constant. Compute the recall, i.e. the number of reported neighbors that
//are real neighbors of the query.
//LSH's property is that (with high probability), increasing the number of
//tables will increase recall. Epsilon ensures that if noise lightly affects
//the projections, the test will not fail.
//This produces false negatives, so we attempt the test numTries times and
//only declare failure if all of them fail.
BOOST_AUTO_TEST_CASE(RecallTest)
{
  // code here
}

//Test: Run LSH with varying hash width, keeping all other parameters 
//constant. Compute the recall, i.e. the number of reported neighbors that
//are real neighbors of the query.
//LSH's property is that (with high probability), increasing the hash width
//will increase recall. Epsilon ensures that if noise lightly affects the 
//projections, the test will not fail.
BOOST_AUTO_TEST_CASE(HashWidthTest)
{
  // code here
}
```

I hope I've described that well enough, let me know if not.  The basic idea is that we can split each of the tests so that a user can run only one test at a time.  The disadvantage to this approach is that you probably have to load a dataset multiple times, but that's not so huge of a deal since the dataset will be relatively small.

---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/605#issuecomment-205992606
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160405/e3981a50/attachment.html>