[mlpack-git] [mlpack] more efficient matrix inner product computation (#2)

Wed Jan 14 23:46:19 EST 2015

I created a simple test branch to try and reproduce @zoq's results: rcurtin:tracedot.  I tested with master, which is just tracedot1, and the tracedot branch, which is tracedot2 (`accu(a % b)`).  Tested on a couple of systems, I got the following results for `time bin/mlpack_test -t LRSDPTest`.

```
zax.ratml.org, with gcc:
tracedot1     38.071s   38.066s   38.142s
tracedot2     20.614s   20.558s   20.609s

zax.ratml.org, with clang:
tracedot1     40.032s   40.011s   40.052s
tracedot2     21.320s   21.913s   21.704s

beautiful.cc.gt.atl.ga.us, with gcc (Intel i5 650, 5GB RAM):
tracedot1     65.709s   67.128s   66.380s
tracedot2     38.891s   38.940s   39.306s

collar.cc.gt.atl.ga.us (AMD Athlon 64 X2 3800+):
tracedot1     159.355s  158.522s  158.692s
tracedot2     115.019s  114.943s  114.949s
```

I didn't bother with the ARM test, and the sparc64 system is still compiling mlpack, so I figured these were good enough results.  My concern is that Marcus is seeing slower results as a result of being on OS X, but I don't have an OS X box to test with.  Marcus, would you mind trying the rcurtin:tracedot branch on the same system you got benchmarks from earlier?

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/2#issuecomment-70039384
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150114/3fe4c9ec/attachment.html>