[mlpack-git] [mlpack] Add covariance factorization caching to gaussian distribution (#390)
Ryan Curtin
notifications at github.com
Mon Jan 26 15:09:04 EST 2015
I spent far too long benchmarking this trying to track down some anomalously slow results with the new code. I ended up unable to reproduce any slowdown, but hey, I did a good amount of benchmarking, so here are the results. I used this gist:
https://gist.github.com/rcurtin/daf960aa6ad545f58402
Below are the numbers for each of the timers in that program (the `no chol` numbers are where I removed the call to `chol(..., "lower")` and just inverted the covariance matrix directly in `FactorCovariance()`:
```
covertype (54x581012)
master
estimate 2.027 2.027 2.031
gmm_training_imitation 43.095 43.014 42.926
probability_batch 0.838 0.835 0.835
probability_individual 64.599 66.545 66.608
random 3.607 3.608 3.601
stephentu
estimate 2.049 2.012 1.005
gmm_training_imitation 42.363 42.615 42.303
probability_batch 0.823 0.826 0.824
probability_individual 0.746 0.745 0.745
random 0.437 0.436 0.438
stephentu, no chol
estimate 1.976 1.995 1.984
gmm_training_imitation 42.377 42.306 42.323
probability_batch 0.825 0.825 0.823
probability_individual 0.744 0.748 0.747
random 0.437 0.439 0.437
```
```
corel (32x37749)
master
estimate 0.053 0.053 0.053
gmm_training_imitation 1.522 1.548 1.547
probability_batch 0.031 0.031 0.031
probability_individual 1.647 1.656 1.603
random 0.917 0.910 0.915
stephentu
estimate 0.052 0.052 0.051
gmm_training_imitation 1.572 1.559 1.573
probability_batch 0.031 0.031 0.031
probability_individual 0.047 0.047 0.046
random 0.257 0.258 0.257
stephentu no chol
estimate 0.051 0.052 0.051
gmm_training_imitation 1.575 1.570 1.583
probability_batch 0.031 0.031 0.031
probability_individual 0.047 0.046 0.047
random 0.256 0.257 0.256
```
```
1000000-10-randu (10x1000000)
master
estimate 0.159 0.160 0.159
gmm_training_imitation 16.528 16.606 16.936
probability_batch 0.332 0.333 0.339
probability_individual 3.477 3.483 3.489
random 0.156 0.156 0.151
stephentu
estimate 0.164 0.164 0.164
gmm_training_imitation 16.514 16.492 16.562
probability_batch 0.335 0.331 0.332
probability_individual 0.155 0.163 0.155
random 0.071 0.071 0.071
stephentu, no chol
estimate 0.158 0.159 0.159
gmm_training_imitation 16.892 16.576 16.844
probability_batch 0.341 0.335 0.337
probability_individual 0.165 0.162 0.169
random 0.071 0.071 0.071
```
So, we get tons of speedup for calls to `Random()` and `Probability()` for a single point, but not much speedup for the other cases. One might see better speedup for the other methods in very high-dimensional settings. As a result, I don't see much speedup in the `gmm` program as a result of this, but it's certainly still an important and valuable contribution. :+1:
---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/390#issuecomment-71528581
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150126/4b568827/attachment.html>
More information about the mlpack-git
mailing list