[mlpack-svn] [MLPACK] #344: Using Welford method to calculate variance in Naive Bayes Classifier
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Mon Apr 14 16:37:38 EDT 2014
#344: Using Welford method to calculate variance in Naive Bayes Classifier
-----------------------------------+----------------------------------------
Reporter: akvah | Owner: rcurtin
Type: enhancement | Status: accepted
Priority: minor | Milestone:
Component: mlpack | Resolution:
Keywords: variance calculation | Blocking:
Blocked By: |
-----------------------------------+----------------------------------------
Changes (by rcurtin):
* owner: => rcurtin
* status: new => accepted
Comment:
Hi Vahab,
Thank you for the contribution. I refactored it slightly; the updated
patch is attached. If you can make sure I haven't broken anything, I'd
appreciate it.
Unfortunately, what I found was that the patched code, while more robust,
takes approximately 3x as long for training. Here are my results for two
datasets:
* isolet (617x7797): ~0.01s training pre-patch, ~0.03s training post-
patch
* randu (10x1000000): ~0.05s training pre-patch, ~0.15s training post-
patch
I ran both of those enough times to account for timing variance.
So, can we find a way to increase the speed of the Wellford method to be
closer to the previous implementation? Alternately, another option is to
have the user specify which method should be used; or, perhaps an error
can be detected when using the original calculation, and then the Wellford
method could be used. What do you think?
Thanks,
Ryan
--
Ticket URL: <https://trac.research.cc.gatech.edu/fastlab/ticket/344#comment:1>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list