[mlpack-git] [mlpack] [Proposal]Develop a scalable Finetune class to fine tune the paramters of deep network (#458)
Ryan Curtin
notifications at github.com
Mon Oct 5 10:16:41 EDT 2015
I like that this uses the existing optimizer API, but a couple comments:
I think that `std::vector<arma::mat*>` is a little bit awkward; could you possibly instead take the instantiated network as the input and extract/modify the parameters from there?
What does the function `Deriv(arma::mat const&, arma::mat&)` do? How is that different from `Gradient()`?
> Besides, the SoftmaxFunction need to cache the probabilities value, else you have to recalculate the probabilites two more times when fine tune the parameters, any suggestions?
This is a common problem in the various optimizers; often an optimizer will make a call to `Evaluate()` followed directly by a call to `Gradient()` with the same parameters. I don't think this can be fixed unless we modify the API for the optimizers. One way to do this might be to add an `EvaluateWithGradient()` function that both calculates the objective and also calculates the gradient. It's relatively straightforward to use template metaprogramming inside of the optimizers in order to call either `Evaluate()` then `Gradient()` or `EvaluateWithGradient()` when available. So this way, the `EvaluateWithGradient()` function would be optional, but when supplied it could accelerate the computation. What do you think of this idea?
I can't comment on how well this proposal fits with the rest of the neural network code; I don't know if there are already any plans for fine tuning.
---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/458#issuecomment-145542827
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20151005/115967d8/attachment.html>
More information about the mlpack-git
mailing list