[mlpack-git] [mlpack] [Proposal]Develop a scalable Finetune class to fine tune the paramters of deep network (#458)

Mon Oct 5 10:16:41 EDT 2015

I like that this uses the existing optimizer API, but a couple comments:

I think that `std::vector<arma::mat*>` is a little bit awkward; could you possibly instead take the instantiated network as the input and extract/modify the parameters from there?

What does the function `Deriv(arma::mat const&, arma::mat&)` do?  How is that different from `Gradient()`?

> Besides, the SoftmaxFunction need to cache the probabilities value, else you have to recalculate the probabilites two more times when fine tune the parameters, any suggestions?

This is a common problem in the various optimizers; often an optimizer will make a call to `Evaluate()` followed directly by a call to `Gradient()` with the same parameters.  I don't think this can be fixed unless we modify the API for the optimizers.  One way to do this might be to add an `EvaluateWithGradient()` function that both calculates the objective and also calculates the gradient.  It's relatively straightforward to use template metaprogramming inside of the optimizers in order to call either `Evaluate()` then `Gradient()` or `EvaluateWithGradient()` when available.  So this way, the `EvaluateWithGradient()` function would be optional, but when supplied it could accelerate the computation.  What do you think of this idea?

I can't comment on how well this proposal fits with the rest of the neural network code; I don't know if there are already any plans for fine tuning.

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/458#issuecomment-145542827
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20151005/115967d8/attachment.html>