[mlpack-git] [mlpack] [Proposal]Enhance the class SparseAutoencoder and SoftmaxRegression (#454)

Ryan Curtin notifications at github.com
Fri Sep 25 10:14:08 EDT 2015


It's probably worth taking a look at the semi-finished https://github.com/mlpack/mlpack/wiki/ProposedNewDesignGuidelines and the style guide at https://github.com/mlpack/mlpack/wiki/StyleGuidelines.

> 1 : allow users to get the trained parameters of SparseAutoencoder and SoftmaxRegression
> ex :
> arma::mat parameters() const;
> void parameters(arma::mat const &parameters);

Fine by me, but make sure to pass by reference:

```
arma::mat& Parameters(); // for modification
const arma::mat& Parameters() const; // for access
```
> 2 : provide a default constructor for class SparseAutoencoder and SoftmaxRegression
> This allow users to train the SparseAutoencoder or SoftmaxRegression later on, so the users do not need to wrap them in smart pointer just to delay the initialization time.

We need to be slightly careful with this one.  If we provide a default constructor, what should be returned is a valid sparse autoencoder or a valid softmax regression object.  The reason for this is that we should avoid situations where a user can have an invalid `SparseAutoencoder` or `SoftmaxRegression` object.  What do you think?  I am still thinking about the right way to do default constructors (or how to handle default-constructed mlpack objects).

> 3 : This one is related to suggestion 2, provide a train function for SparseAutoencoder and SoftmaxRegression, this way the users can train the data whenever they like

Yes, this is a good idea.  The constructors will be able to be refactored to call `Train()`, too.  Some comments on your proposed functions below:

> ex :
> //for SoftMaxRegression

> //this one will use a random initialPoint to train
> void train(arma::mat const &input, arma::vec const &label, size_t numOfLabels);
> //this one can setup the initialPoint
> void train(arma::mat const &input, arma::const &initialPoint,
> arma::vec const &label, size_t numOfLabels);

At the point of calling `Train()`, the `SoftmaxRegression` model should already be a valid object.  So setting the initial point for the training shouldn't be necessary; instead, `Train()` should use the current model parameters as the initial point for the training.  As a nice side effect, this means that you can train on multiple datasets sequentially and get a coherent model in the end.

> template
> void train(OptimizerType& optimizer);
> template
> void train(OptimizerType& optimizer, arma::mat const &initialPoint);

For this particular overload, it's worth pointing out that `optimizer` holds an `initialPoint` inside of it, so the initial point overload ends up being redundant here.

I'd also consider adding a single-point training function:

```
template<typename VecType>
void Train(const VecType& point, const size_t label);
```

> //for SparseAutoencoder

> //this one will use a random parameters to train
> void train(arma::mat const &data, size_t hiddenSize);
> //this one can reuse the trained parameter
> void train(const arma::mat &data, const arma::mat &parameters,
> size_t hiddenSize);

See the comments on `SoftmaxRegression::Train()`---at this point, we should already have a coherent `SparseAutoencoder` model, so we should start our training from the current model parameters.

> 4 : Add "m_" before the data member of the class, by now SoftMaxRegression and SparseAutoencoder will use the same name to initialize the data member.

I disagree with this one; the style guidelines already suggest naming conventions, and then access to the internal parameters is simply done by capitalizing the first letter of the parameter.  For instance, if a class holds a member `matrix`, then this can be accessed/modified with the `Matrix()` function.  Also it would be a truly tremendous amount of work to refactor all mlpack code to adhere to a new naming scheme. :)

If you want to send pull requests for pieces at a time, I'll merge them in (as long as they're specific to the existing SparseAutoencoder and SoftmaxRegression code, otherwise it may have to wait on someone else's approval too).

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/454#issuecomment-143235049
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150925/31f2715c/attachment-0001.html>


More information about the mlpack-git mailing list