[mlpack-git] [mlpack] [Proposal]Enhance the class SparseAutoencoder and SoftmaxRegression (#454)

Sat Sep 26 11:07:00 EDT 2015

> void(double value) and void(const double value) are the same thing, compiler treat them as same symbol, and since the copy of primitive would not have any side effect, could I just declare them as non const?like void(double value) rather than void(const double value).

Yeah, these are the same.  I often write 'const double' just to denote that the double won't be modified, but it doesn't really make a difference either way.

> About the default constructor, I cannot find a good way to design a default constructor to fulfill the requirements for mlpack yet, so I think it is better to remain the old way(no default constructor)

In some cases I've made a default constructor that takes the model's size but nothing else.  Take a look at the constructor I made for `NaiveBayesClassifier` yesterday; maybe that is a strategy that could be applied here?

https://github.com/mlpack/mlpack/commit/4a469b2733c122870274ab813528182157b12211#diff-9ce92a606304271ed37a8531bb3de99dR69

I think maybe instead of default constructors, simple constructors that don't take data can be used.

> The last thing is, if you do not care about breaking the old api, I would like to replace the SparseAutoencder and SparseAutoencoderFunction under the folder methods/sparse_autoencoder

Replace it with what?  Do you mean moving it to `src/mlpack/methods/ann/` or something?  I've got no problem changing the API; we're preparing for a release that will not be reverse-compatible (API-wise) anyway.  Lots of the old design abstractions turned out to not be the best ones.

> I forgot to said, rather than add a default constructor, I would suggest adding one more constructors , this way the users can reuse the trained parameters to train the model if they want to.

I didn't completely understand this idea, could you elaborate more please?

> But this also mean these two classes better offer users "save" and "load" functions.
> sm.save("smoke_params_of_sm");
> sm.load("smoke_params_of_sm");

Why not just use the existing `data::Load()` function that works with `boost::serialization` once `Serialize()` is implemented?  That would look like this:

```
data::Save("model_file.xml" /* or .bin or .txt */, "softmax_regression", sm);
data::Load("model_file.xml", "softmax_regression", sm);
```

> This way the users can reuse the trained results easier, I intent to use boost::serialization to store and save the data of the class, what do you think about it?

`boost::serialization` is pretty nice to use; you can take a look at the other classes I've used for examples (for instance, `NaiveBayesClassifier`, which I just did), or refer to their documentation.  The only difference is the serialization shim that means you can write a `Serialize()` method instead of a `serialize()` method (although, as a technical note, if you write a `serialize()` method, it will still work.  If you like template metaprogramming, take a look at `src/mlpack/core/data/serialization_shim.hpp`; it was a lot of fun to write. :) ).

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/454#issuecomment-143461939
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150926/059c145b/attachment-0001.html>