[mlpack-git] [mlpack] add train test split (#523)

Sat Feb 27 15:46:38 EST 2016

Basically we thought it would be a nice idea to make the ANN code work with the mlpack optimizers. But also the other way around, once we have rewritten the ann optimizer (I have already rewritten the RMSprop and Adam optimizer), we could use them for, e.g. logistic regression. Although solves this the serialization problem of all layers that previously hold an optimizer. At the same time, we also removed the weight initialization rule from each layer. Because, its way easier to initialize the weights, e.g. using a gaussian distribution with zero mean and unit variance. Actually you could do the same thing with the previous layer architecture, by creating an initial weight matrix for the whole network and then initialize the weights using that matrix. Another reason for the transition, was that we could use the ANN code the same way as we use, e.g. the logistic regression or the naive bayes classifier. Obviously there is a drawback, basically you can't specify another optimiz
 er for e
 ach layer.

So, yes, the plan is to delete all ANN optimizers and the general trainer class in favour of a unified interface and the ability to the existing mlpack optimizers.

Keep in mind that the optimizer is independent of the underlying data structure and model. The optimizer tells the underlying function, e.g. the convolutional neural network which set to choose from the complete dataset and the function/model does the rest. This works because the optimizer calls the following function to evaluate a set:
```
function.Evaluate(weight, i);
```
The interesting parameter is ```'i'```, which tells the function to evaluate the i'th set of the complete dataset. Btw. the dataset is hold by the underlying function and not the optimizer, which is great because the function e.g. the convolutional neural network knows best how to work with the dataset. So, we could tell the network, to interpret that one set of the dataset is actually a cube with 3 slices. And don't have to modify the optimizer or trainer.

If you have any questions about the code, do not hesitate to ask me.

Actually I have a preference for solution 1. As I pointed out above the model holds the data and knows the best how to handle it. We could add a parameter to the CNN class that tells the network how to interpret the data. E.g. the number of slices that build a set of the dataset. Something like that 

```CNN(input, labels, dataSize) { ... }```

The evaluation function of the convolutional neural network which is called by the optimizer can then choose the right number of slices.

I was only bringing this up because, I was thinking if we can use your TrainTestSplit function to shuffle and split a dataset that is basically a cube, but that contains data, which is formed by more than one slice.
```
arma::cube image1 = arma::cube(23, 23, 3);
arma::cube image2 = arma::cube(23, 23, 3);

arma::cube data = arma::cube(23, 23, 6);
data.slices(0, 2) = image1;
data.slices(3, 5) = image2;
```

We can't just shuffle the dataset (data) because one set of the has 3 slices.

> ps : the images are funny:)

I was planning to use the images, for the GoogLeNet GSoC idea, but I figured that this doesn't make sense at all, so I took the opportunity :)

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/523#issuecomment-189721853
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160227/9c1b23c2/attachment.html>