[mlpack-git] [mlpack] add train test split (#523)

stereomatchingkiss notifications at github.com
Thu Feb 25 10:50:15 EST 2016


>I would go with references, to be more consistent with the other mlpack code that has functions like this.

I suggest we add another overload to accept the reference and keep the old api, because 
In this case, there would be 6 parameters need to pass into the api

TrainTestSplit(const MatType& input, MatType& train, MatType& test, Row<size_t> &trainLabel,
Row<size_t> &testLabel, double testRatio, unsigned int seed = 0);

>You can create a random vector of indexes using armadillo instead of using an std::vector.

Thanks, arma version looks neat and shorter

>works with data matrices with more than one dimension. Imagine a dataset of images with more than one channel per image.

I beg of your pardon, could you show me more details?Do you mean

a : Treat arma::Mat as a whole image(1~n channels), arma::Cube as a container of those images

I think this is not a problem, the overload of arma::Cube already solve this problem. We could store pixels of multi channels image into arma::Mat, store arma::Mat(from one channel to n channels) into arma::Cube, then split them.

b : You want to split arma::Mat of n into different channels?

If it is case b, different libraries may have different way to store their pixels, I think it is better to let the users preprocess their data first

or another cases?

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/523#issuecomment-188849544
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160225/8f4cc11e/attachment.html>


More information about the mlpack-git mailing list