[mlpack-git] [mlpack] add train test split (#523)
Marcus Edel
notifications at github.com
Thu Feb 25 09:18:34 EST 2016
hanks for the pull request I think, this is the first step to get the GSOC 'Dataset and experimentation tools' project started.
I took a quick look over the pull request and have some comments:
- Instead of returning a tuple that contains the train and test data, I would go with references, to be more consistent with the other mlpack code that has functions like this.
``TrainTestSplit(const MatType& input, MatType& train, MatType& test)``
- You can create a random vector of indexes using armadillo instead of using an std::vector.
`'arma::Col<size_t> order = arma::shuffle(arma::linspace<arma::Col<size_t> >(0, (details::extractSize(input) - 1), details::extractSize(input)));``
I guess this is just a preference of mine, because your code is absolutely fine. I've seen this code fragment a couple of times, so I guess I'm biased.
- Please use camel casing for all names. Capitalize class and method names.
Another thing that crossed my mind is, can we modify the code in a why that it also works with data matrices with more than one dimension. Imagine a dataset of images with more than one channel per image.
---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/523#issuecomment-188802055
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160225/7329f326/attachment-0001.html>
More information about the mlpack-git
mailing list