[mlpack-git] [mlpack/mlpack] add cli executable for data_split (#650)

stereomatchingkiss notifications at github.com
Sun May 29 11:28:23 EDT 2016


@keonkim  Looks good for me, only something need to refine

1 : format 
Split functions have some trailing spaces, please remove them

2 : move the train and test data into std::make_tuple
The overload of Split function which return tuple, please move the trainData and testData into the tuple.

`return std::make_tuple(std::move(trainData), std::move(testData));`

3 : `if ((testRatio < 0.0) && (testRatio > 1.0))`

should be

`if ((testRatio < 0.0) || (testRatio > 1.0))`

4 :  Following codes will copy the value, you can

```
mat training = get<0>(value);
mat test = get<1>(value);
mat trainingLabels = get<2>(value);
mat testLabels = get<3>(value);
```

a : move it
```
mat training = std::move(get<0>(value));
mat test = std::move(get<1>(value));
mat trainingLabels = std::move(get<2>(value));
mat testLabels = std::move(get<3>(value));

```
b : use tuple directly
```
data::Save(trainingFile, get<0>(value), false);
data::Save(testFile, get<1>(value), false);
data::Save(trainingLabelsFile, get<2>(value), false);
data::Save(testLabelsFile, get<3>(value), false);
```

c : use another api from the beginning
```
mat training;
mat test;
mat trainingLabels;
mat testLabels;
data::Split(data, labels_row, training, test, trainingLabels, testLabels, testRatio);
```
d : take the reference, with const( I love const :) )
```
mat const &training = std::move(get<0>(value));
mat const &test = std::move(get<1>(value));
mat const &trainingLabels = std::move(get<2>(value));
mat const &testLabels = std::move(get<3>(value));

```

split without parameters part has the same problem

5 : The test of non-labels Split haven't finished

The file only test the size after split, but do not test the data after are split into correct clusters(train and test). To test it, I think we could 

step 1 : generate sequential, unique values to the input data
step 2 : split input data into train data and test data
step 3 : make sure all of the data in input data could be found in train data and test data


---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/650#issuecomment-222366440
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160529/8236fa41/attachment.html>


More information about the mlpack-git mailing list