[mlpack-git] [mlpack/mlpack] DatasetMapper & Imputer (#694)
Ryan Curtin
notifications at github.com
Thu Jul 7 10:58:25 EDT 2016
> + const T& mappedValue,
> + const size_t dimension,
> + const bool transpose = true)
> + {
> + // initiate output
> + output = input;
> + size_t count = 0;
> +
> + if (transpose)
> + {
> + for (size_t i = 0; i < input.n_cols; ++i)
> + {
> + if (input(dimension, i) == mappedValue ||
> + std::isnan(input(dimension, i)))
> + {
> + output.shed_col(i - count);
I think that here is might be faster to collect the list of columns that needs to be dropped, and then use non-contiguous matrix views to extract what you need to keep. Something like this:
```
std::vector<arma::uword> colsToKeep;
colsToKeep.push_back(0);
colsToKeep.push_back(2);
colsToKeep.push_back(3);
// Only keep columns 0, 2, and 3.
arma::mat output = input.cols(colsToKeep);
```
But before you commit that change, it is probably worth running some quick tests with `mlpack_preprocess_impute` to ensure that speeds things up. I think you will see a lot of speedup with large matrices.
---
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/pull/694/files/a8818316a04506530e2269a2e0a32ba2f6a1c83b#r69922436
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20160707/40d7a765/attachment.html>
More information about the mlpack-git
mailing list