<p>I hate to be so picky on this point, but I think we can do better than the speedup given by <code>dirtyDimensions</code> by providing an overload of <code>Impute()</code> that looks in all dimensions. This way only one pass over the input matrix is necessary:</p>
<pre><code>// Version for only one matrix that we operate on in-place.
void Impute(arma::Mat<T>& matrix, ...)
{
// Get list of dimensions to check.
arma::Col<size_t> dirtyDimensions = ...; // just like in the main.cpp now
// Iterate over all columns in the matrix.
for (size_t c = 0; c < matrix.n_cols; ++c)
{
// Check each dirty column.
for (size_t d = 0; d < dirtyDimensions.n_elem; ++d)
{
if (matrix(dirtyDimensions[d], c) == missingValue)
matrix(dirtyDimensions[d], c) = newValue;
}
}
}
</code></pre>
<p>And the other overload:</p>
<pre><code>// Version for one input and one output matrix.
void Impute(const arma::Mat<T>& input, arma::Mat<T>& output, ...)
{
output.set_size(input.n_rows, input.n_cols);
for (size_t c = 0; c < input.n_cols; ++c)
{
// No need for dirtyDimensions since we must loop over all dimensions.
for (size_t r = 0; r < input.n_rows; ++r)
{
if (input(r, c) == missingValue)
output(r, c) = newValue;
else
output(r, c) = input(r, c);
}
}
}
</code></pre>
<p>If we do things like that, it should be faster than the current approach. I'm focusing so much on speed here, because other libraries like scikit-learn have this functionality already, and if we are not able to perform the same operations more quickly, there is no compelling reason to use this implementation vs. another one.</p>
<p>Otherwise everything looks fine to me, although I think to fix this issue there will have to be some refactoring.</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/694#issuecomment-234574772">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFDqMh_K8WK71WBKw6zYuTS2beDBzks5qYOE7gaJpZM4I07W-">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFPFL__WMs40Avl3HkfJSnEvzzocCks5qYOE7gaJpZM4I07W-.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="https://github.com/mlpack/mlpack/pull/694#issuecomment-234574772"></link>
<meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>