<p>Some suggestions after more study</p>
<p>1 : Maybe a little bit late, if I am correct, IncrementPolicy make no sense with Imputer class? Maybe we can use static assert to forbid users specify the mapper policy as IncrementPolicy?</p>
<pre><code>static_assert(!std::is_same<MapperType, DatasetMapper<IncrementPolicy>>::value,
"Mapper type do not support DatasetMapper<IncrementPolicy>");
</code></pre>
<p>2 : I think separate the test of imputation all to another test case is easier to read</p>
<p>3 : Found a "bug"(this one cost me a while to find out) when I try to Impute all data by non columnMajor</p>
<pre><code>BOOST_AUTO_TEST_CASE(DatasetMapperImputeAllTest)
{
fstream f;
f.open("test_file.csv", fstream::out);
f << "a, 2, 3, 4" << endl;
f << "5, 6, a, 8" << endl;
f << "9, 10, 11, 12" << endl;
f.close();
arma::mat allInputColumnWise;
MissingPolicy allPolicy({"a"});
DatasetMapper<MissingPolicy> allInfo(allPolicy);
BOOST_REQUIRE(data::Load("test_file.csv", allInputColumnWise, allInfo) == true);
// convert missing vals to 99.
CustomImputation<double> allCustomStrategy(99);
Imputer<double,
DatasetMapper<MissingPolicy>,
CustomImputation<double>> allImputer(allInfo, allCustomStrategy);
// convert a or nan to 99 for all dimensions.
arma::mat allInputRowWise = allInputColumnWise;
allImputer.Impute(allInputColumnWise, "a");
allInputColumnWise.print();
std::cout<<allInfo.NumMappings(0)<<", "<<allInfo.NumMappings(1)<<", "
<<allInfo.NumMappings(2)<<", "<<allInfo.NumMappings(3)<<std::endl;
// Custom imputation result check
auto requireClose = [](arma::mat const &input)
{
BOOST_REQUIRE_CLOSE(input(0, 0), 99.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(0, 1), 5.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(0, 2), 9.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(1, 0), 2.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(1, 1), 6.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(1, 2), 10.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(2, 0), 3.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(2, 1), 99.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(2, 2), 11.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(3, 0), 4.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(3, 1), 8.0, 1e-5);
BOOST_REQUIRE_CLOSE(input(3, 2), 12.0, 1e-5);
};
requireClose(allInputColumnWise);
allImputer.ColumnMajor() = false;
allImputer.Impute(allInputRowWise, "a");
allInputRowWise.print("\n");
//Do not work as expected, because the results of NumMappings are 1,0,1,0
std::cout<<allInfo.NumMappings(0)<<", "<<allInfo.NumMappings(1)<<", "
<<allInfo.NumMappings(2)<<", "<<allInfo.NumMappings(3)<<std::endl;
requireClose(allInputRowWise);
// Remove the file.
remove("test_file.csv");
}
</code></pre>
<p>Maybe we should change the codes to</p>
<pre><code>void Impute(arma::Mat<T>& input,
const std::string& missingValue)
{
for (size_t i = 0; i < input.n_rows; ++i)
{
if (mapper.NumMappings(i) > 0)
{
T mappedValue = static_cast<T>(mapper.UnmapValue(missingValue, i));
strategy.Impute(input, mappedValue, i, true);
}
}
}
</code></pre>
<p>This should be ok, because dimensions of NumMappings will always equal to the n_rows of input.</p>
<p>4 : What if users want to impute more than one missing values?</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/740#issuecomment-235713804">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFAlC1xUdVt8RVBpQX7xdyCow4yuQks5qZ8JNgaJpZM4JU0It">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFODKWyd5RCxrquwcMifdi82EcBuUks5qZ8JNgaJpZM4JU0It.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="https://github.com/mlpack/mlpack/pull/740#issuecomment-235713804"></link>
<meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>