<p>Current <code>maps</code> object for DatasetMapper can be described as maps of<br>
<code>map&lt;dimension, pair&lt;bimap&lt;string, MappedType&gt;, numMappings&gt;&gt;</code> (NumMappings usually being numeric primitive types.)</p>

<p>I think this map can be simplified to two parts.</p>

<pre><code>// MapType = maps&lt;dimension, bimap&lt;string, MappedType&gt;&gt;;
MapType maps;
size_t numMappings;
</code></pre>

<p>and for validation &amp; imputation purposes we could have another mapper (I will call it invalidMaps for now). Which looks like</p>

<pre><code>// InvalidMapType = maps&lt;string, std::pair&lt;dimension, point&gt;&gt; 
InvalidMapType invalidMaps;
size_t numInvalidMappings;
</code></pre>

<p>invalidMaps and maps serve two different purposes.<br>
maps is used as usual (mapping categorical feature to numeric feature).<br>
invalidMaps is used as temporary holder for future imputation. Both x and y coordinates have to be stored in order to track the invalid values, since every invalid values are turned to NaNs.</p>

<p>Ultimately, I think this way we could simplify the use of only one mapping policy instead of many.<br>
What do you think of this idea?</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">&mdash;<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/issues/758">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFPg7RBOPCRryH4anKomJn_DRdqfaks5qdp3hgaJpZM4Jeq8j">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFA4uOYZ9tjcu-vW9ThztX1zWQuePks5qdp3hgaJpZM4Jeq8j.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
  <link itemprop="url" href="https://github.com/mlpack/mlpack/issues/758"></link>
  <meta itemprop="name" content="View Issue"></meta>
</div>
<meta itemprop="description" content="View this Issue on GitHub"></meta>
</div>

<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/modules/aws/aws-bg.jpg","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Simpler Mapping Object for DatasetMapper; (#758)"}],"action":{"name":"View Issue","url":"https://github.com/mlpack/mlpack/issues/758"}}}</script>