<p>In <a href="https://github.com/mlpack/mlpack/pull/726#discussion_r75018312">src/mlpack/core/tree/binary_space_tree/rp_tree_max_split_impl.hpp</a>:</p>
<pre style='color:#555'>> + // Find the median of scalar products of the samples and the normal vector.
> + for (size_t k = 0; k < samples.n_elem; k++)
> + values[k] = arma::dot(data.col(samples[k]), direction);
> +
> + const ElemType maximum = arma::max(values);
> + const ElemType minimum = arma::min(values);
> + if (minimum == maximum)
> + return false;
> +
> + splitVal = arma::median(values);
> +
> + // Add a random deviation to the median.
> + // This algorithm differs from the method suggested in the
> + // random projection tree paper.
> + splitVal += math::Random((minimum - splitVal) * 0.75,
> + (maximum - splitVal) * 0.75);
</pre>
<p>Other than the clear speed improvement, why did you choose to go with this simpler approach? I am not sure it will still satisfy the theoretical guarantees given by the paper (I have not spent a huge amount of time thinking about the necessary conditions for that).</p>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/726/files/ea66e5d17914460289974cc61f0669941edc2524#r75018312">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFJSNd1C9TEmhiijzx-8zmrNCxEnhks5qgiPXgaJpZM4JOuGE">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFNM9bu8EFzv814gkWqYCm6xodV1lks5qgiPXgaJpZM4JOuGE.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="https://github.com/mlpack/mlpack/pull/726/files/ea66e5d17914460289974cc61f0669941edc2524#r75018312"></link>
<meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>
<script type="application/json" data-scope="inboxmarkup">{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/mlpack/mlpack","title":"mlpack/mlpack","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/mlpack/mlpack"}},"updates":{"snippets":[{"icon":"PERSON","message":"@rcurtin in #726: Other than the clear speed improvement, why did you choose to go with this simpler approach? I am not sure it will still satisfy the theoretical guarantees given by the paper (I have not spent a huge amount of time thinking about the necessary conditions for that)."}],"action":{"name":"View Pull Request","url":"https://github.com/mlpack/mlpack/pull/726/files/ea66e5d17914460289974cc61f0669941edc2524#r75018312"}}}</script>