<p>Some performance measurements with loading a file with 38.8MB.</p>
<p>Fast csv : 60ms<br>
IoStream : 490ms</p>
<p>Fast csv is 8 times faster than IoStream.<br>
Sounds good, but the bottleneck Load is not file loading but the parser.</p>
<p>Original mapper performance : <br>
transpose : 9616 msec<br>
non transpose : 10131 msec</p>
<p>Parsers and mapper dominate the times.</p>
<p>String to int measurement of different solutions : <a href="url">http://www.kumobius.com/2013/08/c-string-to-int/</a></p>
<p>I assume it is quite hard to beat spirit.</p>
<p>The fastest run time solution, I guess is reuse part of the fast csv reader(I could extract part of the codes) to read the file, use boost::spirit/manual converter to parse the file.</p>
<p>Codes of performance measurement : </p>
<pre><code>BOOST_AUTO_TEST_CASE(FastCSVSpeed)
{
io::LineReader reader("big_file.csv");
size_t line_num = 0;
auto const t1 =
std::chrono::high_resolution_clock::now();
while(auto *line = reader.next_line()){
++line_num;
}
auto const t2 =
std::chrono::high_resolution_clock::now();
auto const duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count();
std::cout<<"line num "<<line_num<<std::endl;
std::cout<<"duration "<<duration<<std::endl;
}
BOOST_AUTO_TEST_CASE(IoStream)
{
std::ifstream in("big_file.csv");
std::ios_base::sync_with_stdio(false);
in.tie(nullptr);
std::string line;
size_t line_num = 0;
auto const t1 =
std::chrono::high_resolution_clock::now();
while(std::getline(in, line)){
++line_num;
}
auto const t2 =
std::chrono::high_resolution_clock::now();
auto const duration = std::chrono::duration_cast<std::chrono::milliseconds>(t2-t1).count();
std::cout<<"line num "<<line_num<<std::endl;
std::cout<<"duration "<<duration<<std::endl;
}
</code></pre>
<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/mlpack/mlpack/pull/681#issuecomment-233172179">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AJ4bFBC25igLCcaOT6zEgxlvHKL9NMjiks5qWe4-gaJpZM4Iu08I">mute the thread</a>.<img alt="" height="1" src="https://github.com/notifications/beacon/AJ4bFF3JjhPBINKP7lPLlfoiQXaotSvSks5qWe4-gaJpZM4Iu08I.gif" width="1" /></p>
<div itemscope itemtype="http://schema.org/EmailMessage">
<div itemprop="action" itemscope itemtype="http://schema.org/ViewAction">
<link itemprop="url" href="https://github.com/mlpack/mlpack/pull/681#issuecomment-233172179"></link>
<meta itemprop="name" content="View Pull Request"></meta>
</div>
<meta itemprop="description" content="View this Pull Request on GitHub"></meta>
</div>