[mlpack-git] [mlpack] segmentation fault with LogisticRegression API (#428)

John Lees notifications at github.com
Mon Mar 30 18:22:31 EDT 2015


I am trying to run a LogisticRegression as follows:
void logisticTest(Kmer& k, const arma::vec& y_train)
{
   // Train classifier
   arma::mat x_train = k.get_x();
   mlpack::regression::LogisticRegression<> fit(x_train, y_train);
...

where the get_x function of the Kmer class returns an arma::mat which is 1x3069 (the same dimension as y_train)

I include in the header:
#include <iostream>
#include <fstream>
#include <cmath>
#include <cstdlib>
#include <string>
#include <algorithm>
#include <list>
#include <vector>
#include <thread>
#include <exception>
#include <sys/stat.h>

// Boost headers
#include <boost/program_options.hpp>
#include <boost/math/distributions/normal.hpp>

// Armadillo/mlpack headers
#include <mlpack/core.hpp>
#include <mlpack/methods/logistic_regression/logistic_regression.hpp>

Code compiles with:
g++ -Wall -g -O0 -std=c++11  -c -o kmer.o kmer.cpp
g++ -Wall -g -O0 -std=c++11  -c -o Assoc.o Assoc.cpp
g++ -Wall -g -O0 -std=c++11 Assoc.o kmer.o -lmlpack -larmadillo -lboost_program_options -lblas -llapack -lm -o assoc

x and y are vectors of 0 or 1 (as doubles). I've tried with a number of different x and y of this form.

When I run the code, I always get a segmentation fault, which valgrind suggests occurs in the mlpack::regression::LogisticRegressionFunction::Evaluate function, though I don't know where as the library is stripped of debugging information

==5679== Invalid read of size 8
==5679==    at 0x4EE7702: mlpack::regression::LogisticRegressionFunction::Evaluate(arma::Mat<double> const&) const (in /usr/local/lib/libmlpack.so.1.0)
==5679==    by 0x439D09: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Evaluate(arma::Mat<double> const&) (lbfgs_impl.hpp:83)
==5679==    by 0x437C26: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Optimize(arma::Mat<double>&, unsigned long) (lbfgs_impl.hpp:364)
==5679==    by 0x435CAF: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Optimize(arma::Mat<double>&) (lbfgs_impl.hpp:331)
==5679==    by 0x434A5A: mlpack::regression::LogisticRegression<mlpack::optimization::L_BFGS>::LogisticRegression(arma::Mat<double> const&, arma::Col<double> const&, double) (logistic_regression_impl.hpp:37)
==5679==    by 0x431DE5: logisticTest(Kmer&, arma::Col<double> const&) (pangwasAssoc.cpp:17)
==5679==    by 0x419FB5: main (pangwasMain.cpp:109)
==5679==  Address 0xfff001000 is not stack'd, malloc'd or (recently) free'd
==5679== 
==5679== 
==5679== Process terminating with default action of signal 11 (SIGSEGV)
==5679==  Access not within mapped region at address 0xFFF001000
==5679==    at 0x4EE7702: mlpack::regression::LogisticRegressionFunction::Evaluate(arma::Mat<double> const&) const (in /usr/local/lib/libmlpack.so.1.0)
==5679==    by 0x439D09: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Evaluate(arma::Mat<double> const&) (lbfgs_impl.hpp:83)
==5679==    by 0x437C26: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Optimize(arma::Mat<double>&, unsigned long) (lbfgs_impl.hpp:364)
==5679==    by 0x435CAF: mlpack::optimization::L_BFGS<mlpack::regression::LogisticRegressionFunction>::Optimize(arma::Mat<double>&) (lbfgs_impl.hpp:331)
==5679==    by 0x434A5A: mlpack::regression::LogisticRegression<mlpack::optimization::L_BFGS>::LogisticRegression(arma::Mat<double> const&, arma::Col<double> const&, double) (logistic_regression_impl.hpp:37)
==5679==    by 0x431DE5: logisticTest(Kmer&, arma::Col<double> const&) (Assoc.cpp:17)
==5679==    by 0x419FB5: main (Main.cpp:109)

Before this error, I also get a few of the form:
==5679== Conditional jump or move depends on uninitialised value(s)
==5679==    at 0x58B68EF: log (w_log.c:30)
==5679==    by 0x4EE770B: mlpack::regression::LogisticRegressionFunction::Evaluate(arma::Mat<double> const&) const (in /usr/local/lib/libmlpack.so.1.0)

However, when I run the LogisticRegression binary from mlpack using the same x and y vectors printed as csv files it works just fine:
logistic_regression --input_file x.vec.csv --input_responses y.vec.csv --output_file logit.txt -v
[INFO ] Loading 'x.vec.csv' as CSV data.  Size is 1 x 3069.
[INFO ] Loading 'y.vec.csv' as CSV data.  Size is 1 x 3069.
[INFO ] Training model with L-BFGS optimizer.
[INFO ] LogisticRegression::LogisticRegression(): final objective of trained model is 2030.33.
[INFO ] Saving model to 'logit.txt'.
[INFO ] Saving raw ASCII formatted data to 'logit.txt'.
[INFO ] 
[INFO ] Execution parameters:
[INFO ]   decision_boundary: 0.5
[INFO ]   help: false
[INFO ]   info: ""
[INFO ]   input_file: x.vec.csv
[INFO ]   input_responses: y.vec.csv
[INFO ]   lambda: 0
[INFO ]   max_iterations: 0
[INFO ]   model_file: ""
[INFO ]   optimizer: lbfgs
[INFO ]   output_file: logit.txt
[INFO ]   output_predictions: predictions.csv
[INFO ]   step_size: 0.01
[INFO ]   test_file: ""
[INFO ]   tolerance: 1e-10
[INFO ]   verbose: true
[INFO ]   version: false
[INFO ] 
[INFO ] Program timers:
[INFO ]   loading_data: 0.006700s
[INFO ]   logistic_regression_optimization: 0.067037s
[INFO ]   saving_data: 0.010786s
[INFO ]   total_time: 0.088893s

Could anyone help me debug this? Have I used the API/some pointers incorrectly, or perhaps compiled incorrectly? This seems more likely than a bug in the function, given that the binary works ok
Apologies if this is the wrong forum for this problem

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/428
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20150330/046bce92/attachment-0001.html>


More information about the mlpack-git mailing list