[mlpack-svn] r11641 - mlpack/trunk/doc/tutorials/linear_regression

fastlab-svn at coffeetalk-1.cc.gatech.edu fastlab-svn at coffeetalk-1.cc.gatech.edu
Tue Feb 28 23:08:00 EST 2012


Author: rcurtin
Date: 2012-02-28 23:08:00 -0500 (Tue, 28 Feb 2012)
New Revision: 11641

Modified:
   mlpack/trunk/doc/tutorials/linear_regression/linear_regression.txt
Log:
Fix references in tutorial; rewrite into a little bit and clarify some
terminology here and there.  Also unnecessary grammar Nazi strikes again...


Modified: mlpack/trunk/doc/tutorials/linear_regression/linear_regression.txt
===================================================================
--- mlpack/trunk/doc/tutorials/linear_regression/linear_regression.txt	2012-02-28 22:28:25 UTC (rev 11640)
+++ mlpack/trunk/doc/tutorials/linear_regression/linear_regression.txt	2012-02-29 04:08:00 UTC (rev 11641)
@@ -4,64 +4,82 @@
 @author James Cline
 @brief Tutorial for how to use the LinearRegression class.
 
- at page lrtutorial Linear Regression tutorial (linear-regression)
+ at page lrtutorial Linear Regression tutorial (linear_regression)
 
- at section intro Introduction
+ at section intro_lrtut Introduction
 
-Linear regression is a statistical method which approximates a set of points as a
-linear function. We use a matrix representation of our dataset, called \b 
-predictors, and a vector of \b responses. The method will find the \f$dim+1\f$
-coefficients, \b parameters, for the linear function \f$y=c_0 + \sum_{i=1}^{dim}
-c_i x_i\f$.
+Linear regression is a simple machine learning technique which aims to estimate
+the parameters of a linear model.  Assuming we have \f$n\f$ \b predictor points
+\f$\mathbf{x_i}, 0 \le i < n\f$ of dimensionality \f$d\f$ and \f$n\f$
+responses \f$y_i, 0 \le i < n\f$, we are trying to estimate the best fit for
+\f$\beta_i, 0 \le i \le d\f$ in the linear model
 
+\f[
+y_i = \beta_0 + \displaystyle\sum_{j = 1}^{d} \beta_j x_{ij}
+\f]
+
+for each predictor \f$\mathbf{x_i}\f$ and response \f$y_i\f$.  If we take each
+predictor \f$\mathbf{x_i}\f$ as a row in the matrix \f$\mathbf{X}\f$ and each
+response \f$y_i\f$ as an entry of the vector \f$\mathbf{y}\f$, we can represent
+the model in vector form:
+
+\f[
+\mathbf{y} = \mathbf{X} \mathbf{\beta} + \beta_0
+\f]
+
+The result of this method is the vector \f$\mathbf{\beta}\f$, including the
+offset term (or intercept term) \f$\beta_0\f$.
+
 \b mlpack provides:
 
  - a \ref cli "simple command-line executable" to run nearest-neighbors search
    (and furthest-neighbors search)
  - a \ref linreg "simple C++ interface" to perform linear regression
 
- at section toc Table of Contents
+ at section toc_lrtut Table of Contents
 
 A list of all the sections this tutorial contains.
 
- - \ref intro
- - \ref toc
- - \ref cli
-   - \ref cli_ex1
-   - \ref cli_ex2
-   - \ref cli_ex3
- - \ref linreg
-   - \ref linreg_ex1
-   - \ref linreg_ex2
-   - \ref linreg_ex3
-   - \ref linreg_ex4
- - \ref further_doc
+ - \ref intro_lrtut
+ - \ref toc_lrtut
+ - \ref cli_lrtut
+   - \ref cli_ex1_lrtut
+   - \ref cli_ex2_lrtut
+   - \ref cli_ex3_lrtut
+ - \ref linreg_lrtut
+   - \ref linreg_ex1_lrtut
+   - \ref linreg_ex2_lrtut
+   - \ref linreg_ex3_lrtut
+   - \ref linreg_ex4_lrtut
+ - \ref further_doc_lrtut
 
- at section cli Command-Line 'linear_regression'
+ at section cli_lrtut Command-Line 'linear_regression'
 
 The simplest way to perform linear regression in \b mlpack is to use the
 linear_regression executable.  This program will perform linear regression and
 place the resultant coefficients into one file.
-The output file holds a vector of coefficients in increasing order, that is,
-the coefficient for \f$x_1\f$ then \f$x_2\f$ as well as the intercept.
-This executable can also predict the \f$y\f$ values of a second dataset based
-on the computed coefficients.
 
+The output file holds a vector of coefficients in increasing order of dimension;
+that is, the offset term (\f$\beta_0\f$), the coefficient for dimension 1
+(\f$\beta_1\f$, then dimension 2 (\f$\beta_2\f$) and so forth, as well as the
+intercept.  This executable can also predict the \f$y\f$ values of a second
+dataset based on the computed coefficients.
+
 Below are several examples of simple usage (and the resultant output).  The '-v'
-option is used so that output is given.  Further documentation on each
+option is used so that verbose output is given.  Further documentation on each
 individual option can be found by typing
 
 @code
 $ linear_regression --help
 @endcode
 
- at subsection cli_ex1 One file, generating the function coefficients
+ at subsection cli_ex1_lrtut One file, generating the function coefficients
 
 @code
 $ linear_regression --input_file dataset.csv -v
 [INFO ] Loading 'dataset.csv' as CSV data.
 [INFO ] Saving CSV data to 'parameters.csv'.
-[INFO ] 
+[INFO ]
 [INFO ] Execution parameters:
 [INFO ]   help: false
 [INFO ]   info: ""
@@ -71,7 +89,7 @@
 [INFO ]   output_predictions: predictions.csv
 [INFO ]   test_file: ""
 [INFO ]   verbose: true
-[INFO ] 
+[INFO ]
 [INFO ] Program timers:
 [INFO ]   load_regressors: 0.006461s
 [INFO ]   regression: 0.000347s
@@ -97,10 +115,10 @@
 As you can see, the function for this input is \f$f(y)=0+1x_1\f$. Keep in mind
 that in this example, the regressors for the dataset are the second column.
 That is, the dataset is one dimensional, and the last column has the \f$y\f$
-values, or responses, for each row. You can specify these responses in a 
+values, or responses, for each row. You can specify these responses in a
 separate file if you want, using the --input_responses, or -r, option.
 
- at subsection cli_ex2 Compute model and predict at the same time
+ at subsection cli_ex2_lrtut Compute model and predict at the same time
 
 @code
 $ linear_regression --input_file dataset.csv --test_file predict.csv -v
@@ -108,7 +126,7 @@
 [INFO ] Saving CSV data to 'parameters.csv'.
 [INFO ] Loading 'predict.csv' as CSV data.
 [INFO ] Saving CSV data to 'predictions.csv'.
-[INFO ] 
+[INFO ]
 [INFO ] Execution parameters:
 [INFO ]   help: false
 [INFO ]   info: ""
@@ -119,7 +137,7 @@
 [INFO ]   output_predictions: predictions.csv
 [INFO ]   test_file: predict.csv
 [INFO ]   verbose: true
-[INFO ] 
+[INFO ]
 [INFO ] Program timers:
 [INFO ]   load_regressors: 0.000360s
 [INFO ]   load_test_points: 0.000090s
@@ -151,16 +169,16 @@
 We used the same dataset, so we got the same parameters. The key thing to note
 about the predict.csv dataset is that it has the same dimensionality as the
 dataset used to create the model, one. Generally, if the model generating
-dataset has \f$n\f$ dimensions, so must the dataset we want to predict for.
+dataset has \f$d\f$ dimensions, so must the dataset we want to predict for.
 
- at subsection cli_ex3 Prediction using a precomputed model
+ at subsection cli_ex3_lrtut Prediction using a precomputed model
 
 @code
 $ linear_regression --model_file parameters.csv --test_file predict.csv -v
 [INFO ] Loading 'parameters.csv' as CSV data.
 [INFO ] Loading 'predict.csv' as CSV data.
 [INFO ] Saving CSV data to 'predictions.csv'.
-[INFO ] 
+[INFO ]
 [INFO ] Execution parameters:
 [INFO ]   help: false
 [INFO ]   info: ""
@@ -171,22 +189,22 @@
 [INFO ]   output_predictions: predictions.csv
 [INFO ]   test_file: predict.csv
 [INFO ]   verbose: true
-[INFO ] 
+[INFO ]
 [INFO ] Program timers:
 [INFO ]   load_model: 0.009519s
 [INFO ]   load_test_points: 0.000067s
 [INFO ]   prediction: 0.000007s
 [INFO ]   total_time: 0.010081s
 
-$ cat parameters.csv 
+$ cat parameters.csv
 -0.0000000000e+00,1.0000000000e+00
 
-$ cat predict.csv 
+$ cat predict.csv
 2
 3
 4
 
-$ cat predictions.csv 
+$ cat predictions.csv
 2.0000000000e+00
 3.0000000000e+00
 4.0000000000e+00
@@ -194,15 +212,16 @@
 
 Further documentation on options should be found by using the --help option.
 
- at section linreg The 'LinearRegression' class
+ at section linreg_lrtut The 'LinearRegression' class
 
 The 'LinearRegression' class is a simple implementation of linear regression.
 
-Using the LinearRegression class is very simple. It has two available constructors,
-one for generating a model from a matrix of predictors and a vector of responses,
-and one for loading an already computed model from a given file.
+Using the LinearRegression class is very simple. It has two available
+constructors; one for generating a model from a matrix of predictors and a
+vector of responses, and one for loading an already computed model from a given
+file.
 
-The class provides one method that does work:
+The class provides one method that performs computation:
 @code
 void Predict(const arma::mat& points, arma::vec& predictions);
 @endcode
@@ -212,61 +231,67 @@
 predictions, will be modified to contain the predicted values corresponding to
 each row of the points matrix.
 
- at subsection linreg_ex1 Generating a model
+ at subsection linreg_ex1_lrtut Generating a model
 
 @code
 #include <mlpack/methods/linear_regression/linear_regression.hpp>
 
-using namespace::regression
+using namespace mlpack::regression;
 
-arma::mat data; // The dataset itself
-arma::vec responses; // The responses, one row for each row in data
+arma::mat data; // The dataset itself.
+arma::vec responses; // The responses, one row for each row in data.
 
-// Regress
+// Regress.
 LinearRegression lr(data,responses);
 
-// Get the parameters, or coefficients
-arma vec parameters = lr.Parameters();
+// Get the parameters, or coefficients.
+arma::vec parameters = lr.Parameters();
 @endcode
 
- at subsection linreg_ex2 Setting a model
+ at subsection linreg_ex2_lrtut Setting a model
+
 Assuming you already have a model and do not need to create one, this is how
 you would set the parameters for a LinearRegression instance.
 
 @code
-arma::vec parameters; // Your model
+arma::vec parameters; // Your model.
 
-LinearRegression lr(); // Create a new LinearRegression instance or reuse one
-lr.Parameters() = parameters; // Set the model
+LinearRegression lr(); // Create a new LinearRegression instance or reuse one.
+lr.Parameters() = parameters; // Set the model.
 @endcode
 
- at subsection linreg_ex3 Load a model from a file
+ at subsection linreg_ex3_lrtut Load a model from a file
+
 If you have a generated model in a file somewhere you would like to load and use,
 you can simply pass it to the LinearRegression initializer like so.
 
 @code
-std::string filename; // The path and name of your file
+std::string filename; // The path and name of your file.
 
-LinearRegression lr(filename); // Will load the model internally
+LinearRegression lr(filename); // Will load the model internally.
 @endcode
 
- at subsection linreg_ex4 Prediction
+ at subsection linreg_ex4_lrtut Prediction
+
 Once you have generated or loaded a model using one of the aforementioned methods,
 you can predict values for a dataset.
 
 @code
 
 LinearRegression lr();
-// Load or generate your model
+// Load or generate your model.
 
-arma::mat points; // The dataset we want to predict on, each row is a data point
-arma::vec predictions; // This will store the predictions, one row for each point
+// The dataset we want to predict on; each row is a data point.
+arma::mat points;
+// This will store the predictions; one row for each point.
+arma::vec predictions;
 
-lr.Predict(points, predictions); // Predict
-// Now, predictions will contain the predicted values.
+lr.Predict(points, predictions); // Predict.
+
+// Now, the vector 'predictions' will contain the predicted values.
 @endcode
 
- at subsection further_doc Further documentation
+ at subsection further_doc_lrtut Further documentation
 
 For further documentation on the LinearRegression class, consult the
 \ref mlpack::regression::LinearRegression "complete API documentation".




More information about the mlpack-svn mailing list