[mlpack-git] master: Update documentation. (79bf330)

gitdub at mlpack.org gitdub at mlpack.org
Thu Apr 14 10:11:39 EDT 2016


Repository : https://github.com/mlpack/mlpack
On branch  : master
Link       : https://github.com/mlpack/mlpack/compare/f73d89833d45e0870d97c133ad55094f494c8061...08ffa1b0c6d0a9fa05e2eb3dc9a993ea7fa97d54

>---------------------------------------------------------------

commit 79bf33090e229961a2bd90333bc42e9de69688d3
Author: Ryan Curtin <ryan at ratml.org>
Date:   Wed Apr 13 11:03:20 2016 -0400

    Update documentation.


>---------------------------------------------------------------

79bf33090e229961a2bd90333bc42e9de69688d3
 doc/guide/formats.hpp | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/doc/guide/formats.hpp b/doc/guide/formats.hpp
index 0ce8702..236fce0 100644
--- a/doc/guide/formats.hpp
+++ b/doc/guide/formats.hpp
@@ -29,15 +29,15 @@ following file types:
  - Armadillo binary, denoted by .bin
  - Raw binary, denoted by .bin \b "(note: this will be loaded as"
    \b "one-dimensional data, which is likely not what is desired.)"
- - HDF5, denoted by .hdf, .hdf5, .h5, or .he5 \b "(note: HDF5 must be enabled"
-   \b "in the Armadillo configuration)"
- - ARFF, denoted by .arff \b "(note: this is not supported by all mlpack"
-   \b "command-line programs"; see \ref formatinfo )
+ - HDF5, denoted by .hdf, .hdf5, .h5, or .he5 (<b>note: HDF5 must be enabled"
+   in the Armadillo configuration</b>)
+ - ARFF, denoted by .arff (<b>note: this is not supported by all mlpack"
+   command-line programs </b>; see \ref formatcat )
 
-Datasets that are loaded by mlpack should be stored with \b "one row for "
-\b "one point" and \b "one column for one dimension".  Therefore, a dataset with
-three two-dimensional points \f$(0, 1)\f$, \f$(3, 1)\f$, and \f$(5, -5)\f$ would
-be stored in a csv file as:
+Datasets that are loaded by mlpack should be stored with <b>one row for 
+one point</b> and <b>one column for one dimension</b>.  Therefore, a dataset
+with three two-dimensional points \f$(0, 1)\f$, \f$(3, 1)\f$, and \f$(5, -5)\f$
+would be stored in a csv file as:
 
 \code
 0, 1
@@ -107,7 +107,6 @@ but also as categorical data (i.e. with numeric but unordered categories).  This
 support is useful for, e.g., decision trees and other models that support
 categorical features.
 
-
 In some machine learning situations, such as, e.g., decision trees, categorical
 data can be used.  Categorical data might look like this (in CSV format):
 
@@ -142,6 +141,10 @@ $ mlpack_hoeffding_tree -t dataset.csv -l dataset.labels.csv -v
 ...
 \endcode
 
+Currently, only the \c mlpack_hoeffding_tree program supports loading
+categorical data, and this is also the only program that supports loading an
+ARFF dataset.
+
 @section formatcatcpp Categorical features and C++
 
 When writing C++, loading categorical data is slightly more tricky: the mappings




More information about the mlpack-git mailing list