[mlpack-git] (blog) master: Keon Week Seven (2e8554d)
gitdub at mlpack.org
gitdub at mlpack.org
Mon Jul 11 16:05:32 EDT 2016
Repository : https://github.com/mlpack/blog
On branch : master
Link : https://github.com/mlpack/blog/compare/1592519d9df365ee00d8034cc020a288667611f8...2e8554d16793593251b6080720fc3dbef2e2f935
>---------------------------------------------------------------
commit 2e8554d16793593251b6080720fc3dbef2e2f935
Author: Keon Kim <kwk236 at gmail.com>
Date: Tue Jul 12 05:05:32 2016 +0900
Keon Week Seven
>---------------------------------------------------------------
2e8554d16793593251b6080720fc3dbef2e2f935
content/blog/KeonWeekSeven.md | 38 ++++++++++++++++++++++++++++++++++++++
1 file changed, 38 insertions(+)
diff --git a/content/blog/KeonWeekSeven.md b/content/blog/KeonWeekSeven.md
new file mode 100644
index 0000000..458c013
--- /dev/null
+++ b/content/blog/KeonWeekSeven.md
@@ -0,0 +1,38 @@
+Title: Dataset and Experimentation Tools : Week-7 Highlights
+Date: 2016-07-11 16:00:00
+Tags: gsoc, dataset, data
+Author: Keon Kim
+
+This week, I:
+
+DatasetMapper & Imputer
+
+1) Applied the changes suggested, add more comments, and debugged DatasetMapper & Imputer pull request.
+
+2) Made an overload for every imputation methods that receives only one input matrix as a paramter.
+The result will be overwritten to the input matrix, hopefully providing faster performance.
+
+3) MedianImputation now excludes user-defined missing values and NaNs while it calculates the median.
+
+4) New solution to implement ListwiseDeletion (suggested by rcurtin) is used.
+
+Descriptive Statistics
+
+Last week, I said I am going to work on statistics module.
+As a result I made a proof-of-concept work on this [commit](https://github.com/keonkim/mlpack/commit/5aed5ba9c78e4584f445217e9c66e52f79d6daec)
+
+I made a class called Statistics and put all the functions inside it.
+I think the Statistics class maybe useful for other things, too.
+so I am considering to separate the class from the executable and put it somewhere else independently.
+
+Sample run on iris.csv shows the results like the below.
+```
+[INFO ] Loading 'iris.csv' as CSV data. Size is 150 x 4.
+[INFO ] dim var mean std median min max range skewness kurtosis SE
+[INFO ] 0 0.6811225.84333 0.8253015.8 4.3 7.9 3.6 0.175246 1.12569 0.0673856
+[INFO ] 1 0.1867513.054 0.4321473 2 4.4 2.4 0.0266889 0.113048 0.0352846
+[INFO ] 2 3.09242 3.75867 1.75853 4.35 1 6.9 5.9 -1.4776 15.3453 0.143583
+[INFO ] 3 0.5785321.19867 0.7606131.3 0.1 2.5 2.4 -0.04573920.557191 0.0621038
+```
+
+The output of this executable is similar to [this application](http://personality-project.org/r/basics.t.html).
More information about the mlpack-git
mailing list