[mlpack-svn] [MLPACK] #345: Sparse Autoencoder Module
MLPACK Trac
trac at coffeetalk-1.cc.gatech.edu
Mon Apr 14 15:03:50 EDT 2014
#345: Sparse Autoencoder Module
----------------------------+-----------------------------------------------
Reporter: siddharth.950 | Owner:
Type: enhancement | Status: new
Priority: major | Milestone:
Component: mlpack | Resolution:
Keywords: | Blocking:
Blocked By: |
----------------------------+-----------------------------------------------
Comment (by rcurtin):
Thanks for getting back to this so quickly! The tests you've written are
great. I see why you feel the first one is unnecessary, but I think it's
a good thing to have there anyway; a hand-calculated value like that can
really help debugging, if someone else comes along later and breaks the
sparse autoencoder. The random test cases, in the next test, are also
good; they have a chance of covering edge cases. If we can produce the
same type of tests for the gradient, and then some tests for the
SparseAutoencoder object itself (i.e., does it work for a very simple
dataset? How about a random one?), then we're good to go.
I've taken a look through the SparseAutoencoderFunction::Evaluate() code
since it is now tested. There are very few changes I can think of making;
the implementation is good.
{{{
parameters.submat(0, 0, 2*hiddenSize - 1, visibleSize - 1) =
arma::randu<arma::mat>(2*hiddenSize, visibleSize);
}}}
can just be
{{{
parameters.submat(0, 0, 2*hiddenSize - 1, visibleSize - 1).randu()
}}}
Later, there are some terms of the form `arma::sum(arma::sum(...))`; these
could be `arma::accu(...)` and that should work the same. I think
Armadillo will optimize the double call to sum() correctly, so that
concern is merely syntactic.
The calculation of w1L2Squared and w2L2Squared could be simplified, too,
given that the only way those are used is `w1L2Squared + w2L2Squared`:
{{{
wNorm = arma::accu(parameters.submat(0, 0, l3 - 1, l2 - 1) %
parameters.submat(0, 0, l3 - 1, l2 - 1));
}}}
So if you can write those additional tests, I'll incorporate it into the
codebase. One possible optimization I am thinking about for the future is
the actual use of sparsity in the code. A sparse autoencoder isn't
*actually* restricted to be sparse; the meaning is that most of the hidden
units are inactive for a given output unit, or that their activations are
small. I suspect that these very small activations can be approximated as
0 without significant loss of generalization ability, but we'd have to get
some kind of test case set up first (speech recognition or image
recognition are easy candidates).
--
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/345#comment:8>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.
More information about the mlpack-svn
mailing list