[mlpack-svn] [MLPACK] #345: Sparse Autoencoder Module

Mon Apr 14 15:03:50 EDT 2014

#345: Sparse Autoencoder Module
----------------------------+-----------------------------------------------
  Reporter:  siddharth.950  |        Owner:     
      Type:  enhancement    |       Status:  new
  Priority:  major          |    Milestone:     
 Component:  mlpack         |   Resolution:     
  Keywords:                 |     Blocking:     
Blocked By:                 |  
----------------------------+-----------------------------------------------

Comment (by rcurtin):

 Thanks for getting back to this so quickly!  The tests you've written are
 great.  I see why you feel the first one is unnecessary, but I think it's
 a good thing to have there anyway; a hand-calculated value like that can
 really help debugging, if someone else comes along later and breaks the
 sparse autoencoder.  The random test cases, in the next test, are also
 good; they have a chance of covering edge cases.  If we can produce the
 same type of tests for the gradient, and then some tests for the
 SparseAutoencoder object itself (i.e., does it work for a very simple
 dataset?  How about a random one?), then we're good to go.

 I've taken a look through the SparseAutoencoderFunction::Evaluate() code
 since it is now tested.  There are very few changes I can think of making;
 the implementation is good.

 {{{
 parameters.submat(0, 0, 2*hiddenSize - 1, visibleSize - 1) =
 arma::randu<arma::mat>(2*hiddenSize, visibleSize);
 }}}

 can just be

 {{{
 parameters.submat(0, 0, 2*hiddenSize - 1, visibleSize - 1).randu()
 }}}

 Later, there are some terms of the form `arma::sum(arma::sum(...))`; these
 could be `arma::accu(...)` and that should work the same.  I think
 Armadillo will optimize the double call to sum() correctly, so that
 concern is merely syntactic.

 The calculation of w1L2Squared and w2L2Squared could be simplified, too,
 given that the only way those are used is `w1L2Squared + w2L2Squared`:

 {{{
 wNorm = arma::accu(parameters.submat(0, 0, l3 - 1, l2 - 1) %
                    parameters.submat(0, 0, l3 - 1, l2 - 1));
 }}}

 So if you can write those additional tests, I'll incorporate it into the
 codebase.  One possible optimization I am thinking about for the future is
 the actual use of sparsity in the code.  A sparse autoencoder isn't
 *actually* restricted to be sparse; the meaning is that most of the hidden
 units are inactive for a given output unit, or that their activations are
 small.  I suspect that these very small activations can be approximated as
 0 without significant loss of generalization ability, but we'd have to get
 some kind of test case set up first (speech recognition or image
 recognition are easy candidates).

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/345#comment:8>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.