[mlpack-svn] [MLPACK] #271: SaveRestoreUtility and hierarchical model support

MLPACK Trac trac at coffeetalk-1.cc.gatech.edu
Thu May 29 09:59:18 EDT 2014


#271: SaveRestoreUtility and hierarchical model support
------------------------------------------------------------------------------+
  Reporter:  rcurtin                                                          |        Owner:  birm        
      Type:  enhancement                                                      |       Status:  accepted    
  Priority:  major                                                            |    Milestone:  mlpack 1.1.0
 Component:  mlpack                                                           |   Resolution:              
  Keywords:  xml, saverestoreutility, hierarchical models, load, save, model  |     Blocking:  177, 255    
Blocked By:                                                                   |  
------------------------------------------------------------------------------+

Old description:

> Currently the SaveRestoreUtility does a great job of loading elements
> from XML files as long as you have the full path to the object.  But
> suppose we have some HMM with Gaussian emissions:
>
> {{{
> <hmm>
>   <dimensionality>5</dimensionality>
>   <emissions>2</emissions>
>   <emission0>
>     <mean> ... </mean>
>     <covariance> ... </covariance>
>   </emission0>
>   <emission1>
>     <mean> ... </mean>
>     <covariance> ... </covariance>
>   </emission1>
>   <transition> ... </transition>
> </hmm>
> }}}
>
> Each emission contains exactly what GaussianDistribution::Save() (which
> is not yet written) would output into a file.  So HMM::Save() should
> internally use GaussianDistribution::Save() (and HMM::Load() should
> internally use GaussianDistribution::Load()) but that means that we need
> to have some XMLNode class (or something similar) so that we can do
> something like
>
> {{{
> HMM<GaussianDistribution>::Load(XMLNode& node)
> {
>   node.LoadParameter(transition, "transition");
>   size_t numEmissions;
>   node.LoadParameter(numEmissions, "emissions");
>   node.LoadParameter(dimensionality);
>
>   emissions.resize(numEmissions, GaussianDistribution(dimensionality));
>   for (size_t i = 0; i < numEmissions; ++i)
>   {
>     emissions[i].Load("emission" + i); // yes, this is just psuedocode
>   }
> }
> }}}
>
> Neil, I cc'ed you on this ticket in case you have any ideas (feel free to
> remove yourself).  I think it should be a fairly simple refactoring.  I
> also think maybe it should be 'ModelNode' not 'XMLNode' so that we can
> later load things that aren't XML.

New description:

 Currently the SaveRestoreUtility does a great job of loading elements from
 XML files as long as you have the full path to the object.  But suppose we
 have some HMM with Gaussian emissions:

 {{{
 <hmm>
   <dimensionality>5</dimensionality>
   <emissions>2</emissions>
   <emission0>
     <mean> ... </mean>
     <covariance> ... </covariance>
   </emission0>
   <emission1>
     <mean> ... </mean>
     <covariance> ... </covariance>
   </emission1>
   <transition> ... </transition>
 </hmm>
 }}}

 Each emission contains exactly what GaussianDistribution::Save() (which is
 not yet written) would output into a file.  So HMM::Save() should
 internally use GaussianDistribution::Save() (and HMM::Load() should
 internally use GaussianDistribution::Load()) but that means that we need
 to have some XMLNode class (or something similar) so that we can do
 something like

 {{{
 template<typename DistributionType>
 HMM<DistributionType>::Load(XMLNode& node)
 {
   node.LoadParameter(transition, "transition");
   size_t numEmissions;
   node.LoadParameter(numEmissions, "emissions");
   node.LoadParameter(dimensionality);

   emissions.resize(numEmissions, DistributionType(dimensionality));
   for (size_t i = 0; i < numEmissions; ++i)
   {
     emissions[i].Load("emission" + i); // yes, this is just psuedocode
   }
 }
 }}}

 Neil, I cc'ed you on this ticket in case you have any ideas (feel free to
 remove yourself).  I think it should be a fairly simple refactoring.  I
 also think maybe it should be 'ModelNode' not 'XMLNode' so that we can
 later load things that aren't XML.

--

Comment (by rcurtin):

 Description updated to reflect genericity of HMM::Load().

-- 
Ticket URL: <http://trac.research.cc.gatech.edu/fastlab/ticket/271#comment:5>
MLPACK <www.fast-lab.org>
MLPACK is an intuitive, fast, and scalable C++ machine learning library developed at Georgia Tech.


More information about the mlpack-svn mailing list