[mlpack-git] [mlpack] [Proposal]Develop a scalable Finetune class (#458)

Sun Oct 4 04:47:06 EDT 2015

I would like to develop a scalable Finetune class, need some suggestions, following are the suggestion api

    /**
     * Fine tune deep network like StackAutoencoder
     *
     *@tparam LayerTypes types of the layers, must provide three functions.
     * - Gradient(const arma::mat&, arma::mat&);
     * - double Evaluate(const arma::mat&);
     * - arma::mat& GetInitialPoint();
     * You can reference to SparseAutoencoderFunction
     *@tparam OutputLayerType types of the output layer, must implement three functions.
     * Gradient(const arma::mat& parameters, arma::mat& gradient);
     * double Evaluate(const arma::mat& parameters);
     * arma::mat& GetInitialPoint();
     *@tparam FineTuneGradient Functor for calculating the last gradient, it should implement two functions
     * - template<typename T> Gradient(arma::mat const&, arma::mat const&, T const&, arma::mat&)
     * - Deriv(arma::mat const&, arma::mat&);
     */
    template<typename LayerTypes, typename OutputLayerType,
             typename FineTuneGradient>
    class FineTuneFunction
    {
    public:        
        using ParamArray =
        std::array<arma::mat*, std::tuple_size<LayerTypes>::value + 1>;

        static_assert(std::tuple_size<LayerTypes>::value > 1,
                  "The tuple size of the LayerTypes must greater than 1");

        /**
         * Construct the class with given data
         * @param input The input data of the LayerTypes and OutputLayerType
         * @param parameters The parameters of the LayerTypes and OutputLayerType
         * @param layerTypes The type(must be tuple) of the Layer(by now only support SparseAutoencoder)
         * @param outLayerType The type of the last layer(ex : softmax)
         */
        FineTuneFunction(ParamArray &input,
                         ParamArray &parameters,
                         LayerTypes &layerTypes,
                         OutputLayerType &outLayerType)
            : trainData(input),
              paramArray(parameters),
              layerTypes(layerTypes),
              outLayerType(outLayerType),
              LayerTypesParamSize(LayerParamTotalSize<>())
        {
        }

        /**
         * Evaluates the objective function of the networks using the
         * given parameters.
         * @param parameters Current values of the model parameters.
         */
        double Evaluate(const arma::mat& parameters);       

        /**
         * Evaluates the gradient values of the objective function given the current
         * set of parameters. The function performs a feedforward pass and computes
         * the error in reconstructing the data points. It then uses the
         * backpropagation algorithm to compute the gradient values.
         * @param parameters Current values of the model parameters.
         * @param gradient Matrix where gradient values will be stored.
         */
        void Gradient(const arma::mat& parameters, arma::mat& gradient);

        //! Return the initial point for the optimization.
        arma::mat& GetInitialPoint();
    };

The example of using this class(omit the initialization part of the training data)

    using namespace mlpack;

    arma::mat sae1_input ;
    arma::mat sae2_input ;
    nn::SparseAutoencoderFunction<> sae1(sae1_input, 3, 2);
    nn::SparseAutoencoderFunction<> sae2(sae2_input, 2, 2);

    arma::mat sm_input;
    arma::Row<size_t> labels(2);
    labels(0) = 0;
    labels(1) = 1;
    regression::SoftmaxRegressionFunction sm(sm_input, labels, 2);

    arma::mat sae1_params;
    arma::mat sae2_params;
    arma::mat sm_params;
    
     //after training, the class will update the params
      std::array<arma::mat*, 3> params{
          &sae1_params, &sae2_params, &sm_params
      };
     //the class will change the input(except the first input) when training
     std::array<arma::mat*, 3> inputs(&sae1_input, &sae2_input, &sm_input);
      auto layer_types = std::forward_as_tuple(sae1, sae2);
      FineTuneFunction<
              decltype(layer_types),
              decltype(sm),
              SoftmaxFineTune
              > finetune(inputs, params, layer_types, sm);

      //create lbfgs to fine tune the value


Besides, the SoftmaxFunction need to cache the probabilities value, else you have to recalculate the probabilites two more times when fine tune the parameters, any suggestions?Thanks

---
Reply to this email directly or view it on GitHub:
https://github.com/mlpack/mlpack/issues/458
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mailman.cc.gatech.edu/pipermail/mlpack-git/attachments/20151004/9830d831/attachment.html>