Thunderbots Project
Loading...
Searching...
No Matches
GradientDescentOptimizer< NUM_PARAMS > Class Template Reference

#include <gradient_descent_optimizer.hpp>

Public Types

using ParamArray = std::array< double, NUM_PARAMS >
 

Public Member Functions

 GradientDescentOptimizer (ParamArray param_weights=GradientDescentOptimizer< NUM_PARAMS >::ParamArray{1}, double gradient_approx_step_size=DEFAULT_GRADIENT_APPROX_STEP_SIZE, double past_gradient_decay_rate=DEFAULT_PAST_GRADIENT_DECAY_RATE, double past_squared_gradient_decay_rate=DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE)
 
ParamArray maximize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters)
 
ParamArray minimize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters)
 

Static Public Attributes

static constexpr double DEFAULT_PAST_GRADIENT_DECAY_RATE = 0.9
 
static constexpr double DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE = 0.999
 
static constexpr double DEFAULT_GRADIENT_APPROX_STEP_SIZE = 0.00001
 

Detailed Description

template<size_t NUM_PARAMS>
class GradientDescentOptimizer< NUM_PARAMS >

This class implements a version of Stochastic Gradient Descent (SGD), namely Adam (see links below for details). It provides functionality for both maximizing and minimizing arbitrary functions. For example usage, please see the tests.

As this class is templated, it is header-only. To split up definition and implementation of functions has been moved to a .tpp file that is included at the end of this file.

"Weights" are used throughout this class, and hence are documented here as follows: Weights used to normalize parameters, as gradient descent works much better if the function being optimized is homogeneous in direction for example, f = x^2 + y^2 is easier to optimize than f = x^2 + 50*y^2

This class uses an implementation of "Adam" (Adaptive Moment) Gradient Descent: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam https://www.ruder.io/optimizing-gradient-descent/#adam https://en.wikipedia.org/wiki/Moment_(mathematics)

NOTE: CLion complains about "Redefinition of GradientDescentOptimizer", but it's incorrect, this class compiles just fine.

Template Parameters
NUM_PARAMSThe number of parameters that a given instance of this class will optimize over.

Constructor & Destructor Documentation

◆ GradientDescentOptimizer()

template<size_t NUM_PARAMS>
GradientDescentOptimizer ( ParamArray  param_weights = GradientDescentOptimizer<NUM_PARAMS>::ParamArray{1},
double  gradient_approx_step_size = DEFAULT_GRADIENT_APPROX_STEP_SIZE,
double  past_gradient_decay_rate = DEFAULT_PAST_GRADIENT_DECAY_RATE,
double  past_squared_gradient_decay_rate = DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE 
)

Creates a GradientDescentOptimizer

NOTE: Unless you know what you're doing, you probably don't want specify the decay rates, as the default values are almost always good.

Parameters
param_weightsThe weight to multiply
gradient_approx_step_sizeThe size of step to take forward when approximating the gradient of a function
past_gradient_decay_ratePast gradient knowledge decay rate, see corresponding class member variable for details
past_squared_gradient_decay_ratePast squared gradient knowledge decay rate, see corresponding class member variable for details

NOTE: We do not use using namespace ... here, because this is still a header file, and as such anything that includes gradient_descent.h (which includes this file), would get any namespaces we use here NOTE: We do not use ParamArray in any of the function signatures here, as ParamArray is dependent on a template parameter (NUM_PARAMS), and so you need quite a complex expression in order to use it here. As such, it was decided that just using std::array<...> was the better option

Member Function Documentation

◆ maximize()

template<size_t NUM_PARAMS>
std::array< double, NUM_PARAMS > maximize ( std::function< double(ParamArray)>  objective_function,
ParamArray  initial_value,
unsigned int  num_iters 
)

Attempts to maximize the given objective function

Runs gradient descent, starting from the given initial_value and running for num_iters

Parameters
objective_functionThe function to maximize
initial_valueThe value to start from
num_itersThe number of iterations to run for
Returns
The parameters corresponding to the maximum value of the objective found

◆ minimize()

template<size_t NUM_PARAMS>
std::array< double, NUM_PARAMS > minimize ( std::function< double(ParamArray)>  objective_function,
ParamArray  initial_value,
unsigned int  num_iters 
)

Attempts to minimize the given objective function

Runs gradient descent, starting from the given initial_value and running for num_iters

Parameters
objective_functionThe function to minimize
initial_valueThe value to start from
num_itersThe number of iterations to run for
Returns
The parameters corresponding to the minimum value of the objective found

The documentation for this class was generated from the following file: