|
Thunderbots Project
|
#include <gradient_descent_optimizer.hpp>
Public Types | |
| using | ParamArray = std::array< double, NUM_PARAMS > |
Public Member Functions | |
| GradientDescentOptimizer (ParamArray param_weights=GradientDescentOptimizer< NUM_PARAMS >::ParamArray{1}, double gradient_approx_step_size=DEFAULT_GRADIENT_APPROX_STEP_SIZE, double past_gradient_decay_rate=DEFAULT_PAST_GRADIENT_DECAY_RATE, double past_squared_gradient_decay_rate=DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE) | |
| ParamArray | maximize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters) |
| ParamArray | minimize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters) |
This class implements a version of Stochastic Gradient Descent (SGD), namely Adam (see links below for details). It provides functionality for both maximizing and minimizing arbitrary functions. For example usage, please see the tests.
As this class is templated, it is header-only. To split up definition and implementation of functions has been moved to a .tpp file that is included at the end of this file.
"Weights" are used throughout this class, and hence are documented here as follows: Weights used to normalize parameters, as gradient descent works much better if the function being optimized is homogeneous in direction for example, f = x^2 + y^2 is easier to optimize than f = x^2 + 50*y^2
This class uses an implementation of "Adam" (Adaptive Moment) Gradient Descent: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam https://www.ruder.io/optimizing-gradient-descent/#adam https://en.wikipedia.org/wiki/Moment_(mathematics)
NOTE: CLion complains about "Redefinition of GradientDescentOptimizer", but it's incorrect, this class compiles just fine.
| NUM_PARAMS | The number of parameters that a given instance of this class will optimize over. |
| GradientDescentOptimizer | ( | ParamArray | param_weights = GradientDescentOptimizer<NUM_PARAMS>::ParamArray{1}, |
| double | gradient_approx_step_size = DEFAULT_GRADIENT_APPROX_STEP_SIZE, |
||
| double | past_gradient_decay_rate = DEFAULT_PAST_GRADIENT_DECAY_RATE, |
||
| double | past_squared_gradient_decay_rate = DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE |
||
| ) |
Creates a GradientDescentOptimizer
NOTE: Unless you know what you're doing, you probably don't want specify the decay rates, as the default values are almost always good.
| param_weights | The weight to multiply |
| gradient_approx_step_size | The size of step to take forward when approximating the gradient of a function |
| past_gradient_decay_rate | Past gradient knowledge decay rate, see corresponding class member variable for details |
| past_squared_gradient_decay_rate | Past squared gradient knowledge decay rate, see corresponding class member variable for details |
NOTE: We do not use using namespace ... here, because this is still a header file, and as such anything that includes gradient_descent.h (which includes this file), would get any namespaces we use here NOTE: We do not use ParamArray in any of the function signatures here, as ParamArray is dependent on a template parameter (NUM_PARAMS), and so you need quite a complex expression in order to use it here. As such, it was decided that just using std::array<...> was the better option
| std::array< double, NUM_PARAMS > maximize | ( | std::function< double(ParamArray)> | objective_function, |
| ParamArray | initial_value, | ||
| unsigned int | num_iters | ||
| ) |
Attempts to maximize the given objective function
Runs gradient descent, starting from the given initial_value and running for num_iters
| objective_function | The function to maximize |
| initial_value | The value to start from |
| num_iters | The number of iterations to run for |
| std::array< double, NUM_PARAMS > minimize | ( | std::function< double(ParamArray)> | objective_function, |
| ParamArray | initial_value, | ||
| unsigned int | num_iters | ||
| ) |
Attempts to minimize the given objective function
Runs gradient descent, starting from the given initial_value and running for num_iters
| objective_function | The function to minimize |
| initial_value | The value to start from |
| num_iters | The number of iterations to run for |