#include <gradient_descent_optimizer.hpp>

Public Types
using	ParamArray = std::array< double, NUM_PARAMS >

Public Member Functions
	GradientDescentOptimizer (ParamArray param_weights=GradientDescentOptimizer< NUM_PARAMS >::ParamArray{1}, double gradient_approx_step_size=DEFAULT_GRADIENT_APPROX_STEP_SIZE, double past_gradient_decay_rate=DEFAULT_PAST_GRADIENT_DECAY_RATE, double past_squared_gradient_decay_rate=DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE)

ParamArray	maximize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters)

ParamArray	minimize (std::function< double(ParamArray)> objective_function, ParamArray initial_value, unsigned int num_iters)

Static Public Attributes
static constexpr double	DEFAULT_PAST_GRADIENT_DECAY_RATE = 0.9

static constexpr double	DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE = 0.999

static constexpr double	DEFAULT_GRADIENT_APPROX_STEP_SIZE = 0.00001

Detailed Description

template<size_t NUM_PARAMS>
class GradientDescentOptimizer< NUM_PARAMS >

This class implements a version of Stochastic Gradient Descent (SGD), namely Adam (see links below for details). It provides functionality for both maximizing and minimizing arbitrary functions. For example usage, please see the tests.

As this class is templated, it is header-only. To split up definition and implementation of functions has been moved to a .tpp file that is included at the end of this file.

"Weights" are used throughout this class, and hence are documented here as follows: Weights used to normalize parameters, as gradient descent works much better if the function being optimized is homogeneous in direction for example, f = x^2 + y^2 is easier to optimize than f = x^2 + 50*y^2

This class uses an implementation of "Adam" (Adaptive Moment) Gradient Descent: https://machinelearningmastery.com/adam-optimization-algorithm-for-deep-learning/ https://en.wikipedia.org/wiki/Stochastic_gradient_descent#Adam https://www.ruder.io/optimizing-gradient-descent/#adam https://en.wikipedia.org/wiki/Moment_(mathematics)

NOTE: CLion complains about "Redefinition of GradientDescentOptimizer", but it's incorrect, this class compiles just fine.

Template Parameters

NUM_PARAMS The number of parameters that a given instance of this class will optimize over.

Constructor & Destructor Documentation

◆ GradientDescentOptimizer()

template<size_t NUM_PARAMS>

GradientDescentOptimizer	(	ParamArray	param_weights = `GradientDescentOptimizer<NUM_PARAMS>::ParamArray{1}`,
		double	gradient_approx_step_size = `DEFAULT_GRADIENT_APPROX_STEP_SIZE`,
		double	past_gradient_decay_rate = `DEFAULT_PAST_GRADIENT_DECAY_RATE`,
		double	past_squared_gradient_decay_rate = `DEFAULT_PAST_SQUARED_GRADIENT_DECAY_RATE`
	)

Creates a GradientDescentOptimizer

NOTE: Unless you know what you're doing, you probably don't want specify the decay rates, as the default values are almost always good.

Parameters

param_weights	The weight to multiply
gradient_approx_step_size	The size of step to take forward when approximating the gradient of a function
past_gradient_decay_rate	Past gradient knowledge decay rate, see corresponding class member variable for details
past_squared_gradient_decay_rate	Past squared gradient knowledge decay rate, see corresponding class member variable for details

NOTE: We do not use using namespace ... here, because this is still a header file, and as such anything that includes gradient_descent.h (which includes this file), would get any namespaces we use here NOTE: We do not use ParamArray in any of the function signatures here, as ParamArray is dependent on a template parameter (NUM_PARAMS), and so you need quite a complex expression in order to use it here. As such, it was decided that just using std::array<...> was the better option

Member Function Documentation

◆ maximize()

template<size_t NUM_PARAMS>

std::array< double, NUM_PARAMS > maximize	(	std::function< double(ParamArray)>	objective_function,
		ParamArray	initial_value,
		unsigned int	num_iters
	)

Attempts to maximize the given objective function

Runs gradient descent, starting from the given initial_value and running for num_iters

Parameters

objective_function	The function to maximize
initial_value	The value to start from
num_iters	The number of iterations to run for

Returns: The parameters corresponding to the maximum value of the objective found

◆ minimize()

template<size_t NUM_PARAMS>

std::array< double, NUM_PARAMS > minimize	(	std::function< double(ParamArray)>	objective_function,
		ParamArray	initial_value,
		unsigned int	num_iters
	)

Attempts to minimize the given objective function

Runs gradient descent, starting from the given initial_value and running for num_iters

Parameters

objective_function	The function to minimize
initial_value	The value to start from
num_iters	The number of iterations to run for

Returns: The parameters corresponding to the minimum value of the objective found

The documentation for this class was generated from the following file:

src/software/optimization/gradient_descent_optimizer.hpp

Public Types

Public Member Functions

Static Public Attributes

Detailed Description

Constructor & Destructor Documentation

◆ GradientDescentOptimizer()

Member Function Documentation

◆ maximize()

◆ minimize()