weight_optimizer – Selection of weight optimizers¶

Description¶

A weight optimizer is an algorithm that adjusts the synaptic weights in a network during training to minimize the loss function and thus improve the network’s performance on a given task.

This method is an essential part of plasticity rules like e-prop plasticity.

Currently two weight optimizers are implemented: gradient descent and the Adam optimizer.

In gradient descent [1] the weights are optimized via:

\[W_t = W_{t-1} - \eta \, g_t \,,\]

whereby \(\eta\) denotes the learning rate and \(g_t\) the gradient of the current time step \(t\).

In the Adam scheme [2] the weights are optimized via:

\[\begin{split}m_0 &= 0, v_0 = 0, t = 1 \,, \\ m_t &= \beta_1 \, m_{t-1} + \left(1-\beta_1\right) \, g_t \,, \\ v_t &= \beta_2 \, v_{t-1} + \left(1-\beta_2\right) \, g_t^2 \,, \\ \hat{m}_t &= \frac{m_t}{1-\beta_1^t} \,, \\ \hat{v}_t &= \frac{v_t}{1-\beta_2^t} \,, \\ W_t &= W_{t-1} - \eta\frac{\hat{m_t}}{\sqrt{\hat{v}_t} + \epsilon} \,.\end{split}\]

Parameters¶

The following parameters can be set in the status dictionary.

Common optimizer parameters
Parameter	Unit	Math equivalent	Default	Description
batch_size			1	Size of batch
eta		\(\eta\)	1e-4	Learning rate
Wmax	pA	\(W_{ji}^\text{max}\)	100.0	Maximal value for synaptic weight
Wmin	pA	\(W_{ji}^\text{min}\)	-100.0	Minimal value for synaptic weight

Gradient descent parameters (default optimizer)
Parameter	Unit	Math equivalent	Default	Description
type			gradient_descent	Optimizer type

Adam optimizer parameters
Parameter	Unit	Math equivalent	Default	Description
type			adam	Optimizer type
beta_1		\(\beta_1\)	0.9	Exponential decay rate for first moment estimate
beta_2		\(\beta_2\)	0.999	Exponential decay rate for second moment estimate
epsilon		\(\epsilon\)	1e-8	Small constant for numerical stability

The following state variables evolve during simulation.

Adam optimizer state variables for individual synapses
State variable	Unit	Math equivalent	Initial value	Description
m		\(m\)	0.0	First moment estimate
v		\(v\)	0.0	Second moment raw estimate

weight_optimizer – Selection of weight optimizers¶

Description¶

Parameters¶

References¶

See also¶

Examples using this model¶