eprop_iaf_adapt_bsshslm_2020 – Current-based leaky integrate-and-fire neuron model with delta-shaped or exponentially filtered postsynaptic currents and threshold adaptation for e-prop plasticity

Description

eprop_iaf_adapt_bsshslm_2020 is an implementation of a leaky integrate-and-fire neuron model with delta-shaped postsynaptic currents and threshold adaptation used for eligibility propagation (e-prop) plasticity.

E-prop plasticity was originally introduced and implemented in TensorFlow in [1].

The suffix _bsshslm_2020 follows the NEST convention to indicate in the model name the paper that introduced it by the first letter of the authors’ last names and the publication year.

Note

The neuron dynamics of the eprop_iaf_adapt_bsshslm_2020 model (excluding e-prop plasticity and the threshold adaptation) are similar to the neuron dynamics of the iaf_psc_delta model, with minor differences, such as the propagator of the post-synaptic current and the voltage reset upon a spike.

The membrane voltage time course \(v_j^t\) of the neuron \(j\) is given by:

\[\begin{split}v_j^t &= \alpha v_j^{t-1} + \zeta \sum_{i \neq j} W_{ji}^\text{rec} z_i^{t-1} + \zeta \sum_i W_{ji}^\text{in} x_i^t - z_j^{t-1} v_\text{th} \,, \\ \alpha &= e^{ -\frac{ \Delta t }{ \tau_\text{m} } } \,, \\ \zeta &= \begin{cases} 1 \\ 1 - \alpha \end{cases} \,, \\\end{split}\]

where \(W_{ji}^\text{rec}\) and \(W_{ji}^\text{in}\) are the recurrent and input synaptic weight matrices, and \(z_i^{t-1}\) is the recurrent presynaptic state variable, while \(x_i^t\) represents the input at time \(t\).

Descriptions of further parameters and variables can be found in the table below.

The threshold adaptation is given by:

\[\begin{split}A_j^t &= v_\text{th} + \beta a_j^t \,, \\ a_j^t &= \rho a_j^{t-1} + z_j^{t-1} \,, \\ \rho &= e^{-\frac{ \Delta t }{ \tau_\text{a} }} \,. \\\end{split}\]

The spike state variable is expressed by a Heaviside function:

\[\begin{split}z_j^t = H \left( v_j^t - A_j^t \right) \,. \\\end{split}\]

If the membrane voltage crosses the adaptive threshold voltage \(A_j^t\), a spike is emitted and the membrane voltage is reduced by \(v_\text{th}\) in the next time step. After the time step of the spike emission, the neuron is not able to spike for an absolute refractory period \(t_\text{ref}\).

An additional state variable and the corresponding differential equation represents a piecewise constant external current.

See the documentation on the iaf_psc_delta neuron model for more information on the integration of the subthreshold dynamics.

The change of the synaptic weight is calculated from the gradient \(g\) of the loss \(E\) with respect to the synaptic weight \(W_{ji}\): \(\frac{ \text{d}E }{ \text{d} W_{ij} }\) which depends on the presynaptic spikes \(z_i^{t-1}\), the surrogate gradient or pseudo-derivative of the spike state variable with respect to the postsynaptic membrane voltage \(\psi_j^t\) (the product of which forms the eligibility trace \(e_{ji}^t\)), and the learning signal \(L_j^t\) emitted by the readout neurons.

\[\begin{split}\frac{ \text{d} E }{ \text{d} W_{ji} } &= \sum_t L_j^t \bar{e}_{ji}^t \,, \\ e_{ji}^t &= \psi_j^t \left( \bar{z}_i^{t-1} - \beta \epsilon_{ji,a}^{t-1} \right) \,, \\ \epsilon^{t-1}_{ji,\text{a}} &= \psi_j^{t-1} \bar{z}_i^{t-2} + \left( \rho - \psi_j^{t-1} \beta \right) \epsilon^{t-2}_{ji,a} \,. \\\end{split}\]

Surrogate gradients help overcome the challenge of the spiking function’s non-differentiability, facilitating the use of gradient-based learning techniques such as e-prop. The non-existent derivative of the spiking variable with respect to the membrane voltage, \(\frac{\partial z^t_j}{ \partial v^t_j}\), can be effectively replaced with a variety of surrogate gradient functions, as detailed in various studies (see, e.g., [3]). NEST currently provides four different surrogate gradient functions:

  1. A piecewise linear function used among others in [1]:

\[\begin{split}\psi_j^t = \frac{ \gamma }{ v_\text{th} } \text{max} \left( 0, 1-\beta \left| \frac{ v_j^t - v_\text{th} }{ v_\text{th} }\right| \right) \,. \\\end{split}\]
  1. An exponential function used in [4]:

\[\begin{split}\psi_j^t = \gamma \exp \left( -\beta \left| v_j^t - v_\text{th} \right| \right) \,. \\\end{split}\]
  1. The derivative of a fast sigmoid function used in [5]:

\[\begin{split}\psi_j^t = \gamma \left( 1 + \beta \left| v_j^t - v_\text{th} \right| \right)^2 \,. \\\end{split}\]
  1. An arctan function used in [6]:

\[\begin{split}\psi_j^t = \frac{\gamma}{\pi} \frac{1}{ 1 + \left( \beta \pi \left( v_j^t - v_\text{th} \right) \right)^2 } \,. \\\end{split}\]

The eligibility trace and the presynaptic spike trains are low-pass filtered with the following exponential kernels:

\[\begin{split}\bar{e}_{ji}^t &= \mathcal{F}_\kappa \left( e_{ji}^t \right) \,, \\ \kappa &= e^{ -\frac{\Delta t }{ \tau_\text{m,out} }} \,, \\ \bar{z}_i^t &= \mathcal{F}_\alpha(z_i^t) \,, \\ \mathcal{F}_\alpha \left( z_i^t \right) &= \alpha \mathcal{F}_\alpha \left( z_i^{t-1} \right) + z_i^t \,, \\ \mathcal{F}_\alpha \left( z_i^0 \right) &= z_i^0 \,, \\\end{split}\]

where \(\tau_\text{m,out}\) is the membrane time constant of the readout neuron.

Furthermore, a firing rate regularization mechanism keeps the average firing rate \(f^\text{av}_j\) of the postsynaptic neuron close to a target firing rate \(f^\text{target}\). The gradient \(g_\text{reg}\) of the regularization loss \(E_\text{reg}\) with respect to the synaptic weight \(W_{ji}\) is given by:

\[\begin{split}\frac{ \text{d} E_\text{reg} }{ \text{d} W_{ji} } = c_\text{reg} \sum_t \frac{ 1 }{ T n_\text{trial} } \left( f^\text{target} - f^\text{av}_j \right) e_{ji}^t \,, \\\end{split}\]

where \(c_\text{reg}\) is a constant scaling factor and the average is taken over the time that passed since the previous update, that is, the number of trials \(n_\text{trial}\) times the duration of an update interval \(T\).

The overall gradient is given by the addition of the two gradients.

For more information on e-prop plasticity, see the documentation on the other e-prop models:

Details on the event-based NEST implementation of e-prop can be found in [2].

Parameters

The following parameters can be set in the status dictionary.

Neuron parameters

Parameter

Unit

Math equivalent

Default

Description

adapt_beta

\(\beta\)

1.0

Prefactor of the threshold adaptation

adapt_tau

ms

\(\tau_\text{a}\)

10.0

Time constant of the threshold adaptation

C_m

pF

\(C_\text{m}\)

250.0

Capacitance of the membrane

E_L

mV

\(E_\text{L}\)

-70.0

Leak / resting membrane potential

I_e

pA

\(I_\text{e}\)

0.0

Constant external input current

regular_spike_arrival

Boolean

True

If True, the input spikes arrive at the end of the time step, if False at the beginning (determines PSC scale)

t_ref

ms

\(t_\text{ref}\)

2.0

Duration of the refractory period

tau_m

ms

\(\tau_\text{m}\)

10.0

Time constant of the membrane

V_min

mV

\(v_\text{min}\)

negative maximum value representable by a double type in C++

Absolute lower bound of the membrane voltage

V_th

mV

\(v_\text{th}\)

-55.0

Spike threshold voltage

E-prop parameters

Parameter

Unit

Math equivalent

Default

Description

c_reg

\(c_\text{reg}\)

0.0

Coefficient of firing rate regularization

f_target

Hz

\(f^\text{target}\)

10.0

Target firing rate of rate regularization

beta

\(\beta\)

1.0

Width scaling of surrogate gradient / pseudo-derivative of membrane voltage

gamma

\(\gamma\)

0.3

Height scaling of surrogate gradient / pseudo-derivative of membrane voltage

surrogate_gradient_function

\(\psi\)

“piecewise_linear”

Surrogate gradient / pseudo-derivative function [“piecewise_linear”, “exponential”, “fast_sigmoid_derivative”, “arctan”]

Recordables

The following state variables evolve during simulation and can be recorded.

Neuron state variables and recordables

State variable

Unit

Math equivalent

Initial value

Description

adaptation

\(a_j\)

0.0

Adaptation variable

V_m

mV

\(v_j\)

-70.0

Membrane voltage

V_th_adapt

mV

\(A_j\)

-55.0

Adapting spike threshold

E-prop state variables and recordables

State variable

Unit

Math equivalent

Initial value

Description

learning_signal

pA

\(L_j\)

0.0

Learning signal

surrogate_gradient

\(\psi_j\)

0.0

Surrogate gradient / pseudo-derivative of membrane voltage

Usage

This model can only be used in combination with the other e-prop models and the network architecture requires specific wiring, input, and output. The usage is demonstrated in several supervised regression and classification tasks reproducing among others the original proof-of-concept tasks in [1].

References

Sends

SpikeEvent

Receives

SpikeEvent, CurrentEvent, LearningSignalConnectionEvent, DataLoggingRequest

See also

Examples using this model