Bender.jl
Layers
Bender.GenDense
— TypeGeneralized version of Flux's Dense layer. The forward
keyword allows you to choose the form of the forward mapping.
GenDense(in=>out, σ=identity;
init = glorot_uniform,
bias=true, α=false, β=false, forward=linear)
Can also be initialized with an additional set of trainable weights
GenDense(in=>out, in_asym=>out_asym, σ = identity;
init = glorot_uniform,
bias=true, α=false, β=false, forward=linear)
The layer has additinal keyword arguments α and β, which default to Flux.Zeros. These are useful if you need an extra set of weights for for your forward pass (if you for example wish to anneal an activation function).
Bender.GenConv
— TypeGeneralized version of Flux's conv layer. The forward
keyword allows you to choose the form of the forward mapping and defaults to linear. This layer can be initialized with either one or two set of filters (a second set of filters is useful for feedback alignment experiments).
GenConv((k, k), ch_in=>ch_out, σ=identity; forward=linear)
GenConv((k, k), ch_in=>ch_out_(k_asym, k_asym), ch_in_asym=>ch_out_asym, σ=identity; forward=linear)
The layer has additinal keyword arguments α and β, which default to Flux.Zeros. These are useful if you need an extra set of weights for for your forward pass (if you for example wish to anneal an activation function).
Forward mappings
Forward mappings for GenDense layers
Bender.linear
— FunctionMatrix multiply weight matrix with x and add bias
Bender.linear_asym_∂x
— Functionbehaves identical to linear
in the forward pass, but relies on matmulasym∂x, which causes errors to be backpropagated using a set of auxiliary weights in the backwards pass. See matmul_asym_∂x
.
Bender.linear_blocked_∂x
— Functionbehaves identical to linear
in the forward pass, but relies on matmulblocked∂x, which prevents error signal from passing through this layer to earlier layers. This is useful in direct feedback alignment experiments where you want to pipe errors directly from the output loss to individual layers. See matmul_blocked_∂x
.
Bender.radial
— FunctionCalls radialSim and computes the negative squared euclidean distance D between the rows ofthe layers W matrix and the columns of matrix X. See radialSim
.
Bender.radial_asym_∂x
— Functionbehaves identical to radial
in the forward pass, but relies on radialSimasym∂x, which causes errors to be backpropagated using a set of auxiliary weights in the backwards pass. See radialSim_asym_∂x
.
Bender.linear_binary_weights
— FunctionRegular forward pass (matmul and bias addition) with a binary activation function applied to the weights.
Bender.linear_stoc_binary_weights
— FunctionRegular forward pass (matmul and bias addition) with a binary stochastic activation function applied to the weights.
Forward mappings for GenConv layers
Bender.conv_linear
— FunctionForward mapping for regular convolutional layer
Bender.conv_linear_asym_∂x
— FunctionIn the forward pass this behaves identical to conv_linear
. Relies on conv_asym_∂x
, which causes errors to be backpropagated using a set of auxiliary weights in the backwards pass. See conv_asym_∂x
.
Similarity/correlation functions
Bender.matmul
— FunctionRegular matrix multiplication.
Bender.matmul_asym_∂x
— FunctionCompute matrix multiplication, but takes an additional matrix B as input. B has same dims as Wᵀ, and is used in the backwards pass.
Bender.matmul_blocked_∂x
— FunctionMatrix multiplication with custom rrule
Bender.radialSim
— FunctionCompute negative squared euclidean distance D between the rows of matrix W and the columns of matrix X. Denoting the rows of W by index i and the columns of X by index j the elements of the output matrix is given by: Dᵢⱼ = -||Wᵢ﹕ - X﹕ⱼ||² = 2Wᵢ﹕X﹕,ⱼ - ||Wᵢ﹕||^2 - ||X﹕ⱼ||².
Bender.radialSim_asym
— FunctionIn the forward pass this function behaves just like radialSim, but in the backwards pass weight symmetry is broken by using matrix B rather than Wᵀ. See docstring for radialSim for more details.
Bender.conv_asym_∂x
— Functioncomputes the convolution of image x with kernel w
when called, but uses a different set of weights w_asym
to compute the pullback wrt x. This is typically uses in feedback alignment experiments.
Loss functions
Bender.direct_feedback_loss
— FunctionError function which takes a vector of the hidden and output neurons states as well as a vector of feedback matrices as arguments
Activation functions
Bender.sign_STE
— FunctionDeterministic straight-through estimator for the sign function. Reference: https://arxiv.org/abs/1308.3432, https://arxiv.org/abs/1511.00363
Bender.stoc_sign_STE
— FunctionA stochastic straight-through estimator version of the sign function. References: https://arxiv.org/abs/1308.3432, https://arxiv.org/abs/1511.00363
Bender.hardσ
— FunctionA piece-wise linear function. for x<-1 it has value 0. For -1<x<1 it has value x. For x>1 it has value 1. It is defined as:
hardσ(x) = max(0, min(1, (x+1)/2))