Authorea

Bernal Jimenez edited section_Descending_the_Alternate_Sparse__.tex about 8 years ago

Commit id: e4716d50f5c2dbae46c8b3ffee3bfa738b8baff7

deletions | additions

The negative gradient of the local objective function \ref{local} is \begin{equation} -\frac{\partial E(a| X; \Phi, W, \theta)}{\partial a_i} = \sum_j\Phi_{ji}X_j-\sum_j \sum_jX_j\Phi_{ji}-\sum_j \Phi_{ji}^2a_i - \theta_i - 2\sum_{j\ne i}W_{ji}a_j. \end{equation} The first term is the same linear filtering term as in SAILnet. The second is the leakiness term with an additional scaling by the length squared of the dictionary element. The dictionary is commonly normalized to have length 1 to prevent it from growing without bound, although Oja's rule does not require this. Empirically, the mean norm will be on the order of length 1, but can vary by a small integer factor and will have non-zero variance. It is an interesting prediction that the leakiness of the membrane of a neuron should scale with the overall strength of its synapses. $-\theta$ would be converted into a spike-threshold in a LIF version of this analog equation. Finally, the last term is twice the SAILnet value due to $W$ being symmetric.