Experimental data show that synaptic connections are subject to stochastic processes, and that neural codes drift on larger time scales. These data suggest to consider besides maximum likelihood learning also sampling models for network plasticity (synaptic sampling), where the current network connectivity and parameter values are viewed as a sample from a Markov chain, whose stationary distribution captures the invariant properties of network plasticity. However convergence to this stationary distribution may be rather slow if synaptic sampling carries out Langevin sampling. We show here that data on the molecular basis of synaptic plasticity, specifically on the role of CaMKII in its activated form, support a substantially more efficient Hamiltonian sampling of network configurations. We apply this new conceptual and mathematical framework to the analysis of rewardgated network plasticity, and show in a concrete example based on experimental data that Hamiltonian sampling speeds up the convergence to well-functioning network configurations. We also show that a regulation of the temperature of the sampling process provides a link between reinforcement learning and global network optimization through simulated annealing.
|Journal||arXiv.org e-Print archive|
|Publication status||Published - 1 Jun 2016|
Yu, Z., Kappel, D., Legenstein, R., Song, S., Chen, F., & Maass, W. (2016). CaMKII activation supports reward-based neural network optimization through Hamiltonian sampling. arXiv.org e-Print archive.