Theta sequences as eligibility traces: a biological solution to credit assignment
Credit assignment problems, for example policy evaluation in RL, often require bootstrapping prediction errors through preceding states or maintaining temporally extended memory traces; solutions which are unfavourable or implausible for biological networks of neurons. We propose theta sequences – chains of neural activity during theta oscillations in the hippocampus, thought to represent rapid playthroughs of awake behaviour – as a solution. By analysing and simulating a model for theta sequences we show they compress behaviour such that existing but short 𝖮(10) ms neuronal memory traces are effectively extended allowing for bootstrap-free credit assignment without long memory traces, equivalent to the use of eligibility traces in TD(λ).
READ FULL TEXT