Completed on 10 Dec 2017 by Shengchao Alfred Li . Sourced from https://www.biorxiv.org/content/early/2017/11/21/222828.
Login to endorse this review.
This is a beautiful study. A few typos. 1. Page 5, line 15, "if ... in Area X was sufficient (to) control dopamine release", a "to" is missing. 2. Page 6, caption of Figure 2 A) "Schematic showing injection of AAV1-CAG-axCHR2 into (VTA)." "VTA" is missing. 3. Page 13, caption of Figure 4 E), "inhibition ... elicited increases in contingency", "increases" should be "decreases". 4. Page 13, line 2, "or through or through", duplicated "or through"'s. 5. in supplemental figures, "TH positive" or "TH negative" were talked about without TH's full name being introduced first.
I have some thoughts below.
1. There is another paper beyond reference 20 about cerebellum connections to Area X, "Thalamostriatal and cerebellothalamic pathways in a songbird, the Bengalese finch". It may be cited too. "https://www.biorxiv.org/con...".
2. The experiment about Figure 6 J) revealed temporal precision in reinforcement learning. The temporal precision presented in figure 6 J) may be interpreted this way: the control syllable did not change its pitch simply because the stimulation or inhibition of dopamine delivery was not correlated to its pitch, not related to the window size or timed stimulation or inhibition of the delivery of dopamine. Fee and Goldberg [reference 6] said "... because of the very sparse activation of HVC (to MSN) synapses, each synapse could carry a memory, such as by synaptic tagging (Redondo and Morris, 2011), or by an ‘eligibility trace’ (Tesauro, 1992, Houk et al., 1994, Suri and Schultz, 1999) of earlier coincident activation of the HVC and LMAN inputs. Such mechanisms could solve the ‘temporal credit assignment problem’ ... ". This hypothesis does not require precisely timed delivery of dopamine signal to Area X. However, Vp may serve the purpose of timed or targeted delivery of dopamine from VTA to Area X (About Vp, see Gale and Perkel, "A Basal Ganglia Pathway Drives Selective Auditory Responses in Songbird Dopaminergic Neurons via Disinhibition"). I think either or both of the following two experiments may further reveal the source of the temporal precision in reinforcement learning. 1. Delay the stimulation or inhibition of dopamine delivery, to find out the size of the effective window. This may be done to deafened birds. 2. For a bird with two harmonic syllables, as those in figure 6 J), target both syllables, but stimulate or inhibit dopamine delivery only at the time of the second syllable (for example, target the situation when both syllables have low pitches). These two experiments can be carried out by the authors with the same technology they have mastered.
Just my two cents. Thanks.