Prefrontal modulation of midbrain dopamine systems during navigation-based decision tasks
Jo, Yong Sang
MetadataShow full item record
Midbrain dopamine (DA) systems are central for reinforcement learning. DA cells encode discrepancies between expected and received rewards in a phasic fashion. These errors in reward prediction may be used as a teaching signal by other brain regions for the learning of reward-directed behavior. Computation of prediction errors requires the value of future rewards estimated in a given situation or state. The literature suggests that the prefrontal cortex (PFC) represents reward expectancy in associative learning, so the PFC may be one of the critical structures that convey expected reward values to DA cells for signaling prediction errors. In order to address this possibility, the current dissertation investigated whether temporary inactivation of each of two major PFC, the orbitofrontal cortex (OFC) and the medial PFC (mPFC) disrupted DAergic prediction errors in the ventral tegmental area (VTA) while rats performed a delay-based decision making task on a T-maze. Significant alternations in firing of DA and non-DA cells commonly indicate that the OFC provides the VTA with value signals in the task. On the other hand, mPFC inactivation also induced significant changes in DA activity, but non-DA cells remained unaltered. These results suggest that the mPFC modulates DAergic prediction errors by conveying temporal information about time delays, rather than value signals. Thus, the OFC, but not the mPFC, is the major prefrontal source of expected reward values to the VTA in the task. Recently, it has become controversial as to whether the rodent OFC encodes value signals or state information about the current choice and its consequence. To examine the nature of neural representations in the OFC, single unit activity was recorded directly from the OFC of rats performing the same delay discounting task as in the prior experiments. Different groups of OFC cells showed excited responses to a series of task-relevant events and periods. In each state, individual neurons signaled a preferred reward condition by firing stronger than in the other reward conditions. In addition, their population activity reflected outcome values evaluated in each task state. These results provide compelling evidence that the OFC represents both specific outcome information and their relative values at the individual and population levels, respectively. It has long been suggested that the midbrain reticular formation (MRF) encodes motivational value in expectation of future rewards. To determine its functional role in delay-based decision making, single unit activity was monitored from the MRF using the same delay discounting task. Consistent with the previous reports, a large number of MRF cells signaled information about expected rewards during waiting periods by continuously ramping their firing to different levels depending on the size of upcoming rewards. Thus, considering the direct projections from the MRF to the VTA, it is likely that the MRF is another source of value signals to midbrain DA systems.
- Psychology