Neural correlates of extrinsic and intrinsic outcome processing during learning in individuals with TBI: a pilot investigation

Outcome processing, the ability to learn from feedback, is an important component of adaptive behavior and rehabilitation. Evidence from healthy adults implicates the striatum and dopamine in outcome processing. Animal research shows that damage to dopaminergic pathways in the brain can lead to a disruption of dopamine tone and transmission. Such evidence thus suggests that persons with TBI experience deficits in outcome processing. However, no research has directly investigated outcome processing and associated neural mechanisms in TBI. Here, we examine outcome processing in individuals with TBI during learning. Given that TBI negatively impacts striatal and dopaminergic systems, we hypothesize that individuals with TBI exhibit deficits in learning from outcomes. To test this hypothesis, individuals with moderate-to-severe TBI and healthy adults were presented with a declarative paired-associate word learning task. Outcomes indicating performance accuracy were presented immediately during task performance and in the form of either monetary or performance-based feedback. Two types of feedback provided the opportunity to test whether extrinsic and intrinsic motivational aspects of outcome presentation play a role during learning and outcome processing. Our results show that individuals with TBI exhibited impaired learning from feedback compared to healthy participants. Additionally, individuals with TBI exhibited increased activation in the striatum during outcome processing. The results of this study suggest that outcome processing and learning from immediate outcomes is impaired in individuals with TBI and might be related to inefficient use of neural resources during task performance as reflected by increased activation of the striatum.


Introduction
Approximately 69 million people sustain a traumatic brain injury (TBI) each year worldwide (Dewan et al., 2019) with cognitive difficulties being extremely common (Jourdan et al., 2018). Memory deficit, resulting from impaired acquisition, is a long-lasting and primary cognitive symptom experienced by individuals with TBI (Carlozzi et al., 2013;Coste et al., 2015;DeLuca et al., 2000;Langlois et al., 2006;Vanderploeg et al., 2014). Deeper understanding of the impaired learning, memory, and neural mechanisms will lay the prerequisite strong foundation for developing novel rehabilitation treatments for these deficits (Horowitz et al., 2018;Sandry, 2015). Difficulty learning from action outcomes (performance feedback) mediated by the frontostriatal network is one underexplored and potentially critical area for understanding the underlying causes of memory problems in TBI.
Knowing the outcomes of your actions is a critical component of the learning process. Whether positive or negative, feedback allows learners to modify their behavior in order to achieve a specific goal (Schlund & Pace, 1999) and acquire new skills. Positive and negative feedback indicate an action's efficacy and whether it should be continued or discontinued/replaced, respectively. Feedback-based learning largely relies on the fronto-striatal network, with the ventral striatum and the ventromedial prefrontal cortex (VMPFC) playing critical roles (Tricomi & Depasque, 2017). Neuroimaging studies have shown that activity in the striatum increases during positive feedback presentation and decreases during negative feedback presentation. Such similar activation patterns are observed during both nondeclarative and declarative learning tasks (Tricomi & Fiez, 2008). Further, this activation pattern has been detected regardless of whether feedback is extrinsic (such as a monetary gain/loss) (e.g. Delgado et al., 2000) or intrinsic (such as positive/negative performance related feedback) (Tricomi et al., 2006).
The distinction in feedback type (monetary vs. performance related) is particularly important in the context of rehabilitation. Monetary feedback represents an objective form of feedback where the value of the feedback is clear. This type of feedback has been frequently used in basic cognitive neuroscience investigations (e.g. Delgado et al., 2000;Tricomi et al., 2004). Performance feedback represents a subjective form of feedback and more closely resembles the type of feedback encountered by individuals in a rehabilitation setting. According to the Cognitive Evaluation Theory (Dec et al., 1999), these two types of feedback differentially impact task performance. Monetary Feedback leads to decreased learning because it undermines one's self-determination and competence, thus reducing intrinsic motivation (i.e. reducing the perception that the activity is inherently rewarding and not dependent on external rewards). Non-monetary performance feedback relies on intrinsic motivation and would not devalue learning, resulting in better performance.
At the neurochemical level, feedback-based learning has been shown to rely on dopamine (Bromberg-Martin et al., 2010;Schultz, 1998), one modulatory neurotransmitter involved in declarative memory formation (Adcock et al., 2006;Wittmann et al., 2005). Individuals with neurological damage and dopaminergic dysfunction show impaired learning from feedback and altered activation of the striatum and associated regions (du Plessis et al., 2018;Griffiths et al., 2014;van der Vegt et al., 2013). For example, progressive cell death of dopaminergic neurons leads to disrupted learning through feedback in individuals with Parkinson's disease (PD). Specifically, performance accuracy decreases when individuals with PD learn through feedback (Foerde & Shohamy, 2011;Shohamy et al., 2004). Other clinical conditions that are characterized by the fronto-striatal dopaminergic system dysfunction (e.g. schizophrenia and depression), also exhibit deficits in learning from feedback (Griffiths et al., 2014;Pujara & Koenigs, 2014).
Animal and human models suggest a similar neurochemical profile may be evident after TBI. Rats given controlled cortical impact injury, a model of closed-head TBI, show altered dopamine neurotransmission (Wagner et al., 2005) and progressive loss of nigrostriatal neurons (Hutson et al., 2011). Relatedly, dopamine transmission after controlled cortical impact improves with the administration of dopamine agonists such as methylphenidate (Wagner, et al., 2009a(Wagner, et al., , 2009b and bromocriptine (Kline et al., 2002). Administration of these types of medications in individuals with TBI also leads to improvements in cognitive functioning (Bales et al., 2009(Bales et al., , 2010Garcia et al., 2011). In-line with these findings, behavioral evidence suggests that individuals with TBI demonstrate impaired outcome anticipation (e.g. Freedman et al., 1987) and impaired learning of action-outcome contingencies (Larson et al., 2007;Schlund, 2002;Schlund et al., 2001), both of which rely on the dopaminergic system.
Additional support for striatal dysfunction in TBI comes from structural magnetic resonance imaging (MRI). Persons with TBI show increased diffusivity in the striatum related to impulsive decision-making (i.e. deciding what action to perform to get a desired outcome) (Newcombe et al., 2011) and compromised striatal white matter tracts are linked to executive dysfunction (Shah et al., 2012). Striatal volume is also reduced in individuals with severe TBI and striatal abnormalities have been reported in mild TBI (Rangaprakash et al., 2017;Tate et al., 2016). This evidence suggests that individuals with TBI may experience negative cognitive symptoms resulting from striatal dysregulation.
Despite strong parallels between PD and TBI that implicate striatum-mediated learning dysfunction (Jenkins et al., 2020) and consistent previous evidence of learning impairments in TBI (DeLuca et al., 2000;Vanderploeg et al., 2014;Wright et al., 2010), no studies have directly examined the functioning of the striatum in individuals with TBI and whether individuals with TBI exhibit deficits in learning through feedback. To address this gap, we investigated (1) whether individuals with TBI are able to learn from feedback and (2) the role the striatum plays in feedback learning. We hypothesized that individuals with TBI will exhibit worse learning through feedback compared to HCs. In addition, we contrasted two alternative neural hypotheses to account for impairments in learning from feedback in TBI. Specifically, compared to HCs, individuals with TBI could exhibit decreased striatal activation during feedback processing. This would be consistent with PD literature and findings of the fronto-striatal network being sensitive to task success (DePasque Swanson & Tricomi, 2014;Dobryakova & Tricomi, 2013;Tricomi et al., 2006). Alternatively, compared to HCs, individuals with TBI could exhibit increased striatal activation during feedback processing. This would be consistent with previous studies that observed increased brain activation in neurological populations and suggest that such hyperactivation results from inefficient use of neural resources during task performance (Manning, 2008;Poston et al., 2016;Turner et al., 2011).
To test the above hypothesis, participants engaged in a paired-associate learning paradigm (Dobryakova & Tricomi, 2013;Tricomi & Fiez, 2008), where they first studied word associations outside of the MRI. Next, during functional MRI (fMRI) data acquisition, participants' memory for the paired associates was tested, with either monetary or performance feedback provided for some of the stimuli. This direct feedback manipulation allowed us to investigate the degree that different feedback types (internal vs. external) may modulate striatal activation and learning in TBI.

Participants
Seventeen participants with moderate-to-severe TBI and 25 healthy control (HC) participants consented to participate in this pilot study. The Wechsler Abbreviated Scale of Intelligence (WASI) Vocabulary subtest was used to characterize premorbid verbal ability (not collected on 5 participants with TBI and 1 HC; see Table 1 for demographic information). Participants were recruited through advertisements/word of mouth and the TBI Model System (TBIMS) compiled into a database of individuals interested in TBI research. According to the TBIMS, TBI is defined as "damage to brain tissue caused by an external mechanical force, as evidenced by loss of consciousness due to brain trauma, post-traumatic amnesia, skull fracture, or objective neurological findings that can reasonably be attributed to TBI on physical examination or mental status examination". Moderate-to-severe injury severity is defined as posttraumatic amnesia lasting more than 24 h, loss of consciousness lasting more than 30 min, Glasgow Coma Scale (GCS) score less than 13. Participants had no history of psychiatric or neurological diseases aside from TBI and no reported history of substance abuse or dependence. TBI participants were at least 1-year post injury (chronic stage of TBI) and able to undergo MRI. TBI severity was determined at participants' acute care hospital as per medical records based on the duration of loss of consciousness, duration of post-traumatic amnesia, Glasgow Coma Scale, and/or positive neuroimaging findings. Diagnosis was done either by a medical care provider in an acute hospital setting (n = 7), medical care provider with specialized knowledge of TBI (n = 9), and/or a clinical neuropsychologist (n = 1). The research was approved by the Institutional Review Board of Kessler Foundation.

Scan session
Participants received an fMRI scan of the brain conducted on a Siemens Skyra 3 T scanner (see Appendix for details).

Behavioral paradigm
The learning task was adapted from Dobryakova and Tricomi (2013) and consist of the Study Phase, the Feedback Phase, and the Test Phase (Fig. 1A).

Study phase
Outside of the scanner participants learned word associations (180 trials). On each trial, participants were presented with three words: a target word with two alternative options underneath (See Appendix for details). One alternative was highlighted in green indicating that this option is the correct match to the target word. Participants were instructed to memorize the target word and highlighted paired-associate. Each trial was presented for 4 s in random order.

Feedback phase
Study Phase words were randomly assigned to one of three feedback conditions presented during the fMRI scan: Monetary Feedback, Non-Monetary Feedback, and No Feedback. A cue screen was presented with the target word positioned above two choice words. Participants selected the choice word that matched the target word based on what they remembered from the Study Phase. In the Monetary Feedback condition, a correct match resulted in feedback of a green circle with the monetary gain amount written inside of it ($1.00); an incorrect match resulted in feedback of a red circle with a monetary loss amount written inside of it ($0.50). Participants were informed that they will receive a bonus based on their performance in this condition. The Non-Monetary Feedback condition resembled the Monetary Feedback condition with the exception of no monetary gains or losses. In the No Feedback condition, participants were presented with a blue circle after making their choice. Conditions were randomly intermixed in blocks of 10 trials, starting with a jittered fixation point (1-5 s) that contained a label informing participants of the current condition. This was followed by a cue (4 s) and feedback screen (1 s) (Fig. 1B). Trial order was also randomized.

Test phase
The final Test Phase occurred outside of the scanner. Studied paired-associates were represented to participants for 4 s each in random order. Participants were asked to select the correct answer. After finishing, all participants were compensated $150 regardless of their performance.

Post-task Questionnaire
After the MRI scan, participants completed a post-experimental questionnaire to evaluate subjective group differences in task-engagement, feedback related affect, and preference for positive and negative feedback. Specifically, we asked: (1) whether participants were engaged while performing the task, and (2) how did they feel when being presented with positive and negative feedback.

Behavioral data
Accuracy served as the main dependent measure (See Appendix for details). Accuracy data from the Feedback Phase were analyzed with mixed effects ANOVA with Group (HC, TBI) as a between subject factor and Phase (Feedback, Test) and Feedback condition (Monetary Feedback, Non-Monetary Feedback, No Feedback) as within subject factors. Additional analyses included a 2 Phase × 3 Feedback condition repeated measures ANOVA, computed independently for each group. We present Cohen's d estimate of effect size alongside independent and paired t-test simple comparisons. Analyses were computed using R v3.3.2.

fMRI data
Standard fMRI data preprocessing and analysis were carried out using FEAT (FMRI Expert Analysis Tool) v6.00 of FSL (See Appendix for preprocessing details).
The second level analysis, which averaged contrast estimates over runs per subject, was carried out using a fixed effects model by forcing the random effects variance to zero in FLAME (FMRIB's Local Analysis of Mixed Effects) (Siegel et al., 2014;Smith, 2002;Woolrich et al., 2001). Group analysis was carried out using FLAME stage 1. To correct for multiple comparisons, Z statistical images were thresholded using a cluster threshold of z > 2.3 and a corrected cluster significance threshold of p < 0.05 (Worsley, 2001).
To address the main research question and examine the role of the striatum during feedback learning in individuals with TBI, we first examined whether there are differences between groups during feedback presentation and performed a contrast that collapsed across feedback type (i.

Behavioral accuracy
TBI participants performed worse and did not learn from feedback compared to the HC group as indicated by the Feedback condition by Group interaction that approached significance, F(2,80) = 2.99, p = 0.056 (Fig. 2). The Feedback condition main effect, F(2,80) = 4.00, p = 0.022, and the Feedback condition × Phase interaction were also significant, F(2, 80) = 4.50, p = 0.014. No other main effects or interactions were significant. Follow-up simple comparisons of the interaction, collapsing across Phase, also approached significance and exhibited a large effect sizes, revealing that the HCs performed better than TBI participants when We further evaluated performance for each group independently using a 2 Phase × 3 Feedback condition repeated measures ANOVA to better understand nuances of the data pattern as a function of Group. For the HCs, the interaction between Phase and Feedback condition was significant, F(2, 48) = 3.70, p = 0.03 (Fig. 3-top)

fMRI results
To determine the regions sensitive to feedback processing during learning in individuals with TBI vs. HC, we first performed a contrast to identify voxels sensitive to positive feedback presentation. That is, while collapsing across feedback conditions, positive feedback presentation was  Since we had a strong a priori hypothesis focused specifically on the striatum (Dobryakova & Tricomi, 2013;Tricomi & Fiez, 2008), we performed a between group comparison for Monetary and Non-Monetary Feedback conditions on voxels of the right and left anterior caudate (cluster extent of 448 and 40 voxels, respectively) showing sensitivity to positive feedback presentation. Significant differences between HCs and TBI were detected bilaterally. TBIs showed greater activation in the right anterior caudate (t(40) = 3.07, p < 0.005), and left anterior caudate, (t(40) = 2.48, p < 0.05) when processing feedback (Fig. 4b, c).

Post-task Questionnaire
See Appendix for details.

Discussion
Learning impairment is one of the most commonly observed disabilities in individuals with TBI and is identified as a major component in the breakdown of the acquisition process that facilitates memory (DeLuca et al., 2000;Vakil, 2005). One prerequisite for many rehabilitation interventions is for patients to be able to learn a rehabilitation strategy after receiving feedback from a therapist. Implicit in this assumption is that individuals with TBI are able to process the outcomes of their actions (i.e. feedback) so that they can learn from them. This is a requirement for cognitive rehabilitation as well as motor rehabilitation due to individuals relying on feedback or outcomes before an action becomes automatic. Inability to learn from feedback can thus exert a significant negative impact on both physical and cognitive rehabilitation after TBI. The inability to learn from feedback in TBI may contribute to perseveration of incorrect actions and/or inefficacy of compensatory strategy use. Perseveration, or the inability to stop repetition of a spceific behavior is a common consequence of TBI and is part of the inhibition and impulse control symptomology in TBI (Selzer et al., 2006). In the current task, it can manifest as selecting an incorrect resonponse even after the receipt of negative feedback. Successful acquisition of compensatory strategies during rehabilitation also requires abandoning a strategy that is not working and swithching to a strategy that results in positive outcomes. The current findings also have implications in scenarious where a therapist is involved as the findings potentially suggest that indivudlas with TBI are not able to fully utilize the feedback provided by a therapist during rehablitation in order to learn a compensatory strategy.
In the current study, we directly examined whether individuals with TBI exhibit learning deficits due to impaired outcome processing by presenting participants with TBI and HCs with a paired-associate word leaning task and different types of feedback intended to improve their learning. Our data show that participants with moderate-to-severe TBI exhibit deficits in learning through feedback compared to HCs. Consistent with past research, learning in the HCs group improved with non-monetary (subjective) feedback with a similar but non-significant pattern of performance for monetary (objective) feedback (Albrecht et al., 2014;Daniel & Pollmann, 2010;Gawlowska et al., 2017). This was not the case in participants with TBI whose learning did not improve, regardless of whether monetary or non-monetary feedback was presented. These results suggest that, while individuals with TBI might be able to acquire information as well as HCs in this paired-associate learning paradigm (as we did not observe group differences in accuracy during the Feedback Phase), feedback does not improve their learning and memory beyond what participants originally learned (i.e. no improvements in accuracy at Test Phase in the TBI group). These data parallel the findings from individuals with PD who also show deficits in learning from immediate feedback compared to HCs (Foerde et al., 2013).
Our analysis of neuroimaging data revealed greater activation in the TBI group vs. the HC group in several cortical brain regions as well as in the striatum. Such 'overactivity' can be interpreted as disinhibition or as a reflection of increased in cognitive demands (Price & Friston, 1999) that individuals with TBI experience during task performance. Such findings are consistent with previous investigations that interpret increased brain activity in other neurological populations (Manning, 2008;Poston et al., 2016;Turner et al., 2011) as well as in TBI (Hillary et al., 2014) as inefficient use of neural resources during task performance.
The functioning of the striatum was of particular relevance in the current study. Contrary to previous findings in action-outcome literature of increased striatal activation in association with improvements in learning in conjunction with increased striatal activation (Dobryakova & Tricomi, 2013;Hiebert et al., 2014;O'Doherty, 2004), we observed increased striatal activation and poor learning in the TBI group compared to HCs. However, greater striatal activation to feedback presentation in the TBI group might reflect the rewarding nature of feedback rather than the learning signal in the striatum. Receiving feedback has been shown to be similarly rewarding as receiving reinforcers such as money (DePasque & Tricomi, 2015;Tricomi et al., 2006). Indeed, most TBI participants indicated that they felt very happy when presented with positive feedback (see Appendix). Thus, increased activation of the striatum in the absence of learning might suggest that, while the striatum does not carry the learning signal, feedback presentation is still rewarding to individuals with TBI. Increased activation of the striatum may also indicate its role in maintaining the representation of the rewards' value throughout the duration of the task (Beck et al., 2010).
The insula and the anterior cingulate cortex (ACC) that also showed increased activation in the TBI vs. HC groups in the current study have been implicated in a variety of goaldirected tasks as well as in tasks that require mental effort (Engström et al., 2014) and reward processing (Heilbronner & Hayden, 2016). Activation of these regions in the current study might suggest that, compared to HC participants, participants with TBI exerted more cognitive effort (Spirou et al., 2018) to process immediate positive feedback and to hold on to the information depicted by positive feedback. This interpretation aligns well with other studies showing ACC involvement in information retention and its role in storing information that is required for appropriate action performance (Heilbronner & Hayden, 2016). The insula is frequently coactivated with the ACC and the medial prefrontal cortex (that also showed increased activation in the present study) during studies involving goal-directed behavior and has reciprocal anatomical connections with the ACC as well as the striatum (Gogolla, 2017;Haber, 2011;Haber et al., 2000). The insula is often affected by neurological and psychiatric disorders and is implicated in a variety of symptoms as well as dopamine-dependent learning and memory (Gogolla, 2017).
Given the importance of outcome processing during learning and rehabilitation for individuals with TBI, future studies should examine whether feedback processing can be improved though normalizing striatal activation. Previous studies in non-TBI individuals suggest that striatal activation is malleable to "normalization" (Dahlin et al., 2008;Gavelin et al., 2017). Similarly, animal literature also suggests that pharmacological interventions can normalize striatal functioning through administration of dopamine agonist (Krishna et al., 2020;Wagner, et al., 2009aWagner, et al., , 2009b).

Limitations
The current study provides important new evidence about learning through feedback and mechanisms involved in outcome processing in TBI. However, it has several limitations. While the effect sizes were large in this study, future replication in a larger sample is necessary to verify the statistical patterns that were observed. One additional limitation is that WASI Vocabulary scores were unavailable for a small subset of participants. It is possible that premorbid verbal ability differed between groups and this was not captured as a result of missing data. Further, the Monetary Feedback condition in the current task reflects not only performance feedback but also carries an objective value. While previous investigations used a similar design (Daniel & Pollmann, 2010) and we did not observed differences in brain activity between Monetary and Non-Monetary Feedback conditions, future studies should utilize an experimental task that separates the value and the learning signal to further explain striatal activation in the absence of improved learning in TBI.

Conclusions
The current pilot study examined learning through feedback in individuals with TBI and associated brain activity. Results revealed impairments during learning through feedback in individuals with TBI that are similar to previously documented findings in PD (Foerde et al., 2013). In conjunction with poor learning from feedback, we also observed increased brain activation that might suggest inefficient use of neural resources during task performance in individuals with TBI. Such impairment may contribute to observable memory disability and poor rehabilitation outcomes in individuals with TBI. Moreover, neural and cognitive mechanisms of impaired feedback learning may serve as a modifiable treatment target.

Behavioral paradigm
The words used in the experiment contained 4-8 letters and 1-2 syllables, had Kucera-Francis frequencies of 20-650 words per million, and had high imagibility ratings (score of over 400 according to the MRC database) (Coltheart, 1981). The words were matched for word length and frequency at the trial level. Words presented on the same trial were not semantically related, with a score of less than 0.2 on the Latent Semantic Analysis similarity matrix (Landauer et al., 1998), did not rhyme and did not begin with the same letter.

Behavioral data
Outliers were operationalized as trials with response times shorter than 300 ms as this short response likely indicated an anticipation error. This procedure resulted in removal of 2.43% (Feedback Phase) and 4.74% (Test Phase) of trials. A programming error resulted in a few stimulus items in the Feedback Phase being redrawn, leading to some participants (N = 16) viewing the same word twice. These trials were discarded from all analyses (6.03 percent of all trials) and the error was corrected for the remaining participants.

fMRI data: preprocessing
Registration of the functional data to the high resolution structural data was performed using boundary-based registration method (Greve & Fischl, 2009). High resolution structural to MNI 2 mm standard space registration was performed using FLIRT and further refined with FNIRT nonlinear registration (Greve & Fischl, 2009;Jenkinson et al., 2002). The following pre-statistics processing were applied: motion correction using MCFLIRT (Jenkinson et al., 2002), slice-timing correction using Fourier-space time-series phase-shifting, non-brain removal using BET (Smith, 2002), spatial smoothing using a Gaussian kernel of FWHM 6.0 mm, grand-mean intensity normalization of the entire 4D dataset by a single multiplicative factor, and highpass temporal filtering (Gaussian-weighted least-squares straight line fitting, with sigma = 45.0 s). Time-series statistical analysis was carried out using FILM with local autocorrelation correction (Woolrich et al., 2001). The time series model included regressors corresponding to the 1 s time period of feedback presentation convolved with the doublegamma hemodynamic response function, and their temporal derivatives. A separate regressor was used for each high motion TR, determined using fsl_motion_outliers according to FD > 0.09 (Siegel et al., 2014). Extended motion parameters were included as regressors of no interest (18 regressors: 6 motion parameters, derivatives of the original motion parameters and the squares of the derivatives).

Post-task Questionnaire
The post-task questionnaire that participants completed after the MRI session served as manipulation check. On the posttask questionnaire, 10 out of 17 participants with TBI indicated that they were engaged while performing the task. More than half indicated feeling happy when presented with positive feedback (12 out of 17) and unhappy when presented with negative feedback (11 out of 17). 20 out of 25 HC participants indicated that they were engaged while performing the task. Most indicated feeling happy when presented with positive feedback (22 out of 25) and unhappy when presented with negative feedback (16 out of 25). These results demonstrate that the manipulation was effective: positive feedback was perceived as more rewarding and negative feedback was perceived as more punishing/aversive to participants.
Author contributions ED: conceptualization, fMRI data analysis, original manuscript draft preparation; SZ: editing, behavioral data analysis, table and figure preparation; JS: behavioral data analysis, figure preparation, manuscript preparation and editing.

Conflict of interest
The authors report no conflict of interest.
Ethical approval The research was approved by the Institutional Review Board of Kessler Foundation. Consent to participate All participants consented to participate in this research.
Consent for publication All authors consent for publication of this material.