The present study systematically examined the impact of emotional facial expressions during initial encounters on recognition following a retention interval of > 1.5 years. In line with our hypothesis, we found evidence that individuals recognized threatening faces better, particularly angry and fearful ones, as compared to non-threatening faces following the long-term retention interval. Moreover, the emotional expression-specific recognition advantage was not present directly after encoding and the long-term advantage of threatening faces was driven by decreased recognition of non-threatening faces over the retention period. Multivariate analyses further supported this finding by showing a separation according to emotional face expression following the long-term retention interval but not during immediate recognition. Interestingly, the expression-specific face recognition pattern exhibited considerable inter-subject variation and a data-driven ISC demonstrated that approximately half of the subjects demonstrated discriminative emotional face representation (discriminators) on the behavioral level while the other half did not (non-discriminators). Examination of neural activation differences during encoding revealed that the different activation patterns in response to threatening vs non-threatening faces in bilateral IOG and vmPFC/mOFC on the whole brain level, and left FFA on the ROI level was associated with the individual differences in the threatening vs non-threatening face recognition pattern after a delay of > 1.5 years. Two-sample t-test on the subgroups of subjects separated by significance of ISC scores further confirmed differential neural activity in the medial prefrontal and occipito-temporal cortex during encoding. Together, the present findings demonstrated that threatening expressions during incidental encounters might facilitate long-term face recognition and that differential encoding in the occipito-temporal cortex and vmPFC/mOFC might contribute to expression-associated recognition pattern differences.
In contrast to previous studies focusing on subsequent emotional face memory using relatively short retention intervals ranging from minutes to weeks with univariate approach (e.g., Anderson et al., 2006; Wang, 2013; Xiu et al., 2015), our study examined enhanced emotional expression effect for face recognition after an extensive delay of at least 1.5 years with both univariate and multivariate approach. The enhanced/discriminative memory for faces with threatening expressions (i.e., angry/fearful) relative to faces with non-threatening expressions (i.e., happy/neutral) in the present study was consistent with previous findings showing that the recognition and recollection for negative (e.g., fearful) faces was better than that for neutral faces after a delay of 24-h (Wang, 2013). Importantly, our study extends previous findings by demonstrating that the year-long emotional enhancement effect of memory was not limited to emotional scenes (Dolcos et al., 2005; Erk et al., 2010; Gavazzeni et al., et al., 2012). No differences between facial expression conditions were observed in the immediate test, which is consistent with some previous studies showing a lack of emotional facial expression modulation during immediate memory (Anderson et al., 2006; Grady et al., 2007; Satterthwaite et al., 2009; Xiu et al., 2015). However, other studies reported better memory for negative faces during immediate face recognition (Grady et al., 2007; Pinabiaux et al., 2013; Wang, 2013). Methodological differences between experiments may account for this discrepancy. For example, the specific emotional expression tested and the number of face stimuli per emotional category varies between studies (Anderson et al., 2006; Xiu et al., 2015; but see Grady et al., 2007; Wang, 2013). Besides, previous studies only tested a small number of subjects (N < 50) which may result in a lack of statistical power. Moreover, the mixed findings from previous studies might also attribute to the heterogeneity of sample characteristics such as age and gender which has been reported to be related to general face recognition memory (e.g., Grady et al., 1995; Sommer et al., 2013). Additionally, different analytical approaches may have contributed to the divergent results such that the previous studies generally conducted univariate analytic approaches which might cause a loss of information in response variability and individual differences (Miendlarzewska et al., 2018; Tian et al., 2019). Indeed, the individual behavioral response pattern (Fig. 2A) indicated individual differences in emotional expression representation, consistent with individual variations that have been reported in face recognition (Tardif et al., 2019; Wang et al., 2012), emotional expression processing (Calder and Young, 2005; Le Grand et al., 2006) and threat-related information processing (Liu et al., 2021). Our study partly overcame some of the previous limitations by incorporating data-driven multivariate analyses in examining emotional expression effect in a large sample of young adults with balanced gender.
The time-dependent effects of emotional expression on recognition found in our study (i.e., enhanced emotional memory after a long-term delay but not immediately after encoding) were in line with prior studies using non-face emotional stimuli such as words or scenes (Sharot and Phelps, 2004; Sharot and Yonelinas, 2008), which indicated enhanced recognition for negative compared to neutral stimuli after a 24-h delay but not immediately after encoding, and the recognition of neutral stimuli decreased over time while recognition of negative stimuli remained the same after a 24-h retention interval. The present observation might point to a role of emotional expression-specific consolidation of face stimuli, similar to other emotional materials (Cahill and McGaugh, 1998; Talmi, 2013; Yonelinas and Ritchey, 2015). From an evolutionary perspective, maintaining recognition of threatening faces over long intervals may represent an adaptive and survival-relevant mechanism (Staugaard, 2010), whereas faces with non-threatening expressions are of lower significance for future encounters thus were prone to be forgotten over time (Dunsmoor et al., 2015). Although emotion may have beneficial effects on immediate recognition, perhaps due to emotion-associated enhanced selective attention during perception or encoding (Feldmann-Wüstefeld et al., 2011; Gable and Harmon-Jones, 2012; Vuilleumier, 2002), the present findings did not support this view but reflected an indirect effect on consolidation by showing an increase of memory advantage for threatening faces after a long retention interval.
Additional control analyses suggest that the long-term memory-enhancing effect of threatening facial expressions could not completely be explained by factors such as an expression-specific tendency to respond items as previously seen or arousal at encoding. While an expression-specific variation in the false alarm rates was observed at both, immediate and delayed recall, false alarm rates were similar across the emotion expressions when variations in the retention interval were controlled for and false alarm rates for threatening faces were at chance level in the delayed recognition test. Previous studies investigating emotional memory advantage emphasized the role of arousal (Bradley et al., 1992; Hamann, 2001; LaBar and Phelps, 1998; Sharot and Phelsps, 2004). Indeed, arousal may bias processing toward salient information that gains processing priority and thus contribute to enhanced consolidation (Mickley Steinmetz et al., 2012; Ritchey et al., 2008). However, the enhanced memory for threatening faces in our study is unlikely explained by arousal based on three lines of evidence. First, a moderation analysis confirmed the lack of arousal effects by showing a non-significant interaction of facial expression category by arousal on memory performance. Second, arousal ratings for each expression at encoding between discriminators and non-discriminators were not significantly different (see Supplementary Results). Third, on the neural level, we did not observe significant activity in brain regions that have been shown to be engaged in processing emotional arousal such as the amygdala (Adolphs et al., 1999) or in unspecific salience processing such as the insula (Menon and Uddin, 2010), which together suggests that the findings in the present study are unlikely to reflect unspecific arousal or salience effects of emotional stimuli.
The individual differences in emotional face recognition pattern not only substantiated the multivariate PCA analysis but also allowed us to capture the corresponding neural correlates by employing individual difference approaches. With a multivariate measure capturing individual differences based on BPSA, we observed that subjects with higher recognition differences between threatening versus non-threatening faces exhibited lower activity in response to threatening (i.e., angry/fearful) faces versus non-threatening (i.e., happy/neutral) faces in the bilateral IOG and vmPFC/OFC. ROI analysis further suggested that the left FFA was also engaged in the long-term emotional expression effects on memory. These identified regions align with the previously reported face network and resemble previous findings on emotional face recognition and emotion-related learning (Kumfor et al., 2013; LaBar and CabeZa, 2006; Rolls, 2019; Sergerie et al., 2005; Xiu et al., 2015). IOG and FFA are two key regions in the core face network (CFN) which usually support face identity recognition (Haxby et al., 2000; Liu et al., 2002; Tsantani et al., 2021), yet were also involved in processing emotional facial expression (Cohen Kadosh et al., 2010; Harry et al., 2013). OFC is a part of extended face network (EFN) which processes social-emotional aspects of face processing, specifically value-based emotion and reward information (Ishai, 2008; O’Doherty et al., 2003). Lesion studies have demonstrated that IOG and OFC lesion impairs emotional memory and learning (Calder and Young, 2005; Kumfor et al., 2013; Rossion et al., 2003). Damage to inferior fronto-occipital fasciculus (IFOF), which connects the occipital cortex to OFC, has moreover been reported to predict facial emotion recognition impairment including anger, fear and sad (Philippi et al., 2009). Our findings suggest a behavioral significance of encoding-related activity in these regions in predicting long-term emotional face memory patterns. Previous studies measuring the correlations between neural activation during fMRI and subsequent face (neutral expression) recognition performance employing independent task paradigms and stimuli have shown that higher face selectivity in FFA and IOG was associated with better face recognition ability in both normal subjects (Huang et al., 2014) and developmental prosopagnosia (Liu et al., 2021), suggesting that the face-selective response in these regions contributes to individual differences in face recognition. However, these studies did not permit an investigation of encoding-related activity in predicting subsequent memory. One study that used an individual difference approach to examine encoding-related neural correlates of emotional face memory reported that the connectivity from IOG to OFC is negatively correlated with immediate subsequent memory performance of faces with different emotional expressions (i.e., negative, neutral, positive, respectively) (Xiu et al., 2015), which aligns with the neural substrates of long-term emotional face memory pattern identified in the present study.
Interestingly, the present results indicated a negative correlation between activity during encoding and long-term emotional face representation pattern, with higher activity for angry/fearful vs happy/neutral activity being associated with less discriminative recognition performance between faces with different emotional expressions (i.e., more similar performance for threatening vs non-threatening faces). Group comparison based on the similarity (discriminators vs. non-discriminators) further revealed distinct mechanisms in the IOG and vmPFC/OFC, that is, the threatening vs non-threatening activation difference in IOG was driven by enhanced neural response to threatening faces in non-discriminators, whereas the threatening vs non-threatening activation difference in vmPFC/OFC resulted from suppressed neural response to threatening faces in discriminators. Previous studies have shown effective connectivity from IOG to OFC is negatively correlated with immediate memory performance of emotional faces (Xiu et al., 2015), suggesting an emotion-specific modulation of the face network. The distinct role of IOG and vmPFC/OFC in predicting long-term emotional face memory and changes in the trajectory over longer retention intervals thus remains to be explored in future studies.
The lack of amygdala effect was consistent with the behavioral results suggesting that there was no modulation of arousal on memory performance and might be partly explained by previous observations that the amygdala responds equally to positive, negative and neutral faces during encoding (Adolphs, 2010; Ball et al., 2009). Although previous studies have indicated that amygdala activation at encoding predicted greater subsequent memory for emotional compared to neutral scenes after a 1-year delay (Dolcos et al., 2005), the findings are based on the neural activity of remembered versus forgotten items for emotional versus neutral scene pictures. Likewise, the present study did not reveal a role of the insula in emotional face memory, suggesting that the emotional face memory was not due a higher externally guided attention towards emotional faces during encoding (Yao et al., 2018).
The findings of the present study need to be considered in the context of the strengths and limitations of the study design. First, we used multivariate PCA and BPSA for long-term emotional face recognition performance in a large sample, which differentiates our study from most previous ones comparing the experimental conditions by averaging across all stimulus items. The univariate method relies on the assumption that all stimulus items within each condition are homogeneous, and the performance differences between items reflect noise fluctuations. The data-driven multivariate PCA allowed us to detect hidden pattern in the confidence ratings in a hypothesis-free manner by capitalizing on the trial-wise responses, which not only complemented the traditional univariate analysis but also informed fMRI analysis using individual difference approach. Although the first component retained only 12.4% of the original variance which might be due to the noise associated with the long retention interval, the PCA did generate separate clusters (see Fig. 3B) and the PC1 separation was statistically significant according to a non-parametric trustworthiness test, suggesting that this approach successfully identified distinguishable latent groups of facial expressions based on the recognition confidence ratings. More importantly, the extracted scores of PC1, representing a discriminative variability that accounted for the facial expression categories, promoted the BPSA by correlating the PC1 scores with each subject’s confidence rating pattern. This multivariate method allowed us for the first time to capture individual differences in emotional face representation patterns by relating the face representation (i.e., multi-item confidence rating pattern) of each subject to a group-level emotion-specific discriminative memory pattern (indexed by PC1 scores). Importantly, the present BPSA took the principal component score pattern as a response template in ISC analysis, whereas previous studies used mean score pattern as a template (Tian et al., 2020). Thus, our method could provide a novel way to distinguish the memory performance for faces with different emotional expression conditions and has the potential to support a broader examination of multidimensional data (e.g., behavioral, gene or fMRI data) comprising of complex conditions. Moreover, by calculating the association between individual feature pattern and the pattern template generated by unsupervised dimensional reduction (e.g., PCA, minimum curvilinear embedding, Miendlarzewska et al., 2018), one can characterize the individual differences in the representation pattern and associated it with the neural systems.
On the other hand, the comparably low number of hits for each expression condition after a long retention interval did not permit a reliable fMRI analysis of subsequent memory effects comparing remembered versus non-remembered items as employed in previous studies (e.g., Becker et al., 2017; Dolcos et al., 2005). Future studies that aim to determine facial expression effects over very long retention intervals may enhance the processing depth of the stimuli or increase the number of stimuli per expression condition as well as the number of subjects (Steele, et al., 2016) are invited to further explore the neural basis underlying long-term emotional face memory. Moreover, to examine the long-term memory effect with higher sensitivity, a dimensional confidence rating was used at the long-term retention interval while the immediate recognition test implemented a forced choice response. Although previous studies used a similar approach to convert dimensional confidence ratings into binary responses (Weymar et al., 2011; Xiu et al., 2015), the results of direct comparison between immediate and delayed recognition should be interpreted with caution. Furthermore, previous studies investigating the immediate recognition of neutral faces that were previously presented with threatening (angry or fearful) or non-threatening (happy or sad) expressions and found no differences in recognizing neutral faces with threatening versus non-threatening expressions (Satterthwaite et al., 2009), suggesting that the “first impression” created by facial expression did not affect subsequent recognition of face identity. However, whether such a “first impression” has an effect on long-term face identity recognition despite a neutral expression during recognition remains to be addressed in future studies. Notably, on the behavioral level no differences were observed during immediate recognition, suggesting that differential encoding-related activation in these regions may be associated with individual variation during memory consolidation. Further studies are needed to explore neural basis during consolidation to provide a more comprehensive understanding of the formation of long-term emotional face memory. Moreover, given the exploratory nature of the fMRI analysis in the present study, we focused on univariate fMRI analyses to uncover the neural substrates of the behavior. A parallel ISC Analysis on activation pattern similarity with suitable experimental designs for future studies would be highly recommended to offer additional insights into the neural basis of the emotional expression effects on face memory (e.g., Nastase et al., 2019; Tian et al., 2020).