Medial prefrontal and occipito-temporal activity at encoding determines enhanced recognition of threatening faces after 1.5 years

Studies demonstrated that faces with threatening emotional expressions are better remembered than non-threatening faces. However, whether this memory advantage persists over years and which neural systems underlie such an effect remains unknown. Here, we employed an individual difference approach to examine whether the neural activity during incidental encoding was associated with differential recognition of faces with emotional expressions (angry, fearful, happy, sad and neutral) after a retention interval of > 1.5 years (N = 89). Behaviorally, we found a better recognition for threatening (angry, fearful) versus non-threatening (happy and neutral) faces after a delay of > 1.5 years, which was driven by forgetting of non-threatening faces compared with immediate recognition after encoding. Multivariate principal component analysis (PCA) on the behavioral responses further confirmed the discriminative recognition performance between threatening and non-threatening faces. A voxel-wise whole-brain analysis on the concomitantly acquired functional magnetic resonance imaging (fMRI) data during incidental encoding revealed that neural activity in bilateral inferior occipital gyrus (IOG) and ventromedial prefrontal/orbitofrontal cortex (vmPFC/OFC) was associated with the individual differences in the discriminative emotional face recognition performance measured by an innovative behavioral pattern similarity analysis (BPSA). The left fusiform face area (FFA) was additionally determined using a regionally focused analysis. Overall, the present study provides evidence that threatening facial expressions lead to persistent face recognition over periods of > 1.5 years, and that differential encoding-related activity in the medial prefrontal cortex and occipito-temporal cortex may underlie this effect.


Introduction
For social species, the recognition of previously encountered conspecifics is vital for survival and successful interaction. In humans, faces are presumably the most important stimuli for subsequent recognition. Given the high evolutionary significance of these stimuli, cortical networks specialized in perceiving and recognizing faces develop already during early infancy (Powell et al. 2018;Cohen et al. 2019). Nevertheless, the ability to recognize faces varies greatly in the human population. While some individuals can recognize faces following a single exposure over years, others find it nearly impossible to recognize highly familiar faces (Russell et al. 2009;Tardif et al. 2019). In addition to individual differences, several characteristics of the facial stimuli can affect subsequent recognition including emotional expressions (Bruce and Young 1986;Haxby et al. 2000). From 1 3 an evolutionary perspective, the emotional expressions may transmit important information such that threatening facial expressions (e.g., angry or fearful) can signal danger, and may thus relate to harm avoidance in the future (Darwin 1872;Staugaard 2010).
In support of this evolutionary hypothesis, some experimental studies have demonstrated a recognition advantage of threatening facial expressions across a variety of delays (Grady et al. 2007;Jackson et al. 2014;Pinabiaux et al. 2013;Stiernströmer et al. 2016; Thomas et al. 2014;Wang 2013). For example, previous studies consistently reported that faces with threatening expressions (i.e., angry or fearful) are better remembered compared to non-threatening (e.g., neutral) faces in visual working memory Öhman et al. 2001;Sessa et al. 2011;Thomas et al. 2014;Vuilleumier 2002). Several studies on short-term memory also found a recognition advantage for threatening faces when the memory was tested immediately (i.e., minutes) after encoding (e.g., Grady et al. 2007;Pinabiaux et al. 2013;Wang 2013). Following a longer retention interval of 24 h, the recognition memory was better for fearful relative to neutral faces (Wang 2013). These findings broadly align with numerous studies indicating enhanced recognition for non-facial emotional stimuli (e.g., scenes or words), particularly high-arousing negative ones, which are more vividly and accurately remembered over retention intervals ranging from minutes to years (for a review, see Bowen et al. 2018). Meanwhile, other studies failed to observe memory enhancement of threatening faces across a variety of delays (minutes to 2 weeks, Anderson et al. 2006;Grady et al. 2007;Satterthwaite et al. 2009;Xiu et al. 2015). These contradictory results may be accounted for by factors such as the lack of statistical power (i.e., small number of subjects and face stimuli), different methodological approaches (e.g., memory performance indexes), and heterogeneity of sample characteristics such as age and gender which may relate to differences in face memory ability (Grady et al. 1995;Sommer et al. 2013). Particularly, various retention intervals and univariate analysis approaches employed in different studies may also have an effect on the mixed findings. Previous studies primarily used hypothesis-driven univariate approaches (e.g., ANOVA) to compare recognition performance for emotional relative to neutral faces. However, this approach discards the information related to response variability of the data due to averaging and dismisses individual differences that characterize face memory (Miendlarzewska et al. 2018;Tian et al. 2020). Also, the emotional expression effect on face memory after a long retention interval (i.e., years) has not been determined to date. Previous studies have reported better recognition performance for emotional scenes compared to neutral scenes selected from the International Affective Picture System (IAPS) 1 year after incidental encoding (e.g., Dolcos et al. 2005;Erk et al. 2010;Gavazzeni et al. 2012), and suggested that the memory advantage of emotional, particularly negative, materials is mediated by effects on consolidation facilitating a better remembering of emotional relative to neutral materials for time periods ranging from 24 h to several months (Yonelinas and Ritchey 2015). Based on these findings, we expected that this long-term beneficial effect of negative emotion on memory would also present in face stimuli which may serve the adaptive needs. Thus, the first aim of the present study was to systematically investigate the emotional expression effects on face recognition over an extended retention interval (> 1.5 years) in a large sample using both univariate and data-driven multivariate analysis approach.
A large body of neuroimaging studies suggests that long-term memory advantage for emotional material-primarily negative visual stimuli like threatening scenesrelative to neutral material is mediated by medial temporal lobe (MTL) regions, in particular the interaction between the amygdala and hippocampal formation during encoding and recognition (Dolcos et al. 2004;Mackiewicz et al. 2006;Ritchey et al. 2008; for a review, see Yonelinas and Ritchey 2015). In contrast, findings on the neural substrates of long-term emotional face recognition have not been well characterized. Human functional magnetic resonance imaging (fMRI) investigations suggest that face processing is supported by distributed neural systems (Haxby et al. 2000), including brain regions processing facial features and identity such as the face fusiform area (FFA) (Kanwisher et al. 1997;McCarthy et al. 1997) and the inferior occipital gyrus (IOG)/occipital face area (OFA) (Gauthier et al. 2000), and those processing social and emotional information (e.g., facial expression) such as the amygdala (Gobbini and Haxby 2007), inferior frontal gyrus (IFG) (Ishai et al. 2002) and orbitofrontal cortex (OFC) (Ishai 2008). Initial studies examining the subsequent memory effects for emotional faces have shown that activation associated with the successful encoding (i.e., remembered > forgotten) of emotional (fearful and happy) versus neutral faces were centered on prefrontal regions such as the IFG, dorsolateral prefrontal cortex (dlPFC) and OFC (Sergerie et al. 2005). More recent fMRI studies using individual difference approaches have additionally revealed that individual variations in immediate memory performance for negative, positive and neutral faces are associated with encoding-related differences in connectivity between occipito-temporal areas (e.g., FFA, IOG) and regions engaged in social-emotional processes such as the OFC (Geiger et al. 2016;Xiu et al. 2015). However, the encoding-related neural basis underlying longterm emotional face recognition after a lengthy retention interval (i.e., years) has not been systematically examined, especially using an individual difference approach. Compared to subsequent memory paradigms (Paller and Wagner 2002), which are primarily designed to measure within-subject, event-related activity associated with successful memory, the individual difference approach allows to identify regions across subjects whose encoding-related activity across condition-specific trials correlates with individual differences in subsequent recognition rendering this approach highly suitable for the investigation of emotional memory after a long retention interval (Ritchey et al. 2008). Therefore, the second aim was to examine whether brain activation during incidental encoding was associated with the recognition advantage of faces with emotional expressions following a retention interval of > 1.5 years using an individual difference approach.
Against this background, 225 healthy students underwent incidental encoding of 50 faces with different emotional expressions (angry, fearful, happy, sad and neutral, each face image was taken from a different actor) during fMRI acquisition (for previous studies from this large cohort project see also Li et al. 2019;Liu et al. 2020;Xu et al. 2020;Zhou et al. 2020). Twenty minutes after the scanning session, all subjects completed an immediate recognition test and a subsample of subjects (N = 102) also participated in another surprise face recognition test after a delay of at least 1.5 years. Behaviorally, we examined whether face recognition was modulated by facial expressions by means of univariate (ANOVA) on hit rate and data-driven multivariate (principal component analysis, PCA) approach on raw confidence rating scores. The multivariate PCA in the present study capitalized on the individual-and item-wise behavioral response of each subject and would thus differentiate the recognition for different expression conditions in an unsupervised data-driven manner with higher sensitivity compared with traditional univariate analysis.
For the fMRI analysis, we employed an individual difference approach by relating neural activation during encoding to recognition performance after a delay of > 1.5 years. Similar approaches have previously been successfully employed to determine the neural substrates of short-term (i.e., minutes) emotional memory for faces (Geiger et al. 2016;Xiu et al. 2015) or long-term (i.e., weeks) emotional memory for scenes (Mackiewicz et al. 2006;Ritchey et al. 2008). Notably, we characterized individual differences in recognition performance using a multivariate measure, instead of the conventionally used univariate measure, based on the principal component scores (PC scores) from the PCA analysis (i.e., behavioral pattern similarity analysis, BPSA, details see "Materials and methods"). Briefly, we calculated the similarity between PC scores and the recognition confidence pattern of each subject. The similarity scores, thus, represented the extent of discriminative recognition between expression conditions and were used to explore the neural substrates that predict individual differences in discriminative long-term recognition performance for faces with different emotional expressions.
Based on previous findings, we expected differential memory performance between faces with threatening expressions (i.e., angry and fearful) and faces with non-threatening expressions after a retention interval > 1.5 years in both univariate and multivariate analyses. On the neural level, we expected that individual differences in the emotional expression effects on long-term face recognition were accompanied by variations in encoding-related activity in brain regions implicated in emotional memory formation and face processing (e.g., OFC, amygdala, and visual cortex).

Subjects
A total of 102 (53 males, age range 20-32) healthy, young right-handed Chinese students participated in this study which was part of a large-scale fMRI project (e.g., Li et al. 2019;Liu et al. 2020;Xu et al. 2020;Zhou et al. 2020). Due to incomplete behavioral and fMRI data (N = 7), extremely low hits and false alarms (hits < 1 and false alarms < 1, N = 4), or excessive head motion during fMRI scanning (N = 2), data from 13 subjects were excluded from both behavioral and fMRI analyses, resulting in N = 89 subjects (44 males, mean age = 23.80 ± 2.39 years) in the final analyses. Details on recruitment protocols and quality assessments are provided in the Supporting Information. The study was approved by the local ethics committee at the University of Electronic Science and Technology of China and in accordance with the latest revision of the Declaration of Helsinki. Written informed consent was obtained from each subject.

Stimuli
A total of 150 face stimuli were selected from two validated Asian facial expression databases: Chinese Facial Affective Picture System (Gong et al. 2011) and Taiwanese Facial Expression Image Database (TFEID) (Chen and Yen 2007) and were evenly divided into three sets. Facial expressions included angry, fearful, sad, happy and neutral (each from 30 different individual actors, 15 males). All facial stimuli were gray-scaled and covered with an oval mask to remove individual features (e.g., hair). The three face sets were used for incidental encoding (set 1), immediate recognition (set 2) and delayed recognition (set 3), respectively. The arousal and valence of each face set (50 faces, 10 for each expression category) was rated by an independent sample (n = 20, 10 males, mean age = 21.2 ± 0.70 years) before the experiment and the arousal ratings of emotional faces (angry, fearful, happy, sad) were higher as compared to neutral faces (all ps < 0.001), whereas arousal ratings between emotional faces did not differ (all ps > 0.05) for each face set.

Experimental procedure
The present study employed a multiple-stage procedure including an incidental encoding phase and a subsequent memory phase (Fig. 1). All subjects initially underwent an event-related fMRI paradigm using an emotional face processing task (i.e., incidental encoding) between August, 2016 and October, 2017 (Time 1, T1). Fifty facial stimuli (set 1) were repeatedly presented over two subsequent runs with different pseudorandom sequence, balanced for facial expression and gender (5 min 12 s per run). Stimuli were shown for 2500 ms during which the subjects were required to judge the gender of the face by button press. After each trial, a jittered fixation cross was presented for 2000-5600 ms (mean ITI = 3800 ms, see Fig. 1). Stimuli were presented via E-prime 2.0 (Psychology Software Tools, USA, http:// www. pstnet. com/ eprime. cfm). Twenty minutes after the fMRI acquisition, subjects were asked to complete a surprise recognition memory test (immediate test) outside the scanner in which the 50 previously presented faces (set 1, targets) from the fMRI paradigm were intermixed with 50 new faces (set 2, lures). Subjects were instructed to indicate whether each face had been shown during the fMRI acquisition (forced choice: old versus new). Emotional arousal ratings for each face were additionally assessed after the immediate old/new recognition test using a 9-point Likert scale (1 = very weak to 9 = very strong). After a retention interval of > 1.5 years (interval range: 653-1113 days), 102 subjects agreed to participate in a surprise recognition test (delayed test) between July, 2019 and August, 2019 (Time 2, T2) in which target faces were intermixed with another set of 50 new faces (set 3, lures). In the delayed test, subjects were asked to rate their recognition confidence on a six-point scale (old vs new; 1 = definitely new to 6 = definitely old, see Fig. 1). The confidence rating approach was employed in the delayed test given that it reflects the strength and quality of the memory more precisely (Aly and Turk-Browne 2016;Stretch and Wixted 1998), and thus was more sensitive as compared to the categorical old/new judgement approach. Moreover, this allowed us to conduct multivariate analysis on the delayed test data with increased power. The delayed recognition memory test was carried out online via Survey-Coder 3.0 (https:// www. surve ycoder. com/).

MRI data acquisition
MRI data were obtained on a 3 T GE MRI system (General Electric, Milwaukee, WI, USA). Functional images were acquired with a gradient echo-planar imaging pulse sequence (39 slices; repetition time (TR) = 2000 ms; echo time (TE) = 30 ms; slice thickness = 3.4 mm; spacing = 0.6 mm; field of view (FOV) = 240 × 240 mm2; flip angle = 90°; matrix size = 64 × 64). Each run of the emotional face processing task consisted of 173 volumes. High-resolution whole-brain T1-weighted images were additionally acquired to improve normalization of the functional images (spoiled gradient echo pulse sequence; 156 slices; TR = 6 ms;

Univariate approach
To assess whether subjects generally remembered faces immediately after encoding as well as > 1.5 years later, general sensitivity index A-prime (A′) and d-prime (d′) were initially computed and compared with their respective chance levels (for details, see Supplementary Methods). Hit rates were subjected to a 2 × 5 repeated-measures ANOVA with the factors time of assessment (immediate vs delayed) and emotional expression (angry vs fearful vs happy vs neutral vs sad) to examine the interaction effects between retention interval and facial expression. For both tests, hit rates were defined as the ratio of target faces correctly identified as old. For the delayed test, ratings of 4, 5 and 6 were considered as correctly identified. Post hoc tests for significant interactions with Holm correction were conducted to examine the facial expression effects within the immediate or delayed test, and planned two-tailed t tests were performed to determine changes of memory between immediate and delayed recognition in each facial expression condition. Notably, the analysis focused primarily on hit rates because it allowed a direct comparison between immediate and delayed test which included different sets of lures. We further compared the false alarm rates with chance level as well as among the expression conditions within each test to control for effects of a higher tendency to judge lure items as previously seen. ANOVA and post hoc analyses were analyzed using the R packages afex (Singmann et al. 2015) and emmeans (Lenth 2019).

Principal component analysis (PCA)
Given that the hypothesis-driven univariate analysis (e.g., ANOVA) might dismiss individual differences related to response variability (see Fig. 2A) due to averaging, we further investigated the robustness of long-term emotional expression effects using a data-driven unsupervised multivariate approach. PCA was applied given that it is one of the most widely used unsupervised multivariate machine learning algorithms for exploring the hidden pattern of multidimensional data (Ringnér 2008), which allowed us to detect and distinguish the factors that influence the memory performance in an unsupervised manner and visualize similarities and differences between individual behavioral responses (Fig. 2B). Specifically, the confidence ratings of all target face trials in the delayed test were plotted in a reduced twodimensional space composed by principal component 1 (PC1) and PC2 (details see Supplementary Methods). PC1 explains the highest variance and represents the most discriminative dimension. We, thus, expected that trials with the same facial expression (color coded) would dominate separate regions within the reduced space along the PC1 axis, representing a discriminative expression-specific face memory pattern that threatening expressions (i.e., angry and fearful) would separate from non-threatening expressions (particularly happy and neutral, Fig. 2B). To further assess the significance of the segregation, we next employed a recently proposed measure 'trustworthiness' which compares the indicator of PC1 segregation (Area Under the ROC Curve, AUC) with a null distribution of AUC values derived from 1000 permutations of shuffled emotion labels (minimum two-tailed p value: 0.05, Fig. 2B) (details on this approach see Durán et al. 2021). Finally, Wilcoxon signedrank tests (two tailed) with Benjamini-Hochberg-adjusted correction were performed to compare responses between expression conditions. Multivariate analyses including PCA, trustworthiness test of significance and non-parametric Wilcoxon Signed-Rank Tests were implemented in PC-corr MATLAB code (https:// github. com/ biome dical-cyber netics/ PC-corr_ net) which has been used in previous studies to successfully discriminate behavioral and omic patterns (Miendlarzewska et al. 2018;Ciucci et al. 2017). The multivariate PCA facilitated a sensitive determination of emotional expression-specific face recognition memory over an extended retention interval. In further exploratory analyses, the PCA was also applied to data from the immediate recognition test to provide a comparison with the delayed test.

Behavioral pattern similarity analysis (BPSA)
To quantify individual differences in representing a discriminative emotional face memory pattern in the delayed test, we devised a behavioral pattern similarity analysis (BPSA) approach by correlating the individual response pattern with the PC1 pattern (Fig. 2). The BPSA integrates PCA results with pattern similarity analysis which was proposed for multivariate fMRI data analysis (Haxby et al. 2001;Haxby 2012), providing a data-driven alternative for characterizing individual differences in discriminating complex conditions with high sensitivity. The BPSA includes the following procedures: (1) select a method for unsupervised dimensionality reduction of the multivariate data (without loss of generality in this study we consider PCA, Fig. 2A, B) and consider a data dimension as the response template that proves to offer a discriminative pattern between the samples (Fig. 2C, right). In our study, the projection of the samples on the first dimension of PCA embedding (which are the PC1 scores of the PCA on confidence ratings for targets of the delayed test) was extracted as a response template, which represents a discriminative variability that accounted for the facial expression conditions; (2) a measure of similarity (e.g., Pearson's correlation coefficients) between each subject's confidence rating pattern and the PC1 score template was calculated to evaluate the extent to which the discriminative group-level emotional expression-specific face representation manifested at the individual level (Fig. 2C), and the similarity scores were then used to be associated with the brain activations. This BPSA method, thus, provided us with similarity score of each subject which could be used to characterize the individual emotional face representation and could inform an individual difference informed fMRI analyses to uncover the neural mechanisms underlying the long-term emotional expression effect for face recognition. To avoid inflation of the similarity calculation, the PC1 template was constructed employing a leave-one-subject-out approach, and specifically the individual whose correlation was calculated was excluded from constructing the template. For further exploratory analysis on the level of the fMRI data, we additionally capitalized on these values to split the sample into two groups with different behavioral patterns. To this end, we separated the subjects based on the similarity such that those whose confidence rating patterns showed significant correlations (p < 0.05, Pearson correlation) with the PC1 template were considered discriminators exhibiting a discriminative expression-specific face memory pattern and those whose similarity did not reach significance (p > 0.05) were considered non-discriminators (Fig. 2D).

Image preprocessing
The functional MRI data were preprocessed and analyzed using SPM12 (Statistical Parametric Mapping, https:// www. fil. ion. ucl. ac. uk/ spm/ softw are/ spm12/). The first ten volumes were discarded to allow for MR equilibration. The remaining functional images were realigned to correct for Fig. 2 Schematic presentation of the major steps in the multivariate investigation of the long-term emotional face recognition memory. A Representation of individual behavioral response patterns for all subjects. Each column represents the confidence rating pattern for one subject. Rows from top to bottom indicate face images with expressions angry, fearful, sad, neutral and happy. Color blue to red designates the confidence rating responses from 1 to 6. B PCA dimensionality reduction and expected discriminative pattern of the facial expression conditions with sample plotted in the 2D reduced space. The inset shows the assessment of significance of segregation using a non-parametric permutation test (i.e., trustworthiness). C Behavioral pattern similarity analysis (BPSA) between each subject's response pattern and the PC1 score pattern (template) derived from PCA. The PC1 template was constructed excluding the individual whose correlation was calculated with the template for N = 89 times. D Grouping the subjects based on the significance of similarity scores. The subgroup of subjects exhibiting significant discriminative response pattern for facial expression conditions were called discriminators, and the subgroup not exhibiting such pattern were called non-discriminators head motion, co-registered with the T1-weighted structural images and normalized to Montreal Neurological Institute (MNI) standard template using a two-step procedure including segmentation of the brain structural images and application of the resultant transformation matrix to the functional time-series. The resampled voxel size of functional data was 3 × 3 × 3 mm. Finally, the images were spatially smoothed using a Gaussian kernel with full-width at half-maximum (FWHM) of 8 mm.

Statistical analyses
To identify the neural substrates associated with the individual differences in long-term emotional expression effects on face memory, a voxel-wise brain-behavior correlation analysis was conducted on the whole-brain level. To this end, the first-level contrast of interest (i.e., threatening vs non-threatening) was modeled using separate onset regressors for all trials with threatening vs non-threatening faces in accordance with the behavioral results, and convolved with the conventional hemodynamic response function (HRF). Six motion parameters were added in the design matrix to control for movement-related artifacts. Next, the first-level contrast images were subjected to a simple regression model with similarity score as a regressor and interval day as a nuisance covariate using the Statistical nonParametric Mapping toolbox (SnPM13, http:// warwi ck. ac. uk/ snpm) based on 5000 permutations to identify the regions in which the activity was correlated with individual differences in emotional face recognition representational pattern. As a parallel exploratory-yet not independent-approach, the association analysis was accompanied by a between-group differences approach using a whole-brain two-sample t test on two subgroups of subjects separated by the significance of the similarity scores (i.e., discriminators vs non-discriminators) to further describe the group difference between subjects with "qualitatively different" emotional face presentation (this analysis represents a related analysis and, thus, details are presented in Supplementary Methods). Significant clusters in the whole brain were determined using a height threshold of p < 0.001 (two tailed) and an extent threshold of p < 0.05 (two tailed) with cluster-based familywise error (FWE) correction (Eklund et al. 2016;Slotnick 2017).
Given that the amygdala has been strongly implicated in long-term (i.e., ranging from 1 to 4 weeks) emotional memory in previous individual differences studies using scene pictures as visual stimuli (Hamann et al. 1999;Mackiewicz et al. 2006;Ritchey et al. 2008), as well as that the fusiform gyrus plays a pivotal role in face recognition (Kanwisher et al. 1997;McCarthy et al. 1997), we examined effects in the amygdala and fusiform gyrus with increased sensitivity using a priori region-of-interest (ROI) analysis. Bilateral amygdala masks were created from the Brainnetome atlas (http:// atlas. brain netome. org/ downl oad. html, Fan et al. 2016). Bilateral fusiform masks were obtained from the probabilistic activation map (PAM) for faces (face vs objects) from the brain activity atlas (http:// www. brain activ ityat las. org, Zhen et al. 2015). Small volume correction (SVC) was performed using FWE correction with a voxellevel threshold of p < 0.05.

Univariate results
The ANOVA revealed a significant main effect of facial expression (F (4,85) = 7.61, p < 0.001, η 2 = 0.26) and retention interval (F (1,88) = 27.35, p < 0.001, η 2 = 0.24) on the hit rates, as well as a significant interaction effect (F (4,85) = 4.17, p < 0.005, η 2 = 0.16, Fig. 3A). Post hoc tests indicated that facial expression modulated delayed memory recognition (hit rate: F (4,85) = 10.18, p < 0.001, η 2 = 0.32), but not immediate memory recognition (hit rate: F (4,85) = 2.17, p = 0.08). Within the delayed recognition, post hoc test further suggested that hit rates for faces with both threatening facial expressions (angry or fearful) were significantly higher as compared to faces with non-threatening expressions (sad, happy or neutral, respectively) (angry vs. sad: t 88 = 3.31, p < 0.01, angry vs. happy: t 88 = 5.05, p < 0.001, angry vs. neutral: t 88 = 4.84, p < 0.001; fearful vs. sad: t 88 = 2.79, p < 0.05, fearful vs. happy: t 88 = 5.43, p < 0.001, fearful vs. neutral: t 88 = 4.97, p < 0.001; all Holm corrected). In contrast, no significant differences between recognition performance of angry versus fearful (t 88 = 1.09, p = 0.558, Holm corrected) as well as neutral versus happy faces (t 88 = 0.10, p = 0.923, Holm corrected) were observed, whereas sad faces were significantly better remembered than neutral faces (t 88 = − 2.63, p = 0.04, Holm corrected) and marginally significantly better remembered than happy faces (t 88 = − 2.37, p = 0.06, Holm corrected). To rule out the possibility that this long-term emotional memory advantage of threatening faces was influenced by variations in the retention interval (ranging from 653 to 1113 days), repeated-measures ANOVA on hit rate of the delayed test was recomputed including interval day as a covariate and results remained stable (F (4,84) = 2.43, p = 0.05, η 2 = 0.10). Moreover, correlation analyses were conducted between retention interval day and hit rate in each condition and these correlations were not significant (all ps > 0.05, Supplementary Results). To test whether this long-term enhanced memory for threatening faces is a result of higher tendency to respond "old" to threatening faces, two control analyses were conducted. First, we compared the false alarm rates in delayed test with the chance level and the results showed that the false alarm rates for both angry (one-sample t test: t 88 = 0.70, p = 0.49) and fearful (one-sample t test: t 88 = 1.61, p = 0.11, for other expression conditions, see Supplementary Results) expression conditions were at chance level, whereas the respective hit rates were higher than chance level (angry: one-sample t test, t 88 = 3.00, p < 0.005; fearful: one-sample t test, t 88 = 2.16, p < 0.05; for other expression conditions, see Supplementary Results). Correspondingly, the comparisons between false alarm rates and chance level in immediate test are provided in Supplementary Results. Moreover, the false alarm rates between expression conditions within the delayed test were not significantly different after including interval as a covariate (F (4,84) = 1.08, p = 0.37; although significant difference without controlling for interval days, F (4,85) = 9.85, p < 0.001, η 2 = 0.32), whereas the false alarms rates during immediate recognition showed an emotion-specific pattern (F (4,85) = 22.15, p < 0.001, η 2 = 0.51; for post hoc test results, see Supplementary Results). Further control analysis was performed on arousal of the face expressions. Repeated-measures ANOVA showed that the arousal ratings of emotional faces (angry, fearful, happy, sad) were significantly higher relative to neutral faces, and angry, fearful and happy faces elicited greater arousal than sad faces (F (4,85) = 42.01, p < 0.001, η 2 = 0.66, post hoc test results see Supplementary Fig S1) in the current sample. Nevertheless, the hierarchical regression analysis suggested that the interaction between facial expression and arousal did not reveal a significant effect (ΔR 2 = 0.068, p = 0.310) on the hit rate. Together, these results suggested that the long-term recognition memory advantage for threatening faces was not influenced by interval days, false alarm rate, and arousal.
To determine changes of memory in each facial expression condition over > 1.5 years, pairwise comparisons on hit rates between the immediate and delay test for each facial expression condition were conducted. Recognition performance for happy, neutral and sad faces (two-tailed paired t test: happy: t 88 = − 5.25, p < 0.001, Cohen's d = 0.56; neutral: t 88 = − 6.10, p < 0.001, Cohen's d = 0.65; sad: t 88 = − 3.21, p < 0.01, Cohen's d = 0.34, Fig. 3A) significantly declined Fig. 3 Behavioral results. A Memory performance for each emotion expression condition in the immediate and delayed memory test as displayed by hit rates and false alarms rates. Error bars depict ± 1 SEM. B PCA on confidence ratings for target faces in the delayed test shows a separation of facial expression conditions (color coded) on memory performance along PC1 score axis (angry and fearful faces: generally negative scores; happy, neutral, sad faces: generally positive scores).The inset shows the distribution of PC1 scores for each facial expression condition during the 1.5-year retention interval, whereas recognition for angry (t 88 = − 1.65, p = 0.104) and fearful (t 88 = − 2.04, p = 0.09) faces remained unchanged after Holm correction.
Together, the results indicated a long-term face recognition advantage of threatening expressions (i.e., angry and fearful) and this advantage was driven by decreased recognition of faces with non-threatening expressions including happy, sad and neutral following a retention interval of 1.5 years.

PCA results
Initial inspection of the response patterns (color-coded trialwise response with blue to red as the confidence increased, see Fig. 2A) revealed strong individual variations in confidence ratings for expression-specific face images in the delayed test. PCA was then applied to map the confidence ratings of all 50 target face images in the 2D geometrical space of PC1 and PC2. As expected, PC1 had a discriminative variability that accounted for facial expression conditions with polarity: angry and fearful faces (generally negative scores) versus happy, neutral, sad faces (generally positive scores) (Fig. 3B). In particular, considering the localization of each face trial along the PC1 axis as visual reference, angry and fearful were separated from neutral and happy, while sad was located in between. PC1 explained 12.40% of the variance. The segregation pattern emerging from the amount of variance explained by PC1 was significant compared with the null distribution (trustworthiness/p < 0.001). In line with the visual presentation, pairwise non-parametric Wilcoxon Signed-Rank tests showed that the confidence ratings were not significantly different between angry vs fearful faces (Wilcoxon sign-rank z = − 1.33, p = 0.18) and neutral vs happy faces (Wilcoxon sign-rank z = − 0.11, p = 0.91). On the other hand, there were significant differences between angry vs. happy (Wilcoxon sign-rank z = − 5.25, p < 0.001), angry vs. neutral (Wilcoxon sign-rank z = − 5.05, p < 0.001) angry vs. sad (Wilcoxon sign-rank z = − 3.82, p < 0.001), fearful vs. happy (Wilcoxon sign-rank z = − 5.20, p < 0.001), fearful vs. neutral (Wilcoxon sign-rank z = − 5.32, p < 0.001), fearful vs. sad (Wilcoxon sign-rank z = − 2.75, p = 0.006), happy vs. sad (Wilcoxon sign-rank z = − 2.11, p = 0.035) and neutral vs. sad (Wilcoxon sign-rank z = − 2.55, p = 0.011). The results remained stable after Benjamini-Hochberg adjustment for multiple comparison testing. The same PCA procedure was also applied to the immediate test data and failed to detect a discriminative memory pattern along the PC1 axis (see Supplementary Fig S2). The lack of separation according to facial expression in the PCA was in line with findings from the conventional univariate non-parametric approach (Wilcoxon Signed-Rank tests) which did not reveal significant interaction effects between emotion and recognition performance for the immediate test (Supplementary Results).
To summarize, our univariate analyses suggested a longterm memory advantage of threatening (i.e., angry and fearful) versus non-threatening (i.e., particularly happy and neutral faces and to a lesser extent sad) faces, and this emotional expression effect was further supported by the results from the data-driven multivariate approach.

Neural basis of individual differences during encoding
Given that the PC1 scores derived from the previous PCA revealed a group-level emotional expression dimension with polarity and there existed apparent individual differences in the emotional recognition representation pattern, we next conducted the BPSA by correlating the individual confidence rating pattern with the PC1 score pattern (i.e., template) in which a higher similarity score represents stronger memory difference between threatening versus non-threatening faces. We then used the similarity score as regressor to illustrate the neural basis at encoding that underlies the individual differences in the emotional face memory representation. To do this, the first-level contrast of interest was modeled as threatening (angry + fearful) > nonthreatening (happy + neutral) expressions in accordance with the data-driven behavioral results (sad condition was excluded from the contrast because of the ambiguous behavioral results and to match the trial numbers for the contrast). The second-level voxel-wise whole-brain analyses controlling for interval days revealed significant negative associations between memory performance and neural activity in clusters in the left and right inferior occipital gyrus extending to occipital pole (left IOG: k = 140, peak MNI coordinates: − 18, − 94, − 16, t = − 4.65, two-tailed p cluster-FWE < 0.05; right IOG: k = 280, peak MNI coordinates: 30, -82, 2, t = − 4.33, two-tailed p cluster-FWE < 0.05; all whole-brain corrected, Fig. 4A) and left and right vmPFC/ OFC (left vmPFC/OFC: k = 114, peak MNI coordinates: − 39, 23, − 22, t = − 4.60, two-tailed p cluster-FWE < 0.05; right vmPFC/OFC: k = 148, peak MNI coordinates: 39, 26, − 19, t = − 4.97, two-tailed p cluster-FWE < 0.05; all whole-brain corrected, Fig. 4A). Because there were no significant differences in the brain-behavior correlations between the left and right IOG (Steiger's Z = − 0.11, p = 0.916) and between the left and right vmPFC/OFC (Steiger's Z = 0.88, p = 0.379), neural responses in the left and right IOG and vmPFC/OFC were collapsed across hemispheres and used for illustration of the correlations (for scatter plots, see Fig. 4B). ROI analysis further indicated that neural responses to threatening vs non-threatening faces in the left fusiform face area (FFA) were negatively correlated with the similarity scores (k = 59, peak MNI coordinates: − 39, − 55, − 25, t = − 4.17, 1 3 two-tailed p voxel-FWE < 0.01, SVC; Fig. 4C, D). Against our expectations, the examination of the bilateral amygdala with SVC did not reveal a significant brain-behavior association. These results suggested that differential reactivity to threatening vs non-threatening faces in bilateral IOG, bilateral vmPFC/OFC and left FFA during encoding might contribute to the individual differences in differential recognition threatening vs. non-threatening faces.

Exploratory analysis: comparison of discriminators and non-discriminators
To further explore the pattern of effects, an exploratory analysis aimed at determining the neural activation differences between 'discriminators' and 'non-discriminators' was conducted based on their behavioral pattern and brain activity. The BPSA approach successfully separated the subjects into discriminators (N = 43, 21 males, rs > 0.28, ps < 0.05) and non-discriminators (N = 46, 23 males, rs < 0.28, ps > 0.05), and a series of additional analyses further confirmed the separable emotional face representations between the two subgroups (details see Supplementary Results). To explore neural differences between the groups, a two-sample t test between the discriminators and non-discriminators was performed on the contrast (angry/fearful vs happy/neutral), revealing significant interaction effects in the left IOG (k = 127, peak MNI coordinates: − 18, − 82, − 19; t = − 4.65; two-tailed p cluster-FWE < 0.05, whole-brain corrected) and right vmPFC/OFC (k = 131, peak MNI coordinates: 27, 32, − 16; t = − 3.90; two-tailed p cluster-FWE < 0.05, whole-brain corrected; Fig. 5A), consistent with the brain-behavior correlation results with the overlap coefficients (defined as the size of intersection divided by the smaller of the two clusters, Saygin et al. 2016) of 0.60 and 0.53 in the left IOG and right vmPFC/OFC, respectively. Subsequent extraction of parameter estimates from independent masks from Brainnetome atlas suggested a distinct pattern of neural responses in the left IOG and right vmPFC/OFC, such that in the left IOG, the non-discriminators showed higher reactivity towards threatening faces compared to non-threatening faces Fig. 4 Brain-behavior correlation results. A, B Whole-brain analysis. A The neural activity to threatening vs non-threatening faces in bilateral IOG and vmPFC/OFC was negatively correlated with the behavioral similarity between the face representation pattern and the grouplevel PC1 pattern. The threshold was set at p < 0.05 cluster-level FWE correction with a cluster-forming threshold p < 0.001. B Scatter plots between similarity scores and neural activity in the collapsed bilat-eral IOG and vmPFC/OFC were shown for illustration purpose. C, D Region of interest analysis. C The neural activity to threatening vs non-threatening faces in the left FFA was negatively correlated with the behavioral similarity. The threshold was set at p < 0.05 voxel-level FWE correction. D Scatter plot between similarity scores and neural activity in the left FFA was shown for illustration purpose 1 3 (t 45 = 4.50, p < 0.001, Holm corrected, Cohen's d = 0.66) while there was no difference in discriminators (t 42 = − 0.49, p = 0.63, Fig. 5B); whereas in the right vmPFC/OFC, the discriminators responded less to threatening faces compared to non-threatening faces (t 42 = − 3.66, p < 0.01, Holm corrected, Cohen's d = 0.56), whereas the non-discriminators showed no difference (t 45 = 1.72, p = 0.09, Fig. 5C, for further details, see also Supplementary Results and Figs S3 and S4).
Taken together, the findings from the two convergent analytic approaches suggest that the individual differences in long-term threatening vs. non-threatening recognition might be preceded by different encoding-related neural activity towards threatening vs non-threatening faces in the medial prefrontal and occipito-temporal cortex.

Discussion
The present study systematically examined the impact of emotional facial expressions during initial encounters on face recognition following a retention interval of > 1.5 years. In line with our hypothesis, we found evidence that individuals recognized threatening faces better, particularly angry and fearful ones, as compared to non-threatening faces following the long-term retention interval. Moreover, the emotional expression-specific recognition advantage was not present directly after encoding and the long-term advantage of threatening faces was driven by decreased recognition of non-threatening faces over the retention period. Multivariate analyses further supported this finding by showing a separation according to emotional face expressions following the long-term retention interval but not during immediate Whole-brain group comparison between discriminators and non-discriminators. A T-statistic map of significant emotional expression (angry/fearful vs. happy/neutral) by group (discriminators vs. non-discriminators) interaction effects in the left inferior occipital gyrus (IOG) and right ventromedial prefrontal/orbitofrontal cortex (vmPFC/OFC). The activation map is displayed at p < 0.05 clusterlevel FWE correction with a cluster-forming threshold p < 0.001, overlaid on the masks from Brainnetome atlas which were subse-quently used to extract parameter estimates. B, C Post hoc tests on the extracted parameter estimates from the left IOG and right vmPFC/ OFC (beta estimates) as defined by independent masks from the Brainnetome atlas showed that there was higher reactivity towards angry/fearful faces in the left IOG of non-discriminators (B), whereas the discriminators responded less to angry/fearful faces in the vmPFC/OFC (C). ***p < 0.001, **p < 0.01, Holm corrected recognition. Interestingly, the expression-specific face recognition pattern exhibited considerable inter-subject variation and a data-driven BPSA demonstrated that approximately half of the subjects demonstrated discriminative emotional face representation (discriminators) on the behavioral level, while the other half did not (non-discriminators). Examination of neural activation differences during encoding revealed that the different activation patterns in response to threatening vs non-threatening faces in bilateral IOG extending to occipital pole and vmPFC/mOFC on the whole-brain level, and left FFA on the ROI level was associated with the individual differences in the threatening vs non-threatening face recognition pattern after a delay of > 1.5 years. Twosample t test on the subgroups of subjects separated by significance of similarity scores further confirmed differential neural activity in the medial prefrontal and occipito-temporal cortex during encoding. Together, the present findings demonstrated that threatening expressions during incidental encounters might facilitate long-term face recognition and that differential encoding in the occipito-temporal cortex and vmPFC/mOFC might contribute to expression-associated recognition pattern differences.
In contrast to previous studies using relatively short retention intervals ranging from minutes to weeks with univariate approach (e.g., Anderson et al. 2006;Wang 2013;Xiu et al. 2015), our study for the first time examined enhanced emotional expression effect for face recognition after an extensive delay of at least 1.5 years with both univariate and multivariate approaches. The enhanced/discriminative long-term memory for faces with threatening expressions (i.e., angry/ fearful) relative to faces with non-threatening expressions (i.e., happy/neutral) in the present study was consistent with previous findings showing that the recognition and recollection for negative (e.g., fearful) faces was better than that for neutral faces after a delay of 24 h (Wang 2013). No differences between facial expression conditions were observed in the immediate test, which is consistent with some previous studies showing a lack of emotional facial expression modulation during immediate memory (Anderson et al. 2006;Grady et al. 2007;Satterthwaite et al. 2009;Xiu et al. 2015). The time-dependent effects of emotional expression on recognition observed in our study (i.e., enhanced emotional memory after a long-term delay but not immediately after encoding) were in line with prior studies using nonface emotional stimuli such as words or scenes (Sharot and Phelps 2004;Sharot and Yonelinas 2008), which indicated that the recognition for emotional materials was comparable to neutral ones immediately after encoding and that the recognition of neutral stimuli decreased over time, while recognition of negative stimuli remained the same. More importantly, our study extends previous findings by demonstrating that the year-long emotional enhancement effect of memory was not limited to emotional scenes (Dolcos et al. 2005;Erk et al. 2010;Gavazzeni et al. 2012) and suggests that threatening expressions slow forgetting of faces via mechanisms that enhance long-term consolidation (Cahill and McGaugh 1998;Talmi 2013;Yonelinas and Ritchey 2015). From an evolutionary perspective, maintaining recognition of threatening faces over long intervals may represent an adaptive and survival-relevant mechanism (Staugaard 2010), whereas faces with non-threatening expressions are of lower significance for future encounters, and thus were prone to be forgotten over time (Dunsmoor et al. 2015). Future studies could further determine the exact interval after which the memory enhancing effect of threatening expressions would emerge through consolidation by including repeated assessments in separate cohorts.
Some previous studies reported better memory for negative faces during immediate face recognition (Grady et al. 2007;Pinabiaux et al. 2013;Wang 2013). Methodological differences between experiments may account for this discrepancy. For example, the specific emotional expression tested and the number of face stimuli per emotional category varies between studies (Anderson et al. 2006;Xiu et al. 2015; but see Grady et al. 2007;Wang 2013). Besides, previous studies only tested a small number of subjects (N < 50) which may result in a lack of statistical power. Moreover, the mixed findings from previous studies might also attribute to the heterogeneity of sample characteristics such as age and gender which has been reported to be related to general face recognition memory (e.g., Grady et al. 1995;Sommer et al. 2013). Additionally, different analytical approaches may have contributed to the divergent results such that the previous studies generally employed univariate analytic approaches which might cause a loss of information in response variability and individual differences (Miendlarzewska et al. 2018;Tian et al. 2020). Indeed, the individual behavioral response pattern ( Fig. 2A) indicated individual differences in emotional expression representation, consistent with individual variations that have been reported in face recognition (Tardif et al. 2019;Wang et al. 2012), emotional expression processing (Calder and Young 2005;Le Grand et al. 2006) and threat-related information processing (Liu et al. 2021a). Our study partly overcame some of the previous limitations by incorporating data-driven multivariate analyses in examining emotional expression effects in a large sample of young adults with balanced gender.
Additional control analyses suggest that the long-term memory-enhancing effect of threatening facial expressions could not completely be explained by factors such as an expression-specific tendency to respond items as previously seen or arousal at encoding. While an expression-specific variation in the false alarm rates was observed at both, immediate and delayed recall, false alarm rates were similar across the emotion expressions when variations in the retention interval were controlled for and false alarm rates for threatening faces were at chance level in the delayed recognition test. Previous studies investigating emotional memory advantage emphasized the role of arousal (Bradley et al. 1992;Hamann 2001;LaBar and Phelps 1998;Sharot and Phelsps 2004). Indeed, arousal may bias processing toward salient information that gains processing priority, and thus contribute to enhanced consolidation (Mickley Steinmetz et al. 2012;Ritchey et al. 2008). However, the enhanced memory for threatening faces in our study is unlikely explained by arousal based on three lines of evidence. First, a moderation analysis confirmed the lack of arousal effects by showing a non-significant interaction of facial expression category by arousal on memory performance. Second, arousal ratings for each expression at encoding between discriminators and non-discriminators were not significantly different (see Supplementary Results). Third, on the neural level, we did not observe significant activity in brain regions that have been shown to be engaged in processing emotional arousal such as the amygdala (Adolphs et al. 1999) or in unspecific salience processing such as the insula (Menon and Uddin 2010), which together suggests that the findings in the present study are unlikely to reflect unspecific arousal or salience effects of emotional stimuli.
The individual differences in emotional face recognition pattern not only substantiated the multivariate PCA analysis but also allowed us to capture the corresponding neural correlates by employing an individual difference approach. With a multivariate measure capturing individual differences based on BPSA, we observed that subjects with higher recognition differences between threatening versus non-threatening faces exhibited lower activity in response to threatening (i.e., angry/fearful) faces versus non-threatening (i.e., happy/neutral) faces in the bilateral IOG extending to occipital pole and vmPFC/OFC. ROI analysis further suggested that the left FFA was also engaged in the long-term emotional expression effects on memory. These identified regions align with the previously reported face network and resemble previous findings on emotional face recognition and emotion-related learning (Kumfor et al. 2013;LaBar and CabeZa 2006;Rolls 2019;Sergerie et al. 2005;Xiu et al. 2015). IOG and FFA are two key regions in the core face network (CFN) which usually support face identity recognition (Haxby et al. 2000;Liu et al. 2002;Tsantani et al. 2021), yet were also involved in processing emotional facial expression (Cohen Kadosh et al. 2010;Harry et al. 2013), whereas occipital pole is modulated by salient visual stimuli and has been involved in processing emotional faces in schizophrenia (Taylor et al. 2012). OFC is a part of extended face network (EFN) which processes social-emotional aspects of face processing, specifically value-based emotion and reward information (Ishai 2008;O'Doherty et al. 2003). Lesion studies have demonstrated that IOG and OFC lesion impairs emotional memory and learning (Calder and Young 2005;Kumfor et al. 2013;Rossion et al. 2003). Damage to inferior fronto-occipital fasciculus (IFOF), which connects the occipital cortex to OFC, has, moreover, been reported to predict facial emotion recognition impairment including anger, fear and sadness (Philippi et al. 2009). Our findings suggest a behavioral significance of encoding-related activity in these regions in predicting long-term emotional face memory patterns. Previous studies measuring the correlations between neural activation during fMRI and subsequent face (neutral expression) recognition performance employing independent task paradigms and stimuli have shown that higher face selectivity in FFA and IOG was associated with better face recognition ability in both normal subjects (Huang et al. 2014) and developmental prosopagnosia (Liu et al. 2021b), suggesting that the face-selective response in these regions contributes to individual differences in face recognition. However, these studies did not permit an investigation of encoding-related activity in predicting subsequent memory. One study that used an individual difference approach to examine encoding-related neural correlates of emotional face memory reported that the connectivity from IOG to OFC is negatively correlated with immediate subsequent memory performance of faces with different emotional expressions (i.e., negative, neutral, positive, respectively) (Xiu et al. 2015), suggesting an emotion-specific modulation of the face network which aligns with our findings.
Interestingly, the present results indicated a negative correlation between activity during encoding and long-term emotional face representation pattern, with higher activity for angry/fearful vs happy/neutral activity being associated with less discriminative recognition performance between faces with different emotional expressions (i.e., more similar performance for threatening vs non-threatening faces). Group comparison based on the similarity (discriminators vs. non-discriminators) further revealed distinct mechanisms in IOG and vmPFC/OFC, that is, the threatening vs non-threatening activation difference in IOG was driven by enhanced neural response to threatening faces in nondiscriminators, whereas the threatening vs non-threatening activation difference in vmPFC/OFC resulted from suppressed neural response to threatening faces in discriminators. The differential mechanisms may correspond to the different function of these regions. That is, the preferential early visual processing of threatening faces in visual cortex did not lead to better recognition, while lower implicit threat evaluation and regulation at encoding in prefrontal regions engaged in emotional processing (i.e., vmPFC/OFC) might link to the enhanced recognition of threatening faces. The distinct role of IOG and vmPFC/OFC in predicting longterm emotional face memory remains to be finely characterized in future studies.
The lack of amygdala effect was consistent with the behavioral results suggesting that there was no modulation 1 3 of arousal on memory performance and might be partly explained by previous observations that the amygdala responds equally to positive, negative and neutral faces during encoding (Adolphs 2010;Ball et al. 2009). Although previous studies have indicated that amygdala activation at encoding predicted greater subsequent memory for emotional compared to neutral scenes after a 1-year delay (Dolcos et al. 2005), the findings are based on the neural activity of remembered versus forgotten items for emotional versus neutral scene pictures. Likewise, the present study did not reveal a role of the insula in emotional face memory, suggesting that the emotional face memory was not due a higher externally guided attention towards emotional faces during encoding (Yao et al. 2018).
The present study used multivariate PCA and BPSA for long-term emotional face recognition performance in a large sample, which differentiates our study from most previous ones comparing the experimental conditions by averaging across all stimulus items. The univariate method relies on the assumption that all stimulus items within each condition are homogeneous, and the performance differences between items reflect noise fluctuations. The data-driven multivariate PCA allowed us to detect hidden pattern in the confidence ratings in a hypothesis-free manner by capitalizing on the trial-wise responses, which not only complemented the traditional univariate analysis but also informed fMRI analysis using an individual difference approach. Although the first component retained only 12.4% of the original variance which might be due to the noise associated with the long retention interval, the PCA did generate separate clusters (see Fig. 3B) and the PC1 separation was statistically significant according to a non-parametric trustworthiness test, suggesting that this approach successfully identified distinguishable latent groups of facial expressions based on the recognition confidence ratings. More importantly, the extracted scores of PC1, representing a discriminative variability that accounted for the facial expression categories, promoted the BPSA by correlating the PC1 scores with each subject's confidence rating pattern. This multivariate method allowed us for the first time to capture individual differences in emotional face representation patterns by relating the face representation (i.e., multi-item confidence rating pattern) of each subject to a group-level emotion-specific discriminative memory pattern (indexed by PC1 scores). Importantly, the present BPSA took the principal component score pattern as a response template, whereas previous studies usually used mean score pattern as a template (e.g., Tian et al. 2020). Thus, our method could provide a novel way to distinguish the memory performance for faces with different emotional expression conditions and has the potential to support a broader examination of multidimensional data (e.g., behavioral, gene or fMRI data) comprising of complex conditions. Moreover, by calculating the association between individual feature pattern and the pattern template generated by unsupervised dimensional reduction (e.g., PCA, minimum curvilinear embedding, Miendlarzewska et al. 2018), one can characterize the individual differences in the representation pattern and associated it with the neural systems.
Several limitations need to be addressed. First, the comparably low number of hits for each expression condition after a long retention interval did not permit a reliable fMRI analysis of subsequent memory effects comparing remembered versus non-remembered items as employed in previous studies (e.g., Becker et al. 2017;Dolcos et al. 2005). Future studies that aim to determine facial expression effects over very long retention intervals may enhance the processing depth of the stimuli or increase the number of stimuli per expression condition as well as the number of subjects (Steele, et al. 2016) are invited to further explore the neural basis underlying long-term emotional face memory. Moreover, to examine the long-term memory effect with higher sensitivity, a dimensional confidence rating was used at the long-term retention interval while the immediate recognition test initially implemented a forced choice response. Although previous studies used a similar approach and converted dimensional confidence ratings into binary responses (Weymar et al. 2011;Xiu et al. 2015), the results of direct comparison between immediate and delayed recognition should be interpreted with caution. Furthermore, previous studies investigating the immediate recognition of neutral faces that were previously presented with threatening (angry or fearful) or non-threatening (happy or sad) expressions and found no differences in recognizing neutral faces with threatening versus non-threatening expressions (Satterthwaite et al. 2009), suggesting that the "first impression" created by facial expression did not affect subsequent recognition of face identity. However, whether such a "first impression" has an effect on long-term face identity recognition remains to be addressed in future studies. Despite comparable recognitions for angry and fearful faces, the grouping of angry and fearful faces into threatening faces may obscure encoding difference between angry and fearful faces. Future studies are needed to investigate the neural differences within categories of threatening and non-threatening faces for subsequent recognition. Notably, on the behavioral level, no differences were observed during immediate recognition, suggesting that differential encodingrelated activation in these regions may be associated with individual variation during memory consolidation. Further studies are necessary to explore neural basis during consolidation to provide a more comprehensive understanding of the formation of long-term emotional face memory. Moreover, given the exploratory nature of the fMRI analysis in the present study, we focused on univariate fMRI analyses to identify specific brain regions associated with individual differences. Combining suitable fMRI and sample designs with multivariate fMRI analyses (e.g., Zhou et al. 2020Zhou et al. , 2021 may offer additional insights into the neurofunctional representation of long-term face memory in future studies. Finally, in line with our research goal, the neural basis of discriminative face recognition pattern was determined on the level of emotion categories. Future studies may aim to further disentangle neural systems of item-specific face memory within each emotional expression category or across emotions to explore different encoding patterns for face stimuli that are on average highly or poorly memorable.

Conclusions
Our study provides evidence for a recognition advantage of threatening faces after a long-term interval of > 1.5 years. fMRI analyses further suggested that individual differences in the emotional face memory pattern were associated with differential encoding of threatening versus non-threating faces in the IOG/occipital pole, FFA and vmPFC/mOFC, suggesting that encoding-related activity in the occipito-temporal cortex and medial prefrontal cortex may play a role in the formation of long-term emotional face memory. These findings extend the theory of long-term emotional memory towards facial stimuli and shed new light on the encoding-related neural basis of preserved memory for faces with threating expressions.