Identifying and modulating neural signatures of stress susceptibility and resilience enables control of anhedonia

Anhedonia is a core aspect of major depressive disorder. Traditionally viewed as a blunted emotional state in which individuals are unable to experience joy, anhedonia also diminishes the drive to seek rewards and the ability to value and learn about them 1–4.The neural underpinnings of anhedonia and how this emotional state drives related behavioral changes remain unclear. Here, we investigated these questions by taking advantage of the fact that when mice are exposed to traumatic social stress, susceptible animals become socially withdrawn and anhedonic, where they cease to seek high-value rewards, while others remain resilient. By performing high density electrophysiological recordings and comparing neural activity patterns of these groups in the basolateral amygdala (BLA) and ventral CA1 (vCA1) of awake behaving animals, we identified neural signatures of susceptibility and resilience to anhedonia. When animals actively sought rewards, BLA activity in resilient mice showed stronger discrimination between upcoming reward choices. In contrast, susceptible mice displayed a rumination-like signature, where BLA neurons encoded the intention to switch or stay on a previously chosen reward. When animals were at rest, the spontaneous BLA activity of susceptible mice was higher dimensional than in controls, reflecting a greater number of distinct neural population states. Notably, this spontaneous activity allowed us to decode group identity and to infer if a mouse had a history of stress better than behavioral outcomes alone. Finally, targeted manipulation of vCA1 inputs to the BLA in susceptible mice rescued dysfunctional neural dynamics, amplified dynamics associated with resilience, and reversed their anhedonic behavior. This work reveals population-level neural signatures that explain individual differences in responses to traumatic stress, and suggests that modulating vCA1-BLA inputs can enhance resilience by regulating these dynamics.

characterizing the correlated activity of multiple neurons, it is essential to record simultaneously from large numbers of neurons, and to analyze their activity at the population level.This approach can also reveal dynamics of internal state more accurately than single cell recordings.Therefore, we conducted high-density Neuropixels recordings 51 in the vCA1 and BLA and used population decoders to analyze the reward-related and spontaneous dynamics of populations of neurons to identify distinctive neural signatures of susceptibility and resilience to chronic stress.Then, we developed a novel circuit-specific modulation approach to rescue aberrant BLA population dynamics and associated anhedonic behavior in stress-susceptible mice.

Distinct behavioral signatures of resilient and susceptible mice following CSDS
To search for neural correlates of differential emotional and behavioral responses to traumatic social stress, we performed high-density single unit electrophysiology using Neuropixels probes implanted in the BLA and vCA1 of control mice and of mice subjected to chronic social defeat stress (CSDS) (Fig. 1A-E).Activity was recorded during both a stimulus-free pre-task condition and while mice performed a novel head-fixed sucrose preference test (SPT).In this test, mice could freely choose to access either water or sucrose rewards by licking at the respective spout to trigger reward delivery.CSDS produced mice with varying degrees of sucrose preference and social interaction scores, which were highly correlated (Fig. 1F).These behavioral profiles allowed us to classify mice as stress-resilient or susceptible (Fig. 1G).The susceptible mice identified using this classification showed lower lick rates during sucrose consumption, as well as markedly reduced lick rate discrimination between sucrose and water rewards --two behavioral subcomponents that reflect anhedonia 4 (Fig. 1H-I and Extended Data Fig. 1A-B).45), but not control (n = 15), mice showed a significant correlation between sucrose preference and social interaction scores.(G) Unsupervised K-means clustering revealed 2 distinct subgroups of CSDS mice (n = 45).(H) Susceptible mice (n = 12) showed reduced sucrose lick rate during Post-reward period compared to resilient mice (n = 33, RM-ANOVA, group x time interaction: F2,57 = 5.63, P = 0.0059).(I) Susceptible mice showed reduced lick rate discrimination index compared to control and resilient mice (two-way RM-ANOVA, group x time interaction: F2,57 = 48.47,P < 0.0001).Data are mean ± SEM. # Significantly different from chance; * P < 0.05; ** P < 0.01.

Enhanced discrimination of reward choice in stress-resilient mice
As we observed robust sucrose-seeking behaviors in resilient mice compared to susceptible mice, we looked for specific features of how rewards and reward-seeking behavior were represented across the recorded BLA and vCA1 neuronal populations.We performed recordings as mice freely chose water or sucrose and indicated their choice by licking a spout to trigger reward delivery (Fig. 2A).To assess neural activity patterns both before the mice behaviorally made their choices and after they consumed the reward, we defined a trial (sucrose or water) using the 8 second time window (4s pre-to 4s postreward) around reward delivery.
First, we quantified the proportion of reward-choice-selective neurons in the BLA and vCA1, defined as those that showed differential firing during water vs. sucrose trials.In the BLA, during both the seconds before reward delivery (Pre-reward) and in the reward consumption period (Post-reward), while susceptible and control mice had comparable numbers of reward-choice-selective neurons, resilient mice had significantly more than either (Fig. 2B-C).This difference between previously stressed mice (i.e., resilient and susceptible groups) was specific to the BLA, as in vCA1 we found that stress exposure increased the proportion of reward-choice-selective neurons in all mice.
To investigate differences between groups at the population level, we trained linear classifiers to discriminate trial types (water or sucrose choice), balancing the number of current and past rewards for each trial type (Fig. 2D, see Methods).When analyzing activity during Pre-reward time bins, we again found distinctive signatures of stress resilience.In resilient mice, the upcoming choice of sucrose or water could be decoded from neural activity in BLA better than chance and better than from neurons in control or susceptible mice (Fig. 2E-F).This effect in resilient mice was more robust after a previous low-value water trial than following a high-value sucrose trial, suggesting that resilient mice may be preferentially seeking out sucrose rewards following a previous low-value water reward (Extended Data Fig. 2A-B).
After reward consumption (Post-reward), reward choice could be decoded from BLA activity better in all mice, but decoding was still strongest in resilient mice.In vCA1, there existed a general effect of stress on decoding ability.These results suggest that BLA neurons in resilient mice showed enhanced discrimination of reward choices both before and during reward consumption.

Intention-specific states in BLA as a unique susceptibility signature
We next examined the nature of anhedonic behavior in susceptible mice by analyzing the sequence of reward choices that led them to less frequently choose sucrose rewards, and compared this to the sequences in control and resilient mice.We found that current and previous reward choices were not independent from each other, as the sequence could be described using a Markov model in which the probability of choosing water or sucrose depended on the choice made in the previous trial.The Markov models of control and resilient mice were similar: both switched from water to sucrose and repeated a sucrose choice more often than susceptible mice did (Fig. 3A-B, Extended Data Fig. 2C-F).In contrast, susceptible mice switched more from sucrose to water rewards and made more consecutive water choices.
Given these patterns, we asked if we could use the four possible sequences of consecutive reward choices (water-water; water-sucrose; sucrose-water; sucrose-sucrose) as the basis for identifying unique neural signatures of the intention to switch or stay on the same reward choice as the previous trial.
Strikingly, single neuron analysis revealed that neurons that were differentially modulated based on the intention to switch rewards or to stay on the same one as the previous trial were only present in the BLA of susceptible mice (Fig. 3C-D).In addition, a population decoder could successfully distinguish stay trials from switch trials using neural data from the seconds before reward delivery in the BLA of susceptible mice but not in the other groups (Fig. 3E-F).In contrast, no striking switch vs. stay signatures were observable at the population level in vCA1 (Extended Data Fig. 2G).This led us to hypothesize that specific population activity patterns existed in the BLA of susceptible mice in the seconds preceding the decision to switch or stay.We identified population hidden states in the 4s Pre-reward period using Hidden Markov Models [52][53][54] (HMM, Fig. 3G-H, Extended Data Fig. 2H).Then, we confirmed that a linear decoder trained using inferred firing rates in HMM hidden states could most strongly distinguish between stay and switch trials in the BLA of susceptible mice (and both vCA1 CSDS groups) (Extended Data Fig. 2I-J).Accordingly, the population representations of stay vs. switch trials were linearly separable in the BLA of susceptible mice (Fig. 3I-J).
Next, we identified hidden states that uniquely characterized trials in which mice either intended to stay or to switch, which we termed intention-selective states (Fig. 3K-L, Extended Data Fig. 2K, see Methods).We found that the BLA of susceptible mice had a significantly higher fraction of these intention-selective states during the 4s Pre-reward period than controls (Fig. 3L).Removing trials that contained these states reduced decoding accuracy of stay vs. switch trials to chance levels (Fig. 3M) and altered the geometry of population representations of switch vs. stay trials (Fig. 3N).Considering only trials with intention-selective states improved the switch vs. stay decoding accuracy, while removal of random states did not impact decoding accuracy (Fig. 3M-N).
Altogether, our results indicate that BLA neurons in susceptible mice evaluate future decisions with respect to their past choices (by representing switch/stay states), which may contribute to behavioral strategies that result in reduced number of sucrose rewards.

Distinct patterns of spontaneous activity in the BLA of susceptible mice
We next asked whether distinct patterns of population activity could be detected in the BLA of susceptible or resilient mice, in the absence of any overt stimuli or task demands.Clinical studies have revealed altered resting state functional connectivity between the AMY and HPC in depressed individuals, but the underlying neural mechanisms remain unknown 55,56 .
To mimic a mildly stressful experience in human imaging studies, mice were head-fixed without taskrelevant stimuli provided.We first examined whether the geometry of spontaneous neural activity patterns differed between groups.As the lack of behavioral timestamps made it difficult to align and directly compare neural representations across animals, we focused on the embedding dimensionality using Principal Component Analysis (PCA), which can estimate population geometry without alignment to overt behavior 57,58 .This analysis revealed a trend towards higher dimensionality in the BLA population activity of susceptible mice compared to controls (Extended Data Fig. 3A-E), suggesting a larger number of neural population states, with each state spanning a different dimension.Indeed, when we quantified the states using HMM and performed agglomerative clustering of states to identify those that were unique, we found that BLA, but not vCA1, susceptible mice showed a greater number of distinct neural states (Fig. 4A-C, Extended Data Fig. 3F-L).Consistent with this, average correlated BLA population activity across time was lower, and thus more variable, in susceptible mice (Extended Data Fig. 3M-N).The greater number of distinct states in susceptible mice could not be attributed to an increased firing rate, which was lower in the BLA of susceptible mice compared to controls (Extended Data Fig. 4O).
Furthermore, across all mice, the number of distinct states was significantly correlated with behaviors used to assess susceptibility (Fig. 4D), with greater numbers of clusters strongly predicting social avoidance and anhedonic behavior.This suggests that structures of population hidden states in the BLA may reflect anhedonia-related behavior.
We next tested whether we could decode the group identity of individual animals from this resting-state activity, by training a classifier using neural features including firing rates (mean and standard deviation), PCA cumulative variance, and the fraction of clustered neural states.Each feature alone allowed us to distinguish between control and susceptible mice to some extent (Extended Data Fig. 3P).
But using all the feature sets in BLA, but not vCA1, we could decode between all pairs of group identities with the highest accuracy (Fig. 4E, Extended Data Fig. 3Q-R).Notably, cross-validated decoding of susceptible vs. control mice was 100% accurate.When we visualized the geometry of the representations in individual mice, we found the greatest distance between control and susceptible mice in the BLA (Fig. 4F).Applying the same analysis to neural recordings from the task period could also differentiate between control and susceptible mice (Extended Data Fig. 4).
Finally, we found that a decoder using these neural features during the stimulus-free pre-task period in BLA better predicted whether an animal was exposed to stress than a decoder using only behavioral features (Fig. 4G).This suggests that neural activity features in the BLA in the absence of any stimuli or task demands may be a more powerful biomarker for identifying a history of chronic stress than classic behavioral indices such as social avoidance and anhedonia-related behaviors.trained on all neural features ("All") could decode group identity better than chance in BLA (n = 1000 cross-validations; chance: n = 100 shuffles).Feature importance in decoding was examined by systematic removal of each feature (subsequent columns).(F) MDS of neural features in BLA showed that controls were most distinct from susceptible mice.(G) Mahalanobis decoder trained on neural features was better at distinguishing control vs. CSDS mice than behavioral features (n = 1000 cross-validations; chance: n = 100 shuffles).Data are mean ± SEM.Chance distributions are 2xSTD around theoretical chance level.# Significantly different from chance; * P < 0.05.

Rescue of dysfunctional vCA1-BLA activity and anhedonia by circuit-specific manipulations
Having established that the BLA displays distinct activity patterns associated with susceptibility to stress, we asked if we could 1) rescue the neurophysiological responses of stress in susceptible mice and 2) whether rescuing the neural phenotype reversed the maladaptive behavior of these animals.
The strategy we chose was to manipulate inputs to BLA from vCA1.We targeted this pathway because 1) vCA1 provides dense input to BLA 12,59 ; 2) our data here shows that CSDS produces changes in representations of reward choice and intended task strategies in vCA1 of all stressed mice, which may reflect a general adaptation to stress (see Fig. 2E-F, Extended Data Fig. 2I); and 3) resilience was positively correlated with the strength of communication between vCA1 and BLA for sucrose vs. water choices during the Pre-reward period (Fig. 5A-B).
To test whether manipulation of vCA1-BLA inputs would modulate signatures of susceptibility in the BLA and/or influence anhedonic behavior, we increased the excitability of vCA1-BLA projection neurons by expressing the excitatory chemogenetic actuator hM3Dq in these cells 60,61 (Fig. 5C-D).We then subjected mice to CSDS and recorded BLA and vCA1 activity and behavior in susceptible mice before and after injection of the hM3Dq activator clozapine-n-oxide (CNO).
CNO increased vCA1 firing rates (Extended Data Fig. 5A-B, K) and modified population activity patterns in vCA1 (Extended Data Fig. 5C-J).During the sucrose preference task, this manipulation enhanced vCA1-BLA correlations for sucrose vs. water choices during the Pre-reward period (Fig. 5E).
In addition, we found that activating the vCA1-BLA pathway increased our ability to decode reward choice Post-reward in both BLA and vCA1(Fig.5F-G), a signature of enhanced reward choice representation in naturally resilient mice (see Fig. 2E-F).
We next asked whether this manipulation of the vCA1-BLA pathway would reduce the occurrence of the unique intention-specific states we had observed in the BLA of susceptible mice.Replicating our previous results, we found that during the saline period, we could decode stay vs. switch trials in susceptible mice (Fig. 5H-J, Extended Data Fig. 5L-M).However, activation of vCA1-BLA brought decoding accuracies to chance levels, changed the geometry of representations in the BLA such that switch and stay trials could no longer be linearly separated (Fig 5J ), and decreased the fraction of intention-specific states (Fig. 5K).In other words, activation of vCA1-BLA pathway reversed this population-level signature of stress susceptibility in the BLA.
Finally, we found that vCA1-BLA activation rescued behavioral indices of anhedonia.Administering CNO increased sucrose preference (Fig. 5L), increased the lick rate discrimination index (Extended Data Fig), and enhanced the proportion of sucrose stay trials (Extended Data Fig. 5O-Q).No behavioral differences were observed between saline and CNO periods in mice infused with the control mCherry virus (Extended Data Fig. 5R).
In summary, these results show that activating vCA1-BLA pathway rescued both aberrant population dynamics in the BLA of susceptible mice and associated behavioral hallmarks of anhedonia (Fig. 5M).

Discussion
Our study reveals distinct neural signatures of stress resilience and susceptibility in BLA population activity.Using Neuropixels recordings while mice were either at rest or engaged in a free-reward-choice task and leveraging complementary analytic approaches, we identified novel population dynamics that underlie distinct features of stress-induced anhedonic state.Critically, when we successfully reversed these neural signatures of anhedonia through targeted modulation of the vCA1-BLA circuit, the behavioral consequences of this maladaptive state were also rescued.
By analyzing population dynamics during the sucrose preference test, we discovered a resilience signature characterized by heightened reward choice representations in the BLA before and during reward consumption.This enhanced reward choice perception or sensitivity may play a crucial role in reinforcing the behavioral processes that lead animals to seek more rewarding options (i.e., choosing the sucrose reward) 62,63 .That is, it may serve as a mechanism for adapting to, or coping with, the experience of CSDS, thereby maintaining a robust behavioral preference for sucrose.
In contrast, the BLA of susceptible mice exhibited reduced representations of current and future reward choices, which may result in decreased reinforcement of behaviors associated with the more rewarding outcome, ultimately contributing to reduced preference for high-value rewards 63 .Moreover, the BLA of susceptible mice also displayed unique representations that reflected their intention to switch or stay on the previously chosen reward.This heightened evaluation of future choices with respect to the past is reminiscent of rumination-like states commonly observed in individuals with depression, such as repetitive thinking about past choices and upcoming decisions 64,65 .
By analyzing the neural activity patterns in the absence of any task or stimuli, we also found an enhanced exploration of distinct neural states in the BLA of stress susceptible mice.This may reflect the emergence of intrusive activity patterns in the BLA, akin to the intrusive thought patterns observed in depressed patients 66,67 .Notably, we found that features of neural activity in the BLA during task-free periods were more effective than classic behavioral readouts in distinguishing between control mice and those with a history of CSDS.This suggests the intriguing possibility that resting-state activity patterns in the BLA may hold significant potential as a novel biomarker for identifying individuals who have experienced stressful life events.Finally, stress susceptible mice showed reduced vCA1-BLA correlations during higher value sucrose reward choice trials, potentially driving anhedonic behavior.When we used chemogenetics to activate BLA-projecting vCA1 neurons in susceptible mice this increased inter-regional communication between these regions, increased representations of current reward choice in both BLA and vCA1; reduced the rumination-like over-representation of the intention to stay or switch in the BLA; and critically, decreased anhedonia-related behavior.
Our data suggests that vCA1 may encode an adaptive coping response to stress, as reward choice representations were enhanced in both resilient and susceptible mice in comparison to controls.How effectively this information from vCA1 is funneled to the BLA via direct vCA1-BLA projections may help determine crucial activity patterns in the BLA that shape resilient and susceptible outcomes.Specifically, in resilient mice, as the BLA is more effectively interacting with vCA1, BLA may be incorporating this adaptive stress response from vCA1 to generate signatures of resilience, such as enhanced reward choice representations.Whereas, in susceptible mice, this vCA1-BLA interaction is weaker.Thus, BLA may not be efficiently using adaptive information from vCA1, resulting in neural activity patterns that promote susceptibility, such as reduced representations of reward choices and the emergence of intention-selective states.
While dysfunction in dopaminergic systems has been implicated in motivational changes in depression and chronic stress 2,[68][69][70][71][72] , this work provides crucial evidence for a role of the vCA1-BLA circuit in modulating stress-induced behavioral phenotypes.By demonstrating that boosting vCA1 to BLA communication can normalize neural dynamics associated with susceptibility and promote those associated with resilience in the BLA, our findings shed light on how dysfunction in this circuit may contribute to stress-induced maladaptive states.Moreover, these results highlight the vCA1-BLA circuit as a promising target for neuromodulation in mood disorder treatments and open new avenues for potential therapies to more effectively address stress-induced pathologies.

Chronic social defeat stress (CSDS)
CSDS was conducted similar to previously established protocol 6 .Briefly, CD1 male mice were singly housed upon arrival for >1 week and were then pre-screened for aggression over 3 consecutive days.Each day, a CD1 mouse was placed in a cage with a new screener BL/6 mouse for 3 min.An aggressive CD1 mouse is defined as one that attacked the BL/6 mouse within the first min over a minimum of 2 consecutive days.Only aggressive CD1 mice were used in defeats and social interaction tests.Defeats occurred over 10 days, where each day, a BL/6 mouse was introduced to a novel CD1 mouse's cage for 10 min.Defeats were terminated early if severe injuries on BL/6 mice were observed.After 10 min, a clear plastic divider with perforations was placed in the middle of the defeat cage for 24 hours, to physically separate the BL/6 and CD1 mice while allowing visual and odor cues to transmit and reinforce the defeat experience during co-housing.After the 10 th day defeat, BL/6 mice were singly housed in new cages (without CD1s) for 24 hours prior to social interaction test.For female defeats, female BL/6 mice were first coated with urine from other aggressive CD1 male mice (not used in defeats) prior to being introduced to the defeat CD1 mouse cage 73 , in order to minimize mounting behavior and maximize defeats.Female defeats were terminated early if mounting was observed.

Social interaction test
Social interaction test took place 1 day after termination of CSDS.BL/6 mice were habituated to the social interaction test room for 1 hour prior to test.The test was performed under red light (10 lux) in a test arena (custom made, 42 cm (w) × 42 cm (d)× 42 cm (h)) in a sound attenuation chamber.During the first phase of the test, the BL/6 mouse was introduced to the test arena with an empty enclosure (10 cm (w) × 6.5 cm (d) × 42 cm (h) at one end for 2.5 min, and its activity patterns were tracked using Ethovision (Noldus Information Technology, Leesburg, VA).At the end of 2.5 min, the mouse was placed back in its home cage, and the empty enclosure was replaced with a second enclosure containing a novel aggressive CD1 that had not been used in defeats.The BL/6 mouse was put back in the test arena for another 2.5 min.Social interaction ratio, as a measure for social avoidance, was calculated as the time spent in the interaction zone (14 cm × 24 cm) with the aggressor present vs. absent.The lower the social interaction ratio, the more socially avoidant the animal was.

Head-fixed sucrose preference test
Following recovery from head bar surgery, mice were habituated to the experimenter and the headfixed setup for 15 min a day for a week.After habituation, mice were water-restricted to ~85-90% their ad lib body weight and were trained for 3 days to lick on the custom designed dual-spout head-fixed reward delivery apparatus.On day 1, mice were introduced to 1 lick spout, where sucrose rewards (10% sucrose, ~3.5 ml each) were intermittently delivered upon licking (i.e., reward were lick-contingent) with 8 seconds inter-trial interval (ITI), with maximum 150 rewards per session.Sucrose rewards were delivered using a solenoid-gated gravity feed.Licks were detected using a piezo element (SparkFun, Boulder, CO).Stimulus delivery and sensor reading were controlled using a custom Arduino MEGA board and recorded using CoolTerm software.On day 2 and 3, mice were introduced to 2 lick spouts, one on each side of the mouse, separated by ~50 degrees.Sucrose rewards were delivered in both spouts upon licking with 8 seconds ITI.The goal was to teach mice that rewards were delivered from both spouts.Thus, if a mouse showed preference for spout on one side, that spout was temporarily removed so the mouse can learn to lick from the other spout.Once the animal showed similar preference for both spouts, lick training was completed and pre-defeat sucrose preference test (SPT) was initiated on the following day.SPT occurred over the course of 2 consecutive days, where one spout delivered water and the other delivered sucrose.Rewards were delivered upon licking with 8 seconds ITI and maximum 150 rewards in total per day.The spout designation was randomized across mice on day 1 and counterbalanced on day 2. Sucrose preference was calculated as the averaged % sucrose rewards obtained across 2 days.Upon completion of day 2 of pre-defeat SPT, mice were taken off water restriction and housed in social defeat room for 3 days before CSDS began.Post-defeat SPT was performed using the same protocol, with the addition of Neuropixels recording.Sucrose preference postdefeat was used for all behavioral analysis.

Neuropixels recording and data pre-processing Recording
Mice were head-fixed to the SPT apparatus without lick spouts present.Kwik-Sil was removed from skull surface.Prior to insertion, Neuropixels 1.0 probes (IMEC, Heverlee, Belgium) were first coated with DiI, DiO, or DiD dyes (ThermoFisher Scientific, Waltham, MA) and allowed to dry.Probes were inserted at ~1 mm/min to the target coordinate using Sensapex manipulators (Oulu, Finland).
Possible probe targets and their coordinates are as follows: amygdala (-1.71 mm AP, -0.28 mm ML, -6.5 mm DV, at 31.3 degrees ML), ventral hippocampus (-3.9 mm AP, -2 mm ML, -4.5 mm DV, at 25.8 degrees ML), frontal cortex (+1.77 mm AP, 0 mm ML, -5.7 mm DV, at 9 degrees ML; +2.60 mm AP, -0.5 mm ML, -4.5 mm DV, at 9 degrees ML), and midline thalamic and hypothalamic regions (-3.80 mm AP, 0 mm ML, -5 mm DV, at 7.6 degrees ML; -1.90 mm AP, -0.15 mm ML, -4.3 mm DV, at 4.4 degrees ML).One or two probes were inserted per session per mouse.Simultaneously recorded probes were coated in the same color of dye but spaced at least several hundred microns apart to allow for unambiguous identification.Different colors of dyes were used across days to help differentiate probe tracks.After a probe reached targeted DV, it was left in place for 10 min prior to the start of recording and SPT.Neuropixels action potential signals were recorded using Neuropixels acquisition system and SpikeGLX software, at 30,000 Hz with gain = 500.Behavioral signals were recorded using a separate data acquisition board (National Instruments, Austin TX), along with a synchronization signal that was also recorded by Neuropixels to help synchronize clocks between different data streams.After each session of SPT, probes were slowly removed from the brain and skull was covered with Kwik-Sil.
Probes were cleaned using Tergazyme solution (1%, Alconox) overnight and rinsed using deionized water before reusing or storage.

Histology and probe track registration
At the end of experiments, mice were transcardially perfused with 1x PBS followed by 4% paraformaldehyde solution.Brains were fixed overnight at 4 C, and then transferred to 30% sucrose solution for 48 hours.Brains were sectioned coronally using a microtome (Leica SM2000) at 50 um thickness and mounted on glass slides with Fluoromount G with DAPI (Southern Biotech, Birmingham, AL).Images were obtained using a confocal microscope (Nikon Ti2-E Crest LFOV Spinning Disk/ C2 Confocal) with 20X objective.Probe tracks were traced using the AllenCCF toolbox (https://github.com/cortex-lab/allenCCF).

Spike-sorting
Neuropixels action potential signals were pre-processed and spike-sorted offline using Kilosort 2 74 , and after sorting, the clusters were manually validated using Phy 75 .Only well-isolated clusters (putative single units that are classified as "Good" using Phy) were analyzed.All other clusters, including multiunit activity and noise, were not analyzed.

Data analysis
Animals were allowed to freely choose reward types after 8s ITI had passed between trials, by licking at the spout of their choice.Reward deliveries were lick-contingent.Trial types were defined as ± 4s time window around the time of reward delivery.For all analysis, only mice with at least 5 neurons in the region of interest were used.

Behavioral data analysis Behavioral classification of mice
The relationship between sucrose preference and social interaction ratio was assessed using Pearson's correlation.To classify CSDS mice into subtypes, we applied unsupervised K-means clustering using both behavioral metrics, sucrose preference and social interaction ratio.The optimal number of clusters was determined by evaluating cluster numbers from 2 to 10 and maximizing the Silhouette score.

Lick analysis
Lick rasters were generated by binning licks using 0.02s bin size.Lick rates were calculated using 0.1s bin size and averaged across trials per mouse for each reward type.Lick rate discrimination index (DI) was calculated as the difference between lick rates for sucrose vs. water trials divided by the sum of the two.
To take into account reward history and assess how it affects current behavior, we further divided sucrose and water trials into sucrose/sucrose (SS), water/sucrose (WS), water/water (WW), and sucrose/water (SW) trials, (previous/current reward).The first trial of each session was discarded as it had no prior trial.In order to assess the probability of each trial type independent of the animal's overall sucrose preference, we normalized the number of trials to the total number of previous trials of a specific type.For example, we defined the overall transition probability from a water trial to a sucrose trial as P(WS) = P(WS) / (P(WW) + P(WS)), and from a sucrose trial to a sucrose trial as P(SS) = P(SS) / (P(SW) + P(SS)), where P(XY) is the transition probability from reward X to reward Y.We normalized the transition probabilities such that P(WW)+P(WS) = 1, and P(SS)+P(SW)=1.Using this normalization, if P(SW) is not significantly different from P(WW), this would suggest that the current water reward choice is independent of the previous reward, because of the probability of switching from sucrose or staying on water is the same; otherwise, the current reward choice is dependent on the previous reward (i.e., reward choices could be modeled as a first-order Markovian process).
We also computed the proportion of each of the 4 trial types normalized to the total number of trials per session, to assess how much each trial type contributes to the overall session.In this case, the % of trials of each of the 4 trial types were computed per session and averaged across sessions for each mouse.As the number of trials may be influenced by each animal's innate preference for different rewards, we computed the chance probability of the occurrence of each trial type by calculating the joint probability of the previous and current trial.For example, chance P(SW) = P(S) x P(W).The number of trials was then subtracted by the chance level in each mouse ("Number of trials chance removed").SS and WW trials were combined when analyzing stay trials, and SW and WS trials were combined when analyzing switch trials.The preference between stay and switch trials in each mouse was calculated as % stay trials -% switch trials.To quantify the number of consecutive trials, we first obtained the average number of consecutive trials per trial type (sucrose or water) per session and then averaged across sessions for each mouse.

Decoding group identity using behavioral features
To examine whether group identity could be decoded using behavioral data, we defined a Mahalanobis-like binary decoder.Specifically, for each mouse, we considered 4 behavioral features: lick rate DI during Pre-reward, Post-reward, sucrose preference, and SI score.Considering two groups at a time, we defined and constructed a Mahalanobis binary decoder to assign a single testing mouse to one of the two groups in the behavioral feature space.The input to the binary classifier consisted of an N x F training matrix and a 1 x F testing matrix, where N represented the total number of training mice between the two classes, and F = 4 represented the total number of features.In each cross-validation (CV), we first balanced the number of mice in each group by randomly subsampling the minimum number of mice between the groups.Next, we randomly selected one mouse as the testing sample and used the remaining mice as the training set, for a total of 1000 CVs.We defined a Mahalanobis-like distance in the feature space as the Euclidean distance between the testing mouse and the centroid of the training groups, divided by the variance along the distance direction.The testing sample was assigned to the group identity with the minimum Mahalanobis-like distance.The performance of the decoder was evaluated by calculating the fraction of correct classifications out of the total 1000 CVs, and the entire procedure was repeated for all possible pairs of the three groups (i.e., control, susceptible, and resilient mice).

Firing rate
For task period, spike trains were aligned at the time of reward delivery (time 0) and neurons within the same region were pooled across animals of the same group to construct pseudo-populations.Only neurons with at least 10 trials per trial type (sucrose and water) were included.For peri-stimulus time histograms, spikes were binned at 10ms resolution, z-scored to -Pre-reward (-1 to 0s), and smoothed with a 50ms moving average filter.For analysis of raw firing rates, spikes were binned at 500ms resolution.

Reward-modulated neurons
Analysis was performed using pseudo-population and only neurons with at least 10 trials per trial type (sucrose and water) were included.Reward-modulated neurons were identified based on Wilcoxon rank-sum test, comparing the distribution of firing rates during 1s Post-reward (0 to 1s) vs. 1s Prereward period (-4 to -3s) across trials.A cell is deemed reward-modulated if Post-reward epoch is significantly different from Pre-reward after false discovery rate (FDR) correction across all recorded neurons, with a significance threshold of P < 0.05.A Chi-squared test was used to perform statistical comparisons between fractions of reward-modulated neurons in CSDS vs. control mice for sucrose and water trials.

Intention-modulated neurons
Analysis was performed using pseudo-population and only neurons with at least 10 trials per trial type (switch and stay) were included.Intention-modulated neurons were identified using a similar method as reward-modulated neurons.Mice with less than 5 neurons in regions of interest were excluded.In this case, a cell is deemed intention-modulated if the distribution of firing rates during the 4s pre-reward period (-4 to 0s) in switch trials is significantly different from stay trials, as identified using Wilcoxon rank-sum test followed by FDR correction across all neurons in that group (P < 0.05).
As the fraction of neurons was small and did not meet the criteria for using Chi-squared test, Fisher's exact tests were used to perform statistical comparisons between % of intention-modulated neurons across groups.
auROC Analysis was performed using pseudo-population and only neurons with at least 10 trials per trial type (sucrose and water) were included.Mice with less than 5 neurons in regions of interest were excluded.Reward-choice-selective cells were identified 76,77 , and the magnitude of the selectivity was quantified, using the area under the receiver operating characteristics (auROC) method, which compares single-neuron firing rates between trial types, across levels of response thresholds for each time bin.Spikes were binned at 500ms resolution.Shuffled distributions were computed for each time bin by randomly shuffling trial type 10 times per neuron.A neuron is deemed reward-choice-selective if its auROC is > 2 x STD of shuffled distribution for that neuron.The fraction of selective neuron in a region was calculated as # of selective neurons / total # of neurons.Differences in the fraction of selective neurons across groups were assessed using Fisher's exact tests.

Analysis of embedding dimensionality
Principal component analysis (PCA) was used to evaluate the embedding dimensionality of population activity of simultaneously recorded neurons over time.The method aims to identify how much variance of the population representation in the firing rate space is accounted for by each principal component.We chose this method because the pre-task period lacks behavioral labels.PCA has the advantage of allowing us to compare neural data between animals because the method is invariant for rotations and global stretching, transformations normally needed to align a neural representation of one subject into another.We examined the activity of each neuron in 1s bins during the 6 min time window (min 2-8) within the 10 min pre-task recording period, resulting in 360 bins.The ensemble activity across these bins can be represented as a geometrical object in the firing space, with each axis representing the firing rate of a neuron and each point representing the ensemble's activity in a time bin.We calculated the embedding dimensionality of this geometrical object for each mouse.We only included mice with at least 5 simultaneously recorded neurons in the region of interest during the pretask recording.We randomly selected 5 neurons for each mouse and calculated the z-scored firing rate matrix N x T, where N is the number of neurons, and T is the number of time bins.We applied PCA to this matrix and determined the cumulative curve of the variance explained by each principal component (PC).We repeated this procedure 1000 times and averaged the results across the subsamples for each mouse.Our goal was to compare cumulative variance curves across groups and determine if a group had a higher cumulative value at M PCs (M £ 5), indicating a lower dimensionality of the geometrical object.
We subsequently used the cumulative variance values for the first 3 PCs as features to decode the group identity.
We also assessed the Participation Ratio (PR), which is a normalized measure of dimensionality based on the full distribution of PCA eigenvalues (i.e., how much variance is explained by each principal component), and it is defined as follows, where λi are the eigenvalues of the covariance matrix of the neural acidity, and N = 5.If only one eigenvalue explains all the variance (λi ≠ 0 for i = 1 and λi = 0 for all i ≥ 2), then PR = 1.On the other hand, if all eigenvalues are equal, the dimensionality is maximum, PR = N 78,79 .
During the task period, the same analysis was repeated during the 1s of Pre-reward and Post-reward periods, using z-scored firing rate with 0.2s bins (5 bins for each period).

Hidden Markov Model
We used hidden Markov models (HMMs) to identify patterns of population activity in the time series, where each pattern corresponds to a specific neural state that is not directly measurable 52,54,80 .
We fitted an HMM separately for each mouse for the pre-task and task period.For the DREADD dataset, HMMs were fitted for saline and CNO periods of each mouse separately.To perform model fitting, we employed the software framework developed by the Linderman Lab (https://github.com/lindermanlab/ssm).
To prepare the data for the HMM analysis, we binned the 6 min pre-task recordings of each session into 1s bins, resulting in 360 bins.We computed the spike count of each neuron in each bin.The input data for the HMM consisted of an N x T matrix, where N represents the total number of simultaneously recorded neurons in the session, and T represents the total number of time bins.
For the analysis during task in the Pre-reward and Post-reward periods, we computed the spike count in 0.2s time bins.We fitted separate HMMs for the Pre-reward and Post-reward periods for sucrose and water trials.To accomplish this, we concatenated the M trials within a single session and arranged the input data in an N x T x M matrix, where T = 5.We chose bin size of 0.2sec, because this bin size balanced the inference of maximum possible transition states and total spike count used to fit HMMs.
For decoding of switch vs. stay using HMM states, we focused on the 4s pre-reward period.Spike counts were binned using 1s bin size, and concatenated across the 4s window of all trial types.This resulted in an N x T x M input matrix, where T = 4, and M represents the total number of recorded trials in the session.Consistent with previous analyses, we retained only sessions with at least 5 simultaneously recorded neurons.
Given the recorded (observed) spike count over time, we modeled the neuronal activity as a Poisson process, where the mean value depends on the current neural state.We represented the probability of observing the spike count vector n(t) of N neurons at time bin t, given the hidden neural state St = j, as a multivariate Poisson process: P (nt | St = j) ~ Poisson(Λ; nt).Here, Λ = {λ1, λ2, ... λN}, and λi represents the estimated mean activity for the i th neuron in state j.The vector Λ corresponds to the column of the N x K "emission matrix" E, which provides the firing rates or activation probabilities of observing a specific neuronal pattern when the population activity is in a particular state.
We assumed the dynamics of the neural states to evolve according to a first-order Markovian process, where the probability of transitioning from one state to another depends only on the current state.This process is summarized by the K x K "transition probability" matrix T. Additionally, we incorporated an initialization vector A, which provides the probability of starting in each state.The HMM was fully described by the set of parameters {E, T, A}, which were inferred by fitting the model to the recorded neuronal spike counts 81 .We used the Baum-Welch expectation-maximization algorithm to update the model parameters and maximize the likelihood of the observed data.For each time series, we fitted 5 models with a maximum of 100 iterations for each value of the total number of states ranging from 2 up to 50, using randomized initial conditions.The model with the smallest Akaike Information Criterion (AIC) score was retained as the best model for further analyses 52 .Subsequently, we used the Viterbi algorithm to estimate the most likely sequence of states over time.

Agglomerative clustering analysis
To better characterize the spatial structure of the hidden states, we examined the pairwise correlation between the inferred activity of the states.For state 1 with an activity vector X = (x1, x2, ..., xN), where xi represents the activity of neuron i, and state 2 with an activity vector Y = (y1, y2, ..., yN), we computed the Pearson correlation coefficient ρ(X,Y) to assess the distance between the states in the neuronal activity space.We calculated the correlation coefficients for all pairs of states and stored them in an N x N correlation matrix K. Subsequently, we performed agglomerative clustering on the correlation matrix.
Specifically, we defined a new distance matrix D as 1 -K, where 1 is an N x N matrix of ones.This matrix served as the input to the agglomerative clustering algorithm, which iteratively combines states to define new clusters according to the pairwise distance.The algorithm initialized each state as a separate cluster with minimum distance (maximum correlation) and iteratively merged two clusters v and u with the smallest distance into a new cluster.The new distance d assigned to the agglomerated clusters was defined as )), where p and q represent all the points in the merged clusters u and v, also known as Farthest Point Algorithm.Agglomerative clustering has the advantage of producing a hierarchical structure of clusters, which we represented as a dendrogram.This hierarchical representation allowed us to examine the relationships and similarities between states, specifically how neural states may be nested differently within large clusters in different groups.Importantly, agglomerative clustering does not require any assumption regarding the total number of clusters.It iteratively merges the closest states and clusters until all states are merged into one final cluster.We performed the clustering analysis separately for each mouse, visualizing the results with a dendrogram that summarizes the merging of clusters at different levels of distance, ranging from 0 (original states) to 1 (a single cluster).
After examining the clusters, we counted the total number of clusters at different levels of distance, or thresholds, where the higher the levels of distance, the lower the number of clusters, until reaching only one cluster at the highest distance.We assessed the curves of the number of total clusters and the proportion of total clusters retained relative to the total number of states as a function of thresholds.Comparing these curves between two groups, a higher number of states at the same threshold value indicates a greater degree of dissimilarity among the inferred states.We retained the proportion of total clusters along these curves from a threshold of 0.1 up to 0.5, resulting in a total of five features that were subsequently used in the decoding of group identity.We applied the clustering analysis to the pre-task activity using the previously inferred states described in the "Hidden Markov Model" section, as well as to the Pre-reward and Post-reward task periods for water and sucrose trials separately.

Correlation of population activity across time
To examine how variable population activity was across time during the pre-task period, we performed Pearson's correlation on population vectors of neuron firing rates across all time bins (1s bins).The correlation values were then averaged to assess differences between groups.

Decoding group identity using neural features
This analysis aimed to decode the group identity (i.e., control, susceptible, or resilient, or saline vs. CNO for DREADD data) on a single-mouse basis by analyzing the pre-task activity, where no behavioral labels were available.As described in the " Analysis of embedding dimensionality " section, the pre-task activity can be represented as a geometrical object in the firing space, with each axis representing the firing rate of a neuron and each point in the space representing the activity of the neuronal ensemble in each time bin.We sought features that characterized the representational object and were invariant to rotations and scaling transformations, or a subset of these transformations, ensuring shape invariance of the object.We included only mice with at least 5 neurons simultaneously recorded during the pre-task period.For each mouse, we computed the cumulative variance explained across the principal components (PCs) (see "Analysis of embedding dimensionality" section for more details).We considered the cumulative values of the first three PCs as features for decoding.Following the inference of hidden states and the clustering analysis, we calculated the proportion of clusters retained at different thresholds and extracted the values at five distinct thresholds (see "Agglomerative Clustering Analysis" section).Additionally, we computed the mean and standard deviation of the spike count as the last two features.All of the neural features were computed using 1s bins to optimize the final decoding performance.Overall, we assessed a total of 10 neural features for each mouse.
We used the same Mahalanobis binary decoder procedure as previously described in the "Decoding group identity using behavioral features" section.Specifically in this case, the input to the binary classifier consisted of an N x F training matrix and a 1 x F testing matrix, where N represented the total number of training mice between the two classes, and F = 10 represented the total number of features.Prior to running the classification algorithm, we preprocessed the input matrices by applying a Min-Max Scaler to the mean and standard deviation of the spike count, ensuring that all features were scaled between 0 and 1 (because the PC cumulative variance and fraction of HMM clusters are defined between 0 and 1 by construction).The decoder was trained and tested for 1000 iterations, where in each of them a new random testing subject was selected and removed from the training set.
The same decoder procedure was also applied during the Pre-reward and Post-reward periods of the task.For the decoding using vCA1 activity, the training set was defined as 20% of the total number of mice due to the initial larger sample size.

Population decoding
Similar to previously described method 46 , a linear support vector machine (SVM) classifier was trained to classify patterns of activity into two discrete categories.Results are reported as the generalized performance of the decoder using cross-validation with 80/20 training/testing split.Patterns of activity are defined as the mean firing rate during 0.5 s non-overlapping time bins.Pseudo-population recordings were generated by combining all neurons within the same region and the same group.As it is well known that neural activity in previous trials could strongly influence activity in current trials 82 , for all pseudo-population decoding analyses, we balanced the number of trials of each trial type by taking into account both the previous and current trial types.In other words, we have equal numbers of water/water, sucrose/sucrose, water/sucrose, and sucrose/water trials (previous/current trials, respectively).Only neurons with at least 8 trials per each of the 4 trial types were included.
To decode current reward, we combined equal numbers of water/water and sucrose/water trials for water trials, and similarly, equal numbers of sucrose/sucrose and water/sucrose trials for sucrose trials.
To decode previous reward, we combined equal numbers of water/water and water/sucrose trials for water trials, and similarly, equal numbers of sucrose/water and sucrose/sucrose trials for sucrose trials.
To decode intention (stay vs. switch), we combined equal numbers of sucrose/sucrose and water/water trials for stay trials, and similarly, equal numbers of sucrose/water and water/sucrose trials for switch trials.
As each group may have different number of cells and trials, we used subsampling procedures to randomly subsample cells (60 neurons for both BLA and vCA1), and within those cells, randomly subsample trials equal to the group with the smallest number of trials.The resulting dataset was used to train SVM and obtain cross-validated decoding accuracies.For each set of subsampled cells, decoding accuracies across random subsampling of trials (repeated 10 times) were averaged to obtain a single sample of decoding accuracy.We repeated the whole procedure 10 times to obtain statistical comparisons across groups and against shuffled distribution.
For within-time-bin decoding, SVMs were trained using data from one time bin and tested using held-out data from the same time bin.For cross-time-bin decoding, SVMs were trained using data from one time bin and tested using data from the other time bins.
For statistical comparisons, decoding accuracy during Pre-reward (-4 to -3s) and Post-reward (0 to 1s) periods was averaged.If the mean decoding accuracy in a group is significantly higher than 2 x STD of its respective mean shuffled distribution, we then performed additional between-group comparisons (2-way comparison: Mann-Whitney test; 3-way comparison: Kruskal-Wallis test followed by Dunn's multiple comparisons test).

Decoding switch vs. stay using HMM states
In addition to using recorded firing rates during the 4s pre-reward window to decode switch vs. stay, we also trained separate decoders using the smoothed activity of the hidden states inferred by the HMMs.This approach uniquely allowed us to identify population hidden states within this time window, and specifically those states that may be intention-selective, which can then be artificially manipulated to assess their necessity in decoding.It is important to note that the training of the HMM was performed on concatenated trials, which includes the four 1s bins pre-reward across all trial types.We then rearranged the sequence of hidden states in each trial type a posteriori.
Once the parameters of the HMMs were inferred, the models could smooth the observed data by computing the mean observed activity under the posterior distribution of hidden states 53 .For instance, given the observed activity vector X during a time bin of a trial pre-reward, the HMM inferred a 0.2 probability of being in state S = 1 and a 0.8 probability of staying in state S = 2.More precisely, P(S = 1 | X) = 0.2, and P(S = 2 | X) = 0.8.The smoothed observations used to train and test the linear decoder were calculated as Y = 0.2μ1 + 0.8μ2, where μj represents the inferred mean for the observations in state j.
To ensure robustness, we randomly sampled 60 neurons from each mouse for 10 neuronal subsamples.We generated 1000 pseudo trials for each of the four trial types, resulting in a total of 4000 pseudo trials for the training and testing sets, separately.The input data to train and test the decoder consisted of the smoothed activity assigned to each time bin.We trained and tested a support vector machine (SVM) classifier with a linear kernel, similar to the approach used in the population decoding using original firing rates, to decode switch vs. stay.In each cross-validation iteration, we randomly selected 100 pseudo trials as the training set and 20 pseudo trials as the testing set, for a total of 100 cross-validations.The final decoder accuracy was computed as the average across neuronal subsamples and cross-validations.
To assess the significance of the decoding signal, we compared it to a chance level, defined as 2 x standard deviations (STD) around the theoretical mean of the distribution of accuracies obtained after 100 shuffles of the labels.

Defining intention-selective states
We conducted a detailed analysis of the distribution of hidden states across trial types to identify intention-selective states.For each mouse, we computed the fraction of occurrence of each hidden state within the 4s bins pre-reward across all trials.This distribution was then normalized to the total number of trials multiplied by the number of bins.We assessed this normalized distribution separately for each trial type.
Consistent with the decoding results, we observed that certain states appeared exclusively in either the stay or switch trials, with no occurrences in the other trial types.To quantify the amount of information each state held for the intention value (i.e., stay or switch), we computed the Shannon entropy 83 .Specifically, for a given state, we normalized its occurrence frequency in each trial type to the total number of trials.The entropy of each state for the intention value was calculated using the following formula: Hstate = -Pswitch x log(Pswitch) + (-Pstay x log(Pstay)) where Pswitch is the occurrence frequency of the state in switch trials (WS, SW) and Pstay = 1 -Pswitch.An entropy value of 0 indicates that the state provides highly informative signals for the intention to switch or stay.Therefore, we defined an intention-selective state as one with an entropy value of 0 for the intention value.
To decode the intention of switch/stay using hidden states, we first examined the distribution of the fraction of intention-selective states at different clustering thresholds for each mouse, and selected a threshold that yielded the highest number of intention-selective states.We then used the inferred firing rates from these identified intention-selective states to train a linear decoder for classifying the intention of mice to switch or stay.
To compare the fraction of intention-selective states across groups, we calculated the fraction of the intention-selective states out of the total number of hidden states using the first four clustering thresholds (ranging from 0.1 up to 0.4, stepped by 0.1), and compared the resulting distribution.
To examine the necessity and sufficiency of intention-selective states, we first excluded trials that contained intention-selective states in at least 3 time bins pre-reward.In the opposite approach, we enhanced the presence of intention-selective states in the decoding procedure by considering only those trials that included intention states in at least 3 time bins before the reward delivery.

Multi-dimensional scaling
To visualize the geometric structure of the data, we used Multi-Dimensional Scaling (MDS) transformation to obtain a low-dimensional representation of the data.For pre-task data, we started with the N x F matrix used for the Mahalanobis decoder, where N represents the total number of subjects across all three groups, and F denotes the number of features employed for decoding the group identity.
Prior to the dimensionality reduction analysis, we normalized each group's data by its variance to reduce noise and enhance the clarity of the final visualization.Next, we performed a diagonalization of the dissimilarity matrix N x N, which contained the Euclidean distances between each pair of subjects in the feature space.We used the same procedure for the task period.In these cases, the input matrix was a T x N matrix, where T represents the total number of pseudo trials, and N denotes the number of neurons.

Fig. 1 .
Fig. 1.Distinct behavioral signatures of resilient and susceptible mice following CSDS.(A) Schematic of SPT and Neuropixels recording protocol following CSDS.(B) Schematic of SPT protocol.Example lick rasters were from two example mice with different sucrose preferences.Top, from a stress-resilient

Fig. 2 .
Fig. 2. Enhanced representations of reward choice in stress resilient mice.(A) Schematic of SPT.(B) Trial-averaged firing rates in sucrose and water trials from example BLA cells, with respective auROC during Pre-reward (grey, -4 to -3s) and Post-reward (black, 0 to 1s).Scale bars: 1 spikes/ 1s unless otherwise stated.(C) In BLA, resilient group showed the highest fraction of selective neurons during both Pre-(n = 69 total neurons, Fisher's exact tests, P < 0.0001) and Post-reward (Fisher's exact tests, P < 0.001), in comparison to control (n = 132 total neurons) and susceptible group (n = 68 total neurons).In vCA1, both susceptible (n = 283 total neurons) and resilient (n = 528 total neurons) groups showed higher fraction of selective neurons than controls (n = 143 total neurons) during Pre-(Fisher's exact tests, P < 0.001) and Post-reward (Fisher's exact tests, P < 0.0001).(D) Schematic of population decoding of current and future reward choices.Linear support vector machine (SVM) classifier was trained to distinguish between water and sucrose trials.(E) In BLA, resilient mice showed higher decoding accuracy than chance during Pre-reward, and the highest decoding accuracy among all groups during Post-reward (colored lines indicate mean of subsampling (n = 10 of n = 60 neurons, with n = 100 cross-validations), Kruskal-Wallis, P < 0.0001).(F) In vCA1, resilient mice showed higher decoding accuracy than susceptible mice during Pre-reward (colored lines indicate mean of subsampling (n = 10 of n = 60 neurons, with n = 100 cross-validations, Mann-Whitney, P = 0.045) and Post-reward (Kruskal-Wallis, P = 0.0011).Data are mean ± STD. # Significantly different from chance; * P < 0.05; ** P < 0.01.

Fig. 4 .
Fig. 4. Distinct neural signatures of CSDS mice in the absence of task.(A) Schematic of analysis in pre-task.HMM was used to identify hidden states (S) and states similarity was assessed using agglomerative clustering.(B) Examples of states correlation heatmap from BLA of a control (n = 19 hidden states) and a susceptible mouse (n = 17 hidden states) and respective agglomerative clustering (dendrograms on right).(C) Susceptible mice had more distant hidden states in BLA (Mann-Whitney, control (n = 5 mice) vs. susceptible (n = 5 mice) P < 0.05 for all thresholds except 0; resilient (n = 5 mice) vs. susceptible P < 0.05 for all thresholds except 0.4-0.6).(D) Proportion of clusters (1-r threshold = 0.5) was correlated with animals' behavior (n = 15 mice, Spearman's correlation).(E) Mahalanobis decoder

. 1 .
Decoding of group identity using behavioral features.(A) Schematic of theMahalanobis decoder trained on behavioral features to decode group identity.(B) As further verification that behavioral features between groups classified using K-means clustering were different, group identity can be successfully decoded using Mahalanobis decoder trained on behavioral features including lick rate discrimination index (DI) during Pre-and Post-reward, sucrose preference, and social interaction ratio.Extended Data Fig.2.Intention selectivity in BLA as a unique susceptibility signature.(A and B) Decoding accuracy of reward choice in the current trial in the BLA when the previous trial was (A) water (WS vs. WW) or (B) sucrose (SW vs. SS).In resilient group, upcoming reward choice can be decoded during Pre-reward time window when the previous trial was water, but not when previous trial was sucrose.In susceptible group, upcoming reward choice during Pre-reward can be decoded regardless of the previous trial choice.Colored lines indicate mean of subsampling (n = 10 of n = 60 neurons, with n = 100 cross-validations.(C) Susceptible mice (n = 12) showed fewer consecutive sucrose trials (ANOVA, effect of group: F2,57 = 7.60, P = 0.0012), and greater number of consecutive water trials in comparison to control (n = 15) and resilient mice (n = 33, ANOVA, effect of group: F2,57 = 25.09,P < 0.0001).(D) Sucrose and water trials were further divided into sucrose/sucrose (SS), water/sucrose (WS), water/water (WW), and sucrose/water (SW) trials after taking into account the previous trial.Comparison of the proportion of trials in each of the 4 trial types revealed that controls showed greater proportion of switch trials (WS, SW).Resilient mice (n = 33) showed greatest proportion of SS trials, while susceptible mice (n = 12) showed greatest proportion of WW trials (RM-ANOVA, trial type x group interaction: F6,171 = 39.99,P < 0.0001).(E and F) Lick rates for each of the 4 trial types during (E) Pre-reward (RM-ANOVA, trial type x group interaction: F6,171 = 2.38, P = 0.031) and (F) Post-reward period (RM-ANOVA, trial type x group interaction: F6,171 = 9.80, P < 0.0001).(G) In vCA1, decoding accuracy of switch vs. stay intention using raw firing rates in susceptible and resilient mice was above chance.Colored lines indicate mean of subsampling (n = 10 of n = 60 neurons, with n = 100 cross-validations.(H) Schematic of HMM to obtain population hidden states.(I) Similarly, in vCA1, decoding accuracy of switch vs. stay intention using inferred firing rates from HMM in the 4s preceding reward delivery in susceptible and resilient mice was above chance (n = 100 cross-validation; chance: n=100 shuffles).(J) MDS visualization of inferred firing rates showed that population representations of switch and stay trials can be linearly separated in vCA1 neurons in susceptible and resilient mice than in controls (MDS example of n=1 subsampling, n=1000 pseudo trials/condition).(K) Average distribution of the of fraction of intention-states across mice at different entropy values (states correlation thresholds of 1-r = [0.1,0.4]: control n = 5 mice, susceptible n = 5 mice, resilient n = 3 mice).For each mouse, the state entropy was computed at fixed threshold on the clustering dendrogram (see Methods).Data are mean ± STD.Data are mean ± SEM unless otherwise stated.# Significantly different from chance; * P < 0.05, ** P < 0.01.Extended Data Fig. 3. Distinct neural signatures of CSDS mice in the absence of task.(A) Number of neurons used in BLA and vCA1 in each mouse for analysis (BLA n = 5 mice per group, vCA1 control n = 12, susceptible n = 14, resilient n = 31 mice).(B) Schematic of dimensionality reduction using principal component analysis (PCA).The embedding dimensionality was quantified using participation ratio.(C) There were no statistically significant differences in cumulative variance explained by principal components (PCs) in BLA and vCA1 between groups (n = 1000 subsampling, n = 5 neurons).Data are mean ± SEM. (D) Mean of the cumulative variance of the first 3 principal components in BLA and vCA1.(E) Participation ratio of BLA and vCA1.(F) Akaike information criteria (AIC) from one example mouse in each group.HMM with the lowest AIC was selected as the best model (n = 5 models/#state).(G) Example raster and HMM states from one representative mouse.Colored lines indicate different hidden states and their posterior probability.(H) Two examples of HMM states correlation matrices for one control (n = 20 hidden states) and one susceptible (n = 20 hidden states) mouse, with respective dendrograms of agglomerative clustering in vCA1.(I) There was no difference in the proportion of distant hidden states in vCA1 between groups.(J) The number of clusters across thresholds did not differ between groups in BLA and vCA1.Data are mean ± SEM. (K) The number of clusters of individual mice.(L) Mean of the proportion of clusters in the first 5 thresholds showed that susceptible mice in BLA had greater proportion of unique hidden states in comparison to controls (Kruskal-Wallis, P = 0.0018).No group difference was found in vCA1.(M) Two example heatmaps of population activity correlation in BLA over time, showing that population activity patterns were much more correlated in the control (top) than the susceptible (bottom) mouse.(N) Average correlation of population activity across time in the BLA showed a trend towards lower correlated activity in the susceptible mice.(O) In BLA, susceptible mice showed reduced firing rates mean (Kruskal-Wallis, P = 0.020) and STD (Kruskal-Wallis, P = 0.0077) in comparison to controls.(P) Firing rate (FR), PCA, and HMM features each alone could successfully decode control vs. susceptible mice in BLA.(Q) Different time bin sizes were tested and the one that allowed the highest decoding accuracy between groups was chosen as the optimal bin size (n = 100 crossvalidations; chance: n = 100 shuffles).(R) Group identity could not be decoded using Mahalanobis decoder trained on neural features in vCA1.The importance of each neural feature in decoding was examined by systematic removal of each of the features (subsequent columns).Data are mean ± STD, unless otherwise stated.Chance distributions are ± 2 x STD around theoretical chance level.# Significantly different from chance; * P < 0.05; ** P < 0.01.Extended Data Fig. 4. Dysfunctional single cell and population-level correlates for reward choice in mice susceptible to chronic stress.(A) Cumulative variance of PCs in BLA and vCA1 during Pre-reward and Post-reward periods revealed no difference between groups (n = 1000 subsampling, n = 5 neurons).(B) Participation ratio of BLA and vCA1 during Pre-reward and Post-reward periods showed no difference between groups.Data are mean ± STD. (C) The proportion and number of HMM clusters across different thresholds showed no statistical difference between groups, but BLA neurons in susceptible mice showed a trend towards higher proportion of unique clusters.(D) Trial-averaged firing rates of pseudo-populations of BLA and vCA1 neurons across groups.Number of neurons are labeled in corresponding group colors.(E) Group identity could be decoded better than chance during specific time windows and trial types (n = 100 cross-validations; chance: n = 100 shuffles).Data are mean ± SEM unless otherwise stated.Chance distributions are ± 2 x STD around theoretical chance level.# Significantly different from chance.Extended Data Fig. 5. Rescue of dysfunctional vCA1-BLA activity and anhedonia by circuit-specific manipulations.(A) Number of neurons used in BLA and vCA1 in each mouse during saline and CNO period for analysis (BLA n = 11 mice per group, vCA1 n = 12 mice per group).(B) Mean vCA1 firing rates during pre-task period was increased following CNO (Mann-Whitney, P = 0.008).Data are mean ± STD. (C) Cumulative variance of PCs did not differ between saline and CNO periods in BLA or vCA1.(D) Mean of the cumulative variance of the first 3 PCs in BLA and vCA1.Data are median ± STD. (E) Participation ratio of BLA and vCA1.(F) Proportion of HMM clusters across different thresholds in BLA and vCA1.(G) The number of HMM clusters across different thresholds in BLA and vCA1.(H) The number of HMM clusters of individual mice across different thresholds in BLA and vCA1.Data are mean ± STD. (I) Mean of the proportion of clusters in the first 5 thresholds.Data are mean ± STD. (J) Despite no statistical difference in each of the FR, PCA, and HMM features, the decoder trained using all features could successfully decode between saline vs. CNO periods in vCA1 better than chance (± 2 x STD, n = 100 cross-validations; chance: n = 100 shuffles).(K) Firing rates of pseudo-population of BLA (n = 76) and vCA1 (n = 274) neurons during task showed that vCA1 neurons had elevated firing rates after CNO during both Pre-reward (RM ANOVA, effect of CNO: F1,1092 = 7.60, P = 0.0060) and Post-reward periods (RM-ANOVA, effect of CNO: F1,1092 = 9.57, P = 0.0020).(L) Removal of trials containing intentionselective states (-Intention-selective states) during the saline period reduced decoding accuracy of switch vs. stay trials to chance, whereas keeping only trials containing intention-selective states (+Intentionselective states) allowed successful decoding of stay vs. switch trials.Removal of trials with random states had little effect on decoding accuracy.Chance distributions are ± 2 x STD around theoretical chance level.(M) MDS visualization showed that keeping only intention-selective states allowed the representations of switch trials to be linearly distinguished from stay trials, whereas removal of intention-selective states prevents the representations of the two trial types from being linearly separated.(N) CNO increased lick rate discrimination index during Post-reward period in comparison to saline (n = 7 mice, RM-ANOVA, treatment x time interaction: F1,12 = 10.80,P = 0.0065).(O) CNO increased the proportion of SS trials (n = 7 mice, RM ANOVA with Bonferroni's multiple comparisons test, trial type x treatment interaction: F3,36 = 6.23,P = 0.0016).(P) CNO altered the proportion of stay (water-water and sucrose-sucrose) trials (n = 7 mice, RM ANOVA with Holm-Sidak's multiple comparisons test, effect of group: F1,12 = 5.61, P = 0.036).(Q) CNO altered the proportion of switch (water-sucrose and sucrose-water) trials (n = 7 mice, RM-ANOVA, effect of group: F1,12 = 5.61, P = 0.036).(R) Mice microinfused with the control virus (AAV-DIO-mCherry) and given CNO showed no change in sucrose preference (n = 18 mice).Data are mean ± SEM unless otherwise stated.# Significantly different from chance; * P < 0.05, ** P < 0.01.