Novel mechanism of HIV elite control by enriching gut dipeptides as HIV-1 antagonist but Prevotella agonist


 HIV-1 Elite controllers (EC) are a rare group among HIV-1-infected individuals who can naturally control viral replication for a prolonged period. Due to their heterogeneous nature, no universal mechanism could be attributed to the EC status; instead, several host and viral factors are discussed for playing a role. In this study, we investigated the fecal metabolome and microbiome in a Swedish cohort of EC, treatment-naïve viremic progressors (VP) and HIV-negative individuals (HC). We observed an enrichment of dipeptides in EC compared to the other two study groups. In vitro analyses identified anti-HIV-1 properties for two dipeptides that could bind to the HIV-1 gp120. Furthermore, these dipeptides supported bacterial growth of the genus Prevotella in vitro that was enriched in EC, which influences host metabolism. Thus, increased levels of both dipeptides and Prevotella could provide beneficial effects for EC. These findings open new possibilities to develop therapeutics against HIV-1.


Introduction
The elite controllers (EC) phenotype represents a rare yet complex subgroup of individuals living with human immunode ciency virus type 1 (HIV-1) who can control the viral replication spontaneously for a longer period. Various hypotheses have been put forward for HIV-1 pathogenesis and potential disease control mechanisms. These include presence of protective class I human leukocyte antigen (HLA) types, like HLA-B*57, HLA-B*27 1 ; speci c mutations or polymorphisms in AIDS-restrictive genes (ARGs) like CCR5-Δ32 (rs333), CCR5 59029G (rs1799987), and SDF1-3′A (rs1801157); polymorphisms in the HLA and CCR5-CCR2 gene locus 2 , defective HIV-1 variants 3 , or natural resistance to HIV-1 infection 4 , but none of these mechanisms has turned out to be universal in EC. Several studies, including ours 5 , have indicated the heterogeneous nature of the EC and that elite control status is attributed to several interconnected host and viral factors.
Recently, the gut microbiota has emerged as a central player in immunity, which can mechanistically link factors related to HIV-1 infection, immune activation, and in ammation 6,7 . Microbiome studies in HIV-1-infected and uninfected adults have indicated signi cant differences in microbiota composition, richness, and diversity associated with HIV-1 infection 8 . Studies also identi ed additional factors that could affect gut microbes such as sexual preferences 9 , treatment status, and regimens 10,11 . Our earlier study indicated that the gut microbiome could contribute to the HIV-1 elite control status 12 . Thus, EC has a unique bacterial signature and richer gut microbiota. Metagenomics functions predicted changes in different metabolic pathways between EC, treatment-naïve and HIV-negative individuals. However, till today no study has investigated the fecal metabolic signature in EC.
High throughput untargeted metabolomics appears as a powerful tool to clarify interactions between gut microbiota and the host, as the metabolites produced by the gut microbiota together with the nutritional intake and environmental factors play an essential role in modulating host physiology and pathology 13 . Although metabolomics has opened a new scenario in the comprehension of the gut ecosystem, studies are limited to understand the interplay between the metabolome-microbiome axis in HIV-1 elite control status. In a recent study, the loss of virologic control in EC was characterized by immune-metabolic deregulation 14 . Our recent study indicated a unique plasma metabolic signature in EC attributing to the elite control status 15 .
In the current study, we investigated the metabolomics and microbiome signature on fecal samples of EC, HIV-1-infected individuals before therapy initiation, and HIV-negative controls to analyze the correlation between fecal metabolic perturbations and elite control status and to explore speci c metabolites that could potentially be used as biomarkers as well as their role in EC status. Our study indicated that the class of dipeptides was enriched in EC. This signature was linked with anti-HIV effects together with enrichment of speci c bacterial communities in EC.

Distinct metabolite pro le in EC & enrichment of dipeptides
In fecal samples, a total of 825 biochemical compounds were identi ed. Metabolites were grouped by super pathways according to their biochemical class (lipids, peptides, carbohydrates, nucleic acids) or other biochemical compounds/substances (amino acids, cofactors, and vitamins, energy, xenobiotics) (Fig. 1a). A large proportion of the detected metabolites belonged to the class of lipids (317 metabolites), followed by amino acids (193 metabolites). Principle component analysis (PCA) identi ed two outliers (EC06 and EC14), which could be due to technical errors, and they were excluded from further analyses ( Supplementary Fig. 1). Partial least-square discriminant analysis (PLS-DA) displays that all EC segregated closely together indicative of uniform metabolic pro le in EC (Fig. 1b). Hierarchical clustering analysis (HCA) indicated the same ( Supplementary Fig. 2). One-way ANOVA performed on the remaining samples (EC n=12, HC and VP as in material and methods section) revealed that 485 biochemicals had a group effect. The largest differences appeared between EC and VP (533 altered biochemicals) and the lowest between VP and HC (127 altered biochemicals). Among the biochemicals that were distinct in feces of HC and VP, 86.6% (110/127) were increased in the latter ones and 91.6% (305/333) of the biochemicals that were altered between HC and EC were decreased in EC. Similarly, 93.6% of the biochemicals changed in VP feces relative to EC feces were decreased in EC (499/533) ( Supplementary Fig. 3). Signi cant differences between EC and the other two groups were observed in peptides ( Fig. 1c and 1d). Interestingly and in contrast to the general trend, almost all the detected dipeptides were increased in EC as compared to VP (Fig. 1c) and HC (Fig. 1d), whereas VP and HC had similar dipeptide levels ( Supplementary Fig. 4). However, g-glutamyl amino acids showed opposite trends, which could be the precursors of the dipeptides (Fig. 1e).
Dipeptides as an anti-HIV compound As we observed a signi cant increase in dipeptides in EC and earlier studies have indicated that dipeptides can act as anti-HIV compounds 16 , we were interested in investigating whether these dipeptides could be involved in the HIV-1 elite control status by interacting with viral proteins. In silico analysis predicted that all 19 dipeptides detected in metabolomics analyses bind to the HIV-1surface protein gp120 with mean binding energy -8.67 kCal/mole (range: -12 to -5) (Fig. 2a). Out of these 19, eight could be synthesized with purity of >95%. Regarding these eight dipeptides, four showed binding to gp120 by MST in vitro; tryptophylglycine (WG; K d =2.1x10 -7 M), valylglutamine (VQ; K d =6.03x10 -7 M), lysylleucine (KL, K d =7.5x10 -7 M), and tyrosylglycine (YG, K d =8.43x10 -8 M) (Fig. 2b-e). To nd out at which sites these four dipeptides possibly interact with gp120, protein-peptide docking was investigated by in silico binding analysis. WG and VQ residues were predicted to bind in the same hydrophobic pocket ( Fig. 2f and 2g, Supplementary Fig. 5a and 5b, Table 1) with VQ binding somewhat deeper into the pocket compared to WG ( Supplementary Fig. 5c). A list of interaction characteristics is given in Table 1. The predicted binding site of both dipeptides is furthermore close to the CD4 binding site in gp120 ( Supplementary Fig. 5d). In vitro virological assays in TZM-bl cells using the amide forms of the identi ed dipeptides revealed anti-HIV-1 activity of WG-am and VQ-am. WG was more potent to reduce infectivity compared to VQ (Table 2); with EC 50 , 7.8 ± 2.2µM (WG) and EC 50 =65.1µM (VQ) respectively for HIV-1 JRFL (CCR5-tropic virus); while antiviral potency against HIV-1 NL4.3 (CXCR4-tropic virus) was observed only for WG with EC 50 =28.62±2.8µM, but not for VQ. (Fig. 2h and Table 2).
Thus, WG is more potent than VQ to antagonize viral entry into target cell, but both dipeptides have antiviral properties against some HIV-1 strains. The other two dipeptides did not show any anti-HIV-1 activity.
Altered microbiome-related biochemicals HIV-1 infection is known to alter the composition of the gut microbiome 7,12 . Consistent with this notion, many biochemicals partially or completely derived from the gut microbiome displayed a trending or signi cant decrease in EC and were elevated in VP relative to HC (Fig.   3a), for example indolepropionate (EC vs HC, p=0.005 and EC vs VP, p<0.001), indoleacetate (EC vs HC, p=0.012 and EC vs VP, p=0.001), phenylacetate (EC vs HC, p=0.045 and EC vs VP, p<0.001), indole-3-carboxylate (EC vs HC, p=0.009 and EC vs VP, p<0.001), and dimethylglycine (EC vs HC, p=0.015 and EV vs VP, p<0.001) ( Fig. 3b and 3c). Many of the detected short and medium-chain fatty acids (SCFA, MCFA) that are also produced by gut microbiota, showed the same pattern of being decreased in EC and increased in VP as compared to HC ( Supplementary Fig. 6). Notably, several secondary bile acids were elevated in VP relative to HC including ursodeoxycholate (p=0.02), 3βhydroxy-5-cholenoic acid (p=0.014), lithocholate (p=0.024), and glycoursodeoxycholate (p=0.002) among others ( Fig. 3a and Supplementary  Fig. 6). Considering 95% of bile acids are actively reabsorbed in the ileum and the remaining 5% excreted in the feces, these data may further re ect impaired intestinal reabsorption of secondary bile acids eliciting their accumulation in feces in VP (Fig. 3d). Interestingly, and in contrast to the majority of microbiome-related chemicals, the tryptophan metabolite skatol (3-methylindole) and its derivative indole were highly elevated (p<0.001 for both metabolites and in all the comparisons) in EC feces relative to both HC and VP ( Fig. 3b and Supplementary  Fig. 6), which could suggest differential abundance of a speci c bacterial class or species in EC.

Differential bacterial abundance in EC
Results of fecal sample analyses could re ect group differences in the gut environment, which is in uenced by gut microbiota. We, therefore, investigated the abundance of different bacterial genus in feces of selected individuals (n=11 in each study group). The relative abundance of nine bacterial genus (Bacteroides, Bi dobacterium, Catenibacterium, Collinsella, Eubacterium, Faecalibacterium, Oscillibacterium, Prevotella, and Ruminococcus) that were detected in samples is shown in Figure 4a and 4b. Interindividual differences within and between study groups were discernible. Outstanding was the presence of bacteria of the genus Prevotella in all EC samples whereas this genus was found in 7 out of 11 HC samples and 9 out of 11 VP samples. It was further noticed that an abundance in Prevotella entailed a reduction in Bacteroides and vice versa, regardless of the study groups. Thus, individuals with very high abundance of Prevotella had very little Bacteroides (Fig. 4a). The relative abundance of only Prevotella, but no other genus, was statistically signi cantly different in intergroup comparison (p=0.008, Fig. 4b). Even more, Prevotella together with Bacteroides were contributing most for separating EC from HC with high abundance of Prevotella and low abundance of Bacteroides in EC ( Fig. 4b and 4c).
Next, we wanted to see if there was a link between dipeptides and microbiota and investigated the effect of the dipeptides WG and VQ on gut microbes that were cultured from patient samples. Consequently, 17 bacterial strains were isolated and incubated with WG or VQ as described in the methods. Enhanced growth was observed for all the four Prevotella isolates tested, when incubated with high concentrations of WG, i.e. 10mM (Fig. 4d) compared to control. Also, WG at 5mM concentrations supported the growth of P. bivia, P. disiens and P. denticola ( Supplementary Fig. 7). No changes were seen when Prevotella strains were incubated with the dipeptides at concentration of 1mM ( Supplementary Fig. 7). Only Prevotella disiens, but not the other Prevotella strains, grew signi cantly more under incubation with 5mM and 10mM VQ (Fig. 4e, Supplementary Fig. 7). The growth of one of the two Bacteroides fragilis isolates (isolate II) was enhanced when incubated with either WG or VQ (10mM) compared to the control ( Fig. 4d and 4e). No other bacterial strain tested was affected by WG (Fig.  4d). VQ, however, had a positive effect on growth of Enterococcus faecalis ATCC 29212, and inhibited growth of Escherichia coli ATCC 25922, but did not affect other strains (Fig. 4e). Altogether, bacteriological assays showed that growth of Prevotella strains is supported by the dipeptide WG. The majority of the other bacterial strains tested do not bene t from neither WG nor VQ, except for B. fragilis isolate II (bene t from WG and VQ) and E. faecalis (bene t from VQ).

Discussion
In this study, we characterized the fecal metabolome and microbiome of two groups of treatment-naïve HIV-1-infected patients, EC and VP, and compared them to an HIV-1-negative control (HC) group and observed a strong metabolic differentiation between EC with VP and HC. Levels of many metabolites were decreased in EC compared to HC and VP, which could indicate enhanced nutrient absorption in the EC group, resulting in fewer nutrients in their feces whereas VP might exhibit impaired nutrient absorption. Signi cant enrichment of several dipeptides was observed in all EC and in vitro analysis identi ed that some of the dipeptides (WG-am and VQ-am) were antagonistic to HIV-1 while agonistic to one (VQ-am) or four (WG-am) bacterial species of the genus Prevotella, which was enriched in EC compared to the other study groups. Besides, several microbiome-derived biochemicals including aromatic amino acid metabolites and SCFA and MCFA followed a similar trend, which may re ect malabsorption in VP and alterations in the growth or composition of the gut micro ora.
Our data generally indicate that EC exhibit enhanced nutrient absorption relative to HC and VP (fewer nutrients in feces), while VP exhibit impaired nutrient absorption (more nutrients in feces). Numerous food components underwent a signi cant or trending decrease in EC relative to HC, while most of these biochemicals were either unchanged or increased in VP relative to HC. These ndings are consistent with literature reporting malabsorption in HIV-infected subjects, which may be attributed to small bowel bacterial overgrowth, villous atrophy, and a reduced gastrointestinal mucosal barrier [17][18][19][20] . Amino acids (essential and non-essential) as a class were signi cantly decreased in feces of EC and increased in feces of VP relative to HC, further supporting enhanced intestinal absorption/uptake in ECs and impaired absorption in VP. Moreover, our recent study on plasma metabolites in the same cohort identi ed that many amino acids were signi cantly elevated in EC plasma samples relative to VP, which further supports the assumption of impaired amino acid absorption/uptake in VP 15 . Differences can in uence these changes in the dietary intake but since the present cohort lacked extensive dietary data, the study groups were instead matched by several parameters including BMI, sexual practice and general food habit. Therefore, we hypothesize that enhanced intestinal absorption/uptake is a signature and protective effect in EC.
Furthermore, HIV-1 infection is known to alter the composition of the gut microbiome 7,21 and the immune status 22 . HIV-1 status, sexual risk category, and gender also impact gut microbial diversity 23 . Our earlier study indicated that the EC group was very similar to HIV-1 negative subjects regarding compositional and inferred functionality analyses and different from individuals with progressive HIV-1 infection 12 . In the current study we further identi ed that Prevotella together with Bacteroides were contributing the most for separating EC from HC due to the high abundance of Prevotella and low abundance of Bacteroides in EC. Also compared to VP, EC has augmented populations of Prevotella and fewer Bacteroides, as shown in the intergroup comparison. Bacteroides and Prevotella are rich bacterial taxas sharing a common phylum (Bacteriodetes) where Prevotella usually dominates in the oral cavity and Bacteroides in the gut 24,25 . However, when they co-occur in the gut, one of the genera outnumbers the other 26,27 . A similar observation was also evident from our current study where an increased abundance of Prevotella in all the groups showed a reduced abundance of Bacteroides and vice-versa. A lower abundance of Ruminococcus and higher numbers of Oscillibacter were also observed in EC relative to VP and HC. Oscillibacter and Ruminococcus are ber fermenters since they are key commensals for degrading complex polysaccharides to SCFA for nutrient utilization by intestinal epithelial cells 28 .
Another remarkable feature in EC was the highly signi cant enrichment of dipeptides in the fecal metabolome compared to progressive infection and HIV-1-negative controls, though they were unchanged in VP relative to HC. These data could re ect a combination of increased protein intake, increased secretion of pancreatic proteases and subsequent protein digestion and/or impaired peptide absorption in EC.
However, due to overall enhanced intestinal absorption/uptake of metabolites and presence of dipeptide transporters in the human gut that most likely facilitates the e ciency of dipeptides transport into the circulation, impaired protein absorption in EC seems less reasonable 29,30 .
Furthermore, dipeptides were also observed in plasma of EC arguing against impaired peptide absorption from the gut 15 . As all EC in our cohort has elevated levels of dipeptides this metabolic signature could also serve as a potential biomarker of EC status. Earlier studies indicated that dipeptides could act as anti-HIV compounds 16 . Given the presence of dipeptides in the plasma, we hypothesized that dipeptides have an anti-HIV-1 effect and enrichment of the dipeptides in both intestine and systemic circulation provide elite control status.
We synthesized several dipeptides and observed that four dipeptides [Valylglutamine (VQ), Lysylleucine (KL), tryptophylglycine (WG) and tyrosylglycine (YG)] bind to viral protein gp120 in biochemical assays, while the dipeptide tryptophylglycine (WG) and Valylglutamine (VQ) further inhibited HIV-1 infection in cell culture assays. Though the exact mechanism needs to be elucidated, these dipeptides can act as entry inhibitors, particularly WG, which was predicted to bind at the CD4 binding sites of gp120.
Finally, we also investigated if the microbiome changes in EC have any correlation with the enriched dipeptides. We observed an enrichment of Prevotella spp. after WG (P. buccae, P. bivia, P. denticola, P. disiens) and VQ (P. disiens) treatment. This observation might relate to the study where a Prevotella-dominated gut had a possible implication on host metabolism 31 . We saw that the growth of Prevotella was supported by the dipeptide WG and this WG-induced domination of Prevotella in EC might lead to a change in the metabolome of the respective study group. Interestingly, and in contrast to the majority of microbiome-related chemicals, the tryptophan metabolite skatol (3methylindole) and its derivative indole were highly elevated (p<0.001 in all the comparisons) in the feces of EC group relative to both HC and VP. In general, tryptophan is metabolized by humans to either serotonin or kynurenine. Several gut microbes cause tryptophan catabolism, too, degrading it into indoles and indole derivates. However, in physiological conditions, the interplay between endogenous and bacterial tryptophan metabolism is balanced 32,33 . Fusobacteria is one of the most enriched phyla for indole production 34 . Yet, in a study by Sasaki-Imamura et al., 6 out of 22 species of Prevotella were tested for their ability of indole production. These six species included Prevotella intermedia, Prevotella aurantiaca, Prevotella falsenii, Prevotella micans, Prevotella nigrescens and Prevotella pallens 35 . P. intermedia, a key periodontopathogenic microbe, has also known abilities for indole production from L-tryptophan 36 . Altogether, several studies have differentiated the indole producing Prevotella from the non-indole producing Prevotella 35 . Therefore, we hypothesize that the enrichment of the Prevotella in EC is directly correlated with enhancing the tryptophan metabolites skatol and indole in this study group. Indole has been recognized to play a role in bacterial bio lm formation, drug resistance, spore formation, and virulence 33,37 . It, however, has also been reported to have bene cial effects in the host, such as promoting the intestinal epithelial barrier function by enhancing tight junction proteins.
Furthermore, indole has anti-oxidative as well as anti-in ammatory properties in the gut. Skatol is one of the indole derivatives known for being a ligand for aryl hydrocarbon receptor (AhR). Receptor-binding leads to its activation and subsequent alteration of innate and adaptive immune responses. AhR acts hereby as a transcription factor regulating antimicrobial defense and intestinal immune homeostasis as well as inducing anti-in ammatory responses 32,33 . Interestingly, in our recent study on plasma metabolomics, we also observed an enhancement in antioxidant defense pathways, together with low in ammation levels in EC as compared to VP 15 . Hence, indole and skatol produced by the gut microbiota might support and contribute to EC status by providing an unfavorable intestinal environment for HIV-1 replication in EC. Even more, both indole and skatol modulate microbial gut communities including bacteria, fungi, and viruses 32,33 , thus they might contribute to intestinal and systemic homeostasis in EC.

Lately, indole-based drugs have also been developed and tested as potent inhibitors of HIV-1 replication, for example as entry and fusion inhibitors, protease inhibitors, allosteric HIV-1 integrase inhibitors (ALLINIs), and non-nucleoside reverse transcription inhibitors (NNRTIs).
Thus, indole derivatives are considered as a class of promising HIV-1 inhibitors [38][39][40][41] . It needs further investigations to examine whether the increase of indole and skatol as seen in the feces of EC group, might have direct anti-HIV-1 properties as observed for the dipeptides.
This study has limitations that merit comments. First, given the fact that the EC constitutes a rare group of individuals, the number of samples is limited and it is therefore not possible to include more EC, considering the current test and HIV-treatment guideline 42 . However, the Swedish EC cohort is an extensive EC cohort with follow-up data for more than two decades. Second, though we have general food habits matched between EC, VP and HC, we do not have extensive data on the dietary intake data for the study groups. However, this should not bias our ndings of the enriched dipeptide levels and enhanced intestinal absorption/uptake in EC, given that there is no difference between VP and HC. Further, our in vitro data showed a link between the enriched dipeptide and Prevotella levels, with the latter being another signature of EC.
In conclusion, this study addressed two key aspects of HIV pathogenesis; rstly, the metabolome analysis of EC highlighted the dipeptide richness in the fecal samples of these patients. Among the dipeptides, WG showed to have potential anti-HIV-1 properties by inhibiting viral entry. Secondly, the microbiome analysis suggested a link between dipeptide enrichment and higher Prevotella abundance in EC, which in uences host metabolism. Thus, we hypothesized that enrichment in Prevotella leads to increased levels of the tryptophan metabolites indole and skatol, which further provides bene cial effects in EC. Further work elucidating the mechanism of the anti-HIV properties of dipeptides (WG and VQ) as well as the role of microbiota-related metabolites, e.g. derived from bacterial tryptophan metabolisms such as indole and skatol, can contribute to a better understanding of their contribution to EC status, which opens a gateway in developing therapeutics in HIV.

Study cohorts
Cross-sectional fecal samples were collected between 2010 and 2016 from three groups of individuals; i) an unbiased cohort of untreated HIV-1-infected EC (n=14), as well as age-, gender-and body mass index-(BMI) matched ii) treatment-naïve HIV-1-infected patients with viremia (VP, n=16), and iii) HIV-1-negative controls (HC, n = 12). The de nitions of EC 5 43 . Dried fecal material (50mg) was used for sample preparation. Several recovery standards were added prior to the rst step in the extraction process for quality control (QC) purposes. To precipitate proteins, recover the metabolites and small molecules bound to protein or trapped in the precipitated protein matrix, the samples were mixed with methanol under vigorous shaking for two minutes followed by centrifugation. The resulting extract was divided into four fractions: two for analysis by two separate reverse phase (RP)/UPLC-MS/MS methods with positive ion mode electrospray ionization (ESI), one for analysis by RP/UPLC-MS/MS with negative ion mode ESI, and one for analysis by hydrophilic interaction liquid chromatography (HILIC)/UPLC-MS/MS with negative ion mode ESI. Samples were placed brie y on a TurboVap® (Zymark) to remove the organic solvent. All mass spectrometry method utilized a Waters ACQUITY UPLC and a Thermo Scienti c Q-Exactive high resolution/accurate mass spectrometer interfaced with a heated electrospray ionization (HESI-II) source and Orbitrap mass analyzer operated at 35,000 mass resolution. The MS analysis alternated between MS and data-dependent MS n scans using dynamic exclusion. The scan range covered 70-1000 m/z.

Data Extraction, compound identi cation, and quanti cation
The raw data extraction, peak identi cation, QC processed and metabolite identi cation were performed using an in house proprietary library based on standards that contains the retention time/index (RI), mass to charge ratio (m/z), and chromatographic data (including MS/MS spectral data) on molecules present in the library. Peaks were quanti ed using area-under-the-curve.

Bioinformatics and statistical analysis
The Welch's two-sample t-test was used to compare between two groups and one-way ANOVA was performed within three groups. Signi cant metabolites (q-value < 0.1 and p-value <0.05, two-way ANOVA) and associated sub-pathways and super-pathways were represented as a network using Cytoscape 3.6.1 (http://www.cytoscape.org/). Partial least-square discriminant analysis (PLS-DA) was performed using the R package ropls (v1.18.8) 44 . Boxplots were made using log2 transformed data and R package ggplot2. Data were log(1+x) transformed and represented as a heatmap using R package ggplot2.

Synthesis of dipeptides
The amide forms of the dipeptides were synthesized either in the core facility of University of Missouri or purchased from Pepscan (Pepscan Presto, Lelystad, the Netherlands) and had a purity of >95%. A total of eight dipeptides were synthesized that were stable.
Microscale thermophoresis of dipeptide binding with gp120 and protease The 6-his-tagged HIV-1 protein gp120 was obtained from Abcam (Abcam, US). Microscale thermophoresis (MST) to identify binding properties was performed as described previously 45 .

Methodology for docking of dipeptides
The structure of cryo-electron microscopy structure of a full-length gp120 in complex with unmodi ed human CD4 and unmodi ed human CCR5 resolved at 3.9 Å resolution (PDB entry 6MET) 46 was used to dock different peptides. Missing sidechains in this structure were added by 'Prime' (Schrödinger Inc. NY). The complex was subjected to molecular dynamics simulation using 'Desmond' integrated with 'Maestro' (Schrödinger Inc. NY). A low energy conformation structure obtained after 20 ns molecular dynamics (MD) simulation was used for identi cation of putative binding sites. A site near CD4 binding side was selected for docking. The structures of dipeptides were generated by ChemDraw (ChemO ce, Fisher Scienti c, USA) followed by minimization using MacroModel (Schrödinger Inc. NY). The structures of dipeptides suitable for docking were generated by LigPrep (Schrödinger Inc. NY). Flexible docking through Induced-Fit-Docking of Schrödinger Suite using SP-Peptide protocol was employed for docking of the peptides. The docked poses with highest Glide score were selected for further analyses.

16S microbiome data analysis
We have reanalyzed our earlier data from the same groups of individuals 12 . The groups were categorized as described under study cohorts, but due to limited sample availability with a reduced number of subjects: EC n=11, HC n=11, and VP n=11. 16S rRNA sequencing was performed on the Illumina MiSeq 2500 platform after DNA extraction from fecal samples of EC, HC, and VP. The raw reads were then preprocessed to remove contaminants and low-quality bases and analyzed using the online platform One-Codex (https://www.onecodex.com/). The relative abundance of each operational taxonomic unit (OTU) at genus level was calculated using inhouse PERL script for all samples and visualized as bar plots created using R package ggplot2 (version 3.2.1.) 47 . PERMANOVA analysis was executed using R package vegan (version 2.4.3) 48 . Statistical analysis between two groups was performed using Mann Whitney U test (two-sided).

Bacterial strains and activity against dipeptides
Clinical isolates of Prevotella species (n=4; P. buccae, P. bivia, P. disiens, P. denticola), Bacteroides fragilis (n=2), Clostridium di cile (n=2), Enterococcus faecalis (n=1), Klebsiella pneumoniae (n=1) and Escherichia coli (n=1) were isolated from patient samples received at Department of Clinical microbiology, Karolinska University Hospital and cultured. These isolates were then used to investigate the effects of dipeptides (WG and VQ) on their viability. Reference strains of E. faecalis (ATCC 29212), E. coli (ATCC 25922), K. pneumoniae (ATCC 25955), S. agalactiae (ATCC 13813), S. pyogenes (ATCC 19615), S. pneumoniae (ATCC 49619) were also used for protocol standardization and experimental controls. The aerobic and anaerobic bacterial strains were grown on blood agar plates for16-18h and 48h respectively and were further used for antimicrobial assay against dipeptides. Bacterial suspensions were prepared in 0.01M PBS (pH 7.4) to 0.5 ± 0.1 MacFarland using Densichek (Biomerieux) and diluted 1:100. A volume of 50 µl of this bacterial suspension was incubated at 37 °C for one hour with or without dipeptides (WG and VQ, 1mM, 5mM and 10mM each) in PBS. Subsequently, the samples were serially diluted and plated on blood agar plates and incubated for 16-18h (aerobes) and 48h (anaerobes) respectively to obtain the number of surviving bacteria. The surviving colonies were calculated per mL from treated and untreated culture. The experiments were repeated thrice in biological triplicates. Data was analysed using GraphPad Prism 8 (https://www.graphpad.com/scienti c-software/prism/) and applying t-test (two-sided) as well as ANOVA.

Data visualization
The gures were prepared with R packages and Adobe Illustrator v24 (https://www.adobe.com).

Ethical clearance
The study was approved by regional ethics committees of Stockholm (Dnr 2013/1944-31/4) and amendment (2019-05585). Written informed consent was obtained from all the participants. All the samples were anonymized and de-identi ed before analysis.

Data availability
All the mass spectrometry data for metabolite pro ling associated with this study are present in this paper including the Supplementary Information (Supplementary Table 1). The data for 16S microbiome analyses can be found in this paper or in Vesterbacka et al, 2017 12 .

Code availability
The codes used during this study are available at github (https://github.com/neogilab/METABO-EC-F). Table 1 Speci c interactions between dipeptide and gp120 residues. The table summarizes the speci c interactions between atoms of residues belonging to dipeptides and residues belonging gp120 and their type of interaction.    Altered microbiome-related biochemicals in EC a. Heatmap representing levels of metabolites that are partially or completely derived from the gut microbiome. Color depictures increasing log2 levels from blue via green to yellow. Data shown includes samples of HC (n=12), EC (n=12), and VP (n=16). b. A schematic presentation of the tryptophan metabolism leading to indole and indole derivatives. Whereas tryptophan is catabolized into indoles by microbiota in the gut, indole metabolites are converted into indoxyl and indoxyl sulfate in the liver. Boxes in neighbourdhood of a metabolite indicate that the respective metabolite was detected and quanti ed in the samples (HC n=12, EC n=12, and VP n=16). Box 1 shows comparison between EC and HC, box 2 between VP and HC and box 3 between EC and VP. A grey-colored box means non-signi cant difference, red presents foldchange greater than one (with p-value<0.05) and green foldchange smaller than one (with p-value<0.05). c. Boxplots of selected metabolites that are derived by the gut microbiota. Log2 transformed values are used to create boxplots. Different colors differentiate between the groups: EC (orange), HC (green), and VP (red). Median values and interquartile ranges are indicated by bars. P-values are determined by Mann-Whitney U test (two-sided) with * indicating p-value<0.05, **p-value<0.01, and ***p-value<0.001. d. A schematic presentation of the bile acid metabolism. Bile acids or salts are derived from cholesterol metabolism. The salts of cholic acid and chenodeoxycholic acid, the major primary bile acids synthesized in human livers, are conjugated in the liver with taurine or glycine for secretion into bile.