Longitudinal cervicovaginal microbiome and virome alterations during ART and discordant shedding in women living with HIV

Despite successful suppression of plasma HIV replication by antiretroviral therapy (ART), some women living with HIV (WLHIV) can still experience genital HIV shedding (discordant shedding). Female genital tract (FGT) microbiome and virome dynamics during long-term ART in WLHIV are poorly understood but might contribute to discordant HIV shedding, as the microbiome and virome are known to influence FGT health. To understand FGT microbial communities over time during ART usage and discordant shedding, we characterized the microbiome and virome in 125 cervicovaginal specimens collected over two years in 31 WLHIV in Lima, Peru. Intrapersonal bacterial microbiome variation was higher in HIV shedders compared to non-shedders. Cervicovaginal virome composition changed over time, particularly in non-shedders. Specifically, anellovirus relative abundance was inversely associated with ART duration and CD4 counts. Our results suggest that discordant HIV shedding is associated with FGT microbiome instability, and immune recovery during ART influences FGT virome composition.


Introduction
With the development of effective antiretroviral therapies (ART), life expectancy for people living with HIV has increased signi cantly and HIV infection can now be treated as a chronic disease 1,2 .Of the estimated 20 million women aged 15 and older living with HIV worldwide in 2022, an estimated 82% were taking ART 3 .While ART is effective in suppressing HIV replication, some women living with HIV (WLHIV) with undetectable plasma viral loads can still experience genital HIV shedding, described as discordant shedding [4][5][6][7][8][9][10] .The microbiome has been implicated in female genital tract (FGT) health and disease 11 .
However, the relationship between discordant shedding and the FGT microbiome and virome during longterm ART in WLHIV remains poorly understood.
The FGT microbiome is particularly important in HIV infection.Speci c microbiome community states and bacterial species have been associated with increased HIV prevalence 22 and acquisition 14 .In WLHIV, cervicovaginal microbiota have been associated with genital HIV shedding 5,22,23 .While pre-exposure prophylaxis (PrEP) has not been shown to signi cantly in uence the vaginal microbiome in women without HIV [24][25][26] , multiple lines of evidence suggest that vaginal bacteria in uence metabolism and concentrations of antiretroviral drugs in the FGT 24,[27][28][29][30][31][32] .Whether long-term ART in uences the vaginal microbiome in WLHIV is poorly understood.One study found no microbiome changes after six months of ART 33 , another reported higher Lactobacillus relative abundance in women with viral suppression after a year of ART 34 , and a third reported an association between bacterial vaginosis and ART with protease inhibitors compared to regimens without protease inhibitors 35 .Further study is needed to clarify the relationship between ART and the FGT microbiome over years or decades.The relationship between the microbiome and discordant HIV shedding is also unclear.Discordant HIV shedding is incompletely understood, but has been associated with genital in ammation 36 , coinfections 36,37 , trauma 38 and ulcers 10 , and has also been attributed to expression of integrated provirus from proliferating cells 4 .The importance of the microbiome in FGT health led us to consider the microbiome as a potential factor promoting discordant HIV shedding.
Compared to the bacterial microbiome, less is known about the role of the virome in FGT health.Viruses found in the FGT include papillomaviruses, anelloviruses, herpesviruses and bacteriophages [39][40][41][42][43][44][45][46][47] .Vaginal virome community structure has been linked with the bacterial microbiome [40][41][42] .Vaginal virome alterations have been associated with in ammation 41 , cervical lesions 43,44 , and preterm birth 45 .Human papillomaviruses are the major cause of cervical cancer, the fourth most-diagnosed cancer in women worldwide in 2020 48 .Anelloviruses, a recently discovered, prevalent viral family 49 , have been associated with FGT in ammation in women without HIV 41 and CD4 T cell levels in WLHIV 44 .Little is known about the relationship between the FGT virome and long-term ART or discordant HIV shedding.
To investigate the relationship between the cervicovaginal microbiome and virome and discordant shedding over time in WLHIV taking ART, we analyzed the microbiome and virome in 125 cervicovaginal specimens collected over a two-year period from 31 WLHIV in Lima, Peru, nine of whom experienced discordant HIV shedding.Intrapersonal bacterial microbiome stability differed signi cantly between HIV shedders and non-shedders.Signi cant virome changes were associated with ART duration, speci cally decreases in anellovirus abundance, especially among non-shedders.Our results suggest that cervicovaginal microbiome instability is associated with discordant HIV shedding, while immune recovery during ART may in uence the cervicovaginal virome.
To investigate the roles of both the microbiome and virome in discordant HIV shedding, we analyzed 125 cervicovaginal specimens collected longitudinally from 31 WLHIV in Lima, Peru (Table 1, Fig. 1A, Supplementary Table 1).Specimens were collected quarterly from participants over a period of two years as part of a larger study characterizing discordant shedding in people living with HIV 4 .In this study, we analyzed an average of 4 specimens per participant.Most specimens (n = 120) were cervicovaginal lavages (CVL).When CVL samples were not available, vaginal swabs (n = 2), cytobrushes (n = 2), or TearFlo paper strips (n = 1) were used to sample the cervicovaginal environment.Nine women experienced discordant HIV shedding during the study period (shedders) and 22 did not (non-shedders).Non-shedders were older than shedders at study enrollment (non-shedder median age = 34 years, shedder median age = 28 years; Mann-Whitney test, p = 0.047) (Extended Data Fig. 1A).Shedders and nonshedders had similar CD4 counts at enrollment and similar CD4 recoveries when values from enrollment and after 2 years of ART were compared (Mann-Whitney test; p = 0.94 at enrollment, p = 0.94 at nal visit, p = 0.94 for change from enrollment to nal visit) (Extended Data Fig. 1B).Using metagenomic nextgeneration sequencing (NGS) after physically enriching for bacteria, we obtained a median of 15.6x10 6 (IQR 13.7x10 6 -17.8x10 6 ) raw read pairs and 1.95x10 6 (IQR 8.7x10 5 -4.7x10 6 ) quality-ltered, deduplicated individual reads per sample.Most samples clustered into groups dominated by Lactobacillus crispatus (n = 29), Lactobacillus iners (n = 21), Gardnerella vaginalis (n = 48), or a mixed community (n = 21), with a few samples dominated by Lactobacillus jensenii (n = 2) or Escherichia coli (n = 4) (Fig. 1B).There were no signi cant differences in bacterial alpha diversity, beta diversity, or cluster assignment between different sample types (CVL, vaginal swabs, cytobrushes or TearFlo strips) (Extended Data Fig. 1C-1F, Supplementary Table 2).Intrapersonal microbiome variation (beta diversity) was lower than interpersonal variation, suggesting intraindividual stability over time (Fig. 2A; Bray-Curtis dissimilarity; Mann-Whitney test, p = 1.9x10 − 19 for non-shedders, p = 0.0019 for shedders).Shedders had higher intrapersonal microbiome beta diversity than non-shedders (Fig. 2A; Bray-Curtis dissimilarity, Mann-Whitney test, p = 0.0019).We tested the hypothesis that perturbation was most pronounced during discordant shedding by comparing diversity between discordant and non-discordant timepoints within individuals (Fig. 2B).Beta diversity was signi cantly higher between samples when one timepoint was discordant and the other was nondiscordant, compared to pairs of non-discordant timepoints (Bray-Curtis dissimilarity; Mann-Whitney test, p = 0.014), suggesting that higher intrapersonal variation in shedders re ects dysbiosis during discordant shedding rather than prolonged uctuations.Principal coordinates analysis (PCoA) using Bray-Curtis dissimilarity showed no association between microbiome composition and ART duration (PERMANOVA, p = 0.33) or shedder/non-shedder status (PERMANOVA, p = 0.34) (Fig. 2C), although discriminant analysis with MaAsLin2 50 showed higher Mycoplasma hominis relative abundance in shedders compared to nonshedders (Fig. 2D, q = 0.048).PCoA of shedder samples showed no difference between discordant and non-discordant timepoints (Fig. 2E; PERMANOVA, p = 0.85).Microbiome community clusters were not associated with shedder/non-shedder status, ART duration, or discordant shedding timepoints (Fig. 1B, Supplementary Table 2).These results suggest that no speci c microbiome composition is associated with shedder/non-shedder status, ART duration, or discordant shedding events, implying that higher variation during discordant shedding re ects individual-speci c, not shared, dysbioses.To determine whether this could be explained by common functional characteristics encoded by bacterial genomes that might not be re ected in taxonomic pro les, we performed functional pro ling with HUMAnN3 51 (Extended Data Fig. 2A).We observed similar results for bacterial metabolic pathway and gene family beta diversity as for taxonomic beta diversity (Extended Data Fig. 2B-2I).172 metabolic pathways were associated with microbiome community clusters but not with shedder/non-shedder status, discordant shedding events, or ART duration (MaAsLin2, Supplementary Table 3).Results were similar for gene families (Supplementary Table 4).This suggests that functional pro les mirror taxonomic pro les, with similar beta diversity dynamics during discordant HIV shedding.
Non-shedders' viromes are less stable than shedders' viromes, with anellovirus and papillomavirus relative abundance changing over time.
Given the temporal dynamics of the bacterial microbiome, we examined the virome for similar patterns.
We used differential abundance analysis (MaAsLin2 50 ) to identify viruses responsible for changes in virome beta diversity over time.Anelloviridae relative abundance was inversely associated with ART duration (q = 8.8x10 − 5 ), while Papillomaviridae relative abundance increased over time (q = 0.037) (Fig. 4F).ART duration trended toward interaction with shedder/non-shedder status for several viral families, but these trends were not signi cant (Supplementary Table 5).When non-shedders were analyzed separately, Anelloviridae relative abundance decreased over time after ART initiation and Papillomaviridae relative abundance increased (Extended Data Fig. 5C).No taxa changed signi cantly over time in shedders.These results suggest that changes in virome composition after ART initiation are driven by Anelloviridae and Papillomaviridae, especially among non-shedders.
Anellovirus communities change over time, with decreased alphatorquevirus abundance and increased betatorquevirus proportions.
Because the Anelloviridae family contributed signi cantly to the temporal virome signature (Fig. 3), we performed phylogenetic analysis of the anellovirus contig ORF1 gene sequences to better de ne anellovirus community dynamics.We obtained predicted ORF1 sequences from 751/799 anellovirus contigs.21 contigs had more than one predicted ORF1 sequence, for a total of 774 sequences.At the genus level, most anelloviruses were classi ed as alphatorqueviruses (n = 339), betatorqueviruses (n = 339), or gammatorqueviruses (n = 85), with a few hetorqueviruses (n = 4) and sequences falling outside known anellovirus genera (n = 7) (Fig. 4A).Compared to all viral genera, relative abundance of alphatorqueviruses and incomplete anellovirus genomes (i.e., without ORF1) decreased over time after ART initiation (MaAsLin2, q = 0.0019 and q = 0.0019) (Extended Data Fig. 6A).There were no signi cant changes in the relative abundance of other anellovirus genera.This suggests that decreased Anelloviridae relative abundance re ects decreased Alphatorquevirus relative abundance.
When shedders and non-shedders were analyzed together, Betatorquevirus relative abundance increased and relative abundance of partial anellovirus genomes decreased relative to other anellovirus genera (Extended Data Fig. 6E; MaAsLin2, q = 0.036 and q = 0.036).When non-shedders were considered separately, Betatorquevirus relative abundance increased (MaAsLin2, q = 0.028) (Extended Fig. 6F) and Alphatorquevirus relative abundance tended to decrease but did not reach signi cance (MaAsLin2, q = 0.060).No anellovirus genera changed signi cantly over time in shedders.Together with the wholevirome analyses, where Alphatorquevirus relative abundance decreased over time in the virome as a whole, these results suggest that betatorqueviruses consequently make up a larger proportion of the remaining anellovirus community, and these changes are more pronounced among non-shedders.
We corroborated the NGS data with quantitative PCR (qPCR) speci c to the untranslated region of the alphatorquevirus genome (Extended Data Fig. 7Α).Alphatorquevirus genome copy numbers were correlated with Alphatorquevirus NGS relative abundance (linear regression, p = 0.041) and trended toward association with Anelloviridae relative abundance (linear regression, p = 0.085), but not Betatorquevirus or Gammatorquevirus relative abundance (linear regression, p = 0.96 and p = 0.84, respectively) (Extended Data Fig. 7Β).Alphatorquevirus copy numbers tended toward inverse association with ART duration, but this was not signi cant (LME, p = 0.090) (Extended Data Fig. 7C), and there was no signi cant difference between shedders and non-shedders (LME, p = 0.51).
Papillomavirus populations are stable over time in shedders and non-shedders.

DISCUSSION
We identi ed intraindividual and longitudinal patterns in the cervicovaginal microbiome and virome of WLHIV with and without discordant HIV shedding.Speci cally, the bacterial microbiome was less stable over time in shedders.Virome composition changed signi cantly in association with ART duration, with more pronounced changes among non-shedders.
Higher intraindividual bacterial beta diversity in shedders suggests more frequent changes in microbiome composition compared to non-shedders.This was driven by shedding timepoints, suggesting that microbiome alterations may be associated with shedding events.By contrast, lower intraindividual diversity in non-shedders suggests stability over time.No speci c microbiome composition was associated with shedders or shedding events, suggesting that stability over time is more important than speci c taxa in the context of discordant shedding.This is interesting because previous studies have often focused on speci c microbiome community states or bacterial taxa as markers of FGT health 11 .Previous analysis of this cohort linked discordant shedding with proliferation of infected cells, rather than HIV replication 4 .Other studies have identi ed associations between cervicovaginal microbiota and T cell populations in the FGT 14 .Thus, the microbiome may contribute to discordant shedding by in uencing the number of infected cells in the FGT, that may proliferate in response to disruption of the FGT.Alternatively, the microbiome may be in uenced by other factors (trauma through intercourse, in ammation, etc.) that also contribute to discordant shedding.
We identi ed signi cant longitudinal changes in the cervicovaginal virome.Speci cally, Anelloviridae and Alphatorquevirus relative abundance decreased and overall anellovirus community diversity changed during two years of ART.Anellovirus relative abundance and alphatorquevirus absolute abundance measured by qPCR were inversely associated with plasma CD4 counts.Anelloviruses are prevalent in healthy adult humans and are not known to be pathogenic 49 .Interestingly, numerous studies have associated anellovirus loads with immune suppression, including in people living with HIV 44,[52][53][54][55][56][57][58] and in organ transplant recipients 49,59 .Thus, the anellovirus dynamics that we observed may re ect participants' immune recovery during two years of ART.The fact that the associations were more pronounced among non-shedders may be a consequence of sampling size, or may suggest underlying immune dysregulation in shedders.
While Papillomaviridae relative abundance increased over time, papillomavirus diversity did not change signi cantly.Because relative abundance data is compositional, papillomavirus relative abundance may have been in ated when anellovirus relative abundance decreased, with no underlying change in papillomavirus viral load.Although Phi29 polymerase may preferentially amplify small circular singlestranded DNA 60 , our alphatorquevirus qPCR absolute copy numbers were consistent with NGS relative abundance, suggesting that our results represent an actual biological signature.Besides plant virus reads identi ed in a few samples, we did not nd any RNA viruses by sequence-independent ampli cation (SIA).This could mean that no RNA viruses were present in the specimens, or the RNA could have degraded during storage.
Our cohort consisted of ART-naïve women from a clinical setting who were initiating governmentsponsored ART in Lima, Peru.Because all the study participants received ART, we cannot directly compare our results with data from WLHIV not receiving ART.While this comparison would be biologically informative, such a study would be incompatible with current treatment standards.To our knowledge, only a few studies have analyzed the cervicovaginal microbiome or virome in women from Peru [61][62][63] .Therefore, our data contributes valuable information about the FGT microbiome and virome in women from this region.Prevalence of vaginal microbiome community states varies between cohorts, with some studies reporting differences between women from different geographic regions or different racial and ethnic backgrounds [11][12][13][14][15]19,[64][65][66] . Therefor, studying diverse groups of WLHIV worldwide would indicate whether our results are generalizable to other populations.
This longitudinal analysis of the cervicovaginal virome and microbiome in WLHIV identi es intrapersonal and temporal patterns associated with discordant HIV shedding and ART duration.These ndings support further investigation of the role of bacterial communities in discordant HIV shedding in women with suppressed plasma viral loads, as well as the role of anelloviruses during immune suppression and recovery.

METHODS
Specimens: All study participants provided informed consent.This study was approved by the Institutional Review Boards of the Hospital Dos de Mayo (Lima, Peru), Seattle Children's Hospital (Seattle, Washington, USA), and Arizona State University (Tempe, Arizona, USA).Cervicovaginal specimens were collected as part of a 18-24-month study evaluating low-level viremias (aka "blips") and discordant genital shedding in ART naïve men and women from Lima, Peru 4,67 .Specimens were collected every three months over a two-year period for each participant, and plasma CD4 counts were measured every six months.Cervicovaginal lavages were collected by washing the uterine cervix with sterile 1X phosphate buffered saline (PBS) and collecting the 1X PBS from the vaginal fornix as described 68 .Plasma and CVL HIV RNA were quanti ed by qRT-PCR as described 69 .Discordant shedding was assessed in participants with ART suppression, de ned as median plasma HIV RNA below the limit-of-quanti cation (1.48 log 10 copies(c)/mL), without virologic failure (HIV RNA > 3.0 log 10 c/mL), who had HIV RNA detected in the CVL (1.48 log 10 copies(c)/mL) for the women.Nine women experienced discordant HIV shedding (shedders) and 22 women did not experience discordant shedding (non-shedders).To select timepoints for microbiome and virome analysis, non-shedders were matched with shedders based on average CD4 counts over the duration of ART (2-4 non-shedders for each shedder).We analyzed the microbiome and virome at enrollment (month 3 for one shedder who did not have an enrollment specimen available); for shedders, discordant shedding timepoints and immediate pre-or post-shedding timepoints, as available; matched visits for non-shedders; and at 24 months (month 21 for two shedders who did not have month 24 specimens available) (Fig. 1A).When CVL specimens were not available, we analyzed other available specimen types: vaginal swabs (n = 2), cytobrushes (n = 2), or TearFlo paper strips (n = 1).

Sample processing
Specimens were thawed on ice and mixed by pipetting.Up to 800 µl of CVL were centrifuged for 10 minutes at 7,000xg at 4° C. The supernatants were used immediately for virus-like particle enrichment and virome sequencing, and the pellets were frozen at 80° C until used for bacterial microbiome sequencing.PBS negative controls (800 µl) were processed alongside the samples through microbiome and virome sequencing to assess contamination.In the case of vaginal swabs, cytobrushes, or TearFlo strips, 1 mL of 1X PBS was added to the specimen and vortexed for 30 seconds, and 800 µl was then processed in the same way as the CVLs.

Bacterial microbiome sequencing
Frozen Virus-like particle enrichment and virome sequencing: CVL supernatants and PBS controls were passed through a 0.2-µm pore lter to exclude bacterial and eukaryotic cells.Filtrates were treated with benzonase and Baseline-ZERO DNase to digest non-encapsidated nucleic acids.Total nucleic acids (TNA) were then extracted from 500 µl of the treated ltrates on the bioMérieux eMAG, with PBS added as needed if ltrates were below the required volume.Viral DNA was ampli ed by multiple displacement ampli cation (MDA) using Phi29 DNA polymerase (GenomiPhi V2 (Cytiva; Marlborough, MA)), with the kit's lambda phage DNA for a positive control.The ampli ed DNA was variably diluted (up to 1:10 depending on band brightness after gel electrophoresis) and used as input for the Illumina DNA Prep library kit, with 6 cycles for the indexing PCR.Libraries were individually quanti ed with a Qubit uorometer, pooled and sequenced on an Illumina NextSeq 2000 sequencer (paired-end, 2x150 bp).Separately, we performed sequence-independent ampli cation (SIA) of viral RNA and DNA by rst-strand cDNA synthesis with SuperScript IV reverse transcriptase (ThermoFisher; Waltham, MA) and primers 5'-GTTTCCCAGTCACGATCNNNNNNNNN-3' and 5'-GTTTCCCAGTCACGATC-3', second strand DNA synthesis with DNA Polymerase I Klenow fragment (New England Biolabs; Ipswich, MA), and ampli cation with AccuPrime Taq high delity polymerase (ThermoFisher; Waltham, MA) and primer 5'-GTTTCCCAGTCACGATC-3'.We used cell-culture derived rhinovirus B14 RNA as a positive control.The SIA products were used undiluted as input for the DNA Prep library kit (Illumina; San Diego, CA) and sequenced in the same way as the MDA products.
Virus-like particle enumeration by uorescence microscopy: Unextracted, nuclease-treated ltrates remaining from the virus-like particle (VLP) enrichment were frozen at 80° C until processed for VLP enumeration by uorescence microscopy, using an adaption of previously described methods 70,71 .
De-identi ed, remnant stool specimens (ASU IRB record STUDY00011967) processed similarly to the CVL samples (resuspension in SM buffer, 0.45-µm and 0.2-µm ltration, and nuclease treatment) and myxoma virus with a known titer were used as positive controls.Filters were dried, stained with 25X SYBR Gold (ThermoFisher; Waltham, MA), and mounted on slides in 15 µl of glycerol/PBS buffer with 0.1% ascorbic acid (working stock: 1 mL glycerol, 980 µl PBS and 20 µl 10% ascorbic acid).Slides were visualized at 60X with oil immersion on an AX R confocal microscope (Nikon; Tokyo, Japan) with a 488-nm wavelength laser.Five elds were randomly selected per sample, and NIS-Elements software (Nikon; Tokyo, Japan) was used to count particles between 50 nm and 500 nm in size.VLP counts for each sample were averaged across the number of elds viewed and used to calculate VLP concentrations in the CVL ltrates and controls.

Bacterial microbiome analysis
Bacterial sequencing reads were quality-ltered, sequencing adapters, PhiX and human sequences removed, reads deduplicated and paired reads merged using BBTools (v.38.38) 72 .Taxonomy was assigned to the clean reads with KrakenUniq (v.0.7.3) 73 using a custom database (RefSeq archaeal, bacterial, viral, plasmid, and fungal genomes, plus the GRCh38 human genome; downloaded in January 2022).To reduce false positives, we masked taxa that were assigned fewer than 1,800 unique kmers per million reads.For downstream analyses, we parsed the read counts at the species level, i.e., we included reads that could be classi ed at the species level or lower and excluded reads that could only be classi ed at the genus level or higher.We excluded probable false positive taxa (Cyanobacteria/Melainabacteria and Rickettsiales), as well as eukaryotes, archaea, viruses, and plasmids.Contaminant bacterial taxa were identi ed using decontam (R package v. 1.18.0) 74and removed at a threshold of 0.1 (Acidovorax sp.KKS102, Streptococcus equi, and Cutibacterium acnes).To normalize for different sequencing depths between samples, species read counts were expressed as number of reads per 100,000 clean reads in a sample, or as relative abundance where appropriate.
Bacterial alpha diversity was measured by species richness, calculated as the number of taxa present in each sample, and Shannon index, calculated using vegan (R package, v. 2.6-4) 75 .Bacterial beta diversity was measured by Bray-Curtis dissimilarity (weighted) and Sorensen dissimilarity (unweighted), calculated with vegan using species relative abundance and species presence-absence, respectively.PCoA analyses were conducted using phyloseq (R package, v. 1.42.0) 76.Microbiome community groups were de ned by kmeans clustering using the R stats package (v.4.2.2) 77 on species relative abundance with k = 6.
MaAsLin2 50 was used on species relative abundance to identify species differentially associated either with ART duration or shedder/non-shedder status.
For functional pro ling, we assigned reads to bacterial gene families and metabolic pathways using HUMAnN3 78 .Contaminant identi cation and beta diversity analyses were performed as described above, using normalized gene family and metabolic pathway read counts or relative abundance for input as appropriate.We used MaAsLin2 to identify gene families and pathways associated with sample metadata and microbiome community clusters.

Identi cation of viral contigs
We used Cenote-Taker2 (v.2.1.5) 82with the standard database and VirSorter2 (v.2.2.4) 83 to identify candidate viral contigs.We queried the candidate viral contigs against the NCBI NT database (downloaded June 2020) with megablast (BLAST + v. 2.13.0) 84 and removed contigs with ≥95% nucleotide identity and query coverage to human sequences.We used CheckV (v.1.0.1) 85to check for false positive, non-viral sequences.We considered contigs to be viral if they were classi ed as complete, high quality or medium quality by CheckV; or if they were low-quality with at least three viral genes and no more than one host gene; or if they were classi ed as proviruses.For contigs classi ed as proviruses, we kept only the viral regions and discarded anking host sequences.We discarded contigs agged by CheckV as having high kmer frequency, or which had been classi ed as bacteria by Cenote-Taker2 during candidate contig identi cation.To assign taxonomy, we queried the viral contigs against a custom viral protein database (NCBI RefSeq and neighbor viral sequences, downloaded January 2023) using blastx (evalue 1x10 − 3 ) and used the taxonomy of the best hit in the database.We discarded contigs classi ed as Mimiviridae as likely false positives.We used Bowtie 2 (v.2.5.1) 86to map the quality-ltered sample reads against the viral contigs.Contaminant contigs were identi ed using decontam (threshold = 0.1) and removed (six contigs).For downstream analyses, we kept samples that had at least 1,000 viral reads (116 of 125 samples (93%)).

Ecological analyses
Virome alpha (Shannon index and richness) and beta diversity metrics (Bray-Curtis and Sorensen dissimilarity) were calculated and PCoA was conducted as described for the bacterial microbiome, using viral contig relative abundance for weighted analyses and presence-absence for unweighted analyses.
Kmeans clustering was conducted on virus family relative abundance in R as described for the bacterial microbiome, with k = for anellovirus contigs in genus-level virome analyses (e.g., MaAsLin2 analyses).For anellovirus UniFrac distance, a separate phylogeny was built with the contig ORF1 sequences and a single Zetatorquevirus reference sequence, using the method described above.This phylogeny was re-rooted in R using ape (v. 5.6-2) 91 , the reference sequence was pruned using phyloseq (v.1.42.0) 76, and weighted and unweighted UniFrac distance were calculated with phyloseq using the pruned phylogeny and anellovirus contig relative abundance.PCoA analyses were performed using weighted and unweighted UniFrac distance based on the pruned phylogeny and anellovirus contig relative abundance or presence-absence.

Papillomavirus phylogenetic analysis
We used Geneious Prime to predict ORFs from the papillomavirus contigs called by blastx during taxonomy assignment.We mapped contig ORFs ≥1,000 nucleotides in length against reference Alphapapillomavirus, Betapapillomavirus, Gammapapillomavirus, Mupapillomavirus and Nupapillomavirus genomes and annotated the ORFs based on mapping to the reference genes.After manual curation, concatenated contig E1-E2-L2-L1 nucleotide sequences and reference Alphapapillomavirus, Betapapillomavirus, Gammapapillomavirus, Mupapillomavirus and Nupapillomavirus sequences retrieved from RefSeq (accessions in Supplementary Table 7) were aligned with MAFFT with auto algorithm selection, trimmed with trimAl with the -gappyout option, and a maximum likelihood phylogeny constructed with IQ-TREE with the model GTR + G and 1,000 bootstrap replicates.The phylogeny was visualized in FigTree and rooted using Mupapillomavirus/ Nupapillomavirus as the outgroup.Contigs were assigned to genera based on their clade, and phylogenybased genus assignments were used for papillomavirus contigs in genus-level virome analyses.Papillomavirus beta diversity analyses were performed in the same way as for anelloviruses, with a separate phylogeny built from the contig E1-E2-L2-L1 sequences and a single Mupapillomavirus reference.
Figures  2).Cluster 3 consists of two samples from the same non-shedder individual.
bacterial pellets (CVL samples and 1X PBS controls) were thawed on ice.DNA was extracted with the DNeasy PowerSoil Pro kit (Qiagen; Germantown, MD), following the manufacturer's instructions modi ed to include 775 µl of Solution CD1, 20 minutes bead beating in a TissueLyser II, 3 minutes of centrifugation at 16,000 g to dry the spin column membrane, and elution in 60 µl of Solution C6.DNA concentrations were measured on a NanoDrop spectrophotometer, and 5-25 µl of DNA (approximately 100-500 ng DNA) per sample were used as input for the Illumina DNA Prep library kit, following the manufacturer's protocol with 6 cycles for the indexing PCR.Two µl of ZymoBIOMICS Microbial Community DNA Standard was used to build a positive control library.Libraries were pooled, quanti ed with a Qubit uorometer and sequenced (paired-end, 2x150 bp) on a NextSeq 1000/2000 sequencer (Illumina; San Diego, CA).Depending on the number of reads obtained per library, some libraries were resequenced and the results concatenated to obtain a minimum of 10 million read pairs per sample.

Table 1
3.MaAsLin2 was used at different taxonomic levels (family, genus, and contig) to identify viral taxa associated with ART duration or shedder/non-shedder status.Anellovirus phylogenetic analysis: We used Geneious Prime (v.2023.0.4,http://www.geneious.com)to predict open reading frames (ORFs) from the anellovirus contigs called by blastx during taxonomy assignment.ORFs ≥1,000 nucleotides in length were extracted, manually curated, translated to amino Zetatorquevirus as the outgroup, and contigs were assigned to genera based on clade location.Phylogeny-based genus assignments, rather than blastx-based genus assignments, were used