Patients’ characteristics
From March 05, 2020, through August 31, 2021, nasopharyngeal swabs taken from a total of 45,573 individuals aged ≤12 years were screened for SARS-CoV-2 infection at the main pediatric Hospital in Rome (Supplementary Figure 1). A diagnosis of COVID-19 was made for 2,399 of them, with a positivity rate that was 0.6% between March and July, 2020, 5.2% between August and December 2020, 5.9% between January and May 2021, and 3.2% between June and August 2021. Whole genome sequencing was performed in 731 samples collected from 731 individuals with varying disease symptoms involving upper or lower respiratory or gastrointestinal tracts. Sampling selection criteria for these samples and the comparison of their demographic and clinical characteristics with SARS-CoV-2 infected ≤12 years aged population are illustrated in Supplementary Fig. 1 and 3, Supplementary Table 1 and Supplementary Results.
One-hundred and nine samples were excluded due to failed amplification (n=25) or poor genomic coverage (<60%, n=94). The final study population thus consisted of 612 patients, whose demographic and clinical characteristics are reported in Table 1.
Most patients lived in Lazio region (n=587, 95.9%), and were caucasian (n=527, 86.1%). Three hundred and forty-five (56.4%) were male, and the median age was 2 (interquartile range [IQR]: 1-6) years. Two hundred and fifteen (35.1%) patients were under-one year of age. At the time of testing, mild infections were the most prevalent (436 cases, 82.3%), followed by moderate/severe infections (51, 9.7%). Only the 7.1% of patients, identified as a contact of a household case, was asymptomatic. One-hundred and seven patients required hospitalization (19.9%). Four patients, 2 of them aged <1 year, manifested a severe disease [16] and required oxygen ventilation in the critical care unit. No deaths were reported. COVID-19 presentation did not change substantially among patients with different age ranges.
Symptoms related to upper respiratory airways infections were most represented (n=445, 85.1%), followed by gastrointestinal symptoms (n=71, 13.5%) and lower respiratory tract symptoms (i.e. bronchitis or pneumonia, n=51, 9.7%). Symptoms did not change substantially among patients with different age ranges, with exception for gastrointestinal symptoms less frequently reported in ≥5-year-old children respect to <1-year-old and 1-5-year-old children (11 [7.7%] vs 36 [18.0%] and 24 [13.3%], P=0.006).
Median (IQR) SARS-CoV-2 nasopharyngeal load was 7.7 (6.1-8.5) log copies/mL. The 62.4% of samples had a SARS-CoV-2 viral load >7.0 log copies/mL. Viral load was slightly (but significantly) higher in patients aged ≤1 year compared to viral load in patients aged 1-5 and ≥5 years (8.3 [6.3-8.6] vs. 7.7 [6.2-8.4] vs. 7.1 [6.0-8.3] SARS-CoV-2 log copies/mL, P<0.0001).
By considering the timing of diagnosis, the 53.4% of paediatric SARS-CoV-2 infections (n=327) were collected during no-restriction periods (white zone), and the 6.2% (n=38) during lockdown (red zone). The remaining diagnosis were performed during light restriction periods (yellow and orange zone).
Distribution of SARS-CoV-2 lineages affecting pediatric population
The distribution of sequences sampled in children up to the end of August 2021 against clinical characteristics and against SARS-CoV-2 global context are shown by a ML tree in Fig.1 and by time-scale phylogeny in Fig.2A, according to PANGOLIN application [22]. Demographic and clinical characteristics of patients infected with SARS-CoV-2 against lineages are reported in Table 1.
Most of SARS-CoV-2 infections (n=253, 41.3%) belonged to lineage B.1.177 (EU1), affecting children with median age 2 (IQR: 1-6) years between October 2020 and January 2021. B.1.177 sequences mainly belonged to 20E (EU1) lineage (n=250, 98.8%) and were characterized by S:C22227T(A222V), and N:C28932T(A220V) [31].
B.1.617.2 and AY sublineages were found in 139 (22.7%) patients with median age 2 (IQR: 1-7) years between July and August 2021, followed by B.1.1.7 (alfa clade) found in 127 (20.8%) patients aged in median 3 (1-6) years between March and April 2021. As the end of August 2021, AY.43 (n=51, 36.7% of sampled sequences) and AY.39 (n=32, 23.0% of sampled sequences) were the most common delta clade sublineages detected in pediatric population.
The P.1 and P.1.1 (gamma clade) were found only in 35 individuals with median age 1 (IQR: 1-6) year and diagnosed between March and May 2021. Of note, 11 (31.4%) patients were of foreign origin, mainly from Southern Est Europe. The two sublineages differed for a single nucleotide polymorphism in RdRp:C13720T(P85S) exclusively found in P.1 lineage.
The B/B.1/B1.1 lineages characterizing the initial months of the SARS-CoV-2 pandemic were present only in 30 individuals aged 5 (1-8) years diagnosed between March and October 2020. This low number of B-related infections could be explained by the low number of children diagnosed as SARS-CoV-2 positive during the early phase of the pandemic (Supplementary Fig. 3).
Other lineages were detected in 28 patients and involved among others the B.1.160 (n=14), the B.1.525 (n=2), and the variant of concern B.1.351 (n=1). Of note, the B.1.160 (20A.EU2) was found in all European recipients, 5 of them belonging to Southern East Europe. This lineage became common in Europe after summer 2020 and was characterized by the S: G22992A(S477N), known to strengthen the binding of the SARS-COV-2 spike with the human ACE2 receptor [32].
The composition of sequences did not change substantially from SARS-CoV-2 sequences retrieved from general population of the same geographical origin and from GISAID (Fig.1). Concordant with the lineages’ modification over time, the genetic pairwise distance of the 612 sequences indicated that the SARS-CoV-2 sequences evolved progressively during time (rho=0.682, Supplementary Fig. 2).
Polymorphisms commonly shared in all lineages (prevalence>90.0%) were the NSP3:synC3037T, the RdRp:C14408T(P314L) and the S:A23403G(D614G). As expected, N:G28881T(R203K) and N:G28883C(G204R), known to increase transmissibility potential of SARS-CoV-2 [33], were detected in B/B.1/B.1.1 (prevalence 73.0% and 76.7%, respectively), B.1.1.7 (prevalence 95.3% both), P.1/P.1.1 (prevalence 100.0% both), B.1.525 and B.1.351 lineages. As expected, and as previously reported in adult population [34,35], gamma and delta clades were characterized by the highest viral load at diagnosis (SARS-CoV-2 RNA log copies/mL: 8.0 [6.1-8.6] and 8.4 [2.3-9.8], respectively) (Table 1 and Fig. 2A).
No significant association was found between lineages and COVID-19 clinical presentation, even if a low number of moderate/severe manifestations was found in presence of B.1.1.7 lineage (Table 1 and Fig. 2A).
Evidence of local transmission clusters
By looking at the time-scale phylogeny, it was possible to identify clear transmission chains and clusters, able to clarify the dynamic of viral lineages in paediatric population (Fig. 2B). The characteristics of the local clusters, with insights for clusters composed by ≥10 sequences were reported in Supplementary Table 2.
Overall, 129 sequences (21.1% of total pediatric sequences) were found in clusters, six of them composed by ≥10 sequences.
Lineage B.1.1 was characterized by limited local transmission, probably due to the low number of SARS-CoV-2 diagnoses in children during the first months of the pandemic. The only one local cluster, cluster B9, was characterized by a posterior probability of 1.00 and a tMRCA dated June, 8 2020 (May, 24-June, 13). It was composed of 6 sequences from children almost exclusively aged less than 2 years (except for one 4-year-old child). Five out of six children had a SARS-CoV-2 load in nasopharyngeal swabs >7.0 log copies/mL. All children lived in Rome, but two of them had a Nigerian origin. Of note, a sequence diagnosed in South Africa on March 31 2020, is at the origin of this cluster, confirming the probable foreign origin of this cluster. Two children experienced a SARS-CoV-2 related pneumonia. All cluster B9 sequences were characterized by the NSP3:synC7639T.
Lineage B.1.177 was characterized by 5 local clusters, involving 44 sequences (17.4% of B.1.177 sequences). Among these 5 clusters, EU18 (posterior probability=1.00) was composed by 24 sequences of paediatric patients, all Italian and residing in Rome, except for two of South American and Southern East Europe origin, respectively. Only four sequences belonged to adult individuals (age range: 22-61 years), supposing a sustained transmission among children aged less than 5 (representing the 80.6% of total paediatric clustering sequences, Supplementary Table 2) probably started in the middle/late August 2020. Cluster EU18 sequences were characterized by the NSP3:A6183G(K1155R), the NSP6:synT11836C, the RdRp:G15438T(M657I), N:synG229254A/T and N:C28706T(H145Y).
Two transmission chains probably starting between January and February 2021 were detected within P.1.1 lineage (Cluster γ5 and γ30, Fig.2B and Supplementary Table 2). While cluster γ30 contains only 4 paediatric sequences intermixed with adult and global SARS-CoV-2 sequences, cluster γ5 involved a tracing network of 18 individuals, 12 of them (66.7%) below 12-years of age. All patients were diagnosed in Rome, but four of them had foreign origin. The earliest and most closely related strain was a sequence from a 4-months old child of Southern East Europe origin collected in late April 2021 in Rome, confirming a multi-seeded transmission. In line with this, the most recent ancestor of this cluster dates back to February 12, 2021 (Feb 8 – Feb 24). All sequences were characterized by the RdRp:C13720T(P85S), NSP2:C2445T(T547I) and NSP4:synC9565T.
Twenty-two sequences from pediatric patients composed two main chains of lineage B.1.1.7 (posterior probability=1, Fig. 2B, Supplementary Table 2). Cluster α4 involved 10/11 paediatric patients aged 4 (IQR:<1-5), who were infected in Lazio region (mainly in Rome), and diagnosed after March, 14 2021. Half patients had a Southern Europe origin, suggesting a potential epidemiological linkage with this part of Europe. The most recent ancestor of this cluster dates back to February 17, 2021 (January 31-March 7). All patients experienced a mild infection, and exclusively reported upper respiratory airways symptoms (9/9 with information available). All sequences were characterized by the NSP8:C12525T(T145I), NSP14:synT18069C, ORF3a:synC25603T, ORF3a:C26110T(P240S), ORF8:C28087T(A65V), and N:synG29179T.
The other chain (Cluster α3) was composed of a total of 21 sequences (12 [57.1%] from pediatric individuals), all of them characterized by the ORF8:T28245G(L118V), ORF8:ins282458CTG(ins119L), NSP2:A1643T(N280Y). Sequences from paediatric patients were intermixed with sequences from adult patients (age range 32-77), diagnosed in Rome in the same period (Fig. 2B). Paediatric patients, with a median age of 1 year (IQR:<1-2), were most Caucasian (11, 91.7%), infected in Lazio region (mainly in Rome) and diagnosed between March/April 2021 (Supplementary Table 2). Mild symptoms (mainly upper respiratory) were the most frequently reported (9/10 with information available).
Delta clade was characterized by 4 local clusters, involving 39 sequences (28.1% of delta clade sequences). The high prevalence of delta sequences in local clusters suggests the sustained circulation in the paediatric population of this clade since its emergence. Among these 4 clusters, δ26 (posterior probability=1.00) was composed by 37 sequences, 14 of them (37.8%) paediatrics and with a SARS-CoV-2 load in nasopharyngeal swabs >7.0 log copies/mL in 12/14 patients (Supplementary Table 2). All patients with exception for one are Caucasian. Sequences were characterized by the NSP1:synC745T, NSP3:synC6730T, RdRp:synC13944T and RdRp:G15906T(Q813H), ORF3a:G25471(D27Y), and M:synA26786G. As of 3 December 2021, all these sequences were defined as AY.125, supporting the hypothesis of a delta-subclade originated in late May, as defined by the tMRCA (May 12, 2021 [April 27-May 19]), and expanded in paediatric and adult Italian population in central Italy in the middle of July 2021.
Cluster δ27 (posterior probability=1.00), probably originated in the late May/early June was composed by 28 sequences, 13 of them (46.7%) belonged to paediatric individuals and with a SARS-CoV-2 load in nasopharyngeal swabs >8.0 log copies/mL in 12/13 patients (Supplementary Table 2). The remaining individuals belonged to adolescent (n=2, aged 15 and 18 years) and adult population (n=11, age range: 20-85 years). Sequences were characterized by the RdRp:synC16111T, the NSP13:G17671A (V479I) and the S:G23593C(G677H), reported as recurrent arising independently in many SARS-CoV-2 lineages circulating worldwide by the end of 2020, and known to enhance viral infectivity and neutralizing antibodies resistance [36,37].
Univariate and multivariate logistic regression models were performed to identify potential factors associated with local clusters (Table 2). The results showed that in our paediatric population, patients aged ≥5 were less commonly found in clusters (adjusted odds ratio, AOR [95% CI]: 0.49 [0.29–0.84]; P=0.009), whereas there was a positive association between the presence in clusters and individuals infected with a delta or gamma clade (AOR [95% CI]: 1.89 [1.18-3.03] and 4.05 [1.88-8.75]; P=0.008 and P<0.0001). No associations were found with COVID-19 presentation. By considering the timing of diagnosis, no significant association with lockdown periods was detected.
Correlation with hospitalization
Univariate and multivariate logistic regression models were performed to define if lineages or clusters can be potentially associated with hospitalization (Table 3). As confounding factors gender, age, nationality, residency, SARS-CoV-2 viral load, symptoms at diagnosis, and comorbidities were considered. The results showed that in our paediatric population, lineages and clusters were not associated with hospitalization, with the exception for B.1.1.7 lineage, significantly and negatively associated with hospitalization (AOR: 0.31 [0.13-0.71], P=0.006). This lineage was also characterized by the lowest (even if not significant) number of COVID-19 moderate or severe manifestations (n=4, 4.0%) compared with other SARS-CoV-2 lineages (Table 1).
As expected, patients aged <1, with moderate/severe COVID-19, and with comorbidities were more frequently associated with hospitalization (P<0.0001, Table 3).