Asymptomatic protozoa community infections in humans from southwestern China

Background: Giardia.lamblia (G.lamblia), Entamoeba histolytica (E.histolytica), Blastocyistis hominis (B. hominis ) and Cryptosporidium spp. infections have been frequently reported as etiological agents for gastroenteritis but also as common gut inhabitants in healthy individuals. Co-infections with protozoa pathogens were previously described but little is known about whether these assemblages are purely random or structured in communities. Methods From 1st July 2016 to 31st March 2017, fresh stool samples were collected from randomly selected individuals in sentinel hospitals in Tengchong City, Yunnan Province, China. molecular biology method was used to detect G.lamblia, E.histolytica, Cryptosporidium spp and B. hominis. Sequencing was applied to conrm the these protozoon genotypes. The data analysis method was involved chi-square test, multivariable logistic regression, null models and Partial least square regression (PLSR) methods with R 3.2 software. Results: The prevalence of the four enteric protozoa in all subjects were (in order of frequency of detection) : B.hominis 9.5% [95%Condence Interval (CI) 7.1-12.4%], G.lamblia 2.2% (95%CI 1.1-3.8%); E.hystolitica 2.0% (95%CI 0.9-3.6%) whereas Cryptosporidium spp. was not detected at all. The prevalence of, at least, one enteric protozoa was 12.4% (95%CI 9.7-15.6), and the prevalence of 2 different enteric protozoa species was 1.2% (95%CI 0.4-2.6). The most common co-infection found was E.histolytica and B.hominis, (1.0%; 95%CI 0.3-2.2). Regarding genetic proles, 10 out of 11 G.lamblia strains were classied in assemblage A, sub-assemblage I, and 1 sample was classied in assemblage B, sub-assemblage IV. B.hominis was detected in 48 samples, 25 were classied as genotype III, 13 as genotype I, 8 samples as genotype VII and the 2 remaining isolates as genotype IV. Our null modelling conrmed the random protozoa co-occurrence and our PLSR analysis the lack of association between these enteric pathogens and clinical symptomatology. Conclusions: The occurrence of these enteric protozoa was purely random. Not specic interactions were detected between the four protozoa studied and neither their presence, jointly or separately, nor the patient’s age were predictors for developing clinical symptoms associated with these pathogens. Further research including a broader range of pathogen species is needed to address remaining knowledge gaps in co-infections and diarrhoeal disease. Giardia.lamblia. Human virus. LMIC: middle-income countries. IQR: Interquartile range. OR: Odd ratios. PCR: polymerase chain reaction. PLSR: Partial least square regression. SES: standardised effect size. WASH: water, sanitation and hygiene practices(cid:0)


Background
Parasitic disease has been frequently reported as a signi cant etiological agent of gastrointestinal disorders and its contribution to the burden of diarrhoeal diseases is really signi cant with several enteric protozoan species described as causative agents in humans [1][2][3][4][5].
Cryptosporidium spp., Entamoeba histolytica (E.histolytica), Giardia.lamblia (G.lamblia) and Blastocyistis hominis (B. hominis ) can be considered as the four main enteric protozoa commonly associated with gastrointestinal disorders [5,6]. However, the impact that these parasites have on public health has not been fully characterized yet and the information available depends on the protozoa species in question.
Thus, it is estimated that around 20% of diarrhoeal episodes in children in low and middle-income countries (LMIC) and 9% in high-income settings (HIC) are caused by Cryptosporidum spp. [1]. Amebiasis, the acute disease caused by E.hystolitica, affects around 50 million people and causes 100,000 deaths each year [7]. Both protozoa, E.histolytica and Cryptosporidum spp., were included as causative of diarrhoeal deaths in the Global Burden Disease Study of 2017 [4]. On the other hand, G.duodenalis infection is estimated to result in 280 million cases annually worldwide [8] and was included in the "Neglected Disease Initiative", together with Cryptosporidum spp. in 2006 [9]. Regarding B.hominis, although it has been the commonest enteric protozoa isolated from patients with diarrhoeal diseases in HIC [1], its pathogenicity remains controversial [10,11] partly because it is the most common enteric protozoa detected in asymptomatic individuals also [12]. However it is have been argued that this protozoa could be used as an indicator of potential exposure to other pathogenic enteric protozoa [13].
Interestingly, the presence of any of these four protozoan species in the stools of asymptomatic individuals is not exclusive of B.hominis, and is relatively common according to some studies [1,14,15] but there is a lack of reliable data due to the absence of monitoring programs, under-reporting [16,17] and the fact that carrier and subclinical stages are often not diagnosed [1,17] .
Even though there have been notable advances, a lack of understanding about what are the factors that determine the development of gastrointestinal symptoms as a result of enteric protozoan infection still persists [18]. Protozoan genotype has been suggested as one of the most decisive predictors, although not the only one. It is expected that the growing development of new diagnostic tools, particularly molecular techniques, contributes to clarify the distinction between pathogenic and non-pathogenic lineages as well as pathophysiological interactions and other epidemiological features of interest [19].
Many previous studies have explored and reported the presence of more than one pathogen in diarrhoeal disease cases [6,19,20] as well as in healthy individuals [21,22], but only a few of them have been addressed to explore the impact of concomitant enteric infections and to characterize the enteric communities and the existence of speci c interactions [15,23]. These interactions, at least between 2 speci c enteric pathogens, are well documented in the veterinary eld, such as the association of enterotoxigenic Escherichia coli and rotavirus that lead to severe diarrhoea in piglets as an example [24] but it has not been addressed in immunocompetent humans until recently [15,25].
In order to understand the existence and impact of intestinal protozoan co-infections in humans, we performed a hospital-based cross-sectional study. The aims of the study were: 1) to investigate the prevalence of the four protozoa species most commonly associated with gastrointestinal disorders and determine their molecular pro le, 2) to assess whether enteric protozoa in hospital-based patients cooccur by chance or are structured in a community, and 3) to explore the potential impact of co-infections with the detected species on the clinical symptomatology among immunocompetent patients attending the People's Hospital of Tengchong City and the Chinese Medicine Hospital in Tengchong City.

Study design and study area
A cross-sectional hospital-based study was conducted from July 2016 to March 2017 in Tengchong (25º01'15''N, 98º29'50''E, 1596 m above sea level) a county level city located in Yunnan Province, Soutwest China. Tengchong has a tropical monsoon climate, the annual average temperature is 15ºC and the average annual rainfall is 1535 mm with a year-round mean relative humidity of 77%. The total resident population is 659,000 (Census 2014) of which 60.5% live in rural areas. Two hospitals from this city (People's Hospital of Tengchong City and Chinese Medicine Hospital of Tengchong City) agreed to participate in the study.
The target sample size was 423 for an expected prevalence of 50%, 95% con dence interval and 5% precision. Finally, 507 patients were recruited.

Study participants
Patients from inpatient department were voluntary recruited after a clear explanation of the study project objectives provided by the researcher. Informed consent was obtained from the participants or their parents, and participant's con dentiality was assured. Subjects with inadequate fecal samples, incomplete information, and those patients infected with Hepatitis B virus (HBV) and/or Human Immunode ciency virus (HIV) were excluded.

Specimen and data collection
Stool samples were collected from each subject with a sterile sampling cup during the study period in these two hospitals, with the criterion that each stool must be greater than 3 g or 3 mL, and stored at -70ºC.All the samples were delivered without interrupting the cold chain to the laboratory of National Institute of Parasitic Diseases, Chinese Centre for Disease Control and Prevention (Shanghai, China) and stored in a -70ºC refrigerator.
A structured questionnaire was used to collect the following information from each patient after the stool samples were collected: demographic data (e.g. age, gender, level of education, occupation, nationality and residence) and clinical manifestations associated with enteric protozoa infection (abdominal distension, inappetence, itchy skin, perianal pruritus, constipation, nausea, abdominal pain, , number of stools per day and type of stools).

Detection of intestinal protozoan species
Genomic DNA was extracted from each stool sample (0.2 gram or 0.2 milliliter) with QIAamp DNA stool mini kit (Qiagen, Hilden, Germany) according to the manufacturers' protocol. The genomic DNA was stored at -70˚C until polymerase chain reaction (PCR) ampli cation (Table 1). It was ampli ed by conventional PCR for B. hominis [26], and nested PCR for Cryptosporidium spp., G.lamblia and E.histolytica [27][28][29].
Genetic characterization of G.lamblia and B.hominis isolates All suspicious positive products were kept at -70 °C, and bidirectional sequencing by Sangon Biotech Company (Shanghai, China) was conducted to con rm the accuracy of G.lamblia and B.hominis infections. The nucleotide sequences obtained in this study were aligned with the G.lamblia amd B.hominis reference nucleotide sequences from GenBank and analyzed with BLAST (http://www.ncbi.nlm.nih.gov/BLAST/), and MEGA version 6.0 (http://www.megasoftware.net/) to determine G.lamblia assemblages and B.hominis subtypes based on Neighbour-Joining method. The reference nucleotide sequences of G.duodenalis from GenBank for TPI included assemblages AI (accession numbers L02120 and EF68803), AII (accession numbers U57897 and EF688019), AIII

Statistical modelling
Data were analyzed using the R 3.2 software. The chi-square or sher exact test, Odd ratios (OR) and 95% con dence intervals (95% CIs) were used to compare and described the qualitative variables.
Only participants with complete data records were included in the nal analysis. A new variable was created according to the WHO's de nition of diarrhoea and the information gathered about type of stool and number of stools per day. A case of acute diarrhoea was de ned as a person with more than three episodes of liquid stools per day, lasting less than 2 weeks [30]. Prevalences at 95 % con dence intervals (CIs) for single infections and co-infections in the study population were calculated using epiR library version 0.5-10 [31] .
Co-occurrence of enteric pathogens A null model analysis was used to explore whether enteric protozoa co-infection were positive, negative, or randomly associated. Data were organized as a presence-absence 4 × 507 (row × columns) matrix, in which each row represented a protozoa species and each column represented a study participant, "1" indicated that a species was present at a particular host and "0" indicated that a species was absent.
The C-score was the co-occurrence index used for co-occurrence patterns characterization and the algorithm chosen was the xed row-equiprobable column [32] The calculated C-score was compared with the expected C-score calculated for 5000 randomly assembled null matrices by Monte Carlo simulations. Furthermore, to compare the degree of co-occurrence across data, a standardised effect size (SES) was calculated, an index that measures the number of standard deviations that the observed index (C-score) is above or below the mean index of the simulated communities. The package EcoSimR 0.1.0 was used to carry out the analysis [33].
Assessing the impact of co-infection with enteric pathogens on diarrhoea severity The partial least square (PLS) regression method was used to assess the impact of co-infection with enteric protozoa on the development of clinical symptomatology. This technique was selected as it offers multiple advantages over other regression methods: it is the least restrictive of the multivariate techniques for exploring complex ecological patterns [34], including the impact of co-infections on the host's health [23] and its distribution is free and well suited to deal with multicollinearity [35]. In our analysis, we de ned explanatory and response components or blocks. The explanatory block (PLS X's component) was de ned by a presence-absence matrix representing the enteric protozoa community (B. hominis, E.histolytica, Criptosporidium sp., and G.lamblia ). In addition, due to the previously mentioned age variability in the clinical presentation of diarrhoeal diseases, age in years was also included as a covariate in the explanatory block. Our response block (PLS's Y component) included the main symptoms described associated with the infections of those protozoa (abdominal distension, inappetence, itchy skin, perianal pruritus, constipation, nausea, abdominal pain and acute diarrhoea).
The signi cance of PLS models was assessed using the Stone-Geisser's Q2 test, a cross-validation redundancy measure created to evaluate the predictive signi cance of exogenous variables. Values greater than 0.0975 indicate that predictors are statistically signi cant, whereas values below this threshold reveal no signi cance. Finally, the percentage of observed MNLS variability explained by the enteric pathogen block was also estimated. The plspm library version 0.4.9 was used to perform the analysis [36].

Results
Characteristics of the study population A total of 507 subjects participated in the study. The male-to-female sex ratios was 1.1 (260/247). Han nationality was predominant with 481 participants (94.9%) and the median age of study participants was 52 years (Interquartile range (IQR) = 38-63).

Genetic characterization of isolates
A total of 11 G.lamblia isolates were successfully characterized (Figure 1), revealing 10 strains were assemblage A (90.9%, 10/11) , and one stain was assemblage B (9.1%, 1/11). Further, all 10 assemblage A strains were AI sub-assemblage, and the only one assemblage B strain was BIV sub-assemblage. The detection rate of assemblage A in diarrhea cases was higher than that of in non-diarrhea subjects (

Impact of co-infection on diarrhoeal symptomatology
According to our PLS analyses, there was no evidence that the presence of any of the three protozoa species detected or their co-occurrence were statistically associated with enteric protozoa symptomatology (Stone-Geisser's Q2 test value < 0.0975). Regarding the other predictor, age, it was not statistically associated with symptomatology in the study population either.
Furthermore, the presence of any of the three protozoa, jointly or separately, covaried negatively with symptomatology. The analysis revealed that most of the X's component variance was due to co-infection with enteric protozoa (43.1%), followed by B.hominis (27.6%) and E.histolytica presence (24.0%) ( Table  2).

Co-occurrence of enteric protozoa
The null model analysis showed that the observed C-score (265) was lesser than expected by chance (267.45), indicating the existence of a random, non-competitively structured protozoa community. There was not evidence of a statistically signi cant protozoa combination (SES -0.33, p-value 0.414) in the study population.

Discussion
Our study con rms the single prevalence of enteric protozoa reported in other studies in similar settings [1,17,37] with adequate access to water, sanitation and hygiene practices (WASH). In the present report, B.hominis was the predominant intestinal protozoa species detected followed by E.histolytica and G.lamblia whereas Cryptosporidium spp. was not detected at all, these data are consistent with previous results [5].
Regarding the genetic pro le of the isolates, 11 G.lamblia strains were classi ed as assemblage A, within those isolates sub-assemblage AI was dominant (10, 90.9%) and no mixed co-infection of A and B genotype was found. This result was consistent with those found in other countries [38] and in Anhui Province, China. In the latter, A subtypes accounted for 100% of G.lamblia isolated from asymptomatic children [39].
The presence and proportion of assemblages seems to presents spatiotemporal variations [40,41] although socio-economic factors have been also suggested as potential drivers for G.lamblia assemblage distribution [22]. Interestingly, assemblage's variance seems to be involved in clinical presentation and only assemblages A and B have been described as causative agents of human infection, with assemblage B resulting more frequently in symptomatic infection in endemic settings [21,42]. Regarding assemblage A, AI has been one of the commonest sub-assemblage reported in the literature although AII is considered the most pathogenic sub-assemblage in humans [17,43,44]. All of the above could provide a potential explanation about why no evidence of association was found between the presence of G.lamblia and clinical symptomatology in the present study.
Regarding B.hominis, four genotypes were identi ed among 48 isolates. 25 isolates belonging to genotype III, 13 to genotype I, 8 were classi ed as genotype VII and a couple of samples were identi ed as genotype IV. These data are in agreement with other studies, who reported subtype III as the commonest [45,46]. As it has been previously highlighted, its clinical signi cance remains unclear [10].
As we have shown, it seems that, at enteric protozoan level, there is not any speci c community assemblage in the study population considering the four protozoan species considered.
Our work showed that the occurrence of those enteric protozoan species was purely random and asymptomatic in the study population. Furthermore, not speci c interactions were detected between them and there was no evidence that their presence, jointly or separately, was associated with the development of clinical symptomatology classically associated with those species. Although all of them have been reported as causative agents of gastrointestinal disorders, especially in children in LMIC [20], asymptomatic carrier has been also commonly reported elsewhere and, even in some cases such as B.hominis, its pathogenicity is subject to a big controversy [10,11] since this species is one of the most common enteric protozoa detected in humans [12,46], so it is not surprising that their presence was not associated with clinical symptomatology in immunocompetent individuals.
Given the results obtained, different factors might provide an explanation that help to understand the pathogenesis of enteric protozoan infections and what are the conditions that lead to disease. Some of them are related to the characteristics of the pathogen, such as different pathogenicity due to virulence variability of strains or the need of, at least, a second infection with another pathogen to cause clinical symptomatology. Moreover, host factors can play a key role too such as individual susceptibility or the presence of healthy functional barriers that protect the human intestine: the mucus layer, the epithelial intestinal layer and the intestinal microbioma [5,15,18,25] . However, further research needs to be performed to establish the ecology of enteric communities in healthy and unhealthy individuals.
We are aware that our research may have some limitations. Firstly, it was a hospital-based crosssectional study so the study population may be not representative of the general population in Tengchong City and the generalizability to other populations may be rather limited. Although the number of protozoa isolates was consistent with previous studies, it was small; this might have in uenced the reliability of the estimates in the PLS analyses.
Therefore, further research with larger sample size, which considers a broader enteropathogen community is needed to explore community assemblages present as well as their relationships with pathogenicity and clinical manifestations in rural and urban regions in China.
Undoubtedly, with the increasing awareness of the microbiome role in the development of host immunity [5] and their potential relationship with many communicable and non-communicable diseases such as Irritable Bowel Syndrome [47], there is a growing need to understand the complex relationships between different pathogens and as well as between the intestinal microbiota and pathogens.

Conclusions
The present study was the rst to analyse community assemblage of the four protozoan species commonly associated with human gastrointestinal disorders in immunocompetent individuals. Our results showed the absence of any structured community between them, their occurrence was purely random. Moreover, there was no evidence of an association between their presence and the development of clinical symptomatology.  [26] Reverse ACTAGGAATTCCTCGTTCATG  Figure 1 Evolutionary relationships among G. lamblia sub-assemblages at the TPI locus inferred by a Neighbor-Joining analysis. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1,000 iterations) is indicated next to the branches. N355, N230, N484, N491, N321, N467, N352, N228, N322 and N130 were isolated from human in this study and were all AI, and the N218 was BIV.

Figure 2
Evolutionary relationships among B. hominis genotypes by a neighbor-joining analysis. The percentage of replicate trees in which the associated clustered together in the bootstrap test (1,000 iterations) is indicated next to the branches.