Self-collected gargle as a patient-friendly sample collection method for COVID-19 diagnosis in population context

Scaling up SARS-CoV-2 testing and tracing continues to be plagued with limitation of sample collection method that requires trained healthcare workers to perform and cause discomfort to the patients. In response, we assessed the performance and user preference of gargle specimens for qRT-PCR based detection of SARS-CoV-2 in Indonesia. Inpatients who had recently been diagnosed with COVID-19 and outpatients who were about to perform qRT-PCR testing were asked to provide nasopharyngeal and oropharyngeal (NPOP) swabs and self-collected gargle specimens. We demonstrated that self-collected gargle specimens can be an alternative specimen to detect SARS-CoV-2 and the viral RNA remained stable for 31 days on room temperature storage. The developed method was validated for use on multiple RNA extraction kit and commercially available COVID-19 RT-PCR kits. Our developed method achieved sensitivity of 91.38% when compared to paired NPOP swab specimens (Ct < 35) with 95.16% of patients prefer the self-collected gargle method.


INTRODUCTION
In early December 2019, the outbreak of pneumonia with an unknown cause emerged in Wuhan, China [1,2]. It has since been identified as COVID-19; a disease caused by SARS-CoV-2 (severe acute respiratory coronavirus 2). To curb the spread of SARS-CoV-2, diagnosis of SARS-CoV-2 in prospective patient is the first crucial step for tracing and effective control of infection in the community. While there are methods, such as IgM/IgG lateral flow test and rapid antigen tests, WHO recommended diagnosis of SARS-CoV-2 in suspected cases with a nucleic acid amplification test, such as real time PCR (qRT-PCR), with respiratory specimens [3]. These respiratory specimens include nasopharyngeal and/or oropharyngeal swabs. However, the collection of nasopharyngeal and/or oropharyngeal swab is invasive and  ND requires close contact between healthcare workers and patients. Patients experience a degree of discomfort during the process of nasopharyngeal and/or oropharyngeal swab collection making it less acceptable when serial sampling is needed [11]. Furthermore, the use of personal protective equipment is necessary to protect healthcare workers from the risk of viral transmission. Usage of personal protective equipment poses additional strain to the already overstretched healthcare systems.
Other studies have shown that saliva can serve as a non-invasive specimen for diagnosis of SARS-CoV-2 using qRT-PCR [4 -8]. However, Becker et al (2020) indicated that saliva samples have about 30% to 50% reduction in sensitivity when tested in a community setting. This highlights the need to find other non-invasive and patient-friendly sample collection method. Therefore, this study was carried out to evaluate the analytical performance and sample stability of self-collected gargle specimens in the population setting (inpatients and outpatients) for initial diagnosis of SARS-CoV-2 infection in Semarang, Indonesia.

Comparison between NPOP swabs and gargle specimens in detecting SARS-CoV-2 in the inpatient cohort
We first sought to determine the sensitivity and specificity of both naso-oropharyngeal swabs (NPOP swabs) and gargle specimens to diagnose COVID-19. 53 inpatients and 13 healthy volunteers from RSDK and RSND were recruited with written informed consent for collection of NPOP swabs and gargle specimens after their confirmation of positive or negative detection of SARS-CoV-2. The mean age (± SD) of participants was 45.4 ± 16.5 years and the majority were females (n/N=41/66; 60.2%). All of the samples were collected within a median (IQR) duration of 3 days (1.75 days) since the symptom onset. We found an overall agreement between NPOP swabs and gargle specimens to be 86.36% with sensitivity of 87.23% (

Comparison between NPOP swabs and gargle specimens in detecting SARS-CoV-2 in the outpatient cohort
To further validate the performance of gargle specimens, a total of 244 outpatients from RSDK and RSND were further recruited for comparison between NPOP swab and gargle specimens. The mean age (± SD) of participants was 34.3 ± 12.5 years and the majority were females (n/N=144/244; 63.2%). History of symptom onset and close contact with COVID-19 patients were obtained from 219 (89.75%) subjects and 22.54% (n=55) of them were reported to be asymptomatic. All of the samples were collected within a median (IQR) duration of 3 days (2 days) since the symptom onset. For the outpatient group, we found a substantial agreement (κ = 0.722) between NPOP swabs and gargle specimens to be 86.48% with sensitivity of 85.14% (95% CI: 78.52% to 89.97%) and specificity of 88.54% (95% CI: 80.64% to 93.48%). Similar to the results of inpatient study, there were 79.25% (n/N = 126/159) of participants with both NPOP swab and gargle specimens, 13.84% (n=22) of participants had NPOP swab positive/gargle negative results and 6.92% (n=11) of participants with gargle positive/NPOP swab negative result. The performance of gargle specimens on outpatients were found to be similar to the performance on inpatients, indicating that gargle specimens can be used in a population context. Preference of sample type were collected from all participants where 97.10% (n/N=301/310) of the participants preferred the usage of gargle specimens as a sample to diagnose COVID-19 (Supplementary Table S3).

High sensitivity of gargle specimens on detecting SARS-CoV-2 across low, moderate, and high NPOP Ct Groups
When comparing the Ct value between NPOP swabs and gargle specimens, we found significant differences in the two virus target genes between both sample types (Wilcoxon matched-pairs signed rank test, p-value < 0.0001), with the median difference for each target genes >8 Ct ( Fig. 2A, B). On the other hand, the median Ct value of human internal control target gene RPP30 was found to be lower as compared to the NPOP's median RPP30 Ct value (Fig. 2C). Additionally, we observed that majority of the discrepant samples occur when NPOP swabs Ct values were > 31, which indicates that the disagreement occur when the viral load is low. This confirms that the viral load in gargle specimens were lower as compared to NPOP swab and that some variation exists between sample types [10,11].
Analyzing the Ct differences in group of low Ct (NPOP Ct <20), moderate Ct (NPOP Ct 20-29, and high Ct (NPOP Ct >30), we found that the median difference widens as the NPOP Ct values were lower (Fig. 2D). The median differences for low Ct groups were 10.40 for helicase and 9.34 for RdRP. For moderate Ct groups, the median differences were 8.85 for helicase and 8.07 for RdRP. As for the high Ct groups, the median differences were 2.39 for helicase and 0.03 for RdRP. However, the effect of lower viral load is marginal as the sensitivity of gargle specimens to diagnose COVID-19 was still at 91.38% for NPOP's Ct values below 35 (Fig. 2E). Ct values were observed to be lower during the earlier period of infection on both NPOP swabs and gargle specimens (Fig. 3A, B) with NPOP swabs having lower Ct value during the earlier period as compared to gargle specimens. The sensitivity of gargle specimens in detecting SARS-CoV-2 drops when samples were collected were longer than 5 days from symptom onset (Fig. 3C), although there were more samples detected to be positive SARS-CoV-2 from gargle specimens (n=8) as compared to NPOP swabs (n=5). On the other hand, age and total number of symptoms do not correlate with the Ct values observed on both  Fig. S1A, B). This confirms that the observation by Zou et al (2020) where higher viral loads were detected soon after symptom onset and that there is no difference in viral load between asymptomatic and symptomatic patients.

Sensitivity and specificity of gargle specimens is highly replicable with a different RNA extraction kit and qRT-PCR kit
To assess the repeatability of the performance of gargle specimens in diagnosing COVID-19, we perform a comparison between NPOP swab and gargle specimen using other commercial extraction and qRT-PCR kits at University of Indonesia (Table 3). Similar to the results on the inpatient and outpatient cohorts of RSDK and RSND, we found an overall agreement of 85% with sensitivity of 85% (95% CI: 63.96% to 94.76%) and specificity of 100% (95%CI: 72.25% to 100%). However, there was no significant median difference observed between Ct values of NPOP swab and gargle specimen ( Supplementary Fig. S2A, B). We also observed high sensitivity (94.12%) of gargle specimens on NPOP swab Ct < 35 validated by Universitas Indonesia (Supplementary Fig. S2C). Thus, our results demonstrated wide applicability of this gargle specimen for adoption with other existing workflows.
A B C Table 3. Validation of gargle specimens with other commercial RNA extraction kit and qRT-PCR kit.

DISCUSSION
Our study demonstrates the application of self-collected gargle specimens as an alternative specimen for collection in the detection of SARS-CoV-2, achieving sensitivity of 91.38% for Ct < 35 and, at the same time, are more preferred by patients as it significantly reduces discomfort felt by the patients. We also validated that self-collected gargle specimens are versatile for use with other commercial RNA extraction kit and COVID-19 qPCR kits. The usage of self-collected gargle specimens may improve testing and tracing by reducing the required numbers of healthcare workers and personal protective equipment for sample collection.
Previously, passive drooled saliva or neat saliva has been reported to be a viable alternative specimen for detection of SARS-CoV-2 infection due to its practicality and reportedly high sensitivity as compared to NPOP specimens [4][5][6][7][8]13]. However, majority of these studies focus only on the inpatient cohort. On the other hand, several studies have highlighted the low sensitivity of passive drooled saliva samples in population context [9,14,15]. Manabe et al (2020) noted that oral fluid saliva only has 41.6% Positive Percentage Agreement on nonhospitalized, ambulatory cohort of COVID-19 patients. This highlights that viral shedding in salivary gland and oral cavity might not be consistent. Additionally, xerostomia manifests in 60% of COVID-19 cases which might reduce the volume of passive drooled saliva and increase its viscosity [16], thus, leading to longer sample collection process and a less sensitive assay.
Posterior oropharyngeal saliva and gargle lavage have been reported to show promising performance compared to nasopharyngeal and oropharyngeal swabs for detection of COVID-19 [10, 11, 17 -19] and other respiratory viruses [20]. Goldfarb et al (2020) found that selfcollected gargle specimens are more sensitive and more acceptable than saliva samples on outpatient cohort. In this study, we reported similar performance of gargle specimens in detecting SARS-CoV-2. As Zou et al (2020) noted that higher viral loads were detected in the nose than in the throat, we incorporated sniffing and coughing procedure before gargling to increase the viral loads in the oral cavity. This resulted in having 85.64% Positive Percentage Agreement across all Ct range.
In our study, the gargle specimens exhibited excellent stability at room temperature (22 -27 o C) for over 30 days when preserved using BioSaliva Collection buffer. This highlights the utility of gargle specimens in remote areas where access to clinical laboratories is limited. Patients could perform gargle method and drop off at a collection point where the samples were then transferred to a centralized laboratory without the need of cold chain distribution while ensuring the viral RNA is still intact. Our study design had several strengths. In addition to having a number of participants across the pediatric and adult age ranges, we also enrolled both inpatient and outpatient cohort to study the performance of gargle specimens in the population context. This is particularly important as COVID-19 cases have surged all around Asia, especially Indonesia, where testing and tracing is required to ensure proper diagnosis and proper care for those affected. Our evaluation also includes the validation of self-collected gargle performance across multiple extraction and qRT-PCR kits and platforms, including an Indonesian Ministry of Healthauthorized commercial kit and US CDC-authorized assay. A main weakness of the study is that a number of patients with discrepant samples did not have additional samples collected to confirm the development of COVID-19. We also did not conduct virus culture as definitive determination in the issue of disagreement of the NPOP and gargle specimens. Thus, further study needs to be conducted to understand the cause of the disagreement.
Given the high diagnostic performance, reduction in swabs and personal protective equipment, and excellent stability at room temperature, self-collected gargle specimens appear to be a promising sample type for testing of both inpatients and outpatients with COVID-19.

Assessment of BioSaliva's capability in preserving viral RNA in gargle specimens
To assess the preservation of viral RNA at room temperature by BioSaliva, we conducted a preliminary study on the stability of spiked viral RNA in gargle specimens at room temperature  Table S1). 53 of them were previously diagnosed as COVID-19 positive and have been admitted to hospitals while the rest of the patients were outpatients of RSDK and RSND. Participating patients were verbally informed about the study and the procedure involved. Written informed consent, along with symptoms and date of onset symptoms, was obtained from all patients prior to sample collection. For outpatient cohort, patients without prior COVID-19 diagnosis who were required or voluntarily to undergo PCR testing were included in the study. Critically ill, unconscious, and patients who cannot gargle were excluded from the study.

Specimen Collection
The NPOP swab was collected from each patient by a trained medical professional and kept in the VTM tube. Before the collection of NPOP swabs, individuals were asked to provide selfcollected gargle specimens following the protocol given and supervised by healthcare workers. Prior to gargle sample collection, patients were required to satisfy a 45-minute fasting period during which they were not allowed to eat, drink, smoke, brush their teeth, and use mouthwash. Patients who underwent dental procedures 24 hours prior to sample collection were excluded. Patients were required to sniff for 5 to 6 times, throat cough for 5 to 6 times, and gargle the provided solution (2.5 mL of Gargle Solution). Patients then spit out the gargled solution into the provided collection tube where 3 mL of BioSaliva Collection Buffer (PT Biofarma, Bandung, Indonesia) were then transferred to the same tube afterwards. Collection tube were then inverted 10 times to mix the gargled solution and the Collection Buffer. Samples were then processed in Biosafety Laboratory level 2 at RSDK and RSND, Semarang, Indonesia. At the end of specimen collection, patients were asked to vote for which method was more preferrable.

Viral RNA Extraction
The pairs of specimens were labelled with different laboratory numbers and randomized. Technicians who performed viral RNA extraction and RT-PCR were unaware of the names and laboratory numbers of the participants. 200 uL of NPOP swab VTM and 200 uL of saliva samples were treated with lysis buffers and processed using DaAn Gene Manual Extraction Kit (Da An Gene Co., Ltd. of Sun Yat-sen University, Guangzhou, China) following manufacturer's protocol.

RT-PCR workflow
The detection of SARS-CoV-2 in the specimens was performed by RT-PCR amplification of the SARS-CoV-2 helicase (nsp13) and RdRP (nsp12) gene fragments, using mBioCoV19 Multiplex qRT-PCR Diagnostic Kit (PT Biofarma, Bandung, Indonesia), which was approved for the detection of SARS-CoV-2 by the Indonesian Ministry of Health. mBioCoV19 boast a lower limit of detection of 5 copies RNA/reaction. The detection of human RNase P gene was included in the kit as a control. RT-PCR was performed using the LightCycler® 480 Instrument (Roche Life Science, Penzberg, Germany) at RSDK and RSND. The result was considered positive if the cycle threshold (Ct) values of one of the target genes were < 40, and negative when Ct values of both targets were ≥ 40.

Statistical analysis
Data were analyzed for normality and descriptive statistics were presented as a number (%) for categorical variables and mean ± standard deviation (SD) or median (interquartile range; IQR) for continuous variables. Kruskal-Wallis test was done to compare the Ct values across different storage temperature. Fisher's exact test was used for categorical variables. Sensitivity, specificity, positive predictive value, negative predictive value and a 95% CI were calculated to assess diagnostic performance. The Cohen's κ coefficient [12] was used to estimate for the agreement between the gargle RT-PCR and nasopharyngeal and oropharyngeal swab RT-PCR results. Wilcoxon matched-pairs signed rank test was used to