More Accurate Estimates of the Accuracy of RT-PCR and Chest CT Tests for COVID-19

In this article, we propose a novel statistical method for estimating the accuracy of chest computed tomography (CT) and reverse transcription polymerase chain reaction (RT-PCR) tests in the diagnosis of coronavirus disease 2019 (COVID-19), with a correction for imperfect gold standard and verification bias simultaneously. These two types of bias are often involved in estimating the diagnostic accuracy of COVID-19 tests. Imperfect gold standard bias arises when estimating accuracy measures of chest CT while using the RT-PCR test as a gold standard, despite its tendency to produce false negative results. Meanwhile, verification bias occurs in some studies where the results from chest CT are verified by RT-PCR test in a subsample of suspected cases that is not representative of the original population. Consequently, the accuracy estimates of chest CT and RT-PCR tests could be seriously biased and lead to invalid inference. Our proposed method is able to correct these two types of bias in providing unbiased and more accurate estimates of sensitivity and specificity of the two tests. Our results suggest that chest CT has higher sensitivity and lower specificity than RT-PCR, and the accuracy estimates can serve as an important reference for assessing and comparing the performance of these two tests in the diagnosis of COVID-19, and could guide policy recommendations for the implementation of these tests.


Introduction
Coronavirus disease 2019 (COVID- 19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2. The World Health Organization declared the outbreak a Public Health Emergency of International Concern on January 30, 2020, and characterized it as a pandemic on March 11, 2020.As of November 15, 2020, over 53.7 million cases and 1.3 million deaths have been reported globally (1).
Early detection of COVID-19 is extremely important for isolation of disease containment and individual patient care.Reverse transcription polymerase chain reaction (RT-PCR) test is currently the clinical reference standard for COVID-19 diagnosis, however, much evidence shows that the RT-PCR test is far from perfect due to its high false negative rate (2).Although the test performs well under ideal laboratory conditions, many external factors such as the timing of the test, source of the sample (upper or lower respiratory tract), handling of specimens, and the performance of detection kits could affect its accuracy.In a study by Kucirka et al. (3), RT-PCR was found to have a false negative rate of at least 20%, depending on the timing of the test.In a systematic review of five studies involving 957 patients that included suspected or confirmed cases of COVID-19, false negative rates of the initial RT-PCR ranged from 2% to 29% (4).False-negative results are consequential and the magnitude of risk from false negative results will increase as the prevalence of COVID-19 rises.Individuals with these false negative results may relax measures such as physical distancing designed to reduce virus transmission, further exacerbating the escalation of the pandemic (5).
In addition to the risk of false negative results in RT-PCR tests, limited healthcare resources including a shortage of RT-PCR testing kits could also restrict the identification of COVID-19 infected patients.This was especially seen in the early stages of the pandemic in severely affected areas where the disease was spreading rapidly.Some reports have suggested that chest computed tomography (CT) scans, which can reveal abnormalities that may indicate infection with COVID-19, could be a useful alternative (6).Additionally, chest CT scans can produce faster results than RT-PCR.Hence, the Guidelines for Diagnosis and Treatment of Pneumonitis Caused by 2019-nCoV (trial sixth version) published by the government of China has recommended the use of chest CT as a practical and rapid method for screening suspected cases of COVID-19 (7).Outside China, considering the availability and false negative problems of RT-PCR World Health Organization suggests that for symptomatic patients with suspected COVID-19, chest imaging for the diagnostic workup of COVID-19 is recommended when: 1) RT-PCR testing is not available; 2) RT-PCR testing is available, but results are delayed; and 3) initial RT-PCR testing is negative, but with high clinical suspicion of COVID-19 (8).Therefore, as an important complement to the RT-PCR test, chest CT currently play a crucial role in the early prevention and control of COVID-19 (9).
Under these circumstances, it is vital to evaluate the diagnostic performance of chest CT and RT-PCR tests.Common measures of test accuracy are sensitivity and specificity, which correspond to the probability of a positive test result when the individual has the disease (true positive rate), and the probability of a negative test result when the individual does not have the disease (true negative rate), respectively.Recently, several studies have estimated the sensitivity and specificity of chest CT and/or RT-PCR tests (10)(11)(12).Among them, many studies treated initial or repeated RT-PCR tests as a gold standard for evaluating the accuracy of chest CT (10,11).However, using an imperfect reference test as if it were a gold standard could underestimate or overestimate the test accuracy; this is called imperfect gold standard bias (13).Moreover, to assess the accuracy of the RT-PCR test, some studies focused on the performance of the initial test (11,12).For example, Long et al. (12) considered initial or repeated RT-PCR tests as the gold standard to estimate the sensitivity of the initial RT-PCR.In this case, the assumption of independence between the reference standard and the test being evaluated was violated; thus, the estimate was not reliable.Another concern about the accuracy of sensitivity and specificity estimates is verification bias.In the study mentioned above (12), only the patients with fever and positive chest CT results were subjected to RT-PCR tests because of the limited supply of nucleic acid detection kits.Their analysis was based on data from 87 patients who underwent both chest CT and RT-PCR tests, but it excluded 106 other patients who were not tested using RT-PCR.Verification bias would arise in this case because chest CT results were verified by RT-PCR only in a selected subsample and not the whole population.Since this subset is not representative of the original population, the resulting estimates would be distorted due to such a bias.
To overcome the aforementioned problems in the accuracy evaluation of COVID-19 tests, in this article, we develop a novel statistical method for simultaneously correcting both imperfect gold standard and verification bias in estimating the sensitivity and specificity of diagnostic tests.Our method treats the sensitivity and specificity as unknown parameters and modelling the verification probabilities in a maximum likelihood procedure.The imperfect gold standard and/or verification bias are frequently ignored in the previous studies for evaluating the accuracy of chest CT and RT-PCR tests for the diagnosis of COVID-19, and to the best of our knowledge, no appropriate method has been adopted to correct these two types of bias.Our proposed method produces unbiased estimates for the sensitivity and specificity of COVID-19 tests, which could be an important clinically relevant reference.

Data
We consider a dataset collected from a retrospective study conducted by Wang et al. (14).The dataset contains test results of chest CT scan and RT-PCR for COVID-19 from the patients who visited the Fever Clinic at Tianyou Hospital (Wuhan, China) and the Fever Clinic at Second Xiangya Hospital (Changsha, China), from January 22, 2020 to February 14, 2020.The two fever clinics are located in regions with different disease prevalence.Of the total 1,300 patients, 1,097 patients came from a high COVID-19 prevalence region (Wuhan, China) and the remaining 203 patients came from a low COVID-19 prevalence region (Changsha, China).The information about age, gender, chest CT results of all patients, and RT-PCR test results of 541 patients are included.All patients who visited the fever clinics underwent a chest CT scan, and the chest CT results were classified as positive or negative for COVID-19 by radiologists.A subgroup consisting of suspected cases received an RT-PCR test, and some of these patients with an initial negative RT-PCR test result were given repeated tests to rule out the possibility of false negatives.For the patients with multiple RT-PCR tests, the final reported result of RT-PCR was defined as positive if there was at least one positive result from the repeated RT-PCR assays; otherwise, it was defined as negative.

Statistical method for estimating test accuracy
Let D be the unknown true disease status, with values 1 and 0 representing diseased and nondiseased individuals, respectively.We denote the results of chest CT and RT-PCR by binary variables  $ and  % , respectively, such that  $ = 1 ( % = 1) if the result of chest CT (RT-PCR) is positive and  $ = 0 ( % = 0) if the result of chest CT (RT-PCR) is negative.Let V be the verification indicator, such that V = 1 if chest CT is followed by RT-PCR, and V = 0 if only chest CT has been administered.
We model the verification probability as a function of  $ , since the probability of an individual receiving the RT-PCR test is dependent on the result of chest CT.In our approach, we assume that  $ and  % are conditionally independent given the true disease status.The conditional independence assumption is reasonable since chest CT and RT-PCR tests are unrelated.The sensitivity and specificity of the two tests are assumed to be consistent across different populations, while the prevalence of disease can vary across these populations.Under these assumptions, the joint distribution of (V,  $ ,  % , D) would be (V,  $ ,  % , D) = (V| $ )( $ |D)( % |D)(D).We treat the aggregated data shown in Table S1 as samples from multinomial distributions, and the probability of each cell can be written as a function of verification probability, sensitivity and specificity of the two tests, and disease prevalence.The estimators of sensitivity and specificity are obtained by maximizing the corresponding loglikelihood function.Given the asymptotic normality of the maximum likelihood estimate, the Wald-type confidence intervals (CI) for the sensitivity and specificity can be constructed.More details can be found in Section S1 of Supplementary Materials.

Sensitivity analysis
In addition to the results of chest CT, we suspect that the decision to use RT-PCR test also depended on the epidemiological history and blood test results of the patients.Thus, we consider the use of a covariate vector while modeling the verification probability.Specifically, we model V as a function of  $ and a two-dimensional vector of covariate  = ( $ ,  % ), in which  $ and  % are binary variables denoting whether one's epidemiological history suggests potential exposure to the virus, and whether the result of the blood test presents abnormal findings related to the presence of disease, respectively.Then we have (V,  $ ,  % , D, ) = (V| $ , )( $ |D)( % |D)(D|)() .In our case, the covariate  is unobserved.To examine the impact of the model for verification probability on the sensitivity and specificity estimates of chest CT and RT-PCR tests, we assign different values to the coefficients involved with  within a plausible range and compare the maximum likelihood estimates of sensitivity and specificity of the two tests.See more details in Section S2 of Supplementary Materials.

Discussion
The strong infectivity of COVID-19 has placed a heavy burden not only on patients and their families, but also on healthcare systems.Early diagnosis, isolation, and treatment has been playing a fundamental role in the prevention and control of COVID-19.Although RT-PCR test remains the clinical gold standard for the diagnosis of COVID-19, occurrence of false negatives, limited healthcare resources, and slow process in providing a result has restricted the early detection of COVID-19.Chest CT has been used as a primary screening tool to supplement RT-PCR, especially in the initial and peak periods of the epidemic (7,10).In this context, accurate and valid estimates of the sensitivity and specificity of chest CT and RT-PCR tests are of great importance in the diagnosis of COVID-19.While some studies have focused on the evaluation of the diagnostic performance of chest CT and/or RT-PCR tests, they typically regarded RT-PCR as a perfect gold standard and/or ignored verification bias.Consequently, their estimates could be seriously biased and would lead to invalid inferences.In this article, we propose a novel statistical method for correcting both imperfect gold standard and verification bias simultaneously in estimating the sensitivity and specificity of two diagnostic tests.Our method provides unbiased estimates of accuracy measures for RT-PCR and chest CT, which are essential for assessing and comparing their performance in the diagnosis of COVID-19.To the best of our knowledge, this is the first study to simultaneously make a correction for both of the bias in evaluating the accuracy of COVID-19 tests.
Our findings suggest that RT-PCR test has a lower sensitivity and a higher specificity than chest CT, which is in accordance with results from previous studies (10)(11)(12)(15)(16)(17).However, it is worth mentioning that that their estimates of sensitivity and specificity are noticeably different with our estimates.Compared to their results, our estimate for specificity of chest CT is much higher, while the sensitivity estimates of chest CT and RT-PCR tests are lower.The discrepancy between their estimates and ours for the accuracy of chest CT is largely due to their use of the RT-PCR test as a gold standard for estimating the accuracy of chest CT, which could seriously bias the resulting estimates.In the study of Ai et al. (10), taking the RT-PCR as a gold standard, the estimated sensitivity and specificity of chest CT was 97% (97% CI: 95% to 98%) and 25% (95% CI: 22% to 30%), respectively; they pointed out that about 81% of the patients with negative RT-PCR results but positive chest CT scans were reclassified as highly suspected cases of COVID-19, and some patients had negative to positive RT-PCR results; as a result, the accuracy of chest CT could be seriously overestimated or underestimated when using RT-PCR as a reference standard.Caruso et al. (16) reported that the specificity of chest CT was 56% (95% CI: 45% to 66%), where the imperfect gold standard bias are partially adjusted by using repeated RT-PCR as the reference standard; their estimate of specificity was much higher than that in Ai et al. (10) and is approaching to our results.Additionally, for evaluating the performance of the initial RT-PCR test, some of them used initial or repeated RT-PCR tests as the reference standard (11,12); since the results of the initial and multiple tests are not independent, the sensitivity estimate of the RT-PCR test was also not valid.Moreover, some studies (10)(11)(12)16) performed analysis based on data from the patients who underwent both tests, but excluded those with only chest CT results, which could give rise to the verification bias.Without a correction for imperfect gold standard and/or verification bias, the accuracy estimates in the aforementioned studies could be seriously distorted.In contrast, our method is able to provide unbiased and reliable estimates for the accuracy measures of diagnostic tests, and the results could be served as a reference for clinicians and policy experts for diagnosing COVID-19.
Although chest CT has a high sensitivity, there are some concerns over the false positive results which might overwhelm available healthcare resources, especially during an epidemic (18).The statements from the American College of Radiology recommended that chest CT should not be used for screen for or as a first-line test to diagnose COVID-19 because of the reported low specificity of chest CT (18).However, our results show that the specificity of chest CT reaches 0.692 (95% CI: 0.589 to 0.795), which may alleviate the aforementioned concerns.Thus, chest CT could play a more important role in the screening or diagnosis of COVID-19 than previously thought.Additionally, Surkova et al. (19) pointed out that false positive RT-PCR results are another hidden problem that might lead to many potential consequences, including but not limited to financial losses and psychological pressure.Our study provides additional insight into this problem by showing the false positive rate could be somewhere between 0.0% and 7.1%.
Our method involves the maximum likelihood estimation which requires certain assumptions to be met.First, chest CT test and RT-PCR test are assumed to be independent given the true disease status, that is, the two tests are independent within the diseased population and within the non-diseased population.This assumption is reasonable because chest CT and RT-PCR are examinations of different mechanisms, as chest CT is test of radiological imaging features while RT-PCR test detects and measures RNA.Additionally, the RT-PCR test is specific to the COVID-19 virus, but the imaging features of COVID-19 pneumonia are highly nonspecific and might overlap with the symptoms of H1N1 influenza, cytomegalovirus pneumonia, or atypical pneumonia (20).Next, we assume that whether being verified by RT-PCR is conditionally independent with the results of RT-PCR given the chest CT results, that is, the results of RT-PCR are regarded to be missing at random.Also, we acknowledge that there may be other additional covariates, such as the epidemiological history and/or blood test results (7), that would affect whether an individual received RT-PCR test.A sensitivity analysis has been conducted by implementing a modified model for verification probability, and it is shown to have negligible effect on the sensitivity and specificity estimates of chest CT and RT-PCR tests.

Table 1 .
Estimates (Est), standard errors (SE), 95% confidence intervals (CI) for the sensitivity and specificity of chest CT scan and RT-PCR tests, using the proposed maximum likelihood method.

Table 2
reports the sensitivity and specificity estimates under the scenarios with different sets of plausible values for the coefficients involved with , that is,  % and  $ .Overall, it clearly shows that different values of  % and  $ have negligible effects on the sensitivity and specificity estimates of the two tests.As  % and  $ vary, the sensitivity estimates change minimally, and the specificity estimates seem to remain the same.Consequently, the maximum likelihood estimation results shown in Table2are almost identical to those presented in Table

Table 2 .
Estimates (Est), standard errors (SE), 95% confidence intervals (CI) for the sensitivity and specificity of chest CT scan and RT-PCR tests, using the proposed maximum likelihood method, with different values for  % and  $ .