Is Pool Testing Method of COVID-19 Employed in Germany and India Effective?

At present, several countries, such as Germany and India, have employed a pool testing method on the nucleic acid testing of COVID-19 for the shortage of detection kits. In this method, the testing is performed on several samples of the cases together as a bunch. If the test result of the bunch is negative, then it is shown that none of the cases in the bunch has been infected with the novel coronavirus. On the contrary, if the test result of the bunch is positive, then the samples are tested one by one to confirm which cases are infected. We verified that the pool testing method of COVID-19 is effective in the situation of the shortage of nucleic acid detection kits based on probabilistic modeling. Moreover, the following interesting results are also obtained. (1) If the infection rate is extremely low, while the same number of detection kits are used, the expected number of cases that can be tested by the pool testing method is far more than that by the one-by-one testing method. (2) The pool testing method is effective only when the infection rate is less than 0.3078. While the infection rate decreases from 0.3078 to 0.0018, the optimal sample sizes in one bunch increases from 3 to 25. In general, the higher the infection rate, the smaller the optimal sample size in one bunch. (3) If N samples are tested by the pool testing method, while the sample size in one bunch is G , the number of detection kits required is in the interval (N/G, N). Additionally, the lower the infection rate, the fewer detection kits are needed. Therefore, the pool testing method is not only suitable for the situation of the shortage of detection kits, but also the situation of the overall or sampling detection for a large population.


Introduction
At present, COVID-19 epidemic outbreaks in most countries all over the world.
The epidemic should last for a long time due to its strong infectivity (Yang and Wang, 2020). To minimize the negative impact of the epidemic, each country has adopted its characteristic response methods. The experience of China suggests that rapid detection and isolation of the infected cases are of vital importance in controlling the development of the epidemic (Cyranoski, 2020). Nucleic acid testing (NAT) is a normal way to detect the infection of the novel coronavirus (Hu et al., 2020). In the early stage of the epidemic in many countries and regions, NAT capacity should be insufficient to enable the one-by-one test of all the suspected cases. To improve the pertinence of the NAT, Ghinai et al. (2020) made a statistical analysis of the NAT results of the close contacts of confirmed cases and found the ones who are most likely to be infected. Additionally, several countries, such as Germany and India, have employed a pool testing method. Specifically, NAT is performed on several samples of the cases together as a bunch. If the test result of the bunch is negative (hereinafter referred to as the "uninfected bunch"), then none of the cases in the bunch has been infected with the novel coronavirus. On the contrary, if the test result of the bunch is positive (hereinafter referred to as the "infected bunch"), then at least one of the cases in the bunch has been infected with the novel coronavirus. In this case, the samples are tested one by one to determine which cases are infected. Therefore, this research aims to analyze the effectiveness of the pool testing method and propose an approach to maximize the detection efficiency with limited nucleic acid testing kits. The remainder of this paper is organized as follows. Section 2 calculates the expected number of testing times of overall infection cases in a country or region. Section 3 compares the efficiency of the pool testing and the one-by-one testing method with limited detection kits. Section 4 proposes the optimal sample sizes in one bunch at different infection rates.

Expected number of testing times of overall infection cases
While NAT is carried out by the pool testing method, the expected number of testing times is affected by the total population and infection rate of the country or region, as well as the sample size in one bunch. Let T denote the total testing number, N denote the total population, R denote infection rate, G denote the sample size in one bunch, and U denote the number of uninfected bunches. Each bunch is tested for nucleic acids in sequence.
The probability that the first bunch is an uninfected bunch is Let 1 ( 1 = 0, 1, … , ) denote the number of uninfected cases in the first bunch.
The probability that there are 1 uninfected cases detected in the first bunch is Based on the total probability formula, the probability that the second bunch is an uninfected bunch can be obtained as In equation (2), ∑ means the number of basic events that 1 uninfected cases are randomly selected from (1 − ) uninfected ones, and then − 1 infected cases are randomly selected from infected ones (here a bunch of G cases is formed), and finally G uninfected cases are randomly selected from the rest (1 − ) − 1 uninfected ones. This is equivalent to the number of basic events that G uninfected cases are randomly selected from (1 − ) uninfected ones, and then G cases are randomly selected from the rest − cases.
Therefore, the number of basic events above is (1− ) • − . Consequently, we have Therefore, we obtain Let 2 ( 2 = 0, 1, … , ) denote the number of uninfected cases in the second bunch. Based on the total probability formula, the probability that the third bunch is an uninfected bunch can be obtained as means the number of basic events that after the first bunch is designated, 2 uninfected cases are randomly Consequently, we have Therefore, we obtain Similarly, the following equation can be also obtained successively: Therefore, the probability of whichever bunch being an uninfected bunch is equal to (1). While ≫ , this probability can be considered as Additionally, the bunch number can be considered as . Consequently, the expected number of uninfected bunches is with replacement. Therefore, the expected number of testing times of overall infection cases is In practice, equation (5) can only be applied in the condition of ≥ 2. Whereas, in one-by-one testing (i.e. = 1 ), the expected number of testing times of overall infection cases is ( ) = .

Testing Efficiency with Limited Detection Kits
Let m denote the upper limit of the number of testing times per unit time in a country or region. While employing the one-by-one testing method, the number of cases that can be tested is m. Then, for a country or region with a total population of N, the probability of each person being tested is , and the expected number of infected cases that can be detected is .
While employing the pool testing method in which the sample size in one bunch is G, the expected number of cases who can be tested is 1 +1−(1− ) . Then, for a country or region with a total population of N, the probability of each person being tested is , and the expected number of infected cases which can be detected is It can be seen that the expected number of cases who can be tested by the pool testing method is approximately G times higher than by the one-by-one testing method while the infection rate is extremely low.

Optimal Sample Size in One Bunch
The optimal sample size in one bunch, denoted by ( ) , is defined as the sample size that minimizes the expected number of testing times (L) per case in the bunch. According to the previous analysis, L is determined both by the sample size in one bunch (G) and infection rate (R). While the detection kits are sufficient, the expected number of testing times per case in the one-by-one testing is 1; the expected number of testing times per case in the pool testing is While ( , ) is larger than 1, the expected number of testing times per case in the pool testing is more than that the in one-by-one testing. That is, Then, we set Obviously, ( ) increases monotonically in the interval (2, ). For any ∈ (0,1), In this case, we have That is, in the condition of ≥ 0.3078, the one-by-one testing method is strictly superior to the pool testing method.
On the contrary, in the condition of < 0.3078, the expected number of testing times per case in the pool testing is smaller than that in the one-by-one testing.
Moreover, we obtain which is greater than zero in the condition of ∈ (0,1). Therefore, the expected number of testing times per case in the condition of = 2 is strictly greater than that in the condition of = 3. In other words, the optimal sample size in one bunch should be no less than 3. In the condition of ≥ 3 0 < < 0.3078, the function graph of ( , ) is depicted in Figure 1. In the condition of ∈ (0,0.3078), as G increases, L increases significantly from its minimum, then decreases slightly, and finally approaches 1. It also can be seen that the larger the value of R, the smaller the optimal sample size in one bunch. Obviously, while G is equal to 1 (the minimum value), the number of testing times per case is 1; while G is equal to N (the maximum value), the expected number of testing times per case is also 1. On this basis, we set The function graph of ( , ) is depicted in Figure 2. Based on the analysis above, we figure out the value of 1 , 2 by traversing R in the interval (0,0.3078) by a step size of 0.0001 in the condition of ( ) = 3, 4, 5, … . Consequently, we obtain the optimal sample size in one bunch ( ) corresponding to different infection rate intervals ( 1 , 2 ] , which is demonstrated in Table 1.

Conclusions
This research verifies the effectiveness of the pool testing method of COVID-19 employed in Germany and India. Additionally, the expected number of testing times of overall infection cases in a country or region by the pool testing method is proposed.
The research compares the efficiency of the pool testing method and the one-by-one testing method in the condition of limited nucleic acid testing kits. The results show that the cases that can be tested by the pool testing method are several times more than that by the one-by-one testing method while the infection rate is extremely low.
However, the pool testing method is effective only when the infection rate is less than 0.3078. For the pool testing, the higher the infection rate, the smaller the optimal sample sizes in one bunch. This research also proposes the optimal sample sizes in one bunch corresponding to different infection rates. In practice, the infection rate in a country or region can be estimated by random sampling and testing, and then the sample size in one bunch can be determined.