DOI: https://doi.org/10.21203/rs.3.rs-80259/v1
Two seropravelence studies were undertaken in Delhi, the city-state capital of India, in July-August 2020, exactly one month apart, to test for SARS-CoV-2 antibodies. Virus-tested (mostly RT-PCR) caseloads corresponding to these surveys, as of 13 days earlier to ensure antibody generation, were compared. The survey conducted June 26-July 10 (sample size 21387) showed 23.48% seropravelence, (extrapolated to 4.48 mn of Delhi population of 19.1 mn), which was 79-times higher than corresponding virus-tested positives totaling 56746. Survey conducted August 1-7 (15311 samples) showed 29.1% antibody-positive (5.56 mn population), and was 44x of virus-tested positive total of 125096. Pointing out that all serological surveys world-over have shown antibody-positives to be higher than virus-test positives by multiples 7x to 80x, this study seeks to examine why the multiple should decline so drastically in one month, from 79x to 44x.
Statistical adjustments were performed for Sampling Error and Sensitivity/Specificity of the diagnostic kits. Indigenously developed COVID KAVACH ELISA tests for IgG antibodies to the SARS-CoV-2 virus were used for the surveys. Significantly, statistical adjustments were also done to account for the Testing Volumes and (Spot) Positivity rates at the two different times. [Spot Positivity is defined in the study and is the closest estimate of current or fresh positivity.] After all statistical revisions, the antibody-positive to virus-test positive multiples stood at 53x and 37x for the two surveys. Calculating across the two sets of data, and other sensitivity analysis, the study indicates that there is a significant proportion of pre-existing cross-reactive antibodies (possibly to the HCoV viruses), that are seropositive in SARS-CoV-2 antibody tests, to the extent of 16%-19% of the population. The study also infers that there is an Amplification Factor of 15 in the Delhi serostudies: ie, each virus-test positive represents 14 more who are possibly asymptomatic and untested.
The study forecasts a seropravelence 31%-34% for the 3rd serial serosurvey scheduled in September, whose results expected 22nd September. Limitations of the study are discussed, notably the absence of any research paper on the survey techniques, antibody testing controversies, and the statistical adjustment for Testing Volumes. The study discusses how Chain-of-Transmission protocols and Decreasing Susceptible Population work in unison to slow down a pandemic, and analyses the disease progression graph of Delhi in that context. The implications of 16%-19% pre-existing antibodies on disease progression in Delhi are discussed.
The National Centre for Disease Control (NCDC) India conducted a serological survey for IgG antibodies to SARS-CoV-2 virus in the state of Delhi, India – a city-state and the capital of the country – between June 26 and July 10, 2020(1), median date July 3. A second serological survey was conducted by the Government of Delhi on the same population, from August 1-7 2020(2), median date August 3. Table 1 below provides details of these surveys and results, considering Delhi’s population as 19.1 million(3). The median time to develop IgG antibodies is 13 days from symptom onset(4), so these serosurveys correspond to those infected by the virus latest by June 20th and July 21st, respectively.
Surv No. |
Survey Dates |
Median Date |
Sample Size |
Positivity Detected (antibody +ve) |
Implies, infected… |
Corresponding “Infected” Dt |
Tot virus-test +ve by date |
Multiple of Antibody +ve to Virus +ve |
1 |
June 26-July 10 |
July 3 |
21,387 |
23.48% |
4.48 mill |
By June 20 |
56,746 |
79-times |
2 |
Aug 1 - 7 |
Aug 3 |
15,311 |
29.1% |
5.56 mill |
By July 21 |
125,096 |
44-times |
Table 1: Details of Two Serological Surveys Conducted at Delhi |
Table 1 shows that while there were 56746 infected cases (2971 per mn) – all tested for virus positivity by RT-PCR, negligible RAT testing, there were 4.48 mn (234,800 per mn) seropositive cases – positive for the presence of SAR-CoV-2 antibodies. By conventional wisdom, 4.48 mn residents had contracted Covid-19 and had developed antibodies thereafter. Those antibody-positive cases were 79 times higher than virus-tested cases. These were “invisible” cases – not only were they adequately asymptomatic to not require medical attention, but they had not also been tested as primary contacts of infected cases. Similarly, for the second survey, there were 6550 virus-positive cases per million, while there were 291,000 per million seropositive, a multiple for 44x.
The Research Question
Although it is surprising that seropositive cases should be much higher than those recorded infected by virus testing, this result is consistent with other studies across the world (see following section). However, it is intriguing that the gap between serosurvey positives and virus-test positives should fall so much – 79 times down to 44 times – in the course of 1 month.
The objective of this study is not to establish that antibody-positive cases are higher in Delhi than recorded-infected cases or to examine why the multiples are as high as 79x, though the study will provide some answers to these questions. The objective of this study is to scrutinize why multiple antibody-tested positive cases fell from 79x in June-July to 44x exactly one month later.
This phenomenon is counterintuitive. If there are “invisible” asymptomatic untested cases who have recovered and developed antibodies, as high as 79 times the recorded numbers at any point of time, this multiple should remain similar at second and subsequent surveys. If the natural phenomenon is that invisible cases infect others because they are without quarantine or isolation, they should continue doing so at all times.
There is one possibility that testing volumes were low until June 20th, and consequently, a number of symptomatic cases were untested in that period but soon fully recovered with active antibodies that showed up in the first survey. A look at the data leads to the intuitive conclusion that virus-tested positives would not have gone up to comparable levels even if testing volumes were higher before the first serosurvey. We establish this intuitive conclusion by the statistical revisions we perform, using testing volumes and spot positivity, to “equate” the two surveys on this parameter. [We define the term Spot Positivity to mean Current or Fresh Positivity Rates of Virus-tests; Spot Positivity on Day D is defined as the (Total of Fresh Cases on Days D, D+1, D+2) divided by (Total of Fresh Tests conducted on Day D-1, D and D+1)]
Background: Antibody-tested Positives are Always Higher than Virus-tested Positives
There is no debate that the total number of those who will go on later to develop antibodies will always be higher than the recorded positive cases. Many infected cases, mostly asymptomatic, are untested. Testing capacity and protocols are not designed to hunt for asymptomatic cases. Practical testing logistics limits testing to symptomatic cases and their primary contacts and excludes asymptomatic cases and secondary contacts. For these and other reasons, there will always be a number of “invisible” Covid-19 cases in the community.
This has been established time and again. All serological surveys conducted so far have indicated an antibody-positive number in excess of virus-tested positive cases, with the range varying between 6x and 80x. Among a few such, a study in Gangelt Germany in March 2020(5) revealed a 7-fold higher seroprevalence than confirmed infected cases. A widely cited study in The Lancet(6) conducted in Geneva during April-May reached the definitive conclusion that antibody-positive cases were 11.6x higher than virus-tested positive cases. A study conducted at 10 diverse sites in the USA between Mar-May 2020(7) showed an average gap of 38x between seroprevalent cases versus recorded-infected cases (counted 7 days prior to antibody testing) – the multiple varied widely across the 10 sites. A study in Spain involving 61075 samples conducted in April-May(8) showed seroprevalence between 3.7% and 6.2% and an antibody-positive figure that is at least 19x the virus-positive cases (after extrapolating the math in the paper). Several other studies(8) report seroprevalence data without comparing with corresponding recorded-infected cases – if computed, these would also reveal significantly higher multiples of antibody-positive cases. It can be inferred that an unstated informal consensus is that seroprevalent cases 10x-15x higher than recorded infected cases are not unusual.
Interestingly, two of the studies cited above have reported a drop between two serial serostudies – in Geneva and some sites of USA – but these have been considered non-typical aberrations in only one subsequent round of testing. These have been seen in light of the possibility that antibodies may decrease over time. This is an unresolved question, with other research upholding both sides of the argument, and we will not factor-in the possibility of antibody decrease in our study.
Background: A Basic View of Viral Dynamics and Antibody Generation
With a large number of asymptomatic cases, the classical picture of exposed à incubation à onset à mild/moderate/severe disease à resolution is now inadequate. A study of viral dynamics is unwarranted in this study, but the relevant context is presented in Fig 1 below. This is a simplified schematic that shows the time relationships between the disease (detection and later), infectivity and antibody generation. The schematic is simplified by the use of median values when each element is actually a probability distribution. Exceptions arising from some recent research (e.g., no antibody generation) are avoided. Viral shedding is detected by RT-PCR testing; however, this oversensitive test will also detect viruses that are not alive (cannot be cultured) and hence do not contribute to active disease in the patient. The patient remains infective as long as the virus is live, and there is generally a phase-out of infectivity simultaneously with a phase-in of seroconversion (antibody generation). Fig 1 below is self-explanatory.
Statistical Adjustments Prior to Analysis
Raw data regarding total infected cases and seropravelence for the two serosurveys are given below:
Situation, originally; (Per mil population, corresponding seroprevalence data approx. 15 days later)
Case 1: As of 20th June, the total Covid-infected (virus test) was 2971, and the antibody-test positive was 234,800 (79x).
Case 2: As of 21st July, the total Covid-infected (virus-test) was 6550, and the antibody-test positive was 291,000 (44x).
Sampling Error. Extrapolating readings from a sample for the population may result in errors. The sample sizes for the two surveys were 25387 and 15311, respectively, with results 23.48% and 29.1% positive. The Adjusted Wald approach(12) adjusts for the population extrapolation by computing a confidence interval with a 95% confidence level of this assertion. The confidence intervals were [22.96% - 24.01%] for the first survey and [28.51% - 29.95%] for the second survey. We take the lower bound in both cases to avoid inflating the sampling bias. The resultant revision is below.
Situation, after removing the sampling error (per mil pop, corresponding seroprevalence data approximately 15 days later)
Case 1: As of 20th June, the total Covid-infected (virus test) was 2971, and antibody-test positive was 229,600 (77x).
Case 2: As of 21st July, the total Covid-infected (virus-test) was 6550, and antibody-test positive was 285,100 (44x).
Error of Sensitivity/Specificity of Diagnostic Kits. Both surveys used the KOVID KAVACH ELISA test, indigenously developed by the National Institute of Virology, Pune. This test quantifies IgG antibodies against the spike glycoprotein of the SARS-CoV-2 virus. The developers report(13) a sensitivity of 92.37% and a specificity of 97.9%. [These figures were unnecessarily mystified by a hasty Press Release by Indian Council of Medical Research (ICMR), reporting far-higher Specificity/Sensitivity scores(14), subsequently amended.] After adjusting for false positives and false negatives by standard formula (15)
Actual Prevalence Rate = (Seropravelence x Sensitivity) + (1-Specificity) x (1-Seropravelence)
The actual prevalence rate computes as 22.83% for Survey 1 and 27.84% for Survey 2.
After removing Sampling and Diagnostic Kit Errors: (Per mn pop, corresponding seroprevalence data ~15 days later)
Case 1: As of 20th June, the total Covid-infected (virus test) was 2971, and antibody-test positive was 228,300 (77x).
Case 2: As of 21st July, the total Covid-infected (virus-test) was 6550, and the antibody-test positive was 278,400 (43x).
Adjusting for Testing Volumes. On 20th June, when there were 2971 cumulative infected cases per million, a total of 351,909 tests were conducted. Over subsequent days, fresh tests added to the cumulative total until 13th July (by when 789853 tests), when the Government of Delhi notified that “reconciliation with ICMR figures” had led to a reduction of 97008 tests cumulatively(16). We have prorated this reduction across all previous days from 12th July. Data details are provided in Table 2 below. For some key dates, including 20th June and 21st July, the two equivalent dates for virus-tested positive cases correspond to the two serosurveys. We also provide in the table the Cumulative Covid Positivity Rate (Total Infections divided by Total Tests), as well as the Spot Positivity Rate on these dates. Spot Positivity Rate, as defined earlier, provides the best estimate of Fresh (Current) Positivity on a given date.
Cumulative by Date OR As on Date |
Cumulative Covid Positivity Rate |
Spot Positivity Rate |
|||||||
Total Infections till date |
Total Tests till date |
Adjusted Tests till Date |
Total Infections per mn |
Total Tests per mn |
Cumulative Covid Positivity Rate |
.Fresh Infections days D, D+1, D+2 |
Fresh Tests days D-1, D, D+1 |
Spot Positivity on Day D |
|
20-Jun |
56746 |
351909 |
308688 |
2971 |
16162 |
18.38% |
9539 |
42729 |
22.32% |
21-Jul |
125096 |
851311 |
851311 |
6550 |
44571 |
14.70% |
3617 |
52432 |
6.90% |
20-Aug |
157354 |
1375193 |
1375193 |
8238 |
72000 |
11.44% |
3877 |
55554 |
6.98% |
6-Sep |
191449 |
1780512 |
1780512 |
10024 |
93221 |
10.75% |
8942 |
97895 |
9.13% |
Table 2: Cumulative and Spot Covid Positivity Rates on Selected Dates |
We wish to adjust the Infected Cases to account for Testing Volumes and Positivity – i.e., we want to forecast how many additional Infected Cases would grow with increased volume of tests. We deal with two different forces at play. In a short time frame of a day or two, additional tests until a point would detect Covid-positive patients at the same rate as the Spot Positivity Rate; tests beyond that point would begin to detect more negative cases, reducing Spot Positivity. Over a longer time frame, fresh infections would emerge at a rate increasing or decreasing depending upon the disease trajectory in the community. In both the short run and the long run, it is difficult to forecast the outcome in terms of additional infected cases detected.
In perhaps the only study of its kind, Favero(17) identifies a statistical basis to adjust case counts with respect to testing volume by adjusting for current positivity rate. By Favero’s rule, the Total Outbreak number, or Adjusted Infected Cases, depends on the Spot Positivity at the time and is expressed as:
Adjusted Infected Cases = Actual Infected Cases x [1 + (Positivity Rate x 100 x Constant)], Constant=.01--> .02
For example, assuming the constant to be 0.02, for 2971 cases per million at a Spot Covid Positivity Ratio of 22.32%, the Adjusted Confirmed Cases works out to 4297 per million on June 20. Obviously, the Adjusted Infected Cases or Total Outbreak Number will match the testing strategy adopted. If the strategy is to test only 18-year-olds, the total outbreak number will only be with respect to 18-year-olds. Alternatively, if the strategy is to test high-incidence areas, the total outbreak number will reflect only those high-incidence areas. The Total Outbreak Number or Adjusted Infected Cases is not a miracle formula for the total confirmed cases in the world!
We assume a constant = 0.02 for our exercise. Given positivity 22.32% on 20th June and 6.9% on 21st July, actual infected cases will rise to 4297 and 7454 per million, instead of the initial scores of 2971 and 6550 per million, respectively.
Removing Sampling and Diagnostic Kit Errors & Adjusting for Testing Volumes:
(Per million population, corresponding seroprevalence data approx. 15 days later)
Case 1: As of June 20, the number of adjusted infected cases (virus test) was 4297, and the number of antibody-test-positive cases was 228,300 (53x).
Case 2: As of July 21, the number of adjusted infected cases (virus test) was 7454, and the number of antibody test-positive cases was 278,400 (37x).
Analysis of Differences in Seropravelence Multiples Over Two Studies
On 20th June, there were 53 times more antibody positive cases compared to recorded virus positives, and by 21st July, this multiple had gone down to 37x. Between the two dates, virus-tested cases increased by 3157 per million, but antibody-positive cases increased by only 16 times (50100 per mn), not by 53 times as would be expected. This seems to mimic the linear equation y = mx + c, where the antibody positives (y) equals a linear increase mx (x are the virus-positives, and m is 16 above), plus a constant c.
This phenomenon is explained if there is a proportion of the population that has pre-existing SARS-CoV-2 antibodies without having gone through the disease. If say, 150,000 per million have pre-existing antibodies (15%), then those developing antibodies after undergoing disease will roughly be a constant multiple of virus-positive cases. Our research question – why is there a drop in the multiples between two studies – would be answered by the existence of a population with pre-existing antibodies, and the multiple would then not change between studies.
We develop this model analytically and then solve for the values:
We fit the data from the two surveys after all statistical adjustments:
[Virus-tested-positive] x F = [Seroprevalence -X]
4297 x F = 228300 -X … from Survey 1, where figures are per million population, and F is unit-less
and, 7454 x F = 278400 -X … from Survey 2
Solving the set of simultaneous equations leads to X = 160150, and F = 15.86
A total of 160150 per million, or 16.02%, are pre-existing with SARS-CoV-2 antibodies without having undergone the disease. Every virus-tested positive case represents 15.9 Covid-infected people, implying that 14.9 people were uncounted as possibly asymptomatic cases.
The discussion so far presumes the presence of pre-existing SARS-CoV-2 antibodies. However, these could equally be other antibodies cross-reactive with the SARS-CoV-2 antigen. This has been frequently reported in recent literature. Both Van der Heide(18) and Ma et al(19) in research published in June 2020 and August 2020 report the cross-reactivity of endemic human coronavirus (HCoV) antibodies against SARS-CoV-2, in one case as high as 10% among individuals not exposed to SARS-CoV-2. Pre-existing cross-reactive antibodies mean that antibodies generated after some other infection are effective against SARS-CoV-2. This is more likely than individuals who acquired precise SARS-CoV-2 antibodies without going through the disease.
To understand the range of variations in X (% of population with pre-existing antibodies) and F (Amplification Factor for any virus-tested positive case), we repeat the calculations with a different set of data. We use the figures prior to adjustment for testing volumes, which also helps us understand the impact of Test Volume adjustment. Fitting the data:
2971 x F = 228300 -X … from Survey 1, where figures are per million population, and F is unit-less
and, 6550 x F = 278400 -X … from Survey 2
implies, X = 186706 (18.67%), and F = 14.0
The results vary within a small range, with or without adjustments for testing volume. The pre-existing antibody coverage varies between 16% and 18.7%, while the amplification factor varies between 15.9 and 14.
This study reached the following conclusions:
These conclusions are subject to the limitations of this study described below and in the context of Delhi State only. Since the climatic, ecological, genetic and economic conditions of a community may impact the elements of this research, further work will be needed to check the applicability of these results to other geographical areas.
Based on the foregoing analysis, we forecast the outcome of the serosurvey scheduled to have taken place at Delhi between 1st and 5th September, for which the latest infections were 20th August, and the results are expected around September 22nd, one week later than the date of this report. We expect seropravelence to be approximately 31% - 34% for this survey. Anything vastly different from this would require re-examination of this study or thorough scrutiny of the serosurvey techniques.
Pre-existing seroprevalence and an amplification factor of invisible cases have implications for disease progression. For subsequent analysis, we will simplify to the following: 17% of the population has pre-existing antibodies, and every virus-tested positive stands for 15 virus-infected cases.
Limitations of this study
The study conclusions are significant despite limitations, and adequate care was exercised to make the conclusions quantitatively robust.
Implications: How Does Pre-existing Antibodies in 17% of Population Impact Disease Progression?
The short answer to this question is that it helps, but we don’t know how much! Clearly, 17% are already immune and non-susceptible. Those who contract the disease and recover acquire antibodies and become non-susceptible. These two categories we have so far called seroprevalent. Additionally, immunity is another category without antibodies – those with various forms of adaptive immunity, such as memory T-cells, non-specific advantages of BCG vaccination, etc. We have no idea of the proportion of such cases.
We know that the transmission efficiency (Reproduction Number) and the availability of a ready susceptible population are the two reasons that a pandemic grows. Similarly, two forces work together in unison to slow down a pandemic:
It is intuitively obvious that a virus that has a finite lifetime in open air finds it difficult to locate a fresh host if more and more people are already non-susceptible -- it dies during the search!. This idea is rigorously presented in the differential equations that govern the Suspected-Infected-Recovered (SIR) model of epidemiology. Virus transmission slows down progressively as numbers of non-susceptibles increase -- it is tougher for a virus to maintain growth-rate at 50% non-susceptible compared to 30% and even tougher at 70%.
Whether susceptible or immune, if individuals and communities work to block the virus from getting to them, in theory, viruses will find no hosts, and disease will end. Conceptually, if the non-susceptible population is one that cannot become infected, those who put up barriers to transmission make themselves unavailable for infection. Individual protocols to break the chain of transmission (distancing, masks, etc.) are well known. Less understood is the role of non-Covid and Covid “bubbles”. Monitored community pockets – e.g., gated residential complexes, old-age homes, monasteries – act as self-contained barriers to the chain of transmission. Some population subgroups (e.g., those with serious illness, superseniors and children < 5 yrs) form logical bubbles. Equally, it is possible to “bottle-up” localized areas of the Covid outbreak by physical distancing and barriers – the so-called containment zone. These now become Covid bubbles, insulated from contaminating the community outside. All these categories together reduce the number available for the virus to infect. Since chain-of-transmission protocols are not perfectly maintained by all people at all times, particularly over long durations, the number available to infect keeps going up and down.
Working together in unison. Both these measures work in unison. The virus has to make its way and find hosts to infect among those available for infection. If a large enough population effectively barriers themselves from the chain of transmission, say 20%, then the virus has only 80% left for consideration, out of whom 40% may be non-susceptible. Viruses die in the effort to find hosts. Note that if 40% of people are immune and 50% of the population is well sheltered, the situation is as difficult for the virus as if 80% of people are immune.
Implications: Disease Progression in Delhi
Fig 2 presents the Disease Progression graph in Delhi, 7-day Moving Average. We define that an outbreak has “reached a peak” in an area when daily fresh cases fall to 2/3rd of its highest value, and no daily number thereafter goes above 2/3rd of the peak value. The graph shows that Delhi’s highest daily caseload was on 26th June when (the 7-day MA) was 3446. This became the peak when the daily caseload fell to 2237 on 6th July, i.e., below 2/3rd of the maximum value. However, after a long lull, daily caseloads rose again and touched 2404 on 5th September, going higher than 2/3rd of max value, nullifying the peak found earlier. Delhi is an example in Covid 19 progression where a peak was achieved and then lost. Delhi has not yet peaked.
Table 3 provides serological and other data on select dates.
Delhi Pop = 19.1mn |
Caseload, Tests and Positivity |
SEROPRAVELENCE |
|||||||
As on Date |
Total Infections till date |
Adjusted Tests till Date |
Total Infections per mn |
Total Tests per mn |
Cumulative Covid Positivity Rate |
Spot Positivity on Day D |
Positivity-adjusted Tot Infect per mn |
Expected Seropravelence (+) |
|
20-Jun |
1st Survey |
56746 |
308688 |
2971 |
16162 |
18.38% |
22.32% |
4297 |
23.46%* |
26-Jun |
Max Daily |
77240 |
402763 |
4044 |
21087 |
19.18% |
18.39% |
5331 |
25% |
21-Jul |
2nd Survey |
125096 |
851311 |
6550 |
44571 |
14.70% |
6.90% |
7454 |
29.10%* |
20-Aug |
3rd Survey |
157354 |
1375193 |
8238 |
72000 |
11.44% |
6.98% |
9388 |
31%-34% |
6-Sep |
Latest dt |
191449 |
1780512 |
10024 |
93221 |
10.75% |
9.13% |
11855 |
35%-39% |
|
(+) Visible 13 days after given date |
A |
* Announced |
|
|
||||
Table 3: Seropravelence Values Expected 13 days after Key Dates |
Note: The total immune % would include those with adaptive immunity, in addition to seroprevalent %. Assume any figure for this complete unknown.
The following observations are presented in the figure and table above:
It is obvious that the 17% population with pre-existing antibodies significantly reduces the duration of the total pandemic in Delhi, although the disease will continue for some months with a downward-sloping graph. However, without this number with pre-existing antibodies, the time to reach a decisive maximum (before plateauing or slowing) would have been extended by another 2 months.
STATEMENT OF FUNDING AND ABSENCE OF COMPETING INTERESTS
The author affirms that this research was entirely self-funded, and there is no conflict of interest or competing interest with any other entity or individual.