Serial SARS-CoV-2 Seropravelence Studies in Delhi July-August 2020: Indications of Pre-existing Cross-reactive Antibodies and Implications for Disease Progression

Two seropravelence studies were undertaken in Delhi, the city-state capital of India, in July-August 2020, exactly one month apart, to test for SARS-CoV-2 antibodies. Virus-tested (mostly RT-PCR) caseloads corresponding to these surveys, as of 13 days earlier to ensure antibody generation, were compared. The survey conducted June 26-July 10 (sample size 21387) showed 23.48% seropravelence, (extrapolated to 4.48 mn of Delhi population of 19.1 mn), which was 79-times higher than corresponding virus-tested positives totaling 56746. Survey conducted August 1-7 (15311 samples) showed 29.1% antibody-positive (5.56 mn population), and was 44x of virus-tested positive total of 125096. Pointing out that all serological surveys world-over have shown antibody-positives to be higher than virus-test positives by multiples 7x to 80x, this study seeks to examine why the multiple should decline so drastically in one month, from 79x to 44x. Statistical adjustments were performed for Sampling Error and Sensitivity/Specicity of the diagnostic kits. Indigenously developed COVID KAVACH ELISA tests for IgG antibodies to the SARS-CoV-2 virus were used for the surveys. Signicantly, statistical adjustments were also done to account for the Testing Volumes and (Spot) Positivity rates at the two different times. [Spot Positivity is dened in the study and is the closest estimate of current or fresh positivity.] After all statistical revisions, the antibody-positive to virus-test positive multiples stood at 53x and 37x for the two surveys. Calculating across the two sets of data, and other sensitivity analysis, the study indicates that there is a signicant proportion of pre-existing cross-reactive antibodies (possibly to the HCoV viruses), that are seropositive in SARS-CoV-2 antibody tests, to the extent of 16%-19% of the population. The study also infers that there is an Amplication Factor of 15 in the Delhi serostudies: ie, each virus-test positive represents 14 more who are possibly asymptomatic and untested. a seropravelence 31%-34% for the 3 rd serial serosurvey scheduled in September, whose results expected 22 nd September. Limitations of the study are discussed, notably the absence of any research paper on the survey techniques, antibody testing controversies, and the statistical adjustment for Testing Volumes. The study discusses how Chain-of-Transmission protocols and Decreasing Susceptible Population work in unison to slow down a pandemic, and analyses the disease progression graph of Delhi in that context. The implications of 16%-19% pre-existing antibodies on disease progression in Delhi are discussed. testing the rst serosurvey. testing spot positivity, on dene the term Spot Positivity to mean Current or Fresh Positivity Rates of Virus-tests; Spot Positivity on Day D is dened as the (Total of Fresh Cases on Days D, D+1, D+2) divided by (Total of Fresh Tests conducted on Day D-1, D and D+1)] The sample sizes for the two surveys were 25387 and 15311, respectively, with results 23.48% and 29.1% positive. The Adjusted Wald approach (12) adjusts for the population extrapolation by computing a condence interval with a 95% condence level of this assertion. The condence intervals were [22.96% - 24.01%] for the rst survey and [28.51% - 29.95%] for the second survey. We take the lower bound in both cases to avoid inating the sampling bias. The resultant revision is below.


Main Text
The National Centre for Disease Control (NCDC) India conducted a serological survey for IgG antibodies to SARS-CoV-2 virus in the state of Delhi, India -a city-state and the capital of the country -between June 26 and July 10, 2020 (1) , median date July 3. A second serological survey was conducted by the Government of Delhi on the same population, from August 1-7 2020 (2) , median date August 3. Table 1 below provides details of these surveys and results, considering Delhi's population as 19.1 million (3) . The median time to develop IgG antibodies is 13 days from symptom onset (4) , so these serosurveys correspond to those infected by the virus latest by June 20 th and July 21 st Table 1 shows that while there were 56746 infected cases (2971 per mn) -all tested for virus positivity by RT-PCR, negligible RAT testing, there were 4.48 mn (234,800 per mn) seropositive cases -positive for the presence of SAR-CoV-2 antibodies. By conventional wisdom, 4.48 mn residents had contracted Covid-19 and had developed antibodies thereafter. Those antibodypositive cases were 79 times higher than virus-tested cases. These were "invisible" cases -not only were they adequately asymptomatic to not require medical attention, but they had not also been tested as primary contacts of infected cases. Similarly, for the second survey, there were 6550 virus-positive cases per million, while there were 291,000 per million seropositive, a multiple for 44x.

The Research Question
Although it is surprising that seropositive cases should be much higher than those recorded infected by virus testing, this result is consistent with other studies across the world (see following section). However, it is intriguing that the gap between serosurvey positives and virus-test positives should fall so much -79 times down to 44 times -in the course of 1 month.
The objective of this study is not to establish that antibody-positive cases are higher in Delhi than recorded-infected cases or to examine why the multiples are as high as 79x, though the study will provide some answers to these questions. The objective of this study is to scrutinize why multiple antibody-tested positive cases fell from 79x in June-July to 44x exactly one month later.
This phenomenon is counterintuitive. If there are "invisible" asymptomatic untested cases who have recovered and developed antibodies, as high as 79 times the recorded numbers at any point of time, this multiple should remain similar at second and subsequent surveys. If the natural phenomenon is that invisible cases infect others because they are without quarantine or isolation, they should continue doing so at all times.
There is one possibility that testing volumes were low until June 20 th , and consequently, a number of symptomatic cases were untested in that period but soon fully recovered with active antibodies that showed up in the rst survey. A look at the data leads to the intuitive conclusion that virus-tested positives would not have gone up to comparable levels even if testing volumes were higher before the rst serosurvey. We establish this intuitive conclusion by the statistical revisions we perform, using testing volumes and spot positivity, to "equate" the two surveys on this parameter. [ There is no debate that the total number of those who will go on later to develop antibodies will always be higher than the recorded positive cases. Many infected cases, mostly asymptomatic, are untested. Testing capacity and protocols are not designed to hunt for asymptomatic cases. Practical testing logistics limits testing to symptomatic cases and their primary contacts and excludes asymptomatic cases and secondary contacts. For these and other reasons, there will always be a number of "invisible" Covid-19 cases in the community.
This has been established time and again. All serological surveys conducted so far have indicated an antibody-positive number in excess of virus-tested positive cases, with the range varying between 6x and 80x. Among a few such, a study in Gangelt Germany in March 2020 (5) revealed a 7-fold higher seroprevalence than con rmed infected cases. A widely cited study in The Lancet (6) conducted in Geneva during April-May reached the de nitive conclusion that antibody-positive cases were 11.6x higher than virus-tested positive cases. A study conducted at 10 diverse sites in the USA between Mar-May 2020 (7) showed an average gap of 38x between seroprevalent cases versus recorded-infected cases (counted 7 days prior to antibody testing)the multiple varied widely across the 10 sites. A study in Spain involving 61075 samples conducted in April-May (8) showed seroprevalence between 3.7% and 6.2% and an antibody-positive gure that is at least 19x the virus-positive cases (after extrapolating the math in the paper). Several other studies (8) report seroprevalence data without comparing with corresponding recorded-infected cases -if computed, these would also reveal signi cantly higher multiples of antibody-positive cases. It can be inferred that an unstated informal consensus is that seroprevalent cases 10x-15x higher than recorded infected cases are not unusual.
Interestingly, two of the studies cited above have reported a drop between two serial serostudies -in Geneva and some sites of USA -but these have been considered non-typical aberrations in only one subsequent round of testing. These have been seen in light of the possibility that antibodies may decrease over time. This is an unresolved question, with other research upholding both sides of the argument, and we will not factor-in the possibility of antibody decrease in our study.

Background: A Basic View of Viral Dynamics and Antibody Generation
With a large number of asymptomatic cases, the classical picture of exposed à incubation à onset à mild/moderate/severe disease à resolution is now inadequate. A study of viral dynamics is unwarranted in this study, but the relevant context is presented in Fig 1 below. This is a simpli ed schematic that shows the time relationships between the disease (detection and later), infectivity and antibody generation. The schematic is simpli ed by the use of median values when each element is actually a probability distribution. Exceptions arising from some recent research (e.g., no antibody generation) are avoided. Viral shedding is detected by RT-PCR testing; however, this oversensitive test will also detect viruses that are not alive (cannot be cultured) and hence do not contribute to active disease in the patient. The patient remains infective as long as the virus is live, and there is generally a phase-out of infectivity simultaneously with a phase-in of seroconversion (antibody generation). Fig 1 below is self-explanatory.

Statistical Adjustments Prior to Analysis
Raw data regarding total infected cases and seropravelence for the two serosurveys are given below: Situation, originally; (Per mil population, corresponding seroprevalence data approx. 15 days later) Case 1: As of 20 th June, the total Covid-infected (virus test) was 2971, and the antibody-test positive was 234,800 (79x).
Sampling Error. Extrapolating readings from a sample for the population may result in errors. The sample sizes for the two surveys were 25387 and 15311, respectively, with results 23.48% and 29.1% positive. The Adjusted Wald approach (12) adjusts for the population extrapolation by computing a con dence interval with a 95% con dence level of this assertion. The con dence intervals were [22.96% -24.01%] for the rst survey and [28.51% -29.95%] for the second survey. We take the lower bound in both cases to avoid in ating the sampling bias. The resultant revision is below.
Situation, after removing the sampling error (per mil pop, corresponding seroprevalence data approximately 15 days later) Case 1: As of 20 th June, the total Covid-infected (virus test) was 2971, and antibody-test positive was 229,600 (77x). when the Government of Delhi noti ed that "reconciliation with ICMR gures" had led to a reduction of 97008 tests cumulatively (16) . We have prorated this reduction across all previous days from 12 th July. Data details are provided in Table 2 below. For some key dates, including 20 th June and 21 st July, the two equivalent dates for virus-tested positive cases correspond to the two serosurveys. We also provide in the  We wish to adjust the Infected Cases to account for Testing Volumes and Positivity -i.e., we want to forecast how many additional Infected Cases would grow with increased volume of tests. We deal with two different forces at play. In a short time frame of a day or two, additional tests until a point would detect Covid-positive patients at the same rate as the Spot Positivity Rate; tests beyond that point would begin to detect more negative cases, reducing Spot Positivity. Over a longer time frame, fresh infections would emerge at a rate increasing or decreasing depending upon the disease trajectory in the community. In both the short run and the long run, it is di cult to forecast the outcome in terms of additional infected cases detected.
In perhaps the only study of its kind, Favero (17) identi es a statistical basis to adjust case counts with respect to testing volume by adjusting for current positivity rate. Number will match the testing strategy adopted. If the strategy is to test only 18-year-olds, the total outbreak number will only be with respect to 18-year-olds. Alternatively, if the strategy is to test high-incidence areas, the total outbreak number will re ect only those high-incidence areas. The Total Outbreak Number or Adjusted Infected Cases is not a miracle formula for the total con rmed cases in the world!
We assume a constant = 0.02 for our exercise. Given positivity 22.32% on 20 th June and 6.9% on 21 st July, actual infected cases will rise to 4297 and 7454 per million, instead of the initial scores of 2971 and 6550 per million, respectively.
Removing Sampling and Diagnostic Kit Errors & Adjusting for Testing Volumes: (Per million population, corresponding seroprevalence data approx. 15 days later) Case 1: As of June 20, the number of adjusted infected cases (virus test) was 4297, and the number of antibody-test-positive cases was 228,300 (53x).
Case 2: As of July 21, the number of adjusted infected cases (virus test) was 7454, and the number of antibody test-positive cases was 278,400 (37x).

Analysis of Differences in Seropravelence Multiples Over Two Studies
On 20 th June, there were 53 times more antibody positive cases compared to recorded virus positives, and by 21 st July, this multiple had gone down to 37x. Between the two dates, virus-tested cases increased by 3157 per million, but antibody-positive cases increased by only 16 times (50100 per mn), not by 53 times as would be expected. This seems to mimic the linear equation y = mx + c, where the antibody positives (y) equals a linear increase mx (x are the virus-positives, and m is 16 above), plus a constant c.
This phenomenon is explained if there is a proportion of the population that has pre-existing SARS-CoV-2 antibodies without having gone through the disease. If say, 150,000 per million have pre-existing antibodies (15%), then those developing antibodies after undergoing disease will roughly be a constant multiple of virus-positive cases. Our research question -why is there a drop in the multiples between two studies -would be answered by the existence of a population with pre-existing antibodies, and the multiple would then not change between studies.
We develop this model analytically and then solve for the values: 1. Let X be the number per million within the population with pre-existing SARS-CoV-2 antibodies, or equivalent, such that they test positive to Covid 19 antibody tests without having undergone the disease. Equivalently, X divided by 10,000 is a percentage that is the non-susceptible population, and (100 -this %) is the percentage of the population susceptible.
2. Seropravelence (expressed as a number found antibody-positive per million), less X, is the actual number of seroprevalent individuals who developed antibodies after contracting the disease. This follows from (1)  The discussion so far presumes the presence of pre-existing SARS-CoV-2 antibodies. However, these could equally be other antibodies cross-reactive with the SARS-CoV-2 antigen. This has been frequently reported in recent literature. Both Van der Heide (18) and Ma et al (19) in research published in June 2020 and August 2020 report the cross-reactivity of endemic human coronavirus (HCoV) antibodies against SARS-CoV-2, in one case as high as 10% among individuals not exposed to SARS-CoV-2. Pre-existing cross-reactive antibodies mean that antibodies generated after some other infection are effective against SARS-CoV-2. This is more likely than individuals who acquired precise SARS-CoV-2 antibodies without going through the disease.
To understand the range of variations in X (% of population with pre-existing antibodies) and F (Ampli cation Factor for any virus-tested positive case), we repeat the calculations with a different set of data. We use the gures prior to adjustment for testing volumes, which also helps us understand the impact of Test Volume adjustment. Fitting the data: 2971 x F = 228300 -X … from Survey 1, where gures are per million population, and F is unit-less and, 6550 x F = 278400 -X … from Survey 2 implies, X = 186706 (18.67%), and F = 14.0 The results vary within a small range, with or without adjustments for testing volume. The pre-existing antibody coverage varies between 16% and 18.7%, while the ampli cation factor varies between 15.9 and 14.

Conclusions
This study reached the following conclusions: 1. Approximately 16% -19% of the population possesses pre-existing antibodies, possibly cross-reactive antibodies of HCoV or other viruses, which provide immunity against the SARS-CoV-2 virus. These individuals test seropositive for SARS-CoV-2 IgG antibodies without undergoing the disease.
2. Every virus-tested positive case represents a larger number of virus-infected cases, uncounted because they are asymptomatic or otherwise untested. Every virus-tested positive case represents 14-16 cases. This nding formalises what has been observed in all serostudies.
These conclusions are subject to the limitations of this study described below and in the context of Delhi State only. Since the climatic, ecological, genetic and economic conditions of a community may impact the elements of this research, further work will be needed to check the applicability of these results to other geographical areas.
Based on the foregoing analysis, we forecast the outcome of the serosurvey scheduled to have taken place at Delhi between 1 st and 5 th September, for which the latest infections were 20 th August, and the results are expected around September 22 nd , one week later than the date of this report. We expect seropravelence to be approximately 31% -34% for this survey. Anything vastly different from this would require re-examination of this study or thorough scrutiny of the serosurvey techniques.
Pre-existing seroprevalence and an ampli cation factor of invisible cases have implications for disease progression. For subsequent analysis, we will simplify to the following: 17% of the population has pre-existing antibodies, and every virus-tested positive stands for 15 virus-infected cases.
Limitations of this study 6. Covid 19 data is low quality worldwide, with reporting delays, bad data discipline, and more.
The study conclusions are signi cant despite limitations, and adequate care was exercised to make the conclusions quantitatively robust.
Implications: How Does Pre-existing Antibodies in 17% of Population Impact Disease Progression?
The short answer to this question is that it helps, but we don't know how much! Clearly, 17% are already immune and nonsusceptible. Those who contract the disease and recover acquire antibodies and become non-susceptible. These two categories we have so far called seroprevalent. Additionally, immunity is another category without antibodies -those with various forms of adaptive immunity, such as memory T-cells, non-speci c advantages of BCG vaccination, etc. We have no idea of the proportion of such cases.
We know that the transmission e ciency (Reproduction Number) and the availability of a ready susceptible population are the two reasons that a pandemic grows. Similarly, two forces work together in unison to slow down a pandemic: It is intuitively obvious that a virus that has a nite lifetime in open air nds it di cult to locate a fresh host if more and more people are already non-susceptible --it dies during the search!. This idea is rigorously presented in the differential equations that govern the Suspected-Infected-Recovered (SIR) model of epidemiology. Virus transmission slows down progressively as numbers of non-susceptibles increase --it is tougher for a virus to maintain growth-rate at 50% non-susceptible compared to 30% and even tougher at 70%.

Breaking the Chain of Transmission
Whether susceptible or immune, if individuals and communities work to block the virus from getting to them, in theory, viruses will nd no hosts, and disease will end. Conceptually, if the non-susceptible population is one that cannot become infected, those who put up barriers to transmission make themselves unavailable for infection. Individual protocols to break the chain of transmission (distancing, masks, etc.) are well known. Less understood is the role of non-Covid and Covid "bubbles". Monitored community pockets -e.g., gated residential complexes, old-age homes, monasteries -act as self-contained barriers to the chain of transmission. Some population subgroups (e.g., those with serious illness, superseniors and children < 5 yrs) form logical bubbles. Equally, it is possible to "bottle-up" localized areas of the Covid outbreak by physical distancing and barriersthe so-called containment zone. These now become Covid bubbles, insulated from contaminating the community outside. All these categories together reduce the number available for the virus to infect. Since chain-of-transmission protocols are not perfectly maintained by all people at all times, particularly over long durations, the number available to infect keeps going up and down.
Working together in unison. Both these measures work in unison. The virus has to make its way and nd hosts to infect among those available for infection. If a large enough population effectively barriers themselves from the chain of transmission, say 20%, then the virus has only 80% left for consideration, out of whom 40% may be non-susceptible. Viruses die in the effort to nd hosts. Note that if 40% of people are immune and 50% of the population is well sheltered, the situation is as di cult for the virus as if 80% of people are immune. Delhi   Fig 2 presents the Disease Progression graph in Delhi, 7-day Moving Average. We de ne that an outbreak has "reached a peak"

Implications: Disease Progression in
in an area when daily fresh cases fall to 2/3 rd of its highest value, and no daily number thereafter goes above 2/3 rd of the peak value. The graph shows that Delhi's highest daily caseload was on 26 th June when (the 7-day MA) was 3446. This became the peak when the daily caseload fell to 2237 on 6 th July, i.e., below 2/3 rd of the maximum value. However, after a long lull, daily caseloads rose again and touched 2404 on 5 th September, going higher than 2/3 rd of max value, nullifying the peak found earlier. Delhi is an example in Covid 19 progression where a peak was achieved and then lost. Delhi has not yet peaked. 3. Unlock 3.0, the reduction of containment zones and lockdown fatigue after 6 intense weeks began on approximately 1 st The consequences are visible from 18 th Aug, when the disease graph shot up wildly. Migration of patients from areas around Delhi would have added to the problem. 4. The seropravelence ~38% corresponding to 6 th Sept (when undertaken on 20 th Sept), combined with other immune categories, is reaching levels where virus transmission rates will soon slow down automatically. By current trends, the graph should see a plateauing and downturn starting in about two weeks, and this will be a permanent downward trajectory because of epidemiological considerations.
It is obvious that the 17% population with pre-existing antibodies signi cantly reduces the duration of the total pandemic in Delhi, although the disease will continue for some months with a downward-sloping graph. However, without this number with pre-existing antibodies, the time to reach a decisive maximum (before plateauing or slowing) would have been extended by another 2 months.

STATEMENT OF FUNDING AND ABSENCE OF COMPETING INTERESTS
The author a rms that this research was entirely self-funded, and there is no con ict of interest or competing interest with any other entity or individual.