Cervical screening is a life-saving intervention, but as the results of this analysis show, it must be applied judiciously to have maximum benefit, whilst minimising impacts of over-treatment in false positive cases. In this work, we examined advantages and limitations of different screening methods. The improved confidence gained for the patient in a negative primary HPV result in comparison with cytology testing alone (performed at an equal frequency) is due to the HPV test’s improved sensitivity(27, 28). The minimal benefit accrued in terms of sensitivity for treatable cervical cancers from co-testing, as outlined in this work, means that the decision of which modality to use is not only a scientific question, but one of appropriate allocation of limited public health resources rather than a scientific one which screening strategy should be pursued(29). Primary HPV testing at a 3 year interval has been demonstrated to have at a minimum equivalence with 5 yearly co-testing(30). Co-testing has other drawbacks too, as outlined in the result section and later in this discussion.
Primary testing(21, 31) produces an abundance of false positives, and this can certainly be reduced with triage testing, as depicted in Fig. 3. Both triage modalities (HPV with LBC reflex or LBC with HPV reflex) have only 10% the false positive rate of LBC, detecting 90% of the cases LBC would detect. As these approaches yield essentially the same result, then a screening programme could implement as their resources allowed without ill-effect. More importantly, triage outcomes are the same regardless of the primary test. As mentioned in the results section, this has economic implications. While the total number of tests required differs by only a small amount, if there is substantial differential costs between both modalities, then the optimum could be selected to minimise these without causing harm. If, for example, HPV tests were much costlier than LBC, then taking LBC primary approaches for triage would be cost-saving. Alternatively, if cytology was a limiting resource, then a HPV primary approach might be more suitable. A move towards HPV testing modalities will likely be strengthen given the WHO’s commitment to cervical cancer elimination.
The benefit of triage testing is the reduced number of excess colposcopies performed, but this comes at a slightly reduced detection rate. One potential approach to increase detection is to perform surveillance and expedited retesting of triage-negative women, and our results show that this approach should allow to eventually detect most prevalent CIN2/3 cases. Another potential approach to maximise detection rate is to perform co-testing, also shown in Fig. 2. This results in improved detection ratio relative to LBC primary (29% improvement) but at the cost of almost doubling the false positive ratio (95% increase). This is accordingly highly likely to prove excessively expensive, and ultimately detrimental to public health, as the increased rate of detection is associated with an amplified false positive rate. While this is a modality reduces missed cases of CIN2/3 cells, this analysis suggests it would not be viable, resulting in needless harm, as other authors have warned(32). This raises important ethical questions regarding the safety of any screening program. When an asymptomatic population are invited into a screening program, there remains an ethical obligation that they exit the program with a reduced cancer risk and minimal harm. Increasing referrals to colposcopy is likely to lead to over-treatment of dysplastic lesions with associated impact on fertility and obstetric outcomes, including a 2 fold increased risk of preterm birth(33). Over-diagnosis resulting from screening has long been recognised as a serious issue(17, 34) with screening programmes, although it is one that remains difficult to quantify(35). The results of this work should be useful in elucidating potential harms and benefits.
While the model presented here is useful for quantifying detection statistics, it is important to consider the limitations of this analysis. For the false positive and negative life-time probability, we did not model natural history, and there is an implicit assumption that test results are independent from previous test results. The model is not adapted to predict the accuracy of screening at different ages, or for assessing the risk of progression/regression of CIN between screening opportunities. However, the results are likely a good approximation for the worst case scenarios, i.e. the risk for a woman with persistent CIN2/3 and the risks of screening for a woman who remains negative throughout her screening journey. This important assumption requires consideration as it is plausible that there are simply some CIN2/3 lesions that may never be detected with cytology or HPV testing due to characteristics of the lesion, such as low volume or low viral load. This influences the cumulative probability of a false negative/positive.
A crucial point to acknowledge is that all screening modalities have inherent limitations - those which maximise detection are most likely to lead to false positives, and those which reduce the incidences of false detection also reduce CIN2/3 detection. However, there are serious caveats to this that must be considered. Considering HPV triage with LBC screening in Table 4, it is immediately apparent that expedited re-testing of HPV positive results outside of the regular screening cycle (6–18 months) goes a long way towards ameliorating the reduced detection ratio of triage tests, whilst minimising false positives and excess colposcopy referrals. This analysis also suggests that LBC only retesting of triage results tends to detect less disease than HPV retesting, or both HPV and LBC retesting.
In designing a screening programme, one must be cognisant of the potential harms as much as benefits. The advent of HPV testing has huge implications for cervical screening(11) (36), the implications of which are quantified further in this paper. The question of testing intervals was beyond the scope of this work, but was briefly alluded to in the analysis of false positive and false-negative cumulative probability illustrated for 1 year, 3 year, and 5 years intervals in Figs. 2 and 3. A recent French study(37) found that reducing testing window interval does more harm than good, leading to over-screening with needless risk, excess costs, and over-treatment. Other authors(28) have suggested that re-screening after a negative primary HPV screen should occur no sooner than every 3 years, and with Dillner et al29 reporting that intervals of even 6 year were safe and effective. 2018 US guidelines for HPV screening recommend a minimum interval of 5 years (26) between routine screening tests. As this analysis illustrates, we would expect in most cases that extending this interval has only a miniscule impact on missed CIN2/3 rate, while substantially reducing false positives.
As HPV testing becomes cheaper and more common, it is vital to consider how they are best implemented into screening. Evidence from recent multicentre studies(5, 38) indicate that HPV-based screening provides greater protection against invasive cervical carcinomas relative to cytology. In those studies, recorded cumulative incidence of cervical cancer was lower 5.5 years after a negative HPV test than 3.5 years after a negative cytology result. This indicates empirically that 5-year intervals for HPV screening are safer than 3-year intervals for cytology. Results in this work support the hypothesis that HPV screening every 5 years could reduce the number of unnecessary colposcopy and biopsy procedures compared to frequent cytology, cutting costs and invasive unnecessary procedures. There is also ample evidence that negative HPV tests provide greater reassurance of low abnormality risk than negative cytology results(28, 39), with authors suggesting that primary HPV screening can be considered as an alternative to current US cytology-based cervical cancer screening methods. Certainly, the results of this analysis support the contention that HPV testing can strongly increase predictive power of screening tests, and when correctly deployed can also reduce potential harms of over-screening.
It is vital to look towards the future of cervical screening. The staggering international success of the HPV vaccine is already apparent(40), and countries with high update of the HPV vaccine are already seeing a precipitous drop in rates of precancer and abnormal cervical cells. The falling prevalence of CIN2/3 has deep implications for how we interpret future tests. As Fig. 3 demonstrates, the chief impact of falling HPV infection rates is that across all modalities, positive results are less likely to be informative. Conversely, our confidence in negative results increases as vaccine rates increase. It is clear that HPV testing is superior as infection rates fall, and results in much fewer false positives relative to LBC testing. This is likely to be important in planning the future evolution of screening programmes. The model outlined in this work has application here too, and can be employed to predict the confidence one should afford a particular screening result under varying levels of population prevalence.
The requirement to both educate women and provide accurate information on the impact of any alteration to screening programmes can be illustrated by the psychosocial impact the addition to or replacement of LBC with HPV for primary screening or for triage which can cause additional stress and anxiety for those participating in the screening programme(41–43). Unfortunately the discrepancy between society’s expectation of screening programmes and actual sensitivities exist demonstrating the importance of public education(44). It is worth noting too that physicians and healthcare professionals are frequently under informed about the benefits and limitations of screening programmes(45, 46), and confusion can easily arise. While screening is an extraordinary measure that saves lives, it is important to understand its fundamental limitations and nuances, so that maximum benefit can be derived from any national programme and misunderstandings minimised.
Cervical screening comes with some inherent uncertainty, irrespective of the modality employed. While this work should help elucidate some optimum strategies for screening, the reality is that screening, while life-saving, cannot be expected to be perfect. It is worth being clear that perfect detection is a mathematical impossibility, and this is demonstrated in the appendix. There is an inherent trade-off in strategies to increase detection, as they inevitably lead to a disproportionate rise in false positives, with needless over-treatment. This is particularly relevant in the context of the legal requirement in some jurisdictions, such as Ireland where following legal action over missed cancer diagnosis, the high court ruled that screeners must have ’absolute confidence’ in negative results(10), despite multiple investigation showing the labs in question were operating to high standard and no negligence was committed.
Such a stipulation is impossible, and as this analysis shows, even striving to get arbitrarily close to this standard is likely to result in more harm than good. This is neither conducive to public health, nor sustainable. It also has potential to muddy public expectation and understanding of screening, and what it can realistically achieve. Screening is a vital undertaking if we are to reduce cervical cancer mortality, and its strengths and limitations must be seen in context so that benefit can be maximised. The results of this analysis should prove useful in optimising approaches and demonstrating the complexities of different implementations so informed decisions can be made.