This study focused on characterizing clinical Mtb isolates in real-world settings and hence has limitations compared to observational studies, including inadequate sample size, imprecise and some missing information (e.g. HIV infection status). First, we only used spoligotypes to assign clusters, which limits and biases the clustering resolution and might potentially over-estimate clustering rates. The second limitation relates to a small sample size and misclassification of EPTB disease sites, which has notoriously confounded comparison of EPTB incidences between studies [38]. Previous studies have identified disease site-specific risk factors, including those with certain Mtb genotypes, drug resistance and meningeal TB which we could not reproduce in our study, since only five (7%) meningeal TB and 1 (1%) pericardial TB isolates were enrolled [39-43]. Nonetheless, we used validated RVCT methods to allow comparisons between studies. Third, the Mtb isolates were not serialized, and information on drug therapy received, TB drug doses and period of isolates’ collection relative to TB therapy, was not available. This made inference and distinction of primary transmitted resistance from acquired resistance in our study difficult [44]. Incomplete medical history on the laboratory request form also made it difficult to determine which isolates were from patients immunosuppressed with HIV or concomitant immunosuppressive agents used for rheumatological diseases. Nonetheless, with ML modelling which is more suitable with missing data or highly complex data structure, we were able to demonstrate that routinely collected laboratory and clinical data can be used to screen patients and identify risk groups where acquired drug resistance is most likely to occur. Gradient boosting allows identification of weak predictors, nonlinear relationships and thresholds in the data space [32], which is like the proverbial “finding a needle in a haystack”. In this case one uses giant magnets to find that needle. Sensitivities and specificities >84% are reasonable and acceptable, given that the information required for initial screening (i.e., identifying disease site as lymphadenitis, cutaneous TB or meningitis) can be ascertained by history and clinical examination. Moreover, ROC values ~70% with cross validation somehow reassures us that results such as these are likely to be reproducible with similar populations. Unlike most EPTB studies performed at large specialized hospitals [45, 46], our study has minimal referral bias, hence the other strength of this study is that the isolates were from primary DOTS/TB facilities and not from patients treated at tertiary specialized facilities.
There are three notable findings from our study with important public health policy and TB control efforts that target reduction of both disease transmission and drug resistance. The findings are certainly applicable in the Tshwane metropolitan area and have potential relevance across similar urban populations in South Africa and across the sub-Sahara African metropolis. The first key finding is the hierarchical and nonlinear association between key EPTB disease sites (mainly lymphatic, cutaneous TB and meningeal) and spoligotypes (mainly to impact both disease transmission and drug resistance). The association between Beijing strains and both TB disease transmission and drug resistance has been well described in South Africa and across the world and the results have been mixed [9-11, 15, 47-49]. Our study demonstrates that those relationships are conditional, complex and characterized by several nonlinear interactions (Figures 3 and 4). For example, two-way interactions between EPTB disease site and another variable explained >20% of the variance in clustering and almost 10% of drug resistance. This means that unless those nonlinear relationships are fully examined, the purported factors driving either transmission or drug resistance will be highly biased or wrong. In fact, for both clustering and drug resistance, the impact of Mtb genotypes is of second order, suggesting that some mycobacterial genotypes have increased propensity to act on some EPTB disease sites and less likely on others. The differential impact of EPTB disease site on any drug resistance (shown in Figure 4C) and on MDR-TB/rifampin monoresistance (shown in Figure 4D), is revealing and consistent with standard PK/PD principles underlying drug resistance emergence [25, 50-52]. PK variability between individuals mean that some patients will have faster drug clearances than others when given the same drug dose. Therefore, inadequate drug exposures at the site of infection, which occurs because of PK variability or suboptimal drug doses or poor drug penetration into protected sites such as meningeal or pericardial spaces, leads to selection of drug resistant or drug tolerant isolates. The selected mutants eventually acquire putative mutations in time. In other words, acquired drug resistance (ADR) occurs de novo during therapy primarily because of inadequate dosing or with unoptimized therapy regimens. The WHO recommends the same standardized and uniform therapy regimens and doses used for PTB for EPTB, with the caveat of experts’ opinions that varying longer therapy durations be given for meningeal and bone/joints disease sites [53]. Indeed, these same guidelines are used in Tshwane. As shown in Figures 4C-D, the standard WHO recommended EPTB treatment regimen is associated with drug resistance in certain EPTB sites such as lymphatic, cutaneous TB and meningeal site. In this study, the attributable risk for both any resistance and MDR-TB and/or rifampin monoresistance were substantial: 0.25 and 0.64, respectively. The corollary suggestion from this specific finding is that the majority resistance observed in our study is more likely acquired during therapy rather than being ‘pre-existing’ or primary. The NNS for targeted screening among EPTB patients based on disease sites for any resistance was 4 and for MDR-TB and/or rifampin monoresistance was 2, which is even more efficient and effective than widely recommended population screenings for active TB in congregate settings or among select risk groups, such as patients with diabetes mellitus or HIV [54]. For example, the NNS HIV infected patients to find one active TB case in regions with low TB incidences is 25 (ranges 11-144) and in high TB incidence regions is 10 (ranges 5-22), while that for prisoners is 520 (ranges 69-427) and 43 (ranges 21-123), respectively.
Secondly, even though the proportion of EPTB disease sites were similar to previous observations, the overall incidence of proven EPTB of 4.43 per 100,000 populations was lower than expected. There were 8,034 microbiologically confirmed PTB cases in Tshwane in 2015, an estimated incidence rate of 254 (95% CI: 249-260) cases per 100,000 population [1]. Confirmed PTB status was based on positive GeneXpert MTB/RIF assay, cultures, line probe assays and microscopy smears, which probably overestimated confirmed PTB cases by accepting nontuberculous mycobacteria cases, based microscopy smears. Hoogendoorn et al reviewed charts of patients treated and notified for clinical EPTB in the predominantly rural Limpopo province, for 10 months of 2013 [3]. Of the 336 patients diagnosed, only 57% had good evidence for stated diagnoses. Nonetheless, the overall estimated incidence of clinical EPTB in that study was 27.92 (95% CI: 24.80-31.23) and that for clinical meningeal TB was 2.56 (1.70-3.70) per 100,000 populations per year. Meningeal TB comprised 9.82% (95% CI: 6.86-13.52) of EPTB in Limpopo and 9.04% (95% CI: 6.94-11.54) in Soweto, per year [3, 46]. Our estimates of EPTB incidence in Tshwane is six-fold lower than those reported from Limpopo; however, proven meningeal TB comprised 7.14% (2.36-15.89) of cases in Tshwane, suggesting that the meningeal TB proportions were similar between these disparate South African studies. In the US, EPTB as a proportion of total TB cases has been steadily increasing as TB elimination efforts are accelerated and the WHO TB elimination targets getting realized. From 7.6% in 1962 at the peak of the epidemic when TB incidence was 28.6 per 100, 000 population, EPTB increased to 15.7% in 1993 with the HIV resurgence and was 30.9% % in 2017 when the reported TB was 2.8 per 100, 000 population [55]. Contrary to the explanations given by Hoogendoorn and others, we actually hypothesize that with the widespread use of laboratory methods to prove EPTB, the incidence will increase consistent with observations in the US, where majority of EPTB reported are proven TB. We argue that several cases currently reported as clinical EPTB by Hoogendoorn and others in South Africa and elsewhere in low-resources settings could be due to other bacterial infections or due to systemic inflammation from HIV infection.
EPTB is generally paucibacillary in nature which means that usually there are not enough Mtb bacilli in tissues from which cultures can be obtained; histology samples are not easy to obtain and therefore not routinely collected. Culture positivity and histology examination of clinical samples, which are the gold standards for confirming EPTB, are notoriously low (about 15% in high TB burden areas) and inconsistent when compared against clinically suspected TB cases. Investments in improved diagnostics to confirm EPTB or ML algorithms that are trained on large clinical data to predict EPTB, will not only save lives by reducing unnecessary TB treatments, but will also be cost-effective because of the reduction of TB transmission and ADR. Both interventions will accelerate meeting WHO TB elimination targets.
Finally, with regards to the heterogeneity of the Mtb spoligotypes causing EPTB, the general predominance of the Beijing clades (lineage 2) and the Euro-American lineage 4 within the Tshwane metropolis are in concordance with the work of others [19]. This is not surprising since lineages 2 and 4 are thought to be the most successful strains among all the Mtb complex organisms in causing all forms of TB disease, including PTB [13,18, 20]. Previous reports have associated Mtb lineages of Beijing clade with major outbreaks in different parts of the world and was shown to disseminate more rapidly and caused more-severe disease than other strains [21-23]. Moreover, several other epidemiological data suggest that certain Mtb genotypes, such as the W-Beijing genotypes, are more transmissible than others [20-22]. Our study found that the Beijing strains within a chain of EPTB transmission was statistically significant when compared to the Euro-American and East African Indian strains which might support the variable virulence hypotheses [23,24].