We evaluated the accuracy of Ultima in comparison to MTB Plus and Ultra on sputum from people with symptoms of TB in a high HIV and TB burden setting. Our key findings are: 1) the proportion of unsuccessful results are significant (~10% with Ultima) and result in missed TB diagnoses, however, retesting halves the number of participants who do not receive a result, 2) Ultima sensitivity was similar to that of MTB Plus and Ultra, 3) Ultima specificity was low (~85%), resulting in approximately 1 in 5 positive results being false-positive (not associated with previous TB), 4) lot variation in MTB Plus and Ultima performance was observed, and 5) MTB-RIF Dx must be done immediately on eluted DNA, has approximately double (19%) the unsuccessful rate on Ultima-positive rather than MTB Plus-positive DNA (even when done fresh) and, due to a low probability of success, should not be done on samples on samples with an MTB Plus or Ultima semi-quantitation classification of “very low”. Together, these data have implications for Truenat adoption.
There are, to our knowledge, no published data of Ultima’s accuracy on sputum. In addition to Ultima, our study increases the evidence base for MTB Plus, in whom data from PLHIV and people with a history of TB was scarce (8, 11). Although we pre-selected specimens from participants with successful Ultra and culture results, a noteworthy proportion of unsuccessful results prior to repeat testing for MTB Plus (17%) and Ultima (10%) were observed consistently over the testing period without any temporal association. This parallels other studies (8, 11, 25), that reported invalid MTB Plus results from 9-18%. Importantly, we show that more TB cases are missed due to the test being unsuccessful rather than false-negative, highlighting the importance of quantifying unsuccessful results in test evaluations, something recently highlighted in the recently updated WHO TPP, where an acceptable unsuccessful result rate was defined as 3-5% (26).
Molbio recommends retesting using the same eluate when the initial TB result is unsuccessful (12, 19). This is supported by our data as, upon retesting, 60% of MTB Plus eluates initially unsuccessful became successful (54% for Ultima). Repeating would increase people diagnosed who might not otherwise return to give another sputum. Reasons why our retesting of the same eluate had success may be because, at initial testing, the DNA eluate was not sufficiently suspended with the lyophilised pellet containing PCR reagents. Molbio recommends allowing this mixture to stand for 30-60 seconds to achieve a clear solution before proceeding (12, 19), however, as the only factor that differed upon retesting was time, the manufacturer should consider extending the duration of standing. Before adopting retesting, laboratories would need to factor in cost and workload.
MTB Plus and Ultima had 84% and 90% sensitivity compared to 92% for Ultra. Our MTB Plus sensitivity estimate is like others in high HIV-prevalence settings. Among all participants, sensitivity was similar between MTB Plus and Ultima, but Ultima had higher sensitivity than MTB Plus in PLHIV. Ultra sensitivity was higher than MTB Plus for all participants, consistent with previous findings from Peru (8), but there was no difference among PLHIV. These data address the shortage of MTB Plus and Ultima data in PLHIV.
Ultima had lower specificity compared to Ultra, which has similar amplification targets (IS6110, IS1081). This is despite both Ultima and MTB Plus (which did not show low specificity in the same people despite also has a step where the tube is open) being done in parallel at the same time and in the same quality-assured laboratory. Importantly, this finding persisted when Ultima was evaluated against an eMRS that included Ultra. Furthermore, unlike what we described before for Xpert and Ultra (4, 27, 28), diminished specificity was not associated with previous TB. This specificity finding, which translates into low PPV for Ultima even in our high burden setting (more than 3/10 positives false-positive per MRS, 2/10 per eMRS), necessitates further investigation, especially if Ultima is to be applied in settings where pre-test probability of disease is lower. In the only other comparison of Ultra and Ultima (on tongue swabs), Ultima specificity was lower than that of Ultra.
We noted clinically important performance variation for MTB Plus and Ultima associated with lot number, both in terms of unsuccessful results and false positivity. Similar challenges have been reported for the SILVAMP TB-LAM test (FujiLAM; Fujifilm, Tokyo, Japan), which led to the test’s postponement (29, 30). Critically, stratification of performance data by lot is not in TB study guidance (31) nor part of the STARD criteria (22). Our data suggest this is important to incorporate, including in evidence review processes for policy making. Lastly, the variation in Ultima lot performance may be due to the product not yet being commercially available. Tightening of manufacturer quality control processes may be needed.
Our study’s primary purpose was not to assess MTB-RIF Dx’s sensitivity for rifampicin susceptibility, which requires further evaluation in people with presumed drug-resistant TB, however, we showed that, when MTB-RIF Dx is applied to Ultima-positive rather than MTB Plus-positive eluates, unsuccessful results are more likely (almost all Ultima-positive “very lows” were MTB-RIF Dx unsuccessful). This is likely because such people were positive exclusively based on the amplification of the multicopy gene target (IS1081) that MTB Plus (and MTB-RIF Dx) does not include. Lastly, it remains possible that, as for TB detection, MTB-RIF unsuccessful results may partly resolve upon retesting and, although we did not evaluate this, such a strategy would need to factor in elevated risk of unsuccessful results associated with non-same day testing.
This study addresses a critical research gap by evaluating new and existing Truenat tests for TB detection in a cohort with many PLHIV (the largest to date). Limitations include the use of biobanked samples for Truenat testing. Truenat samples with unsuccessful results were repeated from the same DNA eluate, however, although Molbio recommends repeating the test on a fresh sample, we show repeating from the same DNA eluate is useful (our approach is likely more feasible in situations where specimen re-collection is unfeasible). Although testing was performed in well-resourced research setting with machines calibrated according to the manufacturers’ recommendations and, in the case of the Cepheid and Molbio tests, the tests done years apart, we experienced high rates of unsuccessful results for MTB Plus and Ultima even though sputa from people with an unsuccessful Ultra result were excluded. Further monitoring and research into the extent of these unsuccessful results is required, including in different settings.
In summary, Truenat MTB Plus and Ultima are alternative TB sputum test that met WHO's minimum sensitivity threshold for sputum-based tests for culture-positive TB. Ultima has improved sensitivity compared to MTB Plus in PLHIV. However, Ultima’s suboptimal specificity, lot variation, and the relatively high proportion of unsuccessful results (also for same-day MTB RIF Dx testing) require careful further investigation.