Generalised Fiebig-like staging
The fundamental feature of the Fiebig staging system (2) is that it identifies a naturally-occurring
sequence of discordant diagnostic tests, which together indicate early clinical disease
progression. The approximate duration of infection can be deduced by analysing the
combination of specific assay results and assigning the appropriate “stage”.
As we demonstrate below, it is preferable to interpret any combination of diagnostic test results into an estimated duration of infection, if
these tests have been independently benchmarked for diagnostic sensitivity (i.e. a
median or mean duration of time from infection to detectability on that assay has
been estimated). Unlike with Fiebig staging, this more nuanced method allows both
for incorporation of results from any available test, and from results of tests run
on specimens taken on different days.
In contrast to the usual statistical definition of ‘sensitivity’ as the proportion
of ‘true positive’ specimens that produce a positive result, we summarise the population-level
sensitivity of any particular diagnostic test into one or two ‘diagnostic delay’ parameters
(
d and
σ in Figure 1). Interpreted at the population level, a particular test’s sensitivity
curve expresses the probability that a specimen obtained at some time
t after infection will produce a positive result. The key features of a test’s sensitivity
curve (represented by the purple curve in Figure 1) are that:
there is effectively no chance of detecting an infection immediately after exposure;
after some time, the test will almost certainly detect an infection;
there is a characteristic time range over which this function transitions from close
to zero to close to one. This can be summarised as something very much like a mean
or median and a standard deviation.
Figure 1
By far the most important parameter is an estimate of ‘median diagnostic delay.’ In Figure 1, this is the parameter
d. If there were perfect test result conversion for all subjects (i.e. no assay ‘noise’),
and no inter-subject variability, this would reduce the smoothly varying purple curve
to a step function.
Various host and pathogen attributes, such as concurrent infections, age, pregnancy
status, the particular viral genotype, post-infection factors, etc., affect the performance
of a test for a particular individual. This determines a subject’s specific sensitivity
curve, such as one of the green curves in Figure 1, which capture the probability
that specimens from a particular subject will produce a positive diagnostic result. Because assay results are themselves not
perfectly reproducible even on the same individual, even these green curves do not
transition step-like from zero to one, but rather have some more finite window of
time over which they transition from close to zero to close to one.
To estimate individual infection times, then, one needs to obtain estimates of the
median diagnostic delays (i.e. the purple curve in Figure 1) for all tests occurring
in a data set, and then interpret each individual assay result as excluding segments
of time during which infection was not possible, ultimately resulting in a final inferred
interval of time during which infection likely occurred.
These calculations require that each individual has at least one negative test result
and at least one positive test result. In the primitive case where there is precisely
one of each, namely a negative result on a test with an expected diagnostic delay
of
d
1
at
t
1
and a positive result on a test with an expected diagnostic delay of
d
2
at
t
2
, then the interval is simply from
(
t
1
-
d
1
) to
(
t
2
-
d
2
). When there are multiple negative results on tests at
t
i
(
-
)
each with a diagnostic delay
d
i
(
-
)
, and/or multiple positive results on tests at
t
j
(
+
)
each with a diagnostic delay
d
j
(
+
)
, then each individual negative or positive test result provides a candidate earliest
plausible and latest plausible date of infection. The most informative tests, then,
are the ones that most narrow the ‘infection window’ (i.e. result in the latest start
and earliest end of the window). In this case, the point of first ‘detectability’
refers to the time when the probability of infection being detected by an assay first
exceeds 0.5.</p>
These remaining plausible ‘infection windows’ are usually summarised as intervals,
the midpoint of which is naturally considered a ‘point estimate’ of the date of infection.
Figure 2 illustrates the way this method works, on a particular (hypothetical) individual.
Given two negative test results on one date and two positive test results on a later
date, a plausible infection window can be estimated using the diagnostic delays of
the assays in question (
d
1
,
d
2
,
d
3
and
d
4
in the figure). Note that it is the most sensitive negative test and the least sensitive
positive test that proves most informative – by excluding the greatest periods of
time during which infection could not have occurred.</p>
Figure 2
These infection intervals can be understood as plateaus on a very broadly plateaued
(rather than ‘peaked’) likelihood function, as shown in Figure 3. Given a uniform
prior, this can be interpreted as a Bayesian posterior, with
a
,
b
in Figure 3 showing the 95% credibility interval (i.e. the interval encompassing
95% of the posterior probability density). Such a posterior, derived from an individual’s
diagnostic testing history, could also serve as a prior for further analysis if there
is an available quantitative biomarker for which there is a robustly calibrated maturation/growth
curve model. We do not deal with this in the present work, but it is explored elsewhere
(1), and is an important potential application of this framework and tool. </p>
Figure 3
In Appendix A we derive a formal likelihood function – i.e. a formula capturing the
probability of seeing a data element or set (in this case, the set of negative and
positive test results), given hypothetical values of the parameter(s) of interest
– here, the time of infection. This interpretation of individual test results relies
on the assumption that test results are independent. Of course, the very factors that
influence the individual (green) sensitivity curves in Figure 1 suggest that strong
correlations between results of different tests on the same person are likely. Given
this, we further demonstrate in Appendix A when and how test correlation might influence
the analysis. While this method does not require a pre-set list of infection stages
dependent upon defined assay combinations (as with Fiebig staging), it does require
estimation of the diagnostic delay for each assay, either by sourcing direct estimates
of the diagnostic delay, or by sourcing such data for a biochemically equivalent assay.
Our online HIV infection dating tool, described below, is preloaded with diagnostic
delay estimates for over 60 HIV assays, and users can both add new tests and provide
alternative diagnostic delay estimates for those tests which are already included.
Implementation
The public online Infection Dating Tool is available at https://tools.incidence-estimation.org/idt/. The source code for the tool is available publicly under the GNU General Public
License Version 3 open source licence at doi:10.5281/zenodo.1488117. The user-facing web interface is described in a supplemental Appendix, Grebe et al 2019 Appendix.pdf.
In practice, the timing of infectious exposure is seldom known, even in intensive
studies, and studies of diagnostic test performance therefore provide relative times of test conversion (10-12). Diagnostic delay estimates are therefore anchored
to a standard reference event – the first time that a highly-sensitive viral load
assay with a detection threshold of 1 RNA copy/mL of plasma would detect an infection.
We call this the Date of Detectable Infection (DDI). The tool produces a point estimate of this date for each study subject, called
the Estimated Date of Detectable Infection (EDDI). Details and evaluation of the performance of the diagnostic delay estimates
underlying this tool compared with other methods for estimation of infection dates
are available elsewhere (1).
The key features of our online tool for HIV infection date estimation are that:
Users access the tool through a free website where they can register and maintain
a profile which saves their work, making future calculations more efficient.
Individual test dates and positive/negative results, i.e. individual-level ‘testing
histories’, can be uploaded in a single comma-delimited text file for a group of study
subjects.
Estimates of the relative ‘diagnostic delay’ between the assays used and the reference
viral load assay must be provided, with the option of using a curated database of
test properties which provides cited estimates for over 60 HIV assays.
If a viral load assay’s detection threshold is known, this can be converted into a
diagnostic delay estimate via the exponential growth curve model (1, 2). We assume
that after the viral load reaches 1 RNA copy/mL, viral load increases exponentially
during the initial ramp-up phase. The growth rate has been estimated at 0.35 log10 RNA copies/mL per day (i.e., a doubling time of slightly less than one day) (2).
The growth rate parameter defaults to this value, but users can supply an alternative
estimate.
Using the date arithmetic described above, when there is at least one negative test
result and at least one positive test result for a subject, the uploaded diagnostic
history results in:
a point estimate for the date of first detectability of infection (the EDDI);
an earliest plausible and latest plausible date of detectable infection (EP-DDI and
LP-DDI); and
the number of days between the EP-DDI and LP-DDI (i.e., the size of the ‘DDI interval’),
which gives the user a sense of the precision of the estimate.
Access / User profiles
Anyone can register as a user of the tool. The tool saves users’ data files as well
as their choices about which diagnostic delay estimates to use for each assay, both
of which are only accessible to the user who uploaded them. No personally-identifying
information is used or stored within the tool; hence, unless the subject identifiers
being used to link diagnostic results can themselves be linked to people (which should
be ruled out by pre-processing before upload) there is no sensitive information being
stored on the system.
Uploading diagnostic testing histories
A single data file would be expected to contain a ‘batch’ of multiple subjects’ diagnostic
testing histories. Conceptually, this is a table like the fictitious example in Table
1, which records that:
one subject (Subject A) was seen on 10 January 2017, at which point he had a detectable
vial load on an unspecified qualitative viral load assay, but a negative Bio-Rad GeeniusTM HIV-1/2 Supplemental Assay (Geenius) result
another subject (Subject B) was screened negative using a point-of-care (PoC) rapid
test (RT) on 13 September 2016, and then, on 4 February 2017, was confirmed positive
by Geenius, having also tested positive that day on the PoC RT
Table 1
In order to facilitate automated processing, the tool demands a list of column names
as the first row in any input file. While extraneous columns are allowed without producing
an error, there must be columns named Subject, Date, Test and Result (not case sensitive). Data in the subject column is expected to be an arbitrary string
that uniquely identifies each subject. Dates must be in the standard ISO format (YYYY-MM-DD).
It is fundamental to the simplicity of the algorithm that assay results be either
‘positive’ or ‘negative’. There are a small number of tests, notably Western blot
and the Geenius, which sometimes produce ‘indeterminate’ results (partially, but not
fully, developed band pattern). Note that there is some lack of standardisation on
interpretation of the Western blot, with practice differing in the United States and
Europe, for example. While we provide default values for common Western blot assays,
users may enter appropriate estimates for the specific products and interpretations
in use in their specific context.
We now briefly reconsider Table 1 by adding the minor twist that the Geenius on Subject
B is reported as indeterminate. In this case, the data must be recorded as results
on either one or both of two separate tests:
a ‘Test-Indeterminate’ version of the test – which notes whether a subject will be
classified either as negative, or ‘at least’ as indeterminate; and
a ‘Test-Full’ version of the test, which determines whether a subject is fully positive
or not.
There is then no longer any use for an un-suffixed version of the original test. The
data from Table 1 is repeated in Table 2 with differences highlighted. The only changes
are the use of the Test-Indeterminate version for Subject A’s negative Geenius result
and an indeterminate Geenius result for Subject B. Note that even while Subject A’s
test results have not changed, their testing history now looks different, as completely
negative results are reported as being negative even for the condition of being indeterminate.
Subject B’s indeterminate result on 4 February requires two rows to record, one to
report that the test result is not fully negative (positive on ‘Geenius Indeterminate’),
and one to report that the result is not fully positive (negative on ‘Geenius Full’).
Once diagnostic delays are provided for these two sub-tests, the calculation of infection
dates can proceed without any further data manipulation on the part of the user.
Table 2
Provision of test diagnostic delay estimates
As described above, tests are summarised by their diagnostic delays. The database
supports multiple diagnostic delay estimates for any test, acknowledging that these
estimates may be provisional and/or disputed. The basic details identifying a test
(i.e. name, test type) are recorded in a ‘tests’ table, and the diagnostic delay estimates
are entered as records in a ‘test-properties’ table, which then naturally allows multiple
estimates by allowing multiple rows which ‘link’ to a single entry in the tests table.
A test property entry captures the critical parameter of the ‘average’ (usually median)
diagnostic delay obtained from experimental data and, when available, a measure of
the variability of the diagnostic delay (denoted
σ).
The system’s user interface always ensures that for each user profile, there is exactly
one test property estimate, chosen by the user, for infection dating calculations
at any point in time. Users need to ‘map’ the codes occurring in their data files
(i.e. the strings in the ‘Test’ column of uploaded data files) to the tests and diagnostic
delay estimates in the database, with the option of adding entirely new tests to the
database, which will only be visible to the user who uploaded them. The tool developers
welcome additional test estimates submitted for inclusion in the system-default tests/estimates.
Execution of infection dating estimation
The command button ‘process’ becomes available when an uploaded testing history has
no unmapped test codes. Pressing the button leads to values, per subject, for EP-DDI,
LP-DDI, EDDI, and DDI interval, which can be previewed on-screen and downloaded as
a comma-delimited file.
By default, the system employs simply the ‘average’ diagnostic delay parameter, in
effect placing the EP-DDI and LP-DDI bounds on the DDI interval where the underlying
sensitivity curve evaluates to a probability of detection of 0.5. When the size of
the inter-test interval (
δ) is greater than about 20 times the diagnostic delay standard deviation (
σ), this encompasses more than 95% of the posterior probability.
As an additional option, when values for both
d and
σ are available, and users may specify a significance level (
α), at which point the system will calculate the bounds of a corresponding credibility
interval. The bounds of the central 95% (in the case of
α
=
0.05) of the posterior are labelled the EP-DDI and LP-DDI.
Database Schema
This tool makes use of a relational database, which records information in a set of
linked tables, including:
subjects: This table captures each unique study subject, and after infection date estimation
has been performed, the subject’s EDDI, EP-DDI, LP-DDI and DDI interval size.
diagnostic_test_history: This table records each test performed, by linking to the subjects table and recording
a date, a ‘test code’, and a result. During the estimation procedure, a field containing
an ‘adjusted date’ is populated, which records the candidate EP-DDI (in the case of
a negative result) or LP-DDI (in the case of a positive result) after the relevant
diagnostic delay has been applied to the actual test date.
diagnostic_tests: This is a lookup table listing all known tests applicable to the current purposes
(both system-provided and user-provided).
test_property_estimates: This table records diagnostic delay estimates (system and user-provided). It allows
estimates per test, with system default estimates flagged.
test_property_mapping: This table records user-specific mapping of test codes by linking each test code in
the diagnostic_test_history table to a test in the diagnostic_tests table, as well
as the specific test property estimate ‘in use’ by that user for the test in question.
A number of subsidiary tables also exist to manage users of the system and allow linking
of personal data files, maps, tests, and test property estimates to specific users.