RECall was able to generate a consensus sequence for 98% (132/135) of the pol experiments, whereas Exatype was successful in 93.3% (126/135) of the tests Table 4. Of these, 126 (93.3%) met the default Exatype and RECall acceptability criteria after automated processing. Inadequate double primer coverage over the entire sequence length was the primary reason for failure as RECall has the flexibility of allowing single primer coverage. For the standard analysis using a standard Laptop (ASUS-i3 660 3.33-GHz CPU, 3 GB RAM, Windows XP), we performed RECall base calling, assembly, contamination check using MEGA X and alignment in less than 4 hours, with human sequence edit review. We then proceeded and used Stanford HIVDB to generate patient results in one hour.
In contrast, we did the entire analysis, QA contamination report generation, and patient result generation in Exatype on the same laptop within one hour. The longer time in the standard software pipeline is attributed from the sequence review and edits before exporting the contig into a different software MEGA X for QA analysis and Stanford HIVDB for patient result generation. All the steps are performed simultaneously in Exatype.
Editing method
|
Results
|
No results
|
Total
|
Exatype
|
126 (93%)
|
9 (7%)
|
135
|
Standard analysis Procedure
|
132 (98%)
|
3 (2%)
|
135
|
Table 4: Performance in generating consensus pol sequences for HIV-1 samples by the different editing approaches.
Nucleic acid sequence concordance between Exatype and Standard analysis procedure Within analyzed bases, there was 99.8% overall agreement in base calling between Exatype and the gold standard. There was 99.6% complete sequence concordance within 311,227 nucleotide positions, as indicated in Fig. 1. Of the 311 discordant nucleotides, 308 (99%) were "partially discordant" (mixtures called by one method but not the other), while 3 (1%) were wholly discordant. 76.5% (238 of 311) of the partially different bases comprised of nucleotide pairs as a result from transitions (R A/G, Y C/T) rather than transversions (K G/T, M A/C, S C/G, W A/T).
Distribution of discordant positions between the transitions, transversions, and a combination of both was relatively the same (n 11, 6, and 5, respectively), as indicated in Fig. 1. 1.2% of nucleotide mixtures detected on all bases. Overall, the standard method called a marginally more significant number of mixtures (1193 standard method -called mixtures [1.08%] and 1181 Exatype-called mixtures [1.05%]; P 0.6).
Amino acid sequence concordance between Gold standard and Exatype interpretations the 311 discordant nucleotide positions resulted in 284 discordant codons. 114 (40.1%) of these, produced Nonsynonymous substitutions between the standard and Exatype method at the sequence to amino acid translation level. 278 (97.8%) were partial amino acid discordances (sharing at least one amino acid between the two interpretations), while only 6 (2.2%) were complete amino acid differences.
In general, the gold standard and Exatype sequence review identified 97 "key" antiretroviral drug resistance mutations26, as either complete amino acid substitutions or as part of mixtures. The two methods agreed for 123 cases. The Exatype identified one resistance mutation (E35D) that the gold standard did not, while the gold standard identified 2(K55R and R57K) that Exatype did not. This variation in resistant mutation identification affected two patient results though none of the three mutations has clinical significance
|
|
|
# with
|
|
|
|
|
# with AA differences
|
# with difference in resistance interpretation
|
Region
|
#
|
# with NT differences
|
Difference Mix
|
Different NTb
|
Gap manualc
|
Error Manuald
|
Error Exatypee
|
|
|
PR
|
126
|
17
|
0
|
0
|
3
|
9
|
24
|
24
|
1(ANRS)
|
RT
|
126
|
17
|
4
|
1
|
1
|
8
|
31
|
54
|
1(ANRS); 2(REGA)
|
Table 5: Differences in Gold standard and Exatype editing of HIV-1 pol sequences from clinical samples and impact on drug resistance interpretation. We considered sequences that passed both Exatype and Gold standard editing. #, number of samples; NT, nucleotide; AA, amino acid; genotypic drug resistance interpretation systems: ANRS version 27, HIVDB version 8.9-1, and REGA version 8.0.2.
- Number of samples with mixtures scored differently by the two approaches
- Number of samples with pure nucleotides scored separately by the two approaches
- Number of samples with parts of sequences that were not analyzable as judged by the editor
- The number of samples containing differences between Recall and Exatype editing due to manual editing.
- Number of samples containing differences between Exatype and Recall editing due to errors made during automatic editing in Exatype
From the HIV-1 RNA measurement remnant samples, 93.3% (126/135) of the pol HIV-1 had a consensus NT sequences available and generated by both Exatype and RECall Table 5. In total, 86.5% (109/126) of the PR, 74.6% (94/126) of the RT sequences were fully concordant at the NT level similar to the AA level. The differences in concordance between the different regions were attributed to the difference in coverage length and were less pronounced when normalized.
For each discordant NT call, the chromatograms were manually reviewed by a second laboratory specialist to verify whether the differences resulted from an erroneous call in the automatic or manual editing process. For both editing approaches, incorrect calls were observed, i.e., in 24 vs. nine samples for PR, 31 vs. 1 for RT Table 5. Only 1 RT nucleotide was different between the manually and automatically edited sequences. In both instances, differences result from mistakes made during manual editing. The operator trimmed the five ends of PR in 3 samples and one sample for RT, but these parts were still completely analyzed by RECall and not Exatype. Additionally, some of the erroneous calls in Exatype were because this tool does not allow sequence editing.
|
|
# with NT differences compared to the reference sequence
|
# with AA differences
|
# with differences in resistance interpretation
|
|
|
PR
|
RT
|
PR
|
RT
|
PR
|
RT
|
|
|
Total
|
Missed mix
|
False mix
|
Different NT/mix
|
Total
|
Missed mix
|
False mix
|
Different NT/mix
|
|
|
ANRS
|
HIVDB
|
REGA
|
ANRS
|
HIVDB
|
REGA
|
Exatype
|
22/10
|
5/4
|
2/2
|
3/3
|
1/-
|
2/-
|
1/-
|
1/-
|
-/-
|
3/4
|
2/-
|
-/-
|
-/-
|
-/-
|
-/-
|
-/-
|
-/-
|
Recall
|
22/12
|
5/3
|
1/1
|
3/2
|
1/-
|
-/2
|
-/-
|
-/2
|
-/-
|
2/3
|
-/-
|
-/-
|
-/-
|
-/-
|
-/-
|
-/-
|
-/-
|
Table 6: Differences in RECall and Exatype editing of HIV-1 pol sequences from all EQA samples and impact on drug resistance interpretation. This analyses were confined to drug resistance positions (PR: 10, 20, 24, 30, 32, 33, 36, 46, 47, 48, 50, 53, 54, 63, 71, 73, 77, 82, 84, 88, 90; RT: 41, 62, 65, 67, 69ins, 69, 70, 74, 75, 77, 100, 103, 106, 108, 115, 116, 151, 181, 184, 188, 190, 210, 215, 219, 225). #, number of samples; PR, protease; RT, reverse transcriptase; NT, nucleotide; AA, amino acid; genotypic drug resistance interpretation systems: ANRS version 27, HIVDB version 8.9-1 and REGA version 8.0.2. The number of sequences that passed Exatype and RECall editing are before the slash. Number of sequences that did not pass either of the two approaches are behind the slash
- The number of samples with mixtures present in the reference sequence, but not scored by the editing approach (pure wild-type or mutant NT).
- The number of samples with mixtures scored by the editing approach that was not present according to the reference sequence (pure wild-type or mutant NT).
- Number of samples with mixtures and pure nucleotides scored differently by the editing approach and the reference sequence
EQA results analysis: 85% (22 + 12 = 34)/40) of the EQA dry panels (These are FASTA files shared by the WHO to all the WHO accredited lab for competency assessment of staff in sequence editing) from WHO had a consensus sequence using Recall, while for the Exatype, it was 80% (32/40) Table 6. For each dry panel, a reference sequence sent by WHO was considered as the accurate results, and was calculated based on the consensus results of all participants within the WHO ResNet Lab (∼52 participants). We further reviewed each discordant NT call to find out whether the difference resulted from a missed mixture, a false mixture, or a different NT or mixture Table 6. Both Recall and Exatype are comparable in terms of detecting mixtures with both almost having a similar score on the mixtures that were not present in the reference sequence Table 6.
|
RECall
|
Exatype
|
|
PR
|
RT
|
PR
|
RT
|
# sequences without NT differences
|
28/34(82%)
|
32/34(94%)
|
24/32(75%)
|
30/32(94%)
|
# sequences without AA differences
|
30/34(88%)
|
34/34(100%)
|
26/32(81%)
|
30/32(94%)
|
# NT differences/total # NT
|
9/2112(0.43%)
|
1/2562(0.04%)
|
18/2023(0.88%)
|
4/2400(0.17%)
|
# AA differences/total # AA
|
5/724(0.72%)
|
0/912(0%)
|
10/675(1.48%)
|
2/800(0.25)
|
|
|
|
|
|
# Me ∩ Mr
|
18
|
18
|
12
|
8
|
# Mr
|
21
|
19
|
16
|
11
|
P(Me|Mr )
|
0.83
|
1
|
0.7
|
0.85
|
# Me ∩Pr
|
7
|
2
|
8
|
1
|
# Pr
|
2081
|
2679
|
1999
|
2381
|
P(Me|Pr )
|
0.002
|
0.0008
|
0.004
|
0.0004
|
Table 7: Comparison of RECall, Exatype editing of WHO dry sample EQA panel with the reference sequence at NT and AA level. To meet the CLSI guidelines of 40% reference panels being EQA standards, we included dry panels from the WHO ResNet group. #, number of; AA, amino acids; NT, nucleotides; Me, mixtures present in the results of the editing approach; Mr, mixtures present in the reference sequences; Me∩ Mr, mixtures present in the reference sequences that scored as a mixture by the editing approach; Pr, pure nucleotides present in the reference sequences; P(Me|Mr ), the probability that a mixture scored if present in the reference sequence; Me∩Pr, pure nucleotides in the reference sequences that scored as a mixture by the editing approach; P(Me|Pr ), the probability that a mixture scored if no mixture was present in the reference sequence
At the NT level, the percentage of sequences without differences compared to the reference sequence is the slightly lower for Exatype editing, which is 75% and 94% for PR and RT, respectively vs. 82% and 94% for RECall editing Table 7. Using Recall, 0.43% of the PR and 0.04 % of the RT nucleotides were discordant with the reference sequence, in contrast to 0.88% of the PR and 0.17% of the RT nucleotides using Exatype which was markedly higher. The same tendency observed at the AA level Table 7. We then assessed for editing approach, the probability P(Me|Mr) that a mixture scored if the mixture was present in the reference sequence and the probability P(Me|Pr) that a mixture scored yet it was a pure nucleotide sequence.
In the remnant HIV-1 RNA samples, the majority of samples for which at least one of the editing approaches was able to generate a consensus NT sequence were interpreted as susceptible to most PI, NRTI, and NNRTI. Also, much more extensive drug resistance profiles observed in the WHO dry panel as compared to the clinical dataset Table 8.
Data set
|
According to
|
ANRS
|
HIVDB
|
REGA
|
|
|
|
PI
|
RTI
|
PI
|
RTI
|
PI
|
RTI
|
|
Clinical
|
Exatype FASTA file
|
82/126 (65%)
|
51/126 (41%)
|
33/126 (26%)
|
42/126 (33%)
|
34/126 (27%)
|
41/126 (33%)
|
|
|
RECall FASTA file
|
81/126 (64%)
|
52/126 (41%)
|
33/126 (26%)
|
42/126 (33%)
|
34/126 (27%)
|
43/126 (34%)
|
|
WHO Dry panel
|
Reference
|
18/40 (45%)
|
20/40 (50%)
|
18/40 (45%)
|
20/40 (50%)
|
17/40 (43%)
|
20/40 (50%)
|
|
Table 8: Number of samples displaying (intermediate) resistance to different drug classes, according to ANRS, HIVDB, REGA, and Geno2Pheno. For the HIV-1 RNA remnant dataset, we included only the sequences that passed for both RECall, and Exatype editing. In contrast, we included resistance information of all reference sequences for the WHO dry panel dataset. FPR, false-positive rate; RTI, reverse transcriptase inhibitor; PI, protease inhibitor; genotypic drug resistance interpretation systems: ANRS version 27, HIVDB version 8.9-1 for the clinical dataset and HIVDB version 8.9-1 for the EQA dataset, and REGA version 8.0.2, G2P Geno2Pheno