Ethics statement
This study was conducted according to the principles expressed in the Declaration of Helsinki. Appropriate institutional review board approval for this study was obtained from the Ethics Committee at Pirogov Russian National Research Medical University (#170 at 18.12.2017). All patients provided written informed consent for the collection of samples, subsequent analysis and publication of .
Sample processing and extraction of cfDNA
Pregnant women referred for invasive diagnostics at the Center for Family Planning and Reproduction in Moscow were recruited from December 2017 to December 2018. Maternal peripheral blood samples were collected into Streck BCT tubes just before obstetric procedures (such as CVS) during the first trimester. Maternal peripheral blood samples were centrifuged at 250 g for 30 min, then plasma was transferred to new 2,0 ml microtubes and centrifuged at 9000 g for 15 minutes. CfDNA was extracted using the QIAamp Blood kit (Qiagen) with an increased volume of lysate transferred to the column (2-4 ml of blood plasma at the beginning) and volume of extraction decreased to 40 uL. QC was performed using the Qubit HS kit measuring on Qubit 2 (ThermoFisher). 2 ng of cfDNA was used in the chimeric DNA library preparation and fetal fraction estimation stages of the assay.
To simplify, we will refer to the two types of libraries used in our method as «smash» and «amplifet» libraries.
Chimeric DNA library preparation (“smash”)
We refer to this type of library as “smash” in honor of the previously developed method [3], which prompted us to develop our methodology.
Fragmentation
2 ng of cfDNA was fragmented using the dsFragmentase kit (NEB): 1.25 uL of Fragmentase reaction Buffer v2; 2.5 uL of dsDNA Fragmentase; 3 uL of 50% PEG-8000 solution (Sigma Aldrich), and MQ water to make up to a total volume of 27 uL . Solution was incubated at 36℃ for 40 minutes.
Double size-selection to obtain 40-50 bp fragments
2x Ampure XP beads (Beckman) were added to 27 uL of fragmented products and incubated for 5 minutes at RT on benchtop. Then the microtube was placed on a magnetic rack until all the beads were concentrated on one side of the microtube. Supernatant with DNA less than 100 bp in length was transferred to a new microtube. Lower size-selection was done using QIAquick Nucleotide Removal Kit (Qiagen) to cut off DNA shorter than 40 bp. DNA was collected to make up a volume of 20 uL.
End-repair
End-repair reaction was done at RT for 180 minutes using a Quick blunting kit (NEB) by adding 2.5 uL of buffer, 2.5 uL dNTP mix and 1 uL of enzyme mix.
Self-ligation
Formation of chimeric DNA molecules was done by adding 4.5 uL of T4 ligation buffer, 0.5 uL of T4 DNA ligase (E320 kit, Sybenzyme), 9 uL of 50% PEG-8000 solution, and 1 uL of 5’-deadenylase (NEB) to the product of the end-repair reaction. Self-ligation was conducted at RT overnight.
A-tailing
4.5 uL of Taq buffer, 2 uL of Taq-pol (PK015L, Evrogen), and 2 uL of dATP (R0181, ThermoFisher) were added to the self-ligation product to make up to a final volume of 47 uL. The solution was incubated for 30 minutes at 65℃ and 2 minutes at 72℃.
Adapter ligation
Oligonucleotides dir_1 (ACACTCTTTCCCTACACGACGCTCTTCCGATCT) and rev_P (P*GATCGGAAGAGCACACGTCTGAACTCCAGTC) were synthesized in Evrogen (Moscow). dir_1 and rev_P were diluted to 5mM and combined in equal volumes, then hybridized in a thermocycler by heating to 95℃ for 5 minutes and slow cooling to RT. To 47 uL of A-tailing product we added: 3 uL of adaper mix; 5,75 uL of ligation buffer; 3 uL of T4 DNA ligase (E320 kit, Sybenzyme); 6.3 uL of 50% PEG-8000 solution; 0.5 uL of 5’-deadenylase. The reaction was incubated in a thermocycler for 100 cycles of the following program: 4℃ for 10 seconds, 16℃ for 30 seconds.
Size-selection
1x volume of MQ water was added to the adapter ligation product. Then x0,4 volumes of x2 concentrated Ampure XP beads was added. Standard cleaning procedure was performed, however, DNA was not eluted from the beads. PCR mix from next step was added directly to the beads with immobilized DNA.
Indexing PCR
In a new micrutube, the following reagents were combined: 1 uL of i5 and 1uL of i7 indexes from the E7600S NEB kit; 5uL of HiFi buffer; 0.5 uL of HiFi pol; 0.75 uL of dNTP’s (all KAPA 7958897001). The mixture was added to beads with immobilized DNA and put into a thermocycler with the following program: 95℃ for 2 minutes; 98℃ for 20 sec, 65℃ for 30 sec, 72℃ for 2 minutes (for 15 cycles); 72℃ for 5 minutes.
Cleanup
PCR products were cleaned with x0.5 volume of Ampure XP beads. Elution was done in 20 uL of Low TE.
QC
QC was done using a High Sensitivity Kit for Bioanalyzer 2100 (Agilent). Result were considered optimal if the library peak was in 500-800 bp range and the concentration was more than 4 nM in 200-800 bp range (picture 1A).
Fetal fraction estimation library preparation (“amplifet”)
We refer to libraries for assessing the proportion of fetal DNA as “amplifet”.
Multiplex PCR
2 ng of cfDNA was added to the first PCR with 20 uL of Amliseq primer mix (ThermoFisher); 8 uL of Phusion buffer; 0.4 uL of Phion U pol (F-555L kit, ThermoFisher); 0.8 uL of dNTP’s (pb006L Evrogen) and MQ water up to 40 uL. The mixture was amplified using the following program: 98℃ for 30 seconds; 98℃ for 10 seconds, 60℃ for 4 minutes, 72℃ for 20 seconds for 27 cycles; 72℃ for 5 minutes.
QC
Length of PCR products was assessed using agarose gel-electrophoresis.
Cleanup
1st PCR product was cleaned with x3 volumes of Ampure XP beads and eluted to 20 uL.
Adapter ligation
To ligate Illumina DIY adapters (same as in the «Adapter ligation» step described above), end-repair and A-tailing reactions were carried out in the same microtube but at a lower temperature. The following reagents were mixed in a new microtube: 10 uL of cleaned amplicons from 1st PCR; 5 uL of Ligase buffer (B302 Sybenzyme); 5 uL of adapter mix; 0.5 uL of T4 DNA ligase (E330 Sybenzyme); 0.5 uL of 5’deadenilase (M0331 NEB); 1 uL of 10 mM ATP (R0441 ThermoFisher); 1 uL of Klenow exo- (m0212L NEB); 1 uL of dATP (R0141 ThermoFisher); 2 uL of T4 PNK (EK0032 ThermoFisher); 12 uL of 50% PEG-8000 solution; 12 uL of MQ water. The mix was incubated at 37℃ for 40 minutes; 10℃ for 10 seconds, 30℃ for 30 seconds (100 cycles).
Cleanup
Ligation product was cleaned with x1.5 volumes of Ampure XP beads and eluted to 30 uL.
Indexing PCR
The following reagents were mixed in a new microtube: 1 uL of i5 and 1uL of i7 indexes from the E7600S NEB kit; 5uL of Phusion buffer; 0.25 uL of Phusion U pol; 0.5 uL of dNTP’s. The mixture was put into a thermocycler and ran through the following program: 98℃ for 30 seconds; 98℃ for 10 sec, 65℃ for 30 sec, 72℃ for 20 seconds (for 14 cycles); 72℃ for 5 minutes.
Cleanup
Cleanup was done using a GeneRead Size Selection Kit (180514 Qiagen). DNA was eluted to 20 uL of Low TE.
QC
QC was done using a High Sensitivity Kit for Bioanalyzer 2100 (Agilent). Results were considered optimal if the library peak was in the 270-280 bp range and the concentration was more than 4 nM in the 270-280 bp range (picture 1B).
Sequencing
NGS was performed using an Illumina HiSeq 2500 instrument with Rapid Run v2 kits designed for 500 cycles (PE250 dual-indexing).
Bioinformatics
Mapping
Raw data in BCL format were converted to FASTQ using bcl2fastq v. 2.20 software. Reads were mapped to the h38 genome in two iterations. Initially, BWA [5] version 0.7.17 with standard settings was used. In this case, smash-read fragments were distributed along the genome depending on where they were mapped. Reads with amplifet libraries were mapped entirely. During the second step, the subreads obtained as a result of the first mapping were extracted and mapped again as separate reads. Additional (supplementary) alignments and mapping onto the minus strand were filtered out (as after the first stage all exported subreads have already been inverted into the plus strand of the reference genome).
Filtering
For each smash library, the following steps were performed:
All BAM files that contain smash data for this sample were downloaded and, if there were several, were combined using samtools [6] version 1.9.
The following reads were filtered out:
imperfectly mapped onto the genome (MAPQ <60)
those that fall into regions of known repeats (RepeatMasker in Genome Browser track)
those that fall into amplicon regions
For each amplifet library, only reads that fall into the amplicons region (off-target reads were depleted) were filtered out.
FLASH tool with the following settings was used to calculate insertion length in smash libraries:
--min-overlap 20 --max-overlap 250 --allow-outies --max-mismatch density 0.20.
Statistics
We ran FetalQuant [7] with the default parameters to estimate the fractional fetal DNA concentration from the SNP data (amplifet target sequencing). The training samples were classified as case/ control based on chromosomal z-scores. We used the R-package NIPTer [8] to perform a variation reduction (peak, GC and chi-squared corrections), match QC and to calculate the z-score.
Filtering
We filtered out all samples in which the estimated fractional fetal DNA concentration was less than 4% or the total number of fragments was less than 2 500 000. We used a threshold of 3 for the z-score in the classification task. To make the estimates of the accuracy characteristics more stable, we implemented a kind of cross-validation procedure. The test dataset was composed of all samples with trisomies (for a certain chromosome) and an equal number of the control samples chosen randomly. The remaining control samples formed the training dataset. For each run we computed sensitivity, specificity and AUC values for the classification task and reported the averaged values over 200 runs.
Determination of the sex of the fetus
To determine the sex, a formula for estimating the fraction of fetal DNA was used based on the analysis of the fraction of fragments of the Y chromosome [9-11].