PURPOSE To investigate if artificial intelligence can identify fetus intracranial structures in pregnancy week 11-14; to provide an automated method of standard and non-standard sagittal view classification in obstetric ultrasound examination
METHOD AND MATERIALS A data set of 1842 2D sagittal-view ultrasound images from 1842 females were collected to train and test a newly design scheme based on deep learning (DL) – Fetus Framework to identify nine fetus intracranial structures: thalami, midbrain, palate, 4th ventricle, cisterna magna, nuchal translucency (NT), nasal tip, nasal skin, and nasal bone. Results from Fetus Framework were further used for standard/non-standard (S-NS) plane classification, a key step for NT measurement and Down Syndrome assessment. S-NS classification were also tested with 156 images from a second medical center. Sensitivity, specificity and area under curve (AUC) were evaluated for comparison among Fetus Framework, three classic DL models and human experts with 1-, 3- and 5-year ultrasound training. Furthermore, a dataset of 316 standard images confirmed by the Fetus framework and another dataset of 316 standard images selected by physicians were utilized individually to train a random forest and perform the Fetal malformation classification task. Based on the hypothesis that random forest performs better on more standard dataset, mean AUC of 5-fold cross validation are compared.
RESULTS Nine intracranial structures identified by Fetus Framework in validation are all consistent with that of senior radiologists. For S-NS sagittal view identification, Fetus Framework achieved AUC of 0.996 (95%CI: 0.987, 1.000) in internal test, at par with classic DL models. In external test, FF reaches an AUC of 0.974 (95%CI: 0.952, 0.995), while ResNet-50 arrives at AUC~0.883, 95% CI 0.828–0.939, Xception AUC~0.890, 95% CI 0.834–0.946, and DenseNet-121 AUC~0.894, 95% CI 0.839–0.949. For the internal test set, the sensitivity and specificity of the proposed framework is (0.905, 1), while the first-, third-, and fifth-year clinicians are (0.798, 0.986), (0.690, 0.958), and (0.619, 0.986), respectively. For the external test set, the sensitivity and specificity of FF is (0.989, 0.797),and first-, third-, and fifth-year clinicians are (0.663, 0.781), (0.609, 0.844), and (0.533, 0.875), respectively. In further validation of fetal malformation classification task, mean AUC of random forest in physician dataset is 0.768 (0.724 – 0.812) and in Fetus dataset is 0.806 (0.741 – 0.871), suggesting that Fetus framework identify standard images more accurately.
CONCLUSION We proposed a new deep learning-based Fetus Framework for identifying key fetus intracranial structures. The framework was tested in data from two different medical centers. The results show consistency and improvement from classic models and human experts in standard and non-standard sagittal view classification during pregnancy week 11-13+6.
CLINICAL RELEVANCE/APPLICATION With further refinement in larger population, the proposed model can improve the efficiency and accuracy in early pregnancy test using ultrasound examination.