2.1 Study design and data collection
The ethical review was approved by the ethics committee of the affiliated hospital of Guangdong medical university and patient informed consent was waived for this retrospective analysis.
We enrolled the information of the patients from the electronic medical record database from January 2017 to December 2019 who were clearly diagnosed as AMI, fulfilling the Fourth Universal Definition of Myocardial Infarction (2018)9. Arrhythmic events were recorded by reviewing electrocardiogram or Holter monitor. VTA were defined as sustained ventricular tachycardia, ventricular fibrillation that resulting in defibrillator shocks and non-sustained ventricular tachycardia. Features and subclassification about risk factors are listed in Table-1:
Table-1: Risk factors considered in the study
Risk factors
|
Features and subclassification
|
General information
Age(years)
|
<60y, between 60y-75y,>75y
|
Gender
Diabetes history
Hypertension history
Information about AMI
Type of AMI
hsTnT
PCI timing
NT-proBNP
EF
Hypokalemia
Relevant organs information
Infection (pneumonia or catheter-related infection)
eGFR(kidney)
|
Female/Male
Yes/No
Yes/No
STEMI/NSTEMI
Not more than 5 times threshold
More than 5 times threshold
More than 10 times threshold
Hospitalized in 24h
Hospitalized more than 24h
Hospitalized without PCI
Normal
Not more than 5 times threshold
More than 5 times threshold
>50%, between 40%-50%, <40%
Yes/No
Yes/No
>60ml/min, between 30-60 ml/min, <30% ml/min
|
2.2 Statistical analysis and Model established
Statistical analysis and model construct were performed by R3.6.1 software. R packages “psych”, “ggplot2”, and “pheatmap” were used to execute and visualize the principal component analysis (PCA). The package “neuralnet”, "NeuralNetTools", "dplyr" and " pROC " were used to develop and validate artificial neural network model. The process of analysis is as follows steps:
2.2.1. Extracting the features from the original variables by principal component analysis (PCA) which is a multivariate statistical method with a long history and a widely used range. The principal components can reflect mostly or all information of the original data while each variable is independent from others, avoiding multiple collinearity and helpful to develop a model10.
2.2.2. The cohort were randomly divided into a training set and a testing set at the ratio of 70%:30%. A standard feed-forward, back-propagation neural network is the simplest form of ANNs that consisting of an input layer, a hidden layer, and an output layer was applied in the study due to its relative simplicity and stability11. The operation process of the model is as follows: the new comprehensive principal components were introduced from input layer to the hidden layer, which consists of several neurons as information receiver. All the neuron connections have a different weights and bias parameter. The former one represents the importance of the corresponding input compared with other inputs, the latter one is used to correct the calculation results of the weight and input. The information is transformed to nonlinearly by the sigmoid activation function and passed into output layer that calculates results whether complicating of VTA. It should be noted that the optimal number of neurons in hidden layer was determined through trial and error, since no accepted theory currently exists for predetermining the optimal number. We use cycle searching to determine the optimal number of neurons of the model in the study. The mathematical operations in the model can be generalized as follows12:
Note: y = output result, i = number of input variables, N = number of neurons, w = weights, x = input variables, b = bias parameter
2.2.3. The optimized model was verified in the training dataset and testing dataset respectively, with following parameters as the assessment tool: area under receiver operating characteristic curve (AUC)13 and confusion matrix with accuracy, sensitivity, specificity, positive predicative value, and negative predictive value.