Study population
The Shinken Database includes all patients newly visiting the Cardiovascular Institute, Tokyo, Japan, excluding foreign travelers and patients with active cancer. This single hospital-based database was established in June 2004. Details of this database have been described elsewhere.13
In the present study, computerized database of ECG was used, which was available since February 2010. Therefore, out of total 32570 patients in the Shinken Database, 19170 patients registered between February 2010 and March 2018 were extracted. After excluding patients with structural heart diseases (n = 4915), patients with age under 20 years or over 90 years (<20 or >90 years; n = 168), and patients with index ECG showing indeterminate axis (R axis >180˚) (n = 76), pacing beats (n = 102), or atrial or ventricular tachyarrhythmia (n = 1763), the remaining 12837 patients were the target population in the present study.
Definition of structural heart disease
The definitions of structural heart diseases were as follows: valvular heart disease, moderate or severe stenosis or regurgitation on echocardiography; coronary artery disease, diagnosed on angiography or scintigraphy; and hypertrophic and dilated cardiomyopathy, diagnosed on echocardiography or magnetic resonance imaging [MRI]. Heart failure was diagnosed when the patients had symptoms of New York Heart Association (NYHA) class ≥ 2.
Patient follow-up
The health status and the incidences of cardiovascular events and mortality were maintained in the database by being linked to the medical records of the hospital, and by study documents of prognosis sent once per year to those who stopped hospital visits or who were referred to other hospitals. In the present study, we included the follow-up data until March 2019, and excluded follow-up data of > 3 years after the initial visit to avoid imbalance of follow-up period among patients due to the different registration years (between 2010 and 2018).
Parameters obtained from ECG
The 12-lead ECG was recorded for 10 s in the supine position, using a GE ECG machine (GE CardioSoft V6.71 and MAC 5500 HD; GE Healthcare, Chicago, IL) at a sampling rate of 500 Hz, and stored using the MUSE data management system.
In the database of the computerized raw data of electrocardiogram with GE system, measurement of 639 parameters was automatically performed. Of these, 201 parameters (9 not lead-specific and 192 [16 × 12 leads] lead-specific) were temporally stored data, including the relative coordinate points (i.e., the start point of P-wave) and calculated values much the same to the original parameters (i.e., among QTc parameters, QTc Calculation [QTc Bazett] was used and QTc Framingham and QTc Fridercia were excluded). After excluding them, remaining 438 parameters (6 not lead-specific and 432 [36 × 12 leads] lead-specific) were used in the analysis (Table 1).
Evaluation and Statistical analysis
Statistical analyses were carried out using SPSS version 26.0 (IBM, Chicago, IL). In all analyses, P < 0.05 was taken to indicate statistical significance. Categorical and consecutive data are presented as number (%) and mean ± SD.
(1) Modeling of biological age (BA) using ECG parameters: BA was modeled using ECG parameters following two steps according to the PCA algorithm.4 [Step 1] Pre-BA
was calculated. In the equation, m indicates the number of principal components, and i indicates their individual orders; n indicates the number of ECG parameters, and j indicates their individual orders; β indicates the coefficient in the PCA; x indicates each ECG parameter; and and sd(x) indicate the average value and the standard deviation of each ECG parameter. The pi was calculated as following formula; pi = (R2 in an univariable linear regression model with each principal component for CA) / (sum of R2 in the univariable linear regression models with each principal component for CA). [Step 2] BA was calculated by the following formula: BA = pre-BA × (CA) + + (CA - ) × (1 - B). In the equation, (CA) and indicate the standard deviation and the average value of CA, and B indicates the standardized coefficient in the univariable linear regression analysis where pre-BA and CA were the dependent and the independent variable, respectively.
(2) AgeDiff (gap between BA and CA) and its effect on mortality: AgeDiff was individually calculated as the gap between BA and CA (AgeDiff = BA − CA). Patients were categorized by 3-year ranges of AgeDiff (≤0, 0< to ≤3, 3< to ≤6, 6< to ≤9, and >9 years), and the cumulative incidence of total mortality by the AgeDiff categories were plot by the Kaplan-Meyer method. Additionally, patients were categorized by AgeDiff >0 and AgeDiff ≤0, and the hazard ratio of AgeDiff >0 for mortality in reference to AgeDiff ≤0 was evaluated by Cox regression analysis with univariable and multivariable models.