Using Machine Learning to Predict MACEs Risk in Patients with Premature Myocardial Infarction

Jing-xian Wang; Miao-miao Liang; Peng-ju Lu; Zhuang Cui; Yan Liang; Yu-hang Wang; An-ran Jing; Jing Wang; Meng-long Zhang; Yin Liu; Chang-ping Li; Jing Gao

doi:10.31083/RCM31298

Reviews in Cardiovascular Medicine ›› 2025, Vol. 26 ›› Issue (5) :31298 DOI: 10.31083/RCM31298

Original Research

research-article

Using Machine Learning to Predict MACEs Risk in Patients with Premature Myocardial Infarction

Author information +

History +

PDF (10804KB)

Abstract

Background:

The study aimed to develop an interpretable machine learning (ML) model to assess and stratify the risk of long-term major adverse cardiovascular events (MACEs) in patients with premature myocardial infarction (PMI) and to analyze the key variables affecting prognosis.

Methods:

This prospective study consecutively included patients (male ≤50 years, female ≤55 years) diagnosed with acute myocardial infarction (AMI) at Tianjin Chest Hospital between January 2017 and December 2022. The study endpoint was the occurrence of MACEs during the follow-up period, which was defined as cardiac death, nonfatal stroke, readmission for heart failure, nonfatal recurrent myocardial infarction, and unplanned coronary revascularization. Four machine learning models were built: COX proportional hazards model (COX) regression, random survival forest (RSF), extreme gradient boosting (XGBoost), and DeepSurv. Models were evaluated using concordance index (C-index), Brier score, and decision curve analysis to select the best model for prediction and risk stratification.

Results:

A total of 1202 patients with PMI were included, with a median follow-up of 26 months, and MACEs occurred in 200 (16.6%) patients. The RSF model demonstrated the best predictive performance (C-index, 0.815; Brier, 0.125) and could effectively discriminate between high- and low-risk patients. The Kaplan-Meier curve demonstrated that patients categorized as low risk showed a better prognosis (p < 0.0001).

Conclusions:

The prognostic model constructed based on RSF can accurately assess and stratify the risk of long-term MACEs in PMI patients. This can help clinicians make more targeted decisions and treatments, thus delaying and reducing the occurrence of poor prognoses.

Graphical abstract

Keywords

acute myocardial infarction / premature myocardial infarction / machine learning / major adverse cardiovascular events / prediction model

Cite this article

Download citation ▾

Jing-xian Wang, Miao-miao Liang, Peng-ju Lu, Zhuang Cui, Yan Liang, Yu-hang Wang, An-ran Jing, Jing Wang, Meng-long Zhang, Yin Liu, Chang-ping Li, Jing Gao. Using Machine Learning to Predict MACEs Risk in Patients with Premature Myocardial Infarction. Reviews in Cardiovascular Medicine, 2025, 26(5): 31298 DOI:10.31083/RCM31298

登录浏览全文

4963

注册一个新账户忘记密码

1. Introduction

In recent years, the prevalence and mortality of acute myocardial infarction (AMI) have tended to be younger and are the leading cause of premature death worldwide [1], with about 4%–10% of AMI patients reported to be aged

\leq

40 or 45 years [2, 3]. The increase of metabolic risk factors in young people, such as obesity, diabetes, high uric acid, and hypertension, has increased the incidence of premature myocardial infarction (PMI) and major adverse cardiovascular events (MACEs) [4], which seriously affect the workability and quality of life of patients, causing a certain burden on families and social economy. Obtaining accurate risk prediction of long-term MACEs after PMI, and therefore early intervention to improve patient prognosis as much as possible, is of utmost importance in clinical management [5, 6, 7, 8].

Machine learning (ML) algorithms provide powerful tools for researchers to learn rules in data and make data-driven outcome predictions by capturing high-dimensional, linear, or non-linear relationships between clinical variables [9]. ML has been used in many medical-related fields, such as diagnosis, outcome prediction, treatment, and medical image interpretation, and is superior to proven traditional risk stratification tools [10, 11, 12, 13, 14]. For example, a study using the American College of Cardiology Chest Pain-MI registry that used an ML model to predict death after AMI reported an area under the curve (AUC) value of close to 0.9 for each ML model, with extreme gradient boosting (XGBoost) provide better risk solutions for high-risk individuals [15]. Another ML-based study of adverse event prediction in acute coronary syndrome (apolipoprotein A1/B, ApoA1/B) showed that different machine learning models showed good predictive performance in predicting all-cause death, myocardial infarction, and major bleeding in acute coronary syndrome (ACS) patients at 1 year after discharge, and compared with traditional risk prediction tools, ML algorithm has advantages in predicting MACEs [16].

Compared with elderly MI patients, young myocardial infarction (MI) patients may have a different risk factor spectrum, and PMI patients often have other unique metabolic risk factors [17]. There are still few studies on the related risk factors affecting the occurrence of AMI adverse events in young adults, and the previous studies using machine learning algorithms to establish MACEs prediction models in young MI patients are also limited. Therefore, the development of machine learning predictive models for these patients to guide early clinical intervention has certain research value. In summary, the purpose of this study was to use machine learning algorithms to assess and stratify the risk of long-term MACEs in PMI patients, and to analyze key clinical variables affecting the occurrence of MACEs.

2. Materials and Methods

2.1 Study Cohort

The flow of the study is shown in Fig. 1. This is a single-center, prospective, observational cohort study. Consecutive patients admitted to Tianjin Chest Hospital for AMI between January 2017 and December 2022, meeting the PMI age threshold, were included in the PMI cohort.

Inclusion criteria:

(1) Age

>

18 years old, female age

\leq

55 years old, male age

\leq

50 years old;

(2) Meet the diagnostic criteria of AMI. The diagnosis of AMI in this study was based on the fourth Global Definition of Myocardial Infarction [18]. That is, elevation of serum myocardial markers (primarily troponin) above at least 99% of the reference limit, accompanied by at least one of the following clinical symptoms:

① Typical symptoms of myocardial ischemia (persistent chest pain

>

30 minutes, not relieved by taking 1–2 nitroglycerin tablets, accompanied by sweating, nausea, vomiting, pallor, and other symptoms);

② New ischemic electrocardiogram (ECG) changes (including increased T wave width, new ST segment and T wave (ST-T) changes, or left bundle branch block), ECG pathological Q-wave formation;

③ Imaging evidence showed new local wall motion abnormalities;

④ Coronary angiography confirmed thrombus in the coronary artery.

Coronary angiography (CAG) was performed by two or more cardiologists qualified in coronary diagnosis and treatment at our center.

Exclusion criteria:

(1) Patients with severe liver and/or renal failure;

(2) Patients with congenital heart disease and/or valvular heart disease;

(3) Patients with severe inflammatory diseases and/or malignant tumors;

(4) Patients with missing transthoracic echocardiography and/or other data;

(5) Patients without signed informed consent.

The study followed the Declaration of Helsinki, was approved by the Ethics Committee of Tianjin Chest Hospital (No. 2017KY-007-01), and written informed consent was obtained from all participants.

2.2 Data Collection

Establish the electronic medical record database of PMI patients in our center. The Epidata data entry system uses a two-person entry method, and to ensure data quality, all event diagnoses are further verified by the review of the medical records by two cardiologists. The lead researcher, statistician, and other team members collaborate to review the data to ensure accuracy, completeness, and reliability.

Data collected included general characteristics, including gender, age, body mass index (BMI), personal history (smoking and drinking history), previous medical history [diabetes, hypertension, hyperlipidemia, chronic kidney disease (CKD), and stroke history], family history of coronary artery disease (CAD), and type of AMI; admission vital signs (heart rate, blood pressure, shock index); laboratory tests [blood routine, liver and kidney function, coagulation function, fasting blood glucose, lipids, brain natriuretic peptide (BNP), peak creatine kinase MB (CK-MB), peak value cardiac troponin T (TNT)], CAG including diseased vessels, number of diseased vessels, coronary thrombosis, percutaneous coronary intervention (PCI), complete occlusion, Syntax score, and transthoracic echocardiography (TTE) parameters (left atrial diameter, left ventricular diameter, left ventricular ejection fraction). Peak CK-MB and TNT levels were recorded, and remaining laboratory parameters were measured after a rapid overnight stay (

\geq

8 hours) on the day of admission. The Syntax score [19] was used to assess the severity of CAD and to assist CAD patients with risk stratification and revascularization strategies. It is calculated using online software version 2.28 (https://syntaxscore.org/). Using Killip

\geq

II as the cutoff value, Killip

\geq

II indicates clear evidence of heart failure (e.g., pulmonary rales or elevated jugular venous pressure). Compared to Killip I patients, these patients have significantly worse prognoses and receive greater attention in clinical management and interventions. In addition, we documented patients’ medication during hospitalization, including antiplatelet drugs, statins, diuretics, angiotensin-converting enzyme inhibitors (ACEIs), angiotensin receptor blockers (ARBs), and beta blockers.

2.3 Study Endpoint

The endpoint of the study was the occurrence of MACEs during follow-up, including cardiac death, nonfatal stroke, readmission for heart failure, nonfatal recurrent myocardial infarction, and unplanned coronary revascularization. All patients were followed up after discharge by a trained specialist on an outpatient basis or by telephone to record the occurrence of MACEs in PMI patients during the follow-up period. Cardiac death was mainly caused by sudden cardiac death, acute congestive heart failure, acute myocardial infarction, severe arrhythmia, and other structural/functional heart disease. Stroke is defined based on imaging findings or typical symptoms. According to the guidelines of the European Society of Cardiology, the diagnosis of heart failure refers to the ventricular filling and/or ejection function impairment caused by various cardiac structural or functional diseases, and the cardiac output cannot meet the metabolic needs of the body tissues, resulting in clinical manifestations such as dyspnea, limited physical activity, and fluid retention. AMI was diagnosed comprehensively based on the results of chest pain, myocardial enzyme pattern changes, and electrocardiogram [18]. Unplanned coronary revascularization is defined as revascularization driven by ischemic symptoms or any pathological event, including unplanned PCI and coronary artery bypass grafting (CABG).

2.4 Model Construction and Evaluation

2.4.1 Data Preprocessing

The study initially included 75 clinical variables. The variables with a deletion rate of more than 10% were deleted, the variables with a deletion rate of less than 10% were filled with multiple imputation methods, and 70 clinical variables were finally included. Multiple imputation was performed using the R 4.4.1 software (R Core Team, Auckland, New Zealand) (mice package). The number of imputations was set to 5 (m = 5), with a maximum of 10 iterations (maxit = 10). The predictive mean matching (PMM) method was used to impute missing values. To ensure reproducibility, a random seed (seed = 123) was set during the imputation process. Since the value ranges of different variables are very different, and some algorithms used need to perform quantitative normalization of data, Z-score is used for data normalization.

2.4.2 Variables Screening

Univariate COX proportional hazards model (COX) regression analysis was used to conduct preliminary screening of all clinical variables in the training set, and variance inflation factor (VIF) was used to test whether multicollinearity existed among clinical variables after screening. In this study, variables with VIF

>

5 were deleted. The VIF threshold was set to 5, which is a commonly accepted indicator of moderate multicollinearity. This threshold was selected to strike a balance between retaining enough variables and reducing the impact of multicollinearity. To avoid overfitting the model, we use the least absolute shrinkage and selection operator (LASSO) to filter the variables. LASSO regression compresses the coefficients of some unimportant or redundant variables to zero by applying L1 regularization to the coefficients, thereby reducing model complexity and reducing the risk of overfitting. For LASSO regression, the COX proportional hazards model (family = ‘cox’) was used to identify important predictors. The optimal regularization parameter was selected using the (cv.glmnet) function from the glmnet package in R, with 10-fold cross-validation. The maximum number of iterations (maxit) was set to 1000, and a fixed random seed (seed = 1234) was applied to ensure reproducibility. Random survival forest (RSF) was also used to screen clinical variables, selecting the top 15 variables in order of importance. In this study, after using LASSO and RSF to screen variables, the intersection of the two is taken as the target variable for modeling. The relationship between the selected variables and the outcome was analyzed by restricted cubic spline (RCS).

2.4.3 Model Development

In this study, four ML models were developed to predict the risk of long-term MACEs in PMI patients. They are COX regression, RSF, extreme gradient boosting (XGBoost), and DeepSurv. RSF and XGBoost models are decision tree-based integrated models for classification and regression problems, both of which can efficiently handle high-dimensional datasets with millions of rows and columns. DeepSurv uses deep learning techniques to process survival data, capturing complex non-linear relationships and interaction effects. This makes it more effective than traditional survival analysis methods when dealing with high-dimensional data and complex risk patterns.

According to whether the endpoint appeared or not, 1202 patients were divided into the training set and the testing set according to the ratio of 3:1 by stratified random sampling. The hyperparameters of ML models are optimized by using a grid search method with 5-fold cross-validation. We used the “surv.coxph”, “surv.rfsrc”, “surv.xgboost.cox”, and “surv.deepsurv” learners from the “mlr3extralearners” package to construct the COX proportional hazards model, RSF, XGBoost, and DeepSurv models, respectively. We have included the final selected hyperparameter results in Supplementary Table 1 for reference.

2.4.4 Model Performance Evaluation

The concordance index (C-index) or time-dependent AUC was used to evaluate the discrimination of the model, that is, the ability to correctly classify the occurrence of MACEs. Discrimination is an important indicator for evaluating prediction models, especially when screening high-risk populations. The model correction was evaluated using the Brier score. Brier score measures the degree of calibration in a quantitative way and is an indicator used to evaluate the performance of the calibration curve. If the model’s predicted probability is close to the frequency of actual events, the Brier score value will be low, indicating that the model is well calibrated. The predictive benefits of the models were evaluated using the decision curve analysis (DCA). Finally, the best performance model was selected from the four models for the prediction and risk stratification of PMI patients. Using the maximum approximate boarding index calculated by the optimal model as the optimal critical value, PMI patients were divided into high-risk group and low-risk group, and then Log-rank test was used to evaluate whether there were differences in Kaplan-Meier curve between the two groups. To visualize the results of the RSF model, a risk calculator for distant MACEs in PMI patients was developed using the “shiny” package. The SHapley Additive exPlanations (SHAP) value of individual samples is calculated using the “survex” package. The goal of SHAP is to explain the prediction of an instance by calculating the contribution of each feature to the prediction, quantifying the contribution of each feature to the prediction made by the model.

2.4.5 Statistical Analysis

All analyses and calculations were performed using R 4.4.1 and SPSS 26.0 (IBM Corp., Armonk, NY, USA). The continuous data of normal distribution were expressed as mean

\pm{}

standard deviation (SD), the comparison between the two groups was performed by independent student t-test, the continuous data of skewness distribution were expressed by median and quartile [M (Q1, Q3)], and the comparison between the two groups was performed by Mann Whitney U test. The categorical data were expressed as frequency and percentage (n, %), and the comparison between the two groups was made by the Chi-square test or Fisher exact probability method (when the theoretical frequency

<

1 or the number of cases

<

40). All p-values were two-sided and if below 0.05 the results were considered statistically significant.

3. Results

3.1 Baseline Characteristics

A total of 1202 patients were enrolled, of whom 1094 (91.0%) were males and 108 (9.0%) were females, and the median age of all patients was 42 (37, 44) years. The median follow-up period was 26 months, ending in June 2023. During the follow-up, a total of 200 patients (16.6%) developed MACEs, including 19 cases of all-cause deaths (9.5%), 8 cases of non-fatal strokes (4.0%), 35 cases of readmissions due to heart failure (17.5%), 75 cases of non-fatal recurrent myocardial infarction (37.5%) and 63 cases of unplanned coronary revascularization (31.5%). Table 1 shows the baseline characteristics and results of 34 clinical variables after univariate COX regression screening. All baseline characteristics of the patients are shown in Supplementary Table 2.

3.2 Variables Screening

The multicollinearity analysis of 34 meaningful variables in the baseline table showed that the VIF of white blood cell count (WBC), absolute neutrophil count (ANC), total cholesterol (TC), low-density lipoprotein cholesterol (LDL-C) were

>

5. After removing these variables, the remaining 30 variables were further screened.

The LASSO coefficient path diagram is drawn to show how the coefficients of each variable change under different regularization intensities (Fig. 2A), and the cross-validation diagram (Fig. 2B) shows the performance of the model under different Log Lambda values. Two lambda values were reported for LASSO regression: lambda.min = 0.007193123 and lambda.1se = 0.06112381. After careful consideration, we chose lambda.min because it offers the best predictive performance, even though it retains more variables and results in a slightly more complex model. A total of 19 variables were screened by LASSO, namely diabetes, Killip

\geq

II, cardiac shock, intra-aortic balloon pump (IABP), left anterior descending coronary artery (LAD), PCI Therapy, three diseased vessels, diuretics, BMI, heart rate, glycated hemoglobin (HbA1c), c-reactive protein (CRP), uric acid (UA), ApoB, free fatty acid (FFA), fibrinogen (FIB), CK-MB, BNP, and left ventricular ejection fraction (LVEF). After RSF selection, the top 15 important variables were selected, which were FFA, cardiac shock, creatinine (Cr), HbA1c, urea, diuretics, LVEF, BMI, BNP, UA, Ventilator, Syntax, ApoB, Killip

\geq

II, D-dimer (Fig. 2C).

Finally, the first 15 variables ranked by RSF feature importance and the 19 variables selected by LASSO were intersected to obtain 10 variables for modeling (Fig. 2D). The 10 variables were BMI, ApoB, FFA, UA, HbA1c, BNP, LVEF, cardiac shock, Diuretics, and Killip

\geq

II.

3.3 RCS Explores the Relationship between Independent Variables and MACEs

The RCS graph graphically shows how the independent variable affects the hazard ratio value (HR value) and thus the occurrence of MACEs in different value intervals. In this study, RCS analysis was carried out on continuous variables among the 10 selected variables (Fig. 3), and the results showed that BMI, UA and MACEs showed a roughly J-shaped relationship: When BMI

>

23.669 kg/m², MACEs risk increased with the increase of BMI value, and the lowest BMI estimate of MACEs risk was 23.669 kg/m². When UA

>

314.087 µmol/L, MACEs risk increased with the increase of UA value and the lowest UA estimate of MACEs risk was estimated to be 314.087 µmol/L. The relationship between the remaining variables ApoB, FFA, HbA1c, BNP, LVEF and MACEs were roughly linear.

3.4 Model Development and Performance Evaluation

The performance of the four models was comprehensively evaluated using several metrics, including discrimination (AUC and C-index), calibration (Brier Score), and clinical utility (DCA). First, the discrimination of the four models was evaluated by AUC and C-index (Fig. 4A). The RSF model consistently outperforming others across all time points (12-month: 0.891; 24-month: 0.858; 36-month: 0.827). These high AUC values highlight the RSF model’s excellent ability to identify high-risk individuals. Additionally, its C-index of 0.815 further confirms strong predictive reliability, with values above 0.8 considered very good for risk stratification. Second, the RSF model achieved an average Brier score of 0.125, which was superior to the other models (Table 2). A lower Brier score indicates better overall performance, as it reflects both the accuracy of the predicted probabilities and their alignment with actual outcomes. Last, DCA demonstrated that the RSF model provided the highest net benefit across a range of threshold probabilities at 12 months, 24 months, and 36 months, outperforming the XGBoost, COX regression, and DeepSurv models (Fig. 4B). Particularly in the lower threshold probability range, where identifying high-risk individuals is essential for early intervention, the RSF model exhibited significant advantages. This underscores its clinical utility and potential to guide personalized treatment strategies. The baseline characteristics of the training and testing sets used for the RSF model are shown in Supplementary Table 3. Statistical analysis revealed no significant differences in variable distributions between the two datasets, ensuring balanced training and testing set. In conclusion, the RSF model was chosen as the primary tool for risk prediction in this study due to its superior discrimination, calibration, and clinical utility.

3.5 Risk Stratification Based on the RSF Model

The RSF model was used to predict and stratify the risk of MACEs in PMI patients. Taking the risk score (24.90 scores) corresponding to the maximal Youden’s index as the optimal cut-off value, patients were divided into a high-risk group and a low-risk group, as shown in the Kaplan-Meier curve (Fig. 5), in both the training set and the testing set, the incidence of MACEs was more pronounced in high-risk patients (the Log-rank test showed a significant difference between the two groups, p

<

0.0001), and special attention needs to be paid to the management and intervention of patients in the high-risk group in clinical practice.

3.6 Importance Ranking of Variables and Forest Map

Fig. 6A shows the 10 most important clinical variables in the RSF model, ranked in order of importance, namely FFA, cardiogenic shock, HbA1c, ApoB, diuretics, LVEF, BNP, BMI, Killip

\geq

II, and UA. The bar chart on the left shows the relative importance of each variable. The forest plot on the right shows the association between each variable and the risk of MACEs. In Fig. 6B, the temporal contributions of individual variables to survival predictions are depicted using SurvSHAP(t) values. Among all variables, FFA exhibits the highest influence on survival predictions, with a consistent upward trend over time, reaching its peak contribution at approximately 60 months. In contrast, BNP and BMI show moderate but stable contributions throughout the timeline. Variables such as Cardiac_Shock, Diuretics, and HbA1c demonstrate smaller contributions with relatively flat or minimal temporal variations. The results highlight the dominant role of FFA in the survival prediction of the RSF model for this individual.

3.7 Model Visualization

To facilitate the use of prognostic models in clinical management, we developed a risk calculator based on the Shiny program package. The left side of the page (Fig. 6C) allows the user to enter each clinical characteristic, and the right side of the page calculates the predicted probability of distant MACEs and risk stratification based on information about PMI patients.

4. Discussion

This study developed and validated an interpretable ML risk prediction model for predicting the risk of long-term MACEs in PMI patients and analyzed clinical variables that influence the development of MACEs. The evaluation results of comprehensive discrimination, calibration and clinical utility showed that the RSF model performed best. Using the risk score (24.90) calculated by the RSF model as the critical value, patients were divided into high-risk group and low-risk group, and there was significant difference in Kaplan-Meier survival analysis curve between the two groups (p

<

0.0001). The ten clinical variables of feature importance ranking are FFA, cardiac shock, HbA1c, ApoB, Diuretics, LVEF, BNP, BMI, Killip

\geq

II, and UA. By calculating the risk of MACEs through a risk calculator and explaining individual risk sources and possible intervention directions through SHAP values, it is hoped that personalized and transparent clinical management can be achieved.

The results of this study highlight the superior performance of the RSF model in predicting the risk of MACEs in AMI patients, as evidenced by its discrimination, reliable calibration, and robust clinical utility. Compared with traditional models such as the COX proportional hazards model, RSF overcomes key limitations by capturing complex nonlinear relationships and high-order interactions without relying on the proportional hazards assumption. This adaptability is particularly valuable in real-world clinical scenarios, where these assumptions are often violated [20, 21]. Although XGBoost is a powerful machine learning algorithm, its application to survival data often requires additional modifications, such as implementing COX loss functions, which may introduce constraints, and it also demands extensive parameter tuning [22]. Similarly, DeepSurv, as a deep learning method, requires large-scale datasets to perform optimally and is prone to overfitting with limited data [23]. In contrast, RSF natively supports survival analysis, offering seamless integration, robust performance even with moderate sample sizes, and higher predictive accuracy without the need for extensive modifications or tuning, making it more practical for real-world clinical applications.

Insulin resistance (IR), elevated HbA1c, and metabolic abnormalities such as abnormal BMI, dyslipidemia, and hyperuricemia are critical contributors to myocardial injury and increased MACEs risk, particularly in young PMI patients. IR leads to myocardial damage through impaired diastole, altered glucose utilization, and microvascular dysfunction, while metabolic abnormalities like elevated BMI and dysregulated lipid metabolism trigger inflammation and thrombosis through the release and accumulation of fat metabolites [24, 25]. The J-shaped relationship between BMI and MACEs observed in this study using RCS is consistent with the findings of some other studies [26, 27], Flegal et al.’s [28] study has shown that overweight individuals have a lower risk than normal-weight individuals. This study’s findings highlight the strong association of FFA with adverse cardiovascular events, FFA assessment alongside traditional risk factors to identify high-risk individuals requiring closer monitoring and intervention. Dyslipidemia, particularly elevated ApoB-containing lipoproteins such as LDL-C, lipoprotein(a), and triglyceride-rich lipoproteins, significantly contributes to MACEs risk. While LDL-C remains a primary target for lipid management, this study’s RCS analysis aligns with prior research showing a linear relationship between higher ApoB levels and increased MACEs risk, even in patients on high-intensity statin therapy [29, 30, 31]. These findings suggest that for younger PMI patients, early and aggressive management of ApoB levels may be crucial in reducing cardiovascular risk. Elevated uric acid levels also emerged as an important predictor of MACEs. Through mechanisms such as oxidative stress, endothelial dysfunction, and inflammation, uric acid exacerbates insulin resistance and promotes atherosclerosis [32, 33, 34, 35]. Managing uric acid levels may disrupt this pathological cycle, offering an additional avenue for intervention in younger patients.

Cardiac function plays a pivotal role in determining MACEs risk. Variables in the RSF model, such as elevated BNP, reduced LVEF, Killip

\geq

II, cardiogenic shock, and in-hospital diuretic use, reflect poor cardiac function during hospitalization. This study’s RCS analysis showed a linear relationship between decreasing LVEF and increasing MACEs risk, consistent with previous research linking reduced ejection fraction with poorer outcomes in PCI-treated patients [36, 37]. These findings emphasize the importance of targeted cardiac rehabilitation and monitoring strategies for PMI patients with compromised cardiac function.

A comprehensive strategy is essential for younger PMI patients to reduce MACEs risk, combining advanced predictive tools and tailored management interventions. Aggressive control of ApoB, FFAs, and uric acid levels is crucial to address inflammation, thrombosis, and oxidative stress, while individualized BMI management mitigates the J-shaped risk relationship observed with MACEs. Targeted cardiac rehabilitation and monitoring of BNP and LVEF further enhance outcomes in patients with compromised cardiac function. The RSF model demonstrated its strength by integrating these multifactorial risks into a comprehensive predictive framework. With the addition of SHAP values, the model provides individual-level explanations, helping clinicians identify key contributing factors for each patient’s risk. Combined with a personalized risk calculator, these tools enable dynamic and patient-specific intervention strategies targeting modifiable risk factors such as IR, dyslipidemia, hyperuricemia, and cardiac dysfunction. This approach supports more effective prevention and treatment, ultimately improving long-term outcomes and reducing MACEs incidence in young PMI patients.

This study has several limitations. Most importantly, it lacks external validation with independent cohorts, which is essential for confirming the generalizability and robustness of the algorithm. In future research, we will incorporate patients from diverse regions and hospitals to perform external validation, ensuring broader applicability across different populations. Additionally, this study primarily focuses on clinical characteristics, missing key factors such as lifestyle, dietary habits, and multi-omics markers. Expanding these variables in future studies could provide a more comprehensive understanding of risk factors and enhance the predictive accuracy of the model.

5. Conclusions

The RSF-based risk stratification tool demonstrated excellent performance, proving its capability to accurately predict MACEs risk in PMI patients. The model identified critical predictors such as FFA, cardiogenic shock, HbA1c, ApoB, diuretic use, LVEF, BNP, BMI, Killip

\geq

II, and UA, highlighting the multifactorial complexity of MACEs risk. Enhanced by SHAP values and a risk calculator, the RSF model provides a personalized framework to identify high-risk patients, pinpoint key risk factors, and guide targeted interventions. This approach enables early management of modifiable risks, improving outcomes and reducing MACEs in PMI patients.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Tweet MS. Sex differences among young individuals with myocardial infarction. European Heart Journal. 2020; 41: 4138–4140. https://doi.org/10.1093/eurheartj/ehaa682.

[2]	Arora S, Stouffer GA, Kucharska-Newton AM, Qamar A, Vaduganathan M, Pandey A, et al. Twenty Year Trends and Sex Differences in Young Adults Hospitalized With Acute Myocardial Infarction. Circulation. 2019; 139: 1047–1056. https://doi.org/10.1161/CIRCULATIONAHA.118.037137.

[3]	Jortveit J, Pripp AH, Langørgen J, Halvorsen S. Incidence, risk factors and outcome of young patients with myocardial infarction. Heart (British Cardiac Society). 2020; 106: 1420–1426. https://doi.org/10.1136/heartjnl-2019-316067.

[4]	Correction to: Cardiovascular-Kidney-Metabolic Health: A Presidential Advisory From the American Heart Association. Circulation. 2024; 149: e1023. https://doi.org/10.1161/CIR.0000000000001241.

[5]	Helgason H, Eiriksdottir T, Ulfarsson MO, Choudhary A, Lund SH, Ivarsdottir EV, et al. Evaluation of Large-Scale Proteomics for Prediction of Cardiovascular Events. JAMA. 2023; 330: 725–735. https://doi.org/10.1001/jama.2023.13258.

[6]	Kuno T, Kiyohara Y, Maehara A, Ueyama HA, Kampaktsis PN, Takagi H, et al. Comparison of Intravascular Imaging, Functional, or Angiographically Guided Coronary Intervention. Journal of the American College of Cardiology. 2023; 82: 2167–2176. https://doi.org/10.1016/j.jacc.2023.09.823.

[7]	Georgiopoulos G, Kraler S, Mueller-Hennessen M, Delialis D, Mavraganis G, Sopova K, et al. Modification of the GRACE Risk Score for Risk Prediction in Patients With Acute Coronary Syndromes. JAMA Cardiology. 2023; 8: 946–956. https://doi.org/10.1001/jamacardio.2023.2741.

[8]	Szarek M, Reijnders E, Jukema JW, Bhatt DL, Bittner VA, Diaz R, et al. Relating Lipoprotein(a) Concentrations to Cardiovascular Event Risk After Acute Coronary Syndrome: A Comparison of 3 Tests. Circulation. 2024; 149: 192–203. https://doi.org/10.1161/CIRCULATIONAHA.123.066398.

[9]	Schwalbe N, Wahl B. Artificial intelligence and the future of global health. Lancet (London, England). 2020; 395: 1579–1586. https://doi.org/10.1016/S0140-6736(20)30226-9.

[10]	Haug CJ, Drazen JM. Artificial Intelligence and Machine Learning in Clinical Medicine, 2023. The New England Journal of Medicine. 2023; 388: 1201–1208. https://doi.org/10.1056/NEJMra2302038.

[11]	Naik K, Goyal RK, Foschini L, Chak CW, Thielscher C, Zhu H, et al. Current Status and Future Directions: The Application of Artificial Intelligence/Machine Learning for Precision Medicine. Clinical Pharmacology and Therapeutics. 2024; 115: 673–686. https://doi.org/10.1002/cpt.3152.

[12]	Theodosiou AA, Read RC. Artificial intelligence, machine learning and deep learning: Potential resources for the infection clinician. The Journal of Infection. 2023; 87: 287–294. https://doi.org/10.1016/j.jinf.2023.07.006.

[13]	Krishnan R, Rajpurkar P, Topol EJ. Self-supervised learning in medicine and healthcare. Nature Biomedical Engineering. 2022; 6: 1346–1352. https://doi.org/10.1038/s41551-022-00914-1.

[14]	Emakhu J, Monplaisir L, Aguwa C, Arslanturk S, Masoud S, Nassereddine H, et al. Acute coronary syndrome prediction in emergency care: A machine learning approach. Computer Methods and Programs in Biomedicine. 2022; 225: 107080. https://doi.org/10.1016/j.cmpb.2022.107080.

[15]	Khera R, Haimovich J, Hurley NC, McNamara R, Spertus JA, Desai N, et al. Use of Machine Learning Models to Predict Death After Acute Myocardial Infarction. JAMA Cardiology. 2021; 6: 633–641. https://doi.org/10.1001/jamacardio.2021.0122.

[16]

D’Ascenzo F, De Filippo O, Gallone G, Mittone G, Deriu MA, Iannaccone M, et al. Machine learning-based prediction of adverse events following an acute coronary syndrome (PRAISE): a modelling study of pooled datasets. Lancet (London, England). 2021; 397: 199–207. https://doi.org/10.1016/S0140-6736(20)32519-8.

[17]	Andersson C, Vasan RS. Epidemiology of cardiovascular disease in young individuals. Nature Reviews. Cardiology. 2018; 15: 230–240. https://doi.org/10.1038/nrcardio.2017.154.

[18]	Thygesen K, Alpert JS, Jaffe AS, Chaitman BR, Bax JJ, Morrow DA, et al. Fourth Universal Definition of Myocardial Infarction (2018). Journal of the American College of Cardiology. 2018; 72: 2231–2264. https://doi.org/10.1016/j.jacc.2018.08.1038.

[19]

Thuijs DJFM, Kappetein AP, Serruys PW, Mohr FW, Morice MC, Mack MJ, et al. Percutaneous coronary intervention versus coronary artery bypass grafting in patients with three-vessel or left main coronary artery disease: 10-year follow-up of the multicentre randomised controlled SYNTAX trial. Lancet (London, England) [published erratum in Lancet. 2020; 395: 870. https://doi.org/10.1016/S0140-6736(20)30249-X]. 2019; 394: 1325–1334. https://doi.org/10.1016/S0140-6736(19)31997-X.

[20]	Tian D, Yan HJ, Huang H, Zuo YJ, Liu MZ, Zhao J, et al. Machine Learning-Based Prognostic Model for Patients After Lung Transplantation. JAMA Network Open. 2023; 6: e2312022. https://doi.org/10.1001/jamanetworkopen.2023.12022.

[21]	Lin J, Yin M, Liu L, Gao J, Yu C, Liu X, et al. The Development of a Prediction Model Based on Random Survival Forest for the Postoperative Prognosis of Pancreatic Cancer: A SEER-Based Study. Cancers. 2022; 14: 4667. https://doi.org/10.3390/cancers14194667.

[22]	Shin H. XGBoost Regression of the Most Significant Photoplethysmogram Features for Assessing Vascular Aging. IEEE Journal of Biomedical and Health Informatics. 2022; 26: 3354–3361. https://doi.org/10.1109/JBHI.2022.3151091.

[23]	Mousavi SM, Beroza GC. Deep-learning seismology. Science (New York, N.Y.). 2022; 377: eabm4470. https://doi.org/10.1126/science.abm4470.

[24]	Lee SH, Park SY, Choi CS. Insulin Resistance: From Mechanisms to Therapeutic Strategies. Diabetes & Metabolism Journal. 2022; 46: 15–37. https://doi.org/10.4093/dmj.2021.0280.

[25]	Yuan D, Xu N, Song Y, Zhang Z, Xu J, Liu Z, et al. Association Between Free Fatty Acids and Cardiometabolic Risk in Coronary Artery Disease: Results From the PROMISE Study. The Journal of Clinical Endocrinology and Metabolism. 2023; 109: 125–134. https://doi.org/10.1210/clinem/dgad416.

[26]

Aune D, Sen A, Prasad M, Norat T, Janszky I, Tonstad S, et al. BMI and all cause mortality: systematic review and non-linear dose-response meta-analysis of 230 cohort studies with 3.74 million deaths among 30.3 million participants. BMJ (Clinical Research Ed.). 2016; 353: i2156. https://doi.org/10.1136/bmj.i2156.

[27]

Global BMI Mortality Collaboration, Di Angelantonio E, Bhupathiraju S, Wormser D, Gao P, Kaptoge S, et al. Body-mass index and all-cause mortality: individual-participant-data meta-analysis of 239 prospective studies in four continents. Lancet (London, England). 2016; 388: 776–786. https://doi.org/10.1016/S0140-6736(16)30175-1.

[28]	Flegal KM, Kit BK, Orpana H, Graubard BI. Association of all-cause mortality with overweight and obesity using standard body mass index categories: a systematic review and meta-analysis. JAMA. 2013; 309: 71–82. https://doi.org/10.1001/jama.2012.113905.

[29]	Nicholls SJ, Nissen SE, Fleming C, Urva S, Suico J, Berg PH, et al. Muvalaplin, an Oral Small Molecule Inhibitor of Lipoprotein(a) Formation: A Randomized Clinical Trial. JAMA. 2023; 330: 1042–1053. https://doi.org/10.1001/jama.2023.16503.

[30]	Hummelgaard S, Vilstrup JP, Gustafsen C, Glerup S, Weyer K. Targeting PCSK9 to tackle cardiovascular disease. Pharmacology & Therapeutics. 2023; 249: 108480. https://doi.org/10.1016/j.pharmthera.2023.108480.

[31]	Hagström E, Steg PG, Szarek M, Bhatt DL, Bittner VA, Danchin N, et al. Apolipoprotein B, Residual Cardiovascular Risk After Acute Coronary Syndrome, and Effects of Alirocumab. Circulation. 2022; 146: 657–672. https://doi.org/10.1161/CIRCULATIONAHA.121.057807.

[32]	Mazzali M, Hughes J, Kim YG, Jefferson JA, Kang DH, Gordon KL, et al. Elevated uric acid increases blood pressure in the rat by a novel crystal-independent mechanism. Hypertension (Dallas, Tex.: 1979). 2001; 38: 1101–1106. https://doi.org/10.1161/hy1101.092839.

[33]	Kuwabara M, Fukuuchi T, Aoki Y, Mizuta E, Ouchi M, Kurajoh M, et al. Exploring the Multifaceted Nexus of Uric Acid and Health: A Review of Recent Studies on Diverse Diseases. Biomolecules. 2023; 13: 1519. https://doi.org/10.3390/biom13101519.

[34]	He M, Wang J, Liang Q, Li M, Guo H, Wang Y, et al. Time-restricted eating with or without low-carbohydrate diet reduces visceral fat and improves metabolic syndrome: A randomized trial. Cell Reports. Medicine. 2022; 3: 100777. https://doi.org/10.1016/j.xcrm.2022.100777.

[35]	Beverly JK, Budoff MJ. Atherosclerosis: Pathophysiology of insulin resistance, hyperglycemia, hyperlipidemia, and inflammation. Journal of Diabetes. 2020; 12: 102–104. https://doi.org/10.1111/1753-0407.12970.

[36]	Zhang C, Jiang L, Xu L, Tian J, Liu J, Zhao X, et al. Implications of N-terminal pro-B-type natriuretic peptide in patients with three-vessel disease. European Heart Journal. 2019; 40: 3397–3405. https://doi.org/10.1093/eurheartj/ehz394.

[37]

Sun LY, Gaudino M, Chen RJ, Bader Eddeen A, Ruel M. Long-term Outcomes in Patients With Severely Reduced Left Ventricular Ejection Fraction Undergoing Percutaneous Coronary Intervention vs Coronary Artery Bypass Grafting. JAMA Cardiology. 2020; 5: 631–641. https://doi.org/10.1001/jamacardio.2020.0239.

Funding

Tianjin Health Commission Science and Technology Project(TJWJ2021QN058)

Key Discipline Project of Tianjin Health Science and Technology Project in 2022(TJWJ2022XK032)

Key Science and Technology Support Project of Tianjin Key Research and Development Plan in 2020(20YFZCSY00820)