Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults.

Chun M.; Clarke R.; Cairns BJ.; Clifton D.; Bennett D.; Chen Y.; Guo Y.; Pei P.; Lv J.; Yu C.; Yang L.; Li L.; Chen Z.; Zhu T.; China Kadoorie Biobank Collaborative Group None.

Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults.

Chun M., Clarke R., Cairns BJ., Clifton D., Bennett D., Chen Y., Guo Y., Pei P., Lv J., Yu C., Yang L., Li L., Chen Z., Zhu T., China Kadoorie Biobank Collaborative Group None.

OBJECTIVE: To compare Cox models, machine learning (ML), and ensemble models combining both approaches, for prediction of stroke risk in a prospective study of Chinese adults. MATERIALS AND METHODS: We evaluated models for stroke risk at varying intervals of follow-up (<9 years, 0-3 years, 3-6 years, 6-9 years) in 503 842 adults without prior history of stroke recruited from 10 areas in China in 2004-2008. Inputs included sociodemographic factors, diet, medical history, physical activity, and physical measurements. We compared discrimination and calibration of Cox regression, logistic regression, support vector machines, random survival forests, gradient boosted trees (GBT), and multilayer perceptrons, benchmarking performance against the 2017 Framingham Stroke Risk Profile. We then developed an ensemble approach to identify individuals at high risk of stroke (>10% predicted 9-yr stroke risk) by selectively applying either a GBT or Cox model based on individual-level characteristics. RESULTS: For 9-yr stroke risk prediction, GBT provided the best discrimination (AUROC: 0.833 in men, 0.836 in women) and calibration, with consistent results in each interval of follow-up. The ensemble approach yielded incrementally higher accuracy (men: 76%, women: 80%), specificity (men: 76%, women: 81%), and positive predictive value (men: 26%, women: 24%) compared to any of the single-model approaches. DISCUSSION AND CONCLUSION: Among several approaches, an ensemble model combining both GBT and Cox models achieved the best performance for identifying individuals at high risk of stroke in a contemporary study of Chinese adults. The results highlight the potential value of expanding the use of ML in clinical practice.

Original publication

DOI

10.1093/jamia/ocab068

Type

Journal article

Journal

J Am Med Inform Assoc

Publication Date

30/07/2021

Volume

Pages

1719 - 1727

Keywords

China, cardiovascular diseases, machine learning, risk assessment, stroke, Adult, China, Female, Humans, Machine Learning, Male, Predictive Value of Tests, Prospective Studies, Stroke

Cookies on this website

Stroke risk prediction using machine learning: a prospective cohort study of 0.5 million Chinese adults.

Chun M., Clarke R., Cairns BJ., Clifton D., Bennett D., Chen Y., Guo Y., Pei P., Lv J., Yu C., Yang L., Li L., Chen Z., Zhu T., China Kadoorie Biobank Collaborative Group None.

DOI

Type

Journal

Publication Date

Volume

Pages

Keywords