Physical Examination Data Based Cataract Risk Analysis
Jianqiao Hao , Yongbo Xiao , Shudi Du
Journal of Systems Science and Systems Engineering ›› 2021, Vol. 30 ›› Issue (2) : 198 -214.
Cataract is a very common eye disease and the most significant cause of blindness. In consideration of its burden on society, the focus was put on testing the risk factors of cataract and building robust machine learning models in which these factors can be utilized to predict the risk of cataract. The data used herein was collected by a Chinese physical examination center located in Shanghai. It contains more than 120,000 examinees and about 500 physical examination metrics. Firstly, association rules were adopted to filter 39 abnormalities which are more likely to incur the risk of cataract, and the significance of these abnormalities was tested with univariate analysis and multivariate analysis. The test results indicate that age, diabetes, refractive error, retinal arteriosclerosis, thyroid nodules, and incomplete mammary gland degeneration significantly increase the possibility of cataract. Various machine learning models were compared in terms of their performance in predicting the risk of cataract based on these six factors, among which the logistic regression model and the decision-tree based ensemble methods outperform others. The test set AUC of these models can reach 0.84.
Cataract / risk factors / physical examination data / machine learning
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
Kleiman R S, Larose E R, Badger J C, Page D, Peissig P L (2018). Using machine learning algorithms to predict risk for development of calciphylaxis in patients with chronic kidney disease. AMIA Summits on Translational Science Proceedings 2018, 139. |
| [18] |
|
| [19] |
Lundberg S M, Lee S I (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems: 4765–4774. |
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
Welp A, Woodbury R B, McCoy M A, et al. Understanding the epidemiology of vision loss and impairment in the United States. Making Eye Health A Population Health Imperative: Vision for Tomorrow, National Academies Press (US). |
| [26] |
|
| [27] |
World Health Organization (2014). Facts about blindness and visual impairment. |
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
Zhao Y, Wong Z S Y, Tsui K L (2018). A framework of rebalancing imbalanced healthcare data for rare events’ classification: A case of look-alike sound-alike mix-up incident detection. Journal of Healthcare Engineering:1–11. |
/
| 〈 |
|
〉 |