Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records

Yang Yang , Qi Li , Zhaoyang Liu , Fang Ye , Ke Deng

Quant. Biol. ›› 2019, Vol. 7 ›› Issue (3) : 210 -232.

PDF (4284KB)
Quant. Biol. ›› 2019, Vol. 7 ›› Issue (3) : 210 -232. DOI: 10.1007/s40484-019-0173-x
RESEARCH ARTICLE
RESEARCH ARTICLE

Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records

Author information +
History +
PDF (4284KB)

Abstract

Background: Traditional Chinese medicine (TCM) has been attracting lots of attentions from various disciplines recently. However, TCM is still mysterious because of its unique philosophy and theoretical thinking. Due to the lack of high quality data, understanding TCM thoroughly faces critical challenges. In this study, we introduce the Zhou Archive, a large-scale database of expert-specific Electronic Medical Records containing information about 73,000+ visits to one TCM doctor for over 35 years. Covering the full spectrum of diagnosis-treatment model behind TCM practice, the archive provides an opportunity to understand TCM from the data-driven perspective.

Methods: Processing the text data in the archive via a series of data processing steps, we transformed the semi-structured EMRs in the archive to a well-structured feature table. Based on the structured feature table obtained, a series of statistical analyses are implemented to learn principles of TCM clinical practice from the archive, including correlation analysis, enrichment analysis, embedding analysis and association pattern discovery.

Results: A structured feature table of 14,000+ features is generated at the end of the proposed data processing procedure, with a feature codebook, a term dictionary and a term-feature map as byproducts. Statistical analysis of the feature table reveals underlying principles about the diagnosis-treatment model of TCM, helping us better understand the TDM practice from a data-driven perspective.

Conclusion: Expert-specific EMRs provide opportunities to understand TCM from the data-driven perspective. Taking advantage of recent progresses on NLP for Chinese, we can process a large number of TCM EMRs efficiently to gain insights via statistical analysis.

Graphical abstract

Keywords

TCM / EMRs / data-driven perspective / Chinese text mining / statistical analysis

Cite this article

Download citation ▾
Yang Yang, Qi Li, Zhaoyang Liu, Fang Ye, Ke Deng. Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records. Quant. Biol., 2019, 7(3): 210-232 DOI:10.1007/s40484-019-0173-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Liu, W. H. (2017) TCM acupuncture-moxibustion: contributing to human health. World J. Acupunct. Moxibustion, 27, 1

[2]

Ahn, A. C., Bennani, T., Freeman, R., Hamdy, O. and Kaptchuk, T. J. (2007) Two styles of acupuncture for treating painful diabetic neuropathy–a pilot randomised control trial. Acupunct. Med., 25, 11–17

[3]

Liu, Z., Sun, F., Zhu, M. and Wang, X. (2004) Effect of acupuncture on insulin resistance in non-insulin dependent diabetes mellitus. J. Acupunt.Tuina Sci., 2, 8–11

[4]

Li, S. and Zhang, B. (2013) Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin. J. Nat. Med., 11, 110–120

[5]

Zhang, B., Wang, X. and Li, S. (2013) An integrative platform of TCM network pharmacology and its application on a herbal formula, Qing-Luo-Yin. Evid. Based Complement. Alternat. Med., 2013, 456747

[6]

Li, S., Zhang, B. and Zhang, N. (2011) Network target for screening synergistic drug combinations with application to traditional Chinese medicine. BMC Syst. Biol., 5, S10

[7]

Lam, W., Bussom, S., Guan, F., Jiang, Z., Zhang, W., Gullen, E. A., Liu, S. H. and Cheng, Y. C. (2010) The four-herb Chinese medicine PHY906 reduces chemotherapy-induced gastrointestinal toxicity. Sci. Transl. Med., 2, 45ra59

[8]

Xiang, Y. Z., Shang, H. C., Gao, X. M. and Zhang, B. L. (2008) A comparison of the ancient use of ginseng in traditional Chinese medicine with modern pharmacological experiments and clinical trials. Phytother. Res., 22, 851–858

[9]

Jian, J. and Wu, Z. (2004) Influences of traditional Chinese medicine on non-specific immunity of Jian Carp (Cyprinus carpio var. Jian). Fish Shellfish Immunol., 16, 185–191

[10]

Bick, R. J., Poindexter, B. J., Sweney, R. R. and Dasgupta, A. (2002) Effects of Chan Su, a traditional Chinese medicine, on the calcium transients of isolated cardiomyocytes: cardiotoxicity due to more than Na, K-ATPase blocking. Life Sci., 72, 699–709

[11]

Iwasaki, K., Satoh-Nakagawa, T., Maruyama, M., Monma, Y., Nemoto, M., Tomita, N., Tanji, H., Fujiwara, H., Seki, T., Fujii, M., (2005) A randomized, observer-blind, controlled trial of the traditional Chinese medicine Yi-Gan San for improvement of behavioral and psychological symptoms and activities of daily living in dementia patients. J. Clin. Psychiatry, 66, 248–252

[12]

Deng, K., Liu, D., Gao, S. and Geng, Z. (2005) Structural learning of graphical models and its applications to traditional Chinese medicine. Lect. Notes Comput. Sci., 3614, 362–367

[13]

Feng, Y., Wu, Z., Zhou, X., Zhou, Z. and Fan, W. (2006) Knowledge discovery in traditional Chinese medicine: state of the art and perspectives. Artif. Intell. Med., 38, 219–236

[14]

Yang, H., Chen, J., Tang, S., Li, Z., Zhen, Y., Huang, L. and Yi, J. (2009) New drug R&D of traditional Chinese medicine: role of data mining approaches. J. Biol. Syst., 17, 329–347

[15]

Wang, Q. and Zhu, Y. (2009) Epidemiological investigation of constitutional types of Chinese medicine in general population: based on 21,948 epidemiological investigation data of nine provinces in China. Zhonghua Zhongyiyao Zazhi (in Chinese), 24, 7–12

[16]

Xue, R., Fang, Z., Zhang, M., Yi, Z., Wen, C. and Shi, T. (2013) TCMID: traditional Chinese Medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res., 41, D1089–D1095

[17]

Liu, B., Zhou, X., Wang, Y., Hu, J., He, L., Zhang, R., Chen, S. and Guo, Y. (2012) Data processing and analysis in real-world traditional Chinese medicine clinical data: challenges and approaches. Stat. Med., 31, 653–660

[18]

Wang, X., Qu, H., Liu, P. and Cheng, Y. (2004) A self-learning expert system for diagnosis in traditional Chinese medicine. Expert Syst. Appl., 26, 557–566

[19]

Yu, S., Ma, Y., Gronsbell, J., Cai, T., Ananthakrishnan, A. N., Gainer, V. S., Churchill, S. E., Szolovits, P., Murphy, S. N., Kohane, I. S., (2018) Enabling phenotypic big data with PheNorm. J. Am. Med. Inform. Assoc., 25, 54–60

[20]

Roden, D. M., Pulley, J. M., Basford, M. A., Bernard, G. R., Clayton, E. W., Balser, J. R. and Masys, D. R. (2008) Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther., 84, 362–369

[21]

Blair, D. R., Lyttle, C. S., Mortensen, J. M., Bearden, C. F., Jensen, A. B., Khiabanian, H., Melamed, R., Rabadan, R., Bernstam, E. V., Brunak, S., (2013) A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell, 155, 70–80

[22]

Rotmensch, M., Halpern, Y., Tlimat, A., Horng, S. and Sontag, D. (2017) Learning a health knowledge graph from electronic medical records. Sci. Rep., 7, 5994

[23]

Blecker, S., Katz, S. D., Horwitz, L. I., Kuperman, G., Park, H., Gold, A. and Sontag, D. (2016) Comparison of approaches for heart failure case identification from electronic health record data. JAMA Cardiol., 1, 1014–1020

[24]

Denny, J. C., Bastarache, L., Ritchie, M. D., Carroll, R. J., Zink, R., Mosley, J. D., Field, J. R., Pulley, J. M., Ramirez, A. H., Bowton, E., (2013) Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol., 31, 1102–1110

[25]

Doshi-Velez, F., Ge, Y. and Kohane, I. (2014) Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics, 133, e54–e63

[26]

Chang, P. C., Tseng, H., Dan, J. and Manning, C. D. (2009) Discriminative reordering with Chinese grammatical relations features. In: SSST’ 09 Proceedings of the 3rd Workshop on Syntax and Structure in Statistical Translation. pp. 51–59

[27]

Levy, R. and Manning, C. D. (2003) Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, 1, 439–446

[28]

Che, W., Li, Z. and Liu, T. (2010) LTP: A Chinese language technology platform. In: COLING’10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, pp. 13–16

[29]

Sun, M., Chen, X., Zhang, K., Guo, Z., Ma, J. and Liu, Z. (2016) THULAC: An efficient lexical analyzer for Chinese

[30]

Li, Z. and Sun, M. (2009) Punctuation as implicit annotations for Chinese word segmentation. Comput. Linguist., 35, 505–512

[31]

Deng, K., Bol, P. K., Li, K. J. and Liu, J. S. (2016) On the unsupervised analysis of domain-specific Chinese texts. Proc. Natl. Acad. Sci. USA, 113, 6154–6159

[32]

Levy, O. and Goldberg, Y. (2014) Neural word embedding as implicit matrix factorization. In: Adv. Neural Inf. Process. Syst. Conference

[33]

Maaten, L. and Hinton, G. E. (2008) Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605

[34]

Borg, I. and Groenen, P. (1987) Modern multidimensional scaling: theory and applications. J. Educ. Meas., 40, 277–280

[35]

Agrawal, R., Imielinski, T. and Swami, A. (1993) Mining association rules between sets of items in large databases. In: SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp. 207–216

[36]

Agrawal, R. and Srikant, R. (1994) Fast algorithms for mining association rules. In: Readings in database systems (3rd ed.), pp. 580–592. San Francisco: Morgan Kaufmann Publishers Inc.

[37]

He, P., Deng, K., Liu, Z., Liu, D., Liu, J. S. and Geng, Z. (2012) Discovering herbal functional groups of traditional Chinese medicine. Stat. Med., 31, 636–642

[38]

Deng, K., Geng, Z. and Liu, J. S. (2014) Association pattern discovery via theme dictionary models. J. R. Stat. Soc. B, 76, 319–347

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (4284KB)

Supplementary files

QB-19173-OF-DK_suppl_1

1584

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/