Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records

Yang Yang, Qi Li, Zhaoyang Liu, Fang Ye, Ke Deng

PDF(4284 KB)
PDF(4284 KB)
Quant. Biol. ›› 2019, Vol. 7 ›› Issue (3) : 210-232. DOI: 10.1007/s40484-019-0173-x
RESEARCH ARTICLE
RESEARCH ARTICLE

Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records

Author information +
History +

Abstract

Background: Traditional Chinese medicine (TCM) has been attracting lots of attentions from various disciplines recently. However, TCM is still mysterious because of its unique philosophy and theoretical thinking. Due to the lack of high quality data, understanding TCM thoroughly faces critical challenges. In this study, we introduce the Zhou Archive, a large-scale database of expert-specific Electronic Medical Records containing information about 73,000+ visits to one TCM doctor for over 35 years. Covering the full spectrum of diagnosis-treatment model behind TCM practice, the archive provides an opportunity to understand TCM from the data-driven perspective.

Methods: Processing the text data in the archive via a series of data processing steps, we transformed the semi-structured EMRs in the archive to a well-structured feature table. Based on the structured feature table obtained, a series of statistical analyses are implemented to learn principles of TCM clinical practice from the archive, including correlation analysis, enrichment analysis, embedding analysis and association pattern discovery.

Results: A structured feature table of 14,000+ features is generated at the end of the proposed data processing procedure, with a feature codebook, a term dictionary and a term-feature map as byproducts. Statistical analysis of the feature table reveals underlying principles about the diagnosis-treatment model of TCM, helping us better understand the TDM practice from a data-driven perspective.

Conclusion: Expert-specific EMRs provide opportunities to understand TCM from the data-driven perspective. Taking advantage of recent progresses on NLP for Chinese, we can process a large number of TCM EMRs efficiently to gain insights via statistical analysis.

Graphical abstract

Keywords

TCM / EMRs / data-driven perspective / Chinese text mining / statistical analysis

Cite this article

Download citation ▾
Yang Yang, Qi Li, Zhaoyang Liu, Fang Ye, Ke Deng. Understanding traditional Chinese medicine via statistical learning of expert-specific Electronic Medical Records. Quant. Biol., 2019, 7(3): 210‒232 https://doi.org/10.1007/s40484-019-0173-x

References

[1]
Liu, W. H. (2017) TCM acupuncture-moxibustion: contributing to human health. World J. Acupunct. Moxibustion, 27, 1
CrossRef Google scholar
[2]
Ahn, A. C., Bennani, T., Freeman, R., Hamdy, O. and Kaptchuk, T. J. (2007) Two styles of acupuncture for treating painful diabetic neuropathy–a pilot randomised control trial. Acupunct. Med., 25, 11–17
CrossRef Pubmed Google scholar
[3]
Liu, Z., Sun, F., Zhu, M. and Wang, X. (2004) Effect of acupuncture on insulin resistance in non-insulin dependent diabetes mellitus. J. Acupunt.Tuina Sci., 2, 8–11
CrossRef Google scholar
[4]
Li, S. and Zhang, B. (2013) Traditional Chinese medicine network pharmacology: theory, methodology and application. Chin. J. Nat. Med., 11, 110–120
CrossRef Pubmed Google scholar
[5]
Zhang, B., Wang, X. and Li, S. (2013) An integrative platform of TCM network pharmacology and its application on a herbal formula, Qing-Luo-Yin. Evid. Based Complement. Alternat. Med., 2013, 456747
CrossRef Pubmed Google scholar
[6]
Li, S., Zhang, B. and Zhang, N. (2011) Network target for screening synergistic drug combinations with application to traditional Chinese medicine. BMC Syst. Biol., 5, S10
CrossRef Pubmed Google scholar
[7]
Lam, W., Bussom, S., Guan, F., Jiang, Z., Zhang, W., Gullen, E. A., Liu, S. H. and Cheng, Y. C. (2010) The four-herb Chinese medicine PHY906 reduces chemotherapy-induced gastrointestinal toxicity. Sci. Transl. Med., 2, 45ra59
CrossRef Pubmed Google scholar
[8]
Xiang, Y. Z., Shang, H. C., Gao, X. M. and Zhang, B. L. (2008) A comparison of the ancient use of ginseng in traditional Chinese medicine with modern pharmacological experiments and clinical trials. Phytother. Res., 22, 851–858
CrossRef Pubmed Google scholar
[9]
Jian, J. and Wu, Z. (2004) Influences of traditional Chinese medicine on non-specific immunity of Jian Carp (Cyprinus carpio var. Jian). Fish Shellfish Immunol., 16, 185–191
CrossRef Pubmed Google scholar
[10]
Bick, R. J., Poindexter, B. J., Sweney, R. R. and Dasgupta, A. (2002) Effects of Chan Su, a traditional Chinese medicine, on the calcium transients of isolated cardiomyocytes: cardiotoxicity due to more than Na, K-ATPase blocking. Life Sci., 72, 699–709
CrossRef Pubmed Google scholar
[11]
Iwasaki, K., Satoh-Nakagawa, T., Maruyama, M., Monma, Y., Nemoto, M., Tomita, N., Tanji, H., Fujiwara, H., Seki, T., Fujii, M., (2005) A randomized, observer-blind, controlled trial of the traditional Chinese medicine Yi-Gan San for improvement of behavioral and psychological symptoms and activities of daily living in dementia patients. J. Clin. Psychiatry, 66, 248–252
CrossRef Pubmed Google scholar
[12]
Deng, K., Liu, D., Gao, S. and Geng, Z. (2005) Structural learning of graphical models and its applications to traditional Chinese medicine. Lect. Notes Comput. Sci., 3614, 362–367
CrossRef Google scholar
[13]
Feng, Y., Wu, Z., Zhou, X., Zhou, Z. and Fan, W. (2006) Knowledge discovery in traditional Chinese medicine: state of the art and perspectives. Artif. Intell. Med., 38, 219–236
CrossRef Pubmed Google scholar
[14]
Yang, H., Chen, J., Tang, S., Li, Z., Zhen, Y., Huang, L. and Yi, J. (2009) New drug R&D of traditional Chinese medicine: role of data mining approaches. J. Biol. Syst., 17, 329–347
CrossRef Google scholar
[15]
Wang, Q. and Zhu, Y. (2009) Epidemiological investigation of constitutional types of Chinese medicine in general population: based on 21,948 epidemiological investigation data of nine provinces in China. Zhonghua Zhongyiyao Zazhi (in Chinese), 24, 7–12
[16]
Xue, R., Fang, Z., Zhang, M., Yi, Z., Wen, C. and Shi, T. (2013) TCMID: traditional Chinese Medicine integrative database for herb molecular mechanism analysis. Nucleic Acids Res., 41, D1089–D1095
CrossRef Pubmed Google scholar
[17]
Liu, B., Zhou, X., Wang, Y., Hu, J., He, L., Zhang, R., Chen, S. and Guo, Y. (2012) Data processing and analysis in real-world traditional Chinese medicine clinical data: challenges and approaches. Stat. Med., 31, 653–660
CrossRef Pubmed Google scholar
[18]
Wang, X., Qu, H., Liu, P. and Cheng, Y. (2004) A self-learning expert system for diagnosis in traditional Chinese medicine. Expert Syst. Appl., 26, 557–566
CrossRef Google scholar
[19]
Yu, S., Ma, Y., Gronsbell, J., Cai, T., Ananthakrishnan, A. N., Gainer, V. S., Churchill, S. E., Szolovits, P., Murphy, S. N., Kohane, I. S., (2018) Enabling phenotypic big data with PheNorm. J. Am. Med. Inform. Assoc., 25, 54–60
CrossRef Pubmed Google scholar
[20]
Roden, D. M., Pulley, J. M., Basford, M. A., Bernard, G. R., Clayton, E. W., Balser, J. R. and Masys, D. R. (2008) Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clin. Pharmacol. Ther., 84, 362–369
CrossRef Pubmed Google scholar
[21]
Blair, D. R., Lyttle, C. S., Mortensen, J. M., Bearden, C. F., Jensen, A. B., Khiabanian, H., Melamed, R., Rabadan, R., Bernstam, E. V., Brunak, S., (2013) A nondegenerate code of deleterious variants in Mendelian loci contributes to complex disease risk. Cell, 155, 70–80
CrossRef Pubmed Google scholar
[22]
Rotmensch, M., Halpern, Y., Tlimat, A., Horng, S. and Sontag, D. (2017) Learning a health knowledge graph from electronic medical records. Sci. Rep., 7, 5994
CrossRef Pubmed Google scholar
[23]
Blecker, S., Katz, S. D., Horwitz, L. I., Kuperman, G., Park, H., Gold, A. and Sontag, D. (2016) Comparison of approaches for heart failure case identification from electronic health record data. JAMA Cardiol., 1, 1014–1020
CrossRef Pubmed Google scholar
[24]
Denny, J. C., Bastarache, L., Ritchie, M. D., Carroll, R. J., Zink, R., Mosley, J. D., Field, J. R., Pulley, J. M., Ramirez, A. H., Bowton, E., (2013) Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol., 31, 1102–1110
CrossRef Pubmed Google scholar
[25]
Doshi-Velez, F., Ge, Y. and Kohane, I. (2014) Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis. Pediatrics, 133, e54–e63
CrossRef Pubmed Google scholar
[26]
Chang, P. C., Tseng, H., Dan, J. and Manning, C. D. (2009) Discriminative reordering with Chinese grammatical relations features. In: SSST’ 09 Proceedings of the 3rd Workshop on Syntax and Structure in Statistical Translation. pp. 51–59
[27]
Levy, R. and Manning, C. D. (2003) Is it harder to parse Chinese, or the Chinese Treebank? In: Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, 1, 439–446
[28]
Che, W., Li, Z. and Liu, T. (2010) LTP: A Chinese language technology platform. In: COLING’10 Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, pp. 13–16
[29]
Sun, M., Chen, X., Zhang, K., Guo, Z., Ma, J. and Liu, Z. (2016) THULAC: An efficient lexical analyzer for Chinese
[30]
Li, Z. and Sun, M. (2009) Punctuation as implicit annotations for Chinese word segmentation. Comput. Linguist., 35, 505–512
CrossRef Google scholar
[31]
Deng, K., Bol, P. K., Li, K. J. and Liu, J. S. (2016) On the unsupervised analysis of domain-specific Chinese texts. Proc. Natl. Acad. Sci. USA, 113, 6154–6159
CrossRef Pubmed Google scholar
[32]
Levy, O. and Goldberg, Y. (2014) Neural word embedding as implicit matrix factorization. In: Adv. Neural Inf. Process. Syst. Conference
[33]
Maaten, L. and Hinton, G. E. (2008) Visualizing high-dimensional data using t-SNE. J. Mach. Learn. Res., 9, 2579–2605
[34]
Borg, I. and Groenen, P. (1987) Modern multidimensional scaling: theory and applications. J. Educ. Meas., 40, 277–280
[35]
Agrawal, R., Imielinski, T. and Swami, A. (1993) Mining association rules between sets of items in large databases. In: SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data, pp. 207–216
[36]
Agrawal, R. and Srikant, R. (1994) Fast algorithms for mining association rules. In: Readings in database systems (3rd ed.), pp. 580–592. San Francisco: Morgan Kaufmann Publishers Inc.
[37]
He, P., Deng, K., Liu, Z., Liu, D., Liu, J. S. and Geng, Z. (2012) Discovering herbal functional groups of traditional Chinese medicine. Stat. Med., 31, 636–642
CrossRef Pubmed Google scholar
[38]
Deng, K., Geng, Z. and Liu, J. S. (2014) Association pattern discovery via theme dictionary models. J. R. Stat. Soc. B, 76, 319–347
CrossRef Google scholar

SUPPLEMENTARY MATERIALS

The supplementary materials can be found online with this article at https://doi.org/10.1007/s40484-019-0173-x.

ACKNOWLEDGEMENT

We thank the Zhou Zhongying’s Studio at Nanjing University of Chinese Medicine for the great efforts on collecting, managing and sharing this valuable archive. We also thank Miss Bing Liang, Mr. Qiuyu Liang and Miss Che Wang for their efforts on data preparation and preprocessing. ƒThis work was partially supported by the National Natural Science Foundation of China (Nos. 11771242 & 11401338), the Tsinghua University Initiative Scientific Research Program and Supporting Grant to the Zhou Zhongying’s Studio 201159 by the State Administration of TCM of China.

COMPLIANCE WITH ETHICS GUDELINES

The authors Yang Yang, Qi Li, Zhaoyang Liu, Fang Ye and Ke Deng declare that they have no conflict of interests.ƒAll procedures were in accordance with the ethical standards of the institution or practice at which the studies were conducted, and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

RIGHTS & PERMISSIONS

2019 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(4284 KB)

Accesses

Citations

Detail

Sections
Recommended

/