1 Introduction
In the last few decades, the global disease patterns of humans have undergone a drastic change along with the economic development and social progress. Nowadays, chronic complex diseases, such as coronary heart disease, stroke, diabetes, tumor, chronic obstructive pulmonary disease, and senile degenerative disease, have consumed a large amount of healthcare resources, which places a heavy burden on the global economy and social development. The medical purpose and models have been transformed along with the changes in the patterns of diseases. It was pointed out in the
World Health Organization Report 1998 that the medical purpose was to improve the human capability of maintaining healthy, and in the 21st century, the main task of medical development would shift from treatment of diseases to maintaining healthy life [
1]. The medical models will be transformed from treatment only to the 4Ps model: prevention, prediction, personalization and participation [
2]. The strategy for the healthcare system will also be adjusted by switching its focus from disease prevention and treatment to the reduction of the occurrence and progression of chronic diseases. The idea behind the strategy is comparable with those in the propositions of “treating before being diseased,” “treatment tailored for individuals” and “ZHENG differentiation-treatment” in the theory of traditional Chinese medicine (TCM).
In light of the principle of evidence based medicine (EBM), the medical decision making should be supported by evidence of high quality. From the discovery and intervention of pathogens, to the prediction of occurrence and development, prognosis of diseases, and to the practice of personalized diagnosis and treatment, they all need the supportive evidence from scientific research. Compared with the studies on clinical treatment, the research on prevention, prediction and personalization will be more difficult. The superiority of TCM is its clinical effectiveness. However, the clinical significance of TCM is still lack of high quality evidence due to lacking the proper methodology for its effect evaluation. The innovation in TCM evaluation methodology is highly demanded.
2 The challenges of methodology in the evaluation of TCM clinical effectiveness
2.1 ZHENG differentiation-treatment
In TCM, ZHENG differentiation-treatment, which means treatment guided by differentiation of symptom patterns, is the primary principle of TCM and unique to itself as well. This matches the idea of personalized medicine. Doctors are required to fine-tune their prescription depending on the transition of patterns in symptoms. The patterns in symptoms may possess the characters of multi-dimensions and dynamics in time and space, which make it difficult to establish the objective diagnosis standards for differentiating patterns among symptoms. Currently, the judgment of the patterns is mainly based on clinical experiences, which is subject to subjectivity and ambiguity, and has no recognized standards to follow. Known as the gold standard for the studies of clinical efficacy, the randomized controlled trial (RCT) needs clear diagnosis, controllable intervention and definite evaluation endpoints. RCTs conducted for TCM treatment evaluation are questioned mainly because of the complexity of its symptom patterns, variety of prescriptions, and lack of definite relation of symptom patterns with the diseases involved.
2.2 Compliance
It is a key point for a clinical trial to ensure that patients accept the treatment according to the research protocol, and that researchers gathering data abide by the protocol. Therefore, clinical research needs to follow related rules in terms of the protocol, and this is inconvenient for both doctors and patients in clinical practice, which lead to the low compliance in clinical trials. The stricter the protocol is, the more obvious the problems are. Then these will weaken the authenticity and the generalizability of the research outcomes. In addition, the constraints on the clinical activities will lead to the difficulty in subject recruitment and some medical ethical issues.
2.3 Complex interventions
Chronic diseases need long-term and integrated therapies [
3]. For example, the medications for angina pectoris will be the different combinations of calcium antagonist, diuretic, β-adrenergic block agents, angiotensin converting enzyme inhibitor, angiotensin antagonist, statins, and platelet suppressant agents. In order to observe the efficacy of TCM treatment, placebo control RCT would be an optimal choice. However, to avoid the ethical issues, most of the TCM clinical trials were conducted with complex design: TCM plus western medications compared with western medications. Under the circumstances of complex intervention, the probability of detecting the significant efficacy of the tested intervention is low, just like trying to find a needle in a haystack or seek a tiny gold granule in desert.
For example, it took over 30 years to confirm the preventive effect of aspirin on myocardial infarction. Since 1956, Craven reported the preventative effects of coronary and cerebral thrombosis of aspirin [
4]. However, no clear conclusion was made due to discordant research results. In 1980, an editorial of the
Lancet pointed out that the small sample size was considered to be the reason [
5] . Until 1988, an international multi-center RCT with 17187 patients of acute myocardial infarction (ISIS-2) was published, which proved the preventive effect of aspirin for myocardial infarction [
6].
The clinical evaluation of aspirin’s preventive effect confirmed the importance of sample size in studies. For interventions with mild or moderate efficacy, it is necessary to conduct large-sample-size clinical trials with proper design to test their efficacy [
7]. Especially for post-marketing evaluation of TCM, due to various confounding factors, a sample with over ten thousands of subjects are demanded to detect the mild or moderate efficacy of a drug. However, it is difficult to organize and implement multi-center large sample clinical trials, which need large human resources, long time and ten millions of dollars. Furthermore, from pharmacoeconomics aspect, the value of large clinical trials is questioned as well [
8]. Even if a large sample clinical trial generates a statistically significant result, it does not mean an ideal cost-effectiveness ratio, cost-benefit ratio, or cost-utility ratio.
2.4 Continuous evaluation
Clinical evaluation goes through the entire life cycle of a drug, from new drug development to post-marketing evaluation, until withdrawn from the market. The Chinese patent medicines, mostly developed from classic prescriptions or empirical prescriptions, are widely used in clinical practice. However, owing to some historical reasons, a large proportion of them lack the evidence to support their safety and efficacy required by contemporary science. Therefore, post-marketing evaluation for Chinese patent medicine is imperative. Nowadays, methods for post-marketing evaluation are usually intensive hospital surveillance for safety monitoring, observational study, cohort study and/or RCTs for efficacy assessment. These methods are all cross-sectional studies demanding large amount of investment, which are still unable to accomplish continuous monitoring and evaluation. Therefore, for the post-marketing evaluation, demanded are the study methods that are more scientific-oriented and standardized with continuous and economical characters.
3 The transformation of research methods triggered by big data
With the popularization of computer and Internet technology, data are accumulating at an unprecedented rate. The amount of data produced in recent decades surpasses those of thousands of years in the past.
Nature, the journal, published a special edition on “Big Data” in 2008 [
9], which introduced the challenges of massive data from Internet technology, network economics, supercomputing, environmental science, biological medicine, etc. In May 2008, the Office of Science and Technology Policy (OSTP) of the Executive Office of the President (EOP) of USA announced the Big Data Research and Development Initiative. This brings significant impact on the research in the field of big data [
10]. With the advent of big data era, our mindsets and technological methodology are being transformed.
As far as scientific research is concerned, sharing of experimental data and trans-regional collaboration has become a trend in age of the Internet [
11]. In 2007, Jim Gray, the A.M. Turing Award winner, proposed that data-intensive science should be independent from computation science, and delineated the fourth paradigm for data-intensive scientific research. Generally speaking, there are four paradigms in course of sciences development: the first one was experimental science; the second one was theoretical science characterized by mainly using models; the third one was computational science, characterized by complex simulations; and the fourth one is data-intensive science based on big data [
12].
The fourth paradigm will change not only scientific research methods, but also people’s thinking. Unlike the traditional logical reasoning research, big data research mainly focuses on the correlation among the complex data rather than on the causal inferences, and it pays close attention to the application of research results. With research methods of big data applied to the field of medicine, especially in the section of clinical evaluation, more difficult issues currently encountered may be resolved effectively.
4 The framework of TCM clinical research based on big data
4.1 Resources of big data from clinical research
Nowadays, significant benefits have been generated by big data technology in the area of electronic commerce. Along with the development and application of Internet and information technology, the data in field of biomedical research increased dramatically, which brings the medical research into the age of big data.
4.1.1 The data of health examination
People usually learn their health status by annual health examination, which is not a continuous health surveillance method and cannot find out body abnormal or risk factors of serious diseases in time. By using the health management system and wearable medical device, health data of large populations in a long period can be collected across regions, such as heart rate, pulse, breathing rate, thermal losses, blood pressure, blood sugar, blood oxygen, hormone level, BMI index, body fat, and also TCM-related data such as tongue presentations, pulse manifestation and body constitution. With big data technology, an individual’s health can be managed throughout the whole life; systematic and continuous collection of healthy data provides valuable data for decision making of health management. Currently systems like Google Health or Health Vault of Microsoft Company provide cloud storage services, and they can not only store health data, but also solve the problems such as the incompatibility of different instruments and systems.
4.1.2 The data of diagnosis and treatment for disease
Along with the extensive use of medical information management system, medical activities were recorded during daily practice. The data on diagnosis, examination (such as electrocardiogram, ultrasonic examination, computerized tomography, and biochemical test), and treatments can be stored, which makes it possible to do researches based on clinical data. Today the hospital information system is available in the hospitals at or above the county level in China. And the integrated information system including registration, charges, prescriptions, treatment, based on electronic records, has also been widely applied. Taking a 1000-bed hospital as an example, about 1.5-million records were collected by the electronic record system every day, and nearly 8 G of data were generated by image archiving and communication system daily [
13]. Clinical practice data integrated with the Government Reimbursement System of Medical Insurance, can provide reliable resources for pharmacoeconomic evaluation [
14].
4.1.3 The data of genome
Human Genome Project (HGP) started in 1990, and the sequencing of the whole 23 human chromosomes was accomplished in 2006. Human DNA sequence can be downloaded from the website of the National Center for Biotechnology Information (NCBI) of the United States. To map the genetic polymorphism of human genome for medical research, the 1000 Genomes Project was launched in 2008 by 75 organizations or institutions including National Human Genome Research Institute (NHGRI), Shenzhen Huada Gene Research Institute (BGI-Shenzhen), Sanger Institute of the United Kingdom (Sanger), etc. [
15]. Now, the volume of the gene data in this project has reached 200 TB, which are equivalent to the amount that 30 000 pieces of regular digital versatile discs. The data are available on Amazon Web Services (AWS), and the data sets containing the whole gene sequence of 1700 people are freely accessed [
16]. In addition, decoding one person genome needs only 1000 dollars in one day, which make personal gene sequencing come to the true. Along with the progress of the genome annotation, there will be profound impact on the disease prediction, prevention, diagnosis and treatment, and even on the whole life sciences.
4.2 TCM clinical research methods based on big data
The rapidly accumulated data mentioned above have composed the main body of the big data in medicine. The mining and utilization of the big data will make individualized medicine possible. According to the individuals’ health data, clinical records and genomic data, physicians or healthcare professionals can systematically analyze each person’s condition and then design an optimal intervention regimen for disease prevention and treatment, which can also be dynamically modified and tailored so as to enhance the effect and reduce adverse event of medications [
17].
From single case observation to case series report, from case-control study to cohort study, and from RCT to centralized monitoring research, the methods of clinical evaluation developed constantly. With the advent of the big data era, the methodology of clinical research will be transformed [
18]. The application of the big data technology will bring new ideas and suitable methods for the clinical evaluation conducted under the circumstances of complex diseases and complicated interventions.
4.2.1 The characters of clinical research in big data era
The traditional clinical research, which conducts in limited conditions, analyses and evaluates fragment data and focus on causal inference. In contrast, clinical research supported by big data lays emphasis on the identification of correlations among the data of routine medical practice, and then assess the effectiveness and the value of interventions. The key differences between the two research models are the different subject investigated.
In big data era, the clinical evaluation changes from focusing on causality to seeking correlations, from micro to macro, from experiments to practice. The clinical researches based on big data have several advantages and can overcome defects in the traditional methods which relying on sampled data. Since fewer constraints are imposed on the activities of clinical practice, the research outcomes will reflect the real-world situations, decrease the compliance problem and ethical issues. The process of big data research is simpler, which saves human, material and financial resources.
Furthermore, information from big data can facilitate the protocol design for experimental studies like RCTs. Based on data of gene, health status and clinical practice, the efficiency of subjects recruiting and screening will be improved and helpful to locate the suitable clinical research institutions. As a consequence, the process of clinical trial will be accelerated. Through big data analysis, the correlation between intermediate indicators and important endpoints can be clarified, which may provide evidence for the selection of intermediate indicators [
19].
4.2.2 The application scope of big data research in TCM
The subject investigated in big data research is continuous data collected from large population. With large sample data, it is suitable for the evaluation of the interventions with mild or moderate efficacy.
Clinical evaluation of TCM for “treated before being diseased”
Preventive intervention, treated before being diseased, is a priority of TCM for health preservation and promotion. For example, the practical remedies include dietary therapy, medicated diet, herbal emplastrum, acupressure, acupoint application, tai chi and Qigong. However, there are still lack of reliable evidence to support their effectiveness, and lack of standard operating procedures for daily application. The problems of conventional methods of clinical evaluation are high cost, but low efficiency, and cannot deal with the interferences of complex factors. In addition, preventive studies usually need several decades’ follow-up, which challenges the conventional evaluation methods. Using the methods of big data technology will facilitate proving the value of preventive interventions. These would highlight the advantages of TCM preventative interventions.
Clinical evaluation for “ZHENG differentiation-treatment” of TCM
According to the principal of ZHENG differentiation-treatment, medications in the treatment process are modified dynamically in terms of the change of symptom patterns. The relations between clinical outcomes and medications are complicated. In the process of diagnosis and treatment, not only the disease, but also the symptom patterns and the symptoms may vary. It is impossible for an RCT using the conventional methods to deal with the process evaluation of the whole treatment process. Using big data technology, not only can evaluate the ZHENG differentiation-treatment process as a whole clinical pathway, but also can assess the segment of medications modification. In this way, it is possible to explain the systemic and local effects of ZHENG differentiation-treatment. Probably, these can generate scientific evidence to promote the standardization of individually tailored diagnosis and treatment in TCM.
Clinical evaluation for endpoint outcomes
TCM and western medicine hold different viewpoints on diseases because their theories are distinct. Thus, the outcome measures for clinical trials are different. Outcomes of TCM are not accepted by western practitioners, while western medical indicators are not suitable for TCM. Important endpoint events such as death, myocardial infarction and stroke are acceptable by both TCM and western medicine, and have been widely used in TCM clinical trials [
20]. However, clinical trials with endpoint events will take a long time with high costs and it is difficult to be conducted. Using big data method, it is easy and economic to evaluate TCM treatments using endpoints in a long time follow-up. TCM treatments usually have mild or moderate effect, which is suitable for long time use. As a consequence, to evaluate the effect of TCM by using endpoint events is feasible and beneficial to prove the strengths of TCM in clinical practice.
Post-marketing evaluation of Chinese patent medicines
Post-marketing evaluation aims to evaluate the effectiveness, safety and economy of drugs used in wide range of people under the complicated clinical situations. RCT as a gold standard for new drug clinical research cannot meet the needs of post-marketing evaluation. In this case, the clinical research based on big data is an answer to the question, which is suitable for evaluating the actual effectiveness and analyzing the complicated relations of marketed drugs in the real world.
Due to some historical reasons, the scientific foundations are weak of post-marketing Chinese patent medicines. The evidence for clinical indications, course of medication, drug interaction and adverse reactions are not sufficient, and pharmacoeconomic studies on them are even rare. With the method of big data research, Chinese patent medicines can be comprehensively monitored in terms of the indications, the specific populations, safety, cost-effectiveness, and then their treatment regimen can be optimized to improve rational use of the medicines.
5 Thinking on clinical big data researches in TCM
In clinical research, the big data technology is a new method, which brings novel ideas and approaches to clinical trials. Applying big data method in clinical evaluation of TCM, there is much hard work, including organization model, building platform, innovating technology, professional team construction. Only when the foundation was laid solid, big data research in clinical research of TCM can achieve ideal results.
5.1 Top-level design
It is necessary to complete top-level design for a new research area at the beginning, which is a guarantee for work going smoothly and sustainable. The design and organization should be well formulated covering the platform, team, resource and profit share. Without top-level design, the work will be aimlessly conducted, out of order and even in chaos, which will whittle down the efficiency and the outputs of a research project.
The successful experiences in this field from home and abroad are worth learning. Taking the Cochrane collaboration as an example, it has 14 Cochrane centers and over 50 work groups all over the world. The authoritative experts in same fields made up the steering committee, and developed the uniform work plan and guidelines, which guarantees the organization getting important achievements. Another example, in the field of reporting clinical trials, international scholars and experts set up work group and formulated guidelines for the reporting of randomized controlled trials named CONSORT Statement [
21]. Then, the CONSORT Statement was accepted by the International Committee of Medical Journal Editors (ICMJE) and announced that trials cannot be published in the ICMJE journals if they cannot comply with the CONSORT checklist. These greatly boost the transparency of clinical trials.
Leaning from the examples above, the successes are mainly ascribed to the perfect overall designs and the resources integration, which generated a global cooperation system that avoided conflicts of interest and ensured the synchronous development. In China, however, research programs are seldom conducted in trans-departments and trans-institutions collaboration to gather the resources and researchers across the country. Current research models are not beneficial for developing predominant work team, and may lead to repetitive studies, conflict of interest and waste of resources. Therefore, TCM clinical research with big data technology needs a perfect top-level design to break the boundary of institutions and disciplines. It will integrate the best resources in domestic or abroad, form collaboration organization so as to carry out grand collaboration and accomplish big projects, and then generate grand achievements.
5.2 Research platform
Standardization of data is the foundation for conducting clinical big data research in TCM. After more than 10 years of construction, the hospital information system, laboratory information system, image archiving and communication system and the electronic medical record system have been established extensively. The different development of software and companies caused the situation that heterogeneous data run on different software and hardware platforms, meanwhile data sources are also independent from each other. “Isolated information island” is not beneficial for data sharing and mining [
22]. Therefore, standardized information-integrated platform is needed to aggregate data from different origins, formats, and characters in order to solve the problems in data distribution and heterogeneity. Professor Baoyan Liu has made great contribution to promote clinical big data research in TCM. He has developed the unified clinical and research information platform based on structured TCM electronic medical records. This platform has played exemplary role in clinical data collection, management and clinical research [
23].
5.3 Technology innovation
With definite research objectives, process of big data research can be divided into four stages: data capture, data aggregation, data analysis and data interpretation. Large volume of data does not mean the increase of value; on the contrary, it means the increase of data noises [
24]. Therefore, data preprocessing, like data cleansing, should be done before data analysis. The preprocessing for mass data is a big challenge for both hardware and algorithm, which needs technological and methodological innovations. Data analysis is the key step to generate valuable information from big data. Hal Varian (the chief economist of Google) said: data was widely available, while there was lack of the ability to extract knowledge from them [
25]. The traditional data-analysis methods such as data mining, machine learning, and statistical analysis should be adjusted in the big data era. New innovated methods should be suitable for complex datasets and can efficiently analyse potential correlations among mass data [
24].
5.4 Professional teams
Clinical big data studies of TCM need researchers from different professional background, such as computer science, database technology, information technology, network technology, artificial intelligence, cloud computing, mathematics, statistics, medicine, evidence based medicine, clinical epidemiology, TCM and management science as well. It is imperative to build professional research teams to carry out researches. It is important that more attention should be paid to fostering and training sizeable inter-disciplinary researchers, so as to reserve talent for the future and secure the sustainable development in the field.
6 Perspectives
Big data era has brought revolutions to thinking and research paradigm. It provides novel research ideas and methods for clinical research, especially for TCM effect evaluation. Challenges usually come with opportunities. The concept of big data is advanced. How to make good use of this technology, there are still a lot of hard tasks should be done to strengthen the foundation as described above. Although it will take a long time to achieve significant outcomes, effect evaluation of TCM based on clinical big data will provide scientific evidence for the advantages of TCM, which may boost the health service capability of TCM.
Higher Education Press and Springer-Verlag Berlin Heidelberg