Introduction
Tuberculosis (TB) has remained a significant public health concern in China, accounting for one million new cases of TB and 130 000 deaths every year [
1,
2]. Around 8.3% of the culture-positive cases have been found to be multidrug-resistant TB (MDR-TB) [
3]. During the last two decades, China accomplished significant success in tracking the TB epidemic, reducing the active incident TB cases by more than 50% from 2000 to 2010 [
1]. Despite this achievement, only slight decrease in the overall prevalence of disease could be observed from 2000 to 2010 (466/100 000 to 459/100 000 population) [
4]. The high prevalence of MDR-TB, high disease burden in rural areas, and enormous migrations from rural to urban settings are still significant challenges which pose a major hurdle for TB control in the country. A recent mathematical modeling study has shown that the control of tuberculosis in China will still not be optimistic in near future under present strategies [
5].
The hallmark of the disease is that most of the people remain asymptomatic after initial infection, i.e., the latent phase. Only 5%–10% of the infected individuals develop active disease within few months to several decades. Given the complex incubation period, from the time of infection to development of active disease, incident TB is sometimes also represented by a combination of recent transmission or reactivation (remote transmission). No separate treatment regimens are there to deal with these two forms of active TB cases, but different control strategies may be adopted, which can offer varying public health significance. If the number of cases from recent transmission is high, case-finding strategies must focus on active detection of sources of infection, their treatment and completion of treatment to render the source case non-infectious, and timely screening of close contacts for active TB cases. Meanwhile, if the case comes from endogenous reactivation, then screening of individuals/population at high risk to develop active TB and preventive measures for latent TB must be given full consideration. Therefore, having a clear distinction on the proportion of incident TB due to recent transmission and endogenous reactivation is important in TB control.
Given the complex nature of the disease, traditional epidemiological methods are not sufficient to decode the exact transmission in populations, which makes the current TB control strategy less effective. Conventionally, most of the cases in a setting are thought to be due to reactivation. However, molecular epidemiological studies have revealed that recent transmission can account for 35%–40% or even up to 70% of the notified TB cases [
6–
9]. Based on these observations, targeted control strategies are implemented to reduce significantly the incidence of disease in some areas. Targeted interventions, for example, were taken to address newly identified transmission high-risk groups, including human immunodeficiency virus (HIV)-positive individuals, drug addicts, and homeless people in San Francisco, United States. A significant decline in overall incidence of TB and recent transmission from 51.2/100 000 and 10.4/100 000 in 1992 to 29.8/100 000 and 3.8/10 million in 1997, respectively, was observed [
10].
The control and prevention strategies of one region cannot be necessarily and effectively implemented in another region. Along with this, the key drivers for TB transmission may also vary. Therefore, molecular epidemiological studies that illustrate the transmission patterns of TB in China are particularly important for controlling the disease. In the present article, we review the recent achievements in molecular epidemiology of TB and estimate the role of recent transmission of M. tuberculosis on the disease burden in China.
Genotyping methods of Mycobacterium tuberculosis in China
Genotyping of
M. tuberculosis strains has enhanced the understanding of TB epidemiology. Strains resulting from the recent transmission should belong to the same genotype (clustered), whereas those from reactivation will have a unique signature in the population. The most common genotyping methods used are IS
6110-RFLP, followed by mycobacterial interspersed repetitive unit-variable number tandem repeat (MIRU-VNTRs) typing and spoligotyping [
11].
IS6110-RFLP has been considered as the standard method for genotyping of M. tuberculosis in the 1990s. It works on the principle of copy number variation of insertion sequence IS6110 in different clinical isolates. Despite its high discriminatory power, it has technical difficulties in data production and comparison of results or sharing the data among different laboratories, limiting its large-scale use.
Spoligotyping is considered the simplest, rapid, and cost-effective technique that has been used to define the predominant and growing number of important clinical M. tuberculosis clones or strains. It works on the presence or absence of specific spacers between the direct repeat (DR) loci and requires small amount of DNA. The results are converted into an octal code and are easy to share among laboratories. However, spoligotyping also has some limitations in differentiating the dominant Beijing strains in China, which reduces its ability to explore transmission dynamics of M. tuberculosis in China.
MIRU-VNTR typing is currently widely used method, which consists of variable sizes of repeat regions in the genome. The discriminatory power of this method, as well as IS
6110-RFLP, has been reported by using a combination of 15/24-loci spectrum [
12]. MIRU-VNTR typing offers high-throughput typing, is convenient, and allows easy sharing of digital data for comparison among different laboratories. The standard 24-VNTR loci spectrum has been recommended as the routine genotyping method in the United States and many countries in Europe. However, many of the VNTR loci in the standard 15-/24-loci spectrum have limited discriminatory power among the Beijing strains [
13]. The resolving power of same VNTR combinations may also vary among different mycobacterial lineages in various regions or settings. Recently, Allix-Beguec
et al. proposed the addition of four hypervariable loci (VNTR-1982, 3820, 3232, and 4120) into the 24-VNTR set to increase their resolution in the regions mainly dominated by Beijing strains [
14]. However, this 24+ 4 loci combination again includes many loci that have low discriminatory power, which could limit its application in high-disease-burden countries such as China [
13].
Currently, no widely accepted standard VNTR set is available in China. Most of the studies used the 15- or 24-VNTR loci spectrum. A few studies tried to evaluate the potential combinations (ranging from 3 to 24 VNR loci) relevant to the local circulating strains (supplemental data) [
13,
15–
19]. One of the studies proposed a new 15-VNTR pattern, but the sample size was very small (54 strains), which limited its application in other parts of the country [
16]. Our group collected 1362 isolates from six research sites across China and proposed an optimized 9+ 3 VNTR loci (9 loci plus 3 hypervariable loci, VNTR-3232, 3820, and 4120) to study the transmission of
M. tuberculosis in China [
13]. Chen
et al. evaluated the discriminatory power of standard 15-VNTR loci spectrum on 3966 samples collected from the national drug survey (31 provinces) in 2007 [
3] and proposed a combination of 8 to 10 VNTR sets for different provinces [
17]. These sets shared 7 VNTR loci with the 9+ 3 VNTR set (supplemental data). However, this study did not include the evaluation of the hypervariable loci, which are needed to discriminate the most common circulated Beijing strains. In addition, the cluster-randomized sampling method of the national drug-resistance survey might not be feasible to assess the ability of the VNTR combination in investigating the transmission of
M. tuberculosis.Further evaluation study systematically compared the 9+ 3 VNTR set with the standard 15/24 loci plus 4 hypervariable loci and other combinations used in China. By using a population-based collection of 891 clinical isolates from five provinces across the country, this study revealed that the 9+ 3 VNTR set has comparable discriminatory power to the 24-VNTR set plus 4 hypervariable loci and requires less labor demand [
15]. Therefore, we suggested that the 9 VNTR loci might serve as the initial genotyping tool for comparative analysis of circulation isolates among different regions in China and that at least these three hypervariable loci should be added to study the recent transmission or outbreaks of
M. tuberculosis.
Implementation of whole-genome sequencing analysis
The development of whole-genome sequencing (WGS) and its cost effectiveness utility has revolutionarily improved the understanding of TB transmission. WGS identifies sequence variations in the whole-genome level (or almost whole-genome) and helps in identifying recent transmission with more accuracy [
20–
23]. Genetic distance below 5–12 SNPs can be used as threshold to explain the recent transmission events [
24]. However, these studies were from low-TB-burden countries. At present, there are lacunae of evidence on the comparison of WGS to traditional genotyping methods in different populations. We compared both methods with two large (>10 cases) clusters from Shanghai and Heilongjiang provinces and observed that WGS helps in further differentiation of VNTR-defined clusters [
22]. Similar finding was also observed in a retrospective study on MDR-TB strains in Shanghai [
23]. Most importantly, based on precise identification of transmission events, WGS can help in better allocating public health resources to conduct more effective investigation of transmission (e.g., the identification of high-risk population of recent transmission and the infectious source). In addition, WGS has also proved worthy in unraveling the long-term evolution of
M. tuberculosis and clinical applications of drug-susceptible testing [
25,
26].
Assessing the burden of recent transmission of M. tuberculosis in China
The proportion of genotypic clusters in a population represents the recent transmission rate of
M. tuberculosis. This proportion can be ascertained by analyzing all sputum culture-positive patients in a region for a certain period of time (at least three years). Studies carried out to date in China used either convenient sampling procedure from hospitals or samples obtained from epidemiological surveys. Such studies were not population based, and the cluster rate obtained could not reflect the real situation of recent transmission [
27]. To the best of our knowledge, only a few population-based studies were conducted. Our group carried out a four-year prospective molecular epidemiology in five different fields and observed that, on average, 31% of TB patients were genotypically clustered, indicating the recent transmission of
M. tuberculosis [
28,
29]. One of the studies revealed 23% cluster rate in one-year period in two provinces of Eastern China [
30], whereas another study from the same group in four rural areas using combined genotyping methods observed cluster rate of 14% in a three-year period [
31]. Similarly, Wang
et al. observed 15% cluster rate from six provinces of Eastern China in six months [
32]. A study in 30 townships of Jiangsu Province for a time span of three months observed 27% transmission rate [
33]. Two studies without the statement of sampling strategies reported clustering rates of 25% and 55% in Hebei Province and Gansu Province, respectively [
34,
35] (Table 1). Unfortunately, the above observations could not make good comparison due to variation in proper research design and time and the use of different genotyping methods.
Molecular epidemiological studies require detailed epidemiological information and the use of high-resolution genotyping methods to elucidate recent transmission events. The VNTR genotyping combinations being used at present in China could not provide satisfactory resolution. Moreover, two patients without any transmission link may possibly harbor the same “genotype.” Such cases are more common among genetic conserved strains, such as Beijing family strains. The misclassification of real clusters or transmission can further hamper the efficiency of epidemiological investigation. Numerous studies have revealed that genotypic clustered strains have low proportion (<30%) of epidemiological links. A study in a rural area of China did not reveal any contacts among VNTR-defined clustered cases, suggesting that occasional exposure can also contribute to transmission [
32]. Recently, One of our studies. suggested that spatial proximity (i.e., clustered cases most commonly living in the same community or connected streets) can also be the link for recent transmission of MDR-TB in Shanghai [
23]. However, we cannot exclude the possibility of unidentified samples within the transmission chain. The development of high-resolution genotyping methods could enhance the accuracy of clustered isolates, hence increasing the possibility of identifying epidemiological links.
The identification of high-risk population groups and places is one of the major purposes of molecular epidemiology study of TB transmission. However, identifying high-risk factors of recent transmission of TB in China is difficult [
22,
23,
28,
32,
33]. One of the common reasons could be the limited discrimination power of genotyping methods being used and the many barriers in epidemiological investigations to identify transmission links. In addition, being a high-disease-burden country, the infectious sources in the community are common in the general population and difficult to track. Despite some reports of suspicious outbreaks in schools [
36–
38], public places were considered the high-risk place of TB transmission [
22,
23,
28,
32]. In other words, implementing targeted interventions without a high-risk group or hotspot is difficult. The circumstances of risk factors for TB transmission in China was different from those reported in other countries, where recent transmission is found mostly among specific population groups, such as HIV-positive individuals, homeless people, and drug abusers. Thus, we suggest that for TB control in China, active case finding, especially the identification of high-risk groups or places, needs to be strengthened.
Indirect evidence of ongoing transmission-exogenous reinfection
Reinfection of
M. tuberculosis among patients with recurrence or treatment history also suggests the importance of ongoing/recent transmission [
39]. Patients with recurrent TB are usually considered as endogenous reactivation. However, comparison of the genotype of strains before and after recurrence can determine whether the patient has a true endogenous relapse or an exogenous reinfection. Data on the prevalence of reinfection of recurrent TB in China are limited. Two of our studies separately using VNTR genotyping method revealed that 42% (59/141) and 68% (25/37) of recurrent TB cases were due to exogenous reinfection [
40,
41]. A study on two-year follow-up of 249 TB patients from Anhui Province revealed that out of five recurrent patients, two had exogenous reinfection [
42].
Similarly, development of drug resistance among patients with treatment history was believed to be caused by reactivation of the original strains in previous episodes. However, we observed that 84% (32/36) of patients with new resistance profiles were in fact due to reinfection of a new VNTR genotype with drug resistance [
43]. Recently, we expanded the sample size and found the proportion of exogenous reinfection to be 59% (48/81) among retreated drug-resistance patients using both VNTR genotyping and WGS analysis [
44]. Thus, these findings suggest a continued transmission force of
M. tuberculosis in the community, which always tends to create new infections.
Transmission of MDR-TB in China
The prevalence of MDR-TB has posed a serious challenge in China as well as other high-TB-burden countries. In 2007, the National Drug Surveillance in China estimated that 37.7% of TB cases were resistant to any single drug (DR), and 8.3% were MDR-TB, of which only less than 10% could be diagnosed and treated [
2]. In general, poor compliance to chemotherapy or inappropriate treatment during anti-TB therapy might lead to high risk of acquired drug resistance. The national survey reported that patients with multiple treatment history significantly increase the possibility of developing drug resistance. One study has also reported the treatment history as one of the risk factors in developing drug resistance, suggesting that inappropriate treatment of TB patients in China can be more common than expected, particularly in general hospitals [
2,
45]. Meanwhile, a recent study has shown that the genetic heterogeneity in
M. tuberculosis population within host is much higher than we previously expected as large numbers of low-frequency mutations were found [
46]. These low-frequency mutations, although neglected in most analyses, can give rise to the emergence of drug-resistant clones in cases of drug pressure weakening (e.g., inappropriate treatment) [
46,
47].
The extensive scale-up of the DOTS strategy has reduced acquired resistance due to inappropriate treatment, but the prevalence of MDR-TB in China still could not be curtailed. This paradoxical situation may be due to the initial spread of MDR strains in the population. Disease surveillance systems usually provide the proportion of DR/MDR-TB among new or retreated patients. However, the high proportion of drug resistance among treated patients somehow misleads the priority of controlling MDR-TB infection. In fact, screening for the proportion of new and retreated patients among MDR-TB cases revealed that most cases were from new patients rather than retreated patients, suggesting direct transmission of MDR strains [
48]. In 2007–2008, the national drug-resistance tuberculosis surveys in China reported that more than 70% of DR patients actually had no treatment history [
3]. Similarly, another study from urban and rural settings also reported 71% and 59% of patients with DR-TB and MDR-TB, respectively, had no treatment history [
28]. Furthermore, molecular epidemiology study using WGS and epidemiological investigation delineated the evidence of recent transmission of MDR
M. tuberculosis strains [
23] and also revealed that MDR strains were more likely to be genotypically clustered [
28]. Extensive transmission of isoniazid-resistant
M. tuberculosis strains is reported to be associated with increased MDR-TB in rural areas of the country [
49]. Mathematical modeling approach has suggested that if necessity measures and control strategies are not considered in the current situation, then there is a likely possibility that 90% of the MDR-TB in China will be due to the transmission of drug-resistant strains [
50].
Current deficiencies and future challenges
Molecular epidemiological investigations in TB have been carried out for more than a decade in China. However, only a few population-based studies systematically evaluated the contribution of recent transmission of TB to the disease burden. Current evidence suggests that recent transmission of M. tuberculosis plays an important role in the generation of new TB cases and high MDR-TB burden. These findings may help devise targeted and effective control strategies for TB epidemics in China. However, due to the limitations and challenges in study design, case finding, and genotyping methods, the impact of recent transmission on the disease epidemic may be underestimated.
(1) Lack of prospective molecular epidemiological studies and incomplete sampling scheme. Ideally, molecular epidemiological investigation needs to incorporate all clinical isolates in a population over a certain period of time. Various factors, such as length of the study, case notification rate, and proportion of culture-positive cases, may affect the sample collection and lead to underestimation of recent transmission rate [
27]. In most regions in China, due to the limited resource, the current case-finding strategy is a passive scheme, which mainly relies on symptomatic individuals voluntarily seeking medical care and treatment. One of the active case-finding studies from a rural area in Sichuan Province revealed that the passive strategy could capture only 25% of total TB cases compared with active case-finding strategies [
51]. Our study in both rural and urban settings observed that Shanghai was the only region with intensive case finding and the combination of active case finding and routine case-recommendation strategy for many years, which resulted in the reliable case notification rate. Thus, the recent transmission rate of 32% in Shanghai is much more convincing than that observed in rural areas with high TB burden but low case notification rate in the study [
28]. In high-TB-burden countries, such as China, elevated cost is one of the main barriers in implementing active case-finding interventions. However, mathematical modeling study suggests that even with a cost up to $4000 per case finding in China, it still has a high cost-effective impact in a long-time frame [
52]. Therefore, the exploration of novel active case-finding strategies in China needs to be continued.
(2) Another challenging situation is the low proportion of bacteriologically confirmed TB cases in China. A study from our group in five provinces revealed that 30% of active TB cases were culture positive. The reasons behind this finding are not clear; they could be due to the low sensitivity of the diagnostic method and/or the low quality of diagnostic procedures. Meanwhile, such low proportion of culture-positive TB cases can slow down the effectiveness of molecular epidemiology study in delineating the real TB burden caused by recent transmission. Thus, strategies to augment the capacity building and improve the quality of sputum sampling procedures, microscopy testing, and bacterial culture are urgently needed to increase the proportion of bacteriologically confirmed TB cases in China. A long-term time frame (i.e., more than three years) is also important to assess the extent of recent transmission, since recently infected individuals could not be included due to the short duration of the study period.
(3) Non-availability of a standard genotyping method throughout the country. Given the absence of uniform genotyping methods, genotypic data from different studies and various laboratories are not comparable. To the best of our knowledge, we recommend the national scale-up of the “9+ 3” loci pattern or an optimized pattern based on this scheme. Moreover, establishment of a data center can effectively promote the nationwide sharing of genotypic data. This capacity building may benefit the comparison of various studies and further strengthen attempts to investigate the transmission of M. tuberculosis nationwide, especially the enormous internal migrant populations from rural areas with high TB prevalence to the urban cities with low TB prevalence. Meanwhile, with its reduced cost, WGS can become a routine method and a good alternative to distinguish clinical isolates and establish recent transmission events.
Summary
Endogenous reactivation can be treated as the pool reservoir of TB with the recent transmission of M. tuberculosis as the source, which is continually being added to the reservoir. If recent transmission cannot be controlled in time, it may deteriorate the current achievements and even increase disease burden. Therefore, large-scale prospective molecular epidemiological studies on TB based on intensive case finding need to be established. Capacity building in the diagnosis of culture-positive TB cases is another important aspect for the precise assessment of recent TB transmission. Along with this, an epidemiological meta-database linked to the genomic data needs to be developed as it will allow molecular epidemiology findings to be transformed into the corresponding prevention and control measures. We suggest that targeted interventions emphasized in controlling the recent transmission of M. tuberculosis based on molecular epidemiology studies will be an effective means for rapidly decreasing incident TB cases and can serve as an example to control TB in China and other high-disease-burden courtiers.
Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature