Introduction
Microorganisms from diverse environments naturally produce a wide variety of chemicals that have found use as commodities, fuels, food additives, pharmaceuticals, animal feed supplements and polymers. Production of chemicals from renewable sources has become a major focus of the biotechnology industry as petroleum sources become scarcer and riskier to acquire (
Stephanopoulos, 2007). As the market for chemicals has grown rapidly but is accompanied by increasing competition, researchers have being trying their best to improve the efficiency of chemicals production to the highest possible levels.
Traditionally, industrial microorganisms have been developed
via multiple rounds of random mutagenesis and selection. However, this approach often causes unwanted alterations in the genome, the consequences of which cannot be easily identified. Thus, it is often difficult to further improve the strains developed by random mutagenesis. For this reason, approaches for strain improvement have shifted to rational metabolic engineering (targeted metabolic engineering), which purposefully modify genes and pathways towards enhanced production of a desired metabolite (
Bailey, 1991;
Park and Lee, 2008). Despite being targeted and site-specific, this strategy has been successful in improving strain performance in several cases. However, it also has some problems. Such attempts have been limited to the manipulation of only a handful of genes encoding enzymes and regulatory proteins selected using available information and research experience. Furthermore, the scope of engineering the cell is often local rather than genome-wide, and consideration of the entire metabolic network is often neglected. Thus, these limit the extent of strain improvement.
Recent advances in genomics and other omics technologies combined with computational analysis are now opening a new avenue toward strain improvement (
Lee et al., 2005 and
2007;
Park et al., 2007). Systems biology analysis, powered by large-scale omics analyses and computational tools, allows the rapid evaluation of the global physiology of a cell. The results of this analysis can be used to predict targeted genes required to be amplified or deleted (
Park et al., 2008). Now, approaches for strain improvement have shifted to systems metabolic engineering, which can be an upgraded version of metabolic engineering with the aid of systems biology tools and an ideal way of strain improvement. In this article, we briefly review recent advances in the application of systems biology for strain improvement.
Omic analyses for strain improvement
Advances in omics provide many challenges and opportunities to metabolic engineering. By comparing omic profiles between different strains or between the samples obtained at different time points and/or under different culture conditions, possible target genes to be modified can be identified.
As DNA sequencing has become faster and cheaper, the genome sequences of many microorganisms have been completed and many more are in progress. Genomic information has become a basis for performing functional studies. Comparative analysis of genomes is a relatively simple yet powerful way of identifying the genes that need to be overexpressed and/or deleted to achieve a desired metabolic phenotype. Genomes of various strains can be compared, as can wild-type and mutant or engineered strains. Ohnish et al. (
2002) developed a genome-based approach to create a minimally mutated strain for efficient production. They compared the genome sequence of a lysine-overproducing
Corynebacterium glutamicum strain with that of the wild-type strain to identify the mutation in three genes that might be beneficial for the efficient production of L-lysine. These mutations were introduced into the wild-type
C. glutamicum and the productivity of L-lysine was increased to 2.96 g·L
-1·h
-1 from 1.90 g·L
-1·h
-1. They further improved the strain by introducing the
gnd mutation (6-phospho-gluconate dehydrogenase) (
Ohnishi et al., 2005) and the
mqo mutation (malate:quinine oxido-reductase ) (
Ikeda et al., 2006), selected on the basis of genome comparison. With the absence of strains with different production levels, Lutke-Eversloh and Stephanopoulos (
2008) developed the combinatorial overexpression approach based on the genome to identify enzymatic bottlenecks of metabolic pathways. By combinatorial overexpression of aromatic amino acid biosynthesis genes in the L-tyrosine producing
E. coli strain,
ydiB,
aroK and
tyrB (coding for shikimate dehydrogenase, shikimate kinase and aromatic acid transaminase, respectively) were identified as major overexpression targets for improved L-tyrosine production. These genes were introduced into the L-tyrosine producing
E. coli strain and the L-tyrosine titer was increased by 26%-45%. They also applied a similar strategy to search for key genes for lycopene production in
E. coli (
Jin and Stephanopoulos, 2007). By screening genomic libraries of
E. coli in a sequential-iterative manner, some novel target genes which were needed to be overexpressed were obtained. After combination of these selected overexpression targets with the knock-out targets, the best-engineered
E. coli (
T5p-dxs,
T5p-idi,
rrnBp-yjiD-ycgW,
▵gdh▵aceE▵fdhF, pACLYC) was constructed, which accumulated 16 mg/g cell of lycopene. We also applied this approach in the engineering of
E. coli for improving lycopene production and found that overexpression of
dxs, idi, yjiD, ycgW, and
rpoS was beneficial to improvement of lycopene production (Fig. 1).
Transcriptome profiling using DNA microarray allows the examination of mRNA transcript levels for thousands of genes of strains simultaneously. By comparing transcriptome profiles between different strains or between the samples obtained at different time points and/or under different culture conditions, it is possible to understand cell physiology and regulatory mechanisms at the whole-cell transcript level, and potential target genes to be manipulated can be identified. The new information generated in this way can be used to engineer the local metabolic pathways for improving the strain. Upregulated genes involved in product synthesis were further amplified, whereas those preventing product formation were removed (
Park et al., 2008). In addition, downregulated genes, which are necessary for the overproduction of a desired product, could be amplified to achieve overproduction (
Park et al., 2007). Transcriptome profiles of recombinant
E. coli producing human insulin-like growth factor I fusion protein (IFG-I
f) before and after induction during the high-cell-density fed-batch culture were analyzed (
Choi et al., 2003). About 200 genes were significantly down-regulated. Of these down-regulated genes, the
prsA gene and the
glpF gene, which are involved in an early key step in the biosynthetic pathway of nucleotides and amino acids, and the first step in glycerol utilization, respectively, were selected for overexpression. Overexpression of these two genes allowed a significant increase in IFG-I
f production (from 1.8 to 4.3 g/L) and in productivity (from 0.36 to 0.82 g·L
-1·h
-1). Comparative transcriptome profiling was performed during batch fermentation of the engineered
E. coli for the production of L-valine and the control strain (
Park et al., 2007). Among the down-regulated genes, overexpression of the
lrp,
ygaZH (encode a global regulator Lrp and L-valine exporter, respectively), and
lrp-ygaZH genes resulted in enhanced production of L-valine by 21.6%, 47.1%, and 113%, respectively. Transcriptome data of recombined
E. coli AK1 under two different sets of conditions with and without xylitol production were compared and 56 genes were down-regulated during xylitol production (
Hibi et al., 2007). Of the gene disruptants, a
yhbC-deficient strain showed improvement in xylitol production. The result indicates that the combination of transcriptome analysis and phenotype tests of single-gene-knockout mutants is a good method for improving strain. The combination of comparative transcriptome analysis, nucleotide sequence analysis, and knockout of key genes identified the cellular mechanisms and possible bottlenecks for phenol production of
Pseudomonas putida S12TPL3 (
Wierckx et al., 2008).
Proteomics is playing an important role not only in biological research but also in various biotechnological applications because most cellular metabolic activities are directly or indirectly mediated by proteins. Comparative proteome analysis under various genetically or environmentally different conditions is being carried out, and one can identify protein spots that show altered intensities for further analysis and manipulation. The proteome of metabolically engineered
E. coli XL1-Blue for poly (3-hydroxybutyrate) (PHB) was compared with that of the control
E. coli strain, so that one can understand the mechanism of PHB production and find the importance of Eda (2-keto-3-deoxy-6-phosphogluconate aldolase) in PHB production by engineered
E. coli (
Han et al., 2001). The proteome of the pyruvate kinase knockout mutant (
pyk-F)
E. coli JW1966 was compared with the parent
E. coli BW25113, and it was found that the main metabolic enzymes from the aromatic amino acid biosynthetic pathway, such as 3-deoxy-D-arabino-heptulosonate 7-phosphate (DAHP) synthase (
aroF,
G,
H), shikimate dehydrogenase (
aroE), shikimate kinase (
aroK,
L), chorismate synthase (
aroC), prephenate dehydrogenase (
pheA), aminotransferase, anthranilate synthetase, and tryptophan synthetase showed significant up-regulation in the mutant as compared to the wild type strain (
Kedar et al., 2007). Overexpression of these enzymes plays an important role in increasing the production of phenylalanine, tyrosine, and tryptophan in the
E. coli mutant.
With the advances in high-throughput quantitative analytic technology of metabolites, comparative analysis of the metabolome profile under genetic and/or environmental perturbations makes it possible to analyze the physiological states of cells (
Bartek et al., 2008). In fact, the number of metabolites in the cell is far fewer than the number of genes. However, there are many more metabolites that are still unknown to us or difficult to be detected. Fluxome, which is a collection of metabolic fluxes inside a cell, can be determined by the isotopomer-based flux analysis or the genome-scale
in silico metabolic model of a microorganism. If the metabolome data can be analyzed together with the fluxome data, metabolome profiling will become a popular tool in systems metabolic engineering.
Recently, combined omics analyses have become a powerful tool to understand metabolic mechanisms and to develop strains. Metabolic fluxome and transcriptome of pantothenate production strain
C. glutamicum ATCC 13032
▵ilvA P-
ilvEM3 (pJC1
ilvBNCD) (pMM55) were compared with the control strain to understand the physiological characterization of the pantothenate producer and provided new strategies for improving the strain (
Hüser et al., 2005). Yoshida et al. compared the metabolome and transcriptome of the bottom-fermentation yeast
Saccharomyces pastorianus (producing high levels of SO
2 and H
2S under anaerobic conditions) with that of baker’s yeast
S. cerevisiae (not producing SO
2 and H
2S) and found that
O-acetylhomoserine (OAH) is the rate-limiting factor for the production of SO
2 and H
2S (
Yoshida et al., 2008). Appropriate genetic modifications were then introduced into a prototype strain to increase metabolic fluxes from aspartate to OAH and from sulfate to SO
2, resulting in high SO
2 and low H
2S production.
Computational modeling and simulation
Mathematic modeling of microorganisms and simulation have become effective tools for systems biology.
In silico simulation of genome-scale metabolic models can also suggest potential targets for further strain improvement. Recently, flux balance analysis (FBA) has become increasingly popular because it allows the determination of the metabolic fluxes and the description of metabolic characteristics with agreeable accuracy and without requiring detailed kinetic information. First, a stoichiometric model or metabolic network is reconstructed based on genomic information and literature. It is then simulated by linear optimization technique using an appropriate objective function (
e.g. maximization of cell growth rate, minimization of metabolic adjustment, regulatory on/off minimization,
etc.) and constrains that restrict the solution space within a cell’s capacity. In particular, the effects of knocking-out genes on metabolic flux distribution can be easily examined by setting the fluxes of the knocked-out reactions to zero. In general, the method is used to identify targets that are needed to be disrupted. Recently, several successful examples of metabolic engineering based on the results of
in silico simulation have appeared. Burgard et al. (
2003) applied FBA using two objective functions (maximization of cell growth rate and minimization of metabolic adjustment) to identify knockout genes for succinic acid, lactic acid and 1, 3-propanediol production by
E. coli. The results are in good agreement with the experiment on the mutant strains. Alper et al. (
2005) applied FBA subject to minimization of metabolic adjustment to identify 3 knockout genes for lycopene production by engineered
E. coli and then tested it experimentally by constructing the corresponding single, double, and triple gene knockout mutants. Of these mutants, the triple knockout mutant showed a nearly 40% increase over the parental strain. We also carried out studies and found that the mutant of the knockout of
gdhA and
aceE had a higher lycopene yield (3.8 mg·g
-1 DCW) than that of the knockout of
gdhA,
aceE and
fdhF (3.6 mg·g
-1 DCW). The level of lycopene production of the double knockout mutant was 3.8 times that of the parental strain (Table 1). Fong et al. (
2005) developed a novel method combining FBA and adaptive evolution for strain improvement. They first constructed mutants overproducing lactic acid based on the results of FBA, and then conducted the adaptive evolution experiments. It was confirmed that mutants did actually evolve the maximization of growth rate and lactic acid secretion rate. The theoretically optimal methionine yield of
C. glutamicum and
E. coli were predicted by FBA (
Krömer et al., 2006). FBA showed that both strains had completely different optimal flux distributions, and the theoretical optimal methionine yield of
E. coli was higher than that of
C. glutamicum. Some valuable strategies in strain and process improvement were provided based on FBA. Lee et al. (
2008) reconstructed the genome-scale metabolic network of
Clostridium acetobutylicum ATCC 824 from its annotated genomic sequence and applied the
in silico model successfully to predict metabolic flux during the acidogenic phase and solventogenic phase using classical FBA and nonlinear programming, respectively. The results show that decreasing hydrogenase flux is one of the strategies used to enhance the butanol-producing capability of
C. acetobutylicum.Recently, combination of
in silico simulation and omics analysis has become a powerful tool for metabolic engineering. Lee et al. first constructed
E. coli overexpressing
ppc,
aceBA and
rthC genes as the result of transcriptome analysis, resulting in the increase in threonine production. And then
acs gene (encoding acetyl-CoA synthetase) was introduced to reduce acetic acid production based on the results of
in silico simulation. The final engineered
E. coli strain was able to produce threonine with a high yield of 0.393 g per gram of glucose and 82.4 g/L threonine by fed-batch culture. They also applied the strategy combining transcriptome profiling with
in silico simulation to engineer
E. coli for L-valine production (
Park et al., 2007). The engineered
E. coli strain (Val ▵
aceF▵
mdh▵
pfkA, overexpressing
ilvB,
ilvCED,
ygaZH and
lrp genes) was able to produce 7.55 g/L L-valine from 20 g/L glucose in batch culture, resulting in a high yield of 0.378 g of L-valine per gram of glucose.
Directed evolution of metabolic pathway
Directed evolution of metabolic pathways is a powerful tool of metabolic engineering in the post-genome era. Directed evolution incorporates Darwinian principles of mutation and selection into experimental strategies for improving biocatalysts or cellular properties. In the laboratory, directed evolution comprises two discrete components: first, genetic diversity is created through the production of a library of genetic variants; second, the library is evaluated by genetic selection and high-throughput screening to identify variants with required functions. Directed evolution of metabolic pathways is developed on the basis of directed evolution of enzymes. The directed evolution of key enzymes in the metabolic pathway can improve the level of production of the cell. The approach has been successfully used in the directed evolution of polyhydroxyalkanoate biosynthesis (
Kichise et al., 2002), doramectin biosynthesis (
Stutzman-Engwall et al., 2005), and lycopene biosynthesis (
Wang et al., 2000). Directed evolution of geranylgeranyl diphosphate synthetase from
Archaeoglobus fulgidus enhanced lycopene production by 100% in metabolically engineered
E. coli (
Wang et al., 2000).
Genome shuffling is another approach-directed evolution of the metabolic pathway. It has successfully been applied to improve the production of the polyketide antibiotic tylosin in
Streptomyces fradiae (
Zhang et al., 2002), to improve acid tolerance in
Lactobacillus (
Patnaik et al., 2002), and to improve the degradation of pentachlorophenol by
Sphingobium chlorophenolicum ATCC 39723 (
Dai et al., 2004). In the process of genome shuffling, improved mutants are obtained by mutagenic or chemostat mediated adaptation. The pooled population is shuffled by homologous recombination using protoplast fusion. Improved progenies are selected and subjected to the next round of shuffling. In general, the efficiency of genome shuffling depends on the diversities of the mutant population used for genome shuffling. Hida et al. (
2007) applied genome shuffling
Streptomyces sp. U121 to achieve rapid improvement in (2S, 3R)-hydroxycitric acid (HCA) production. The best mutant showed increased cell growth in flask culture, as well as increased HCA production.
Future perspectives
Chemicals’ biosynthetic pathways are complex and tightly regulated. Thus, microorganisms normally do not produce high levels of chemicals. With advances in omics technology, biological research has moved into the era of systems biology, which offers great challenges and opportunities to metabolic engineering. Metabolic engineering, based on systems biology, allows the development of superior strains. This approach will prevail in the future. However, the application of systems biology to strain improvement has only just begun. Although significant advances have been made in bioinformatics and systems biology to explain the complex omics data, new methods need to be developed for their integrated analysis. Besides the in silico genome-scale metabolic models developed in the literature, novel genome-scale models that integrate regulatory circuits with metabolism should also be developed for metabolic engineering. Moreover, genome-scale kinetic models that will be able to predict dynamic time-dependent metabolic fluxes need to be developed so that changes in cellular metabolic and regulatory circuit characteristics over time can be observed. In addition, methods to incorporate metabolic engineering strategies considering the fermentation and downstream processes at the early stage of strain development would be beneficial.
Fermentation and downstream processes are also important factors that affect the product cost. Systems biology also plays an important role in these processes. There are only a few successful examples of the applications of systems biology in bioprocess optimizations (
Gupta and Lee, 2007).
Systems biology will have an increasing impact on industrial biotechnology in the future. All steps of biotechnological development, from up-stream and mid-stream to down-stream processes, will benefit significantly by applying systems biological approaches.
Higher Education Press and Springer-Verlag Berlin Heidelberg