Introduction
As a complex interactive nonlinear system, much attention has been gained by the study related to the metabolic networks in recent years. Efforts have also been made for the analysis of the structure of whole metabolic networks (
Feist and Palsson, 2008). The use of flux balance analysis in order to predict the flux distribution in an entire metabolic network under certain physiological conditions is to analyze and compare the metabolic networks related to different organisms. There is already the availability of genome scale reconstructed models of the different organisms like
Helicobacter pylori, Mycobacterium tuberculosis etc. These metabolic networks are obtained from different metabolic models databases (
Hamilton and Reed, 2012). Flux Balance Analysis (FBA) is a widely used constraint based approach for studying biochemical metabolic networks, mainly the genomic scale metabolic network that have been reconstructed in the past decade. It uses the concept of linear programming. Linear programming obtains the maximum potential of the objective function that we are looking at, and therefore, when using flux balance analysis, a single solution is obtained for the optimization problem (
Raman and Chandra, 2009). FBA calculates the flow of metabolites taking place in the metabolic network, hence making it possible to predict the growth speed of an organism or the rate of production of a biochemically important metabolite (Edwards and Palsson, 2002). Essential genes comprise of genes whose specific deletion is lethal under a specific environmental condition. Identification of essential genes in the metabolic network of microorganisms helps to identify potential drug targets and in understanding of minimal requirements for a synthetic cell. However, experimental assessment of essentiality of the coding genes of metabolic network is resource intensive and not viable for all bacterial organisms, in particular if they are infective (
Plaimas et al., 2010;Chowdhury et al., 2015). A constraint-based analysis of reconstructed metabolic networks has proved to be quite useful in various applications such as metabolic engineering prediction of outcomes of gene deletions, drug-target identification and in the elucidation of cellular regulatory networks (
Raman et al., 2005). By analogy, Synthetic Lethals (SLs) refer to the pair of non-essential genes whose simultaneous deletion is proved to be lethal. Synthetic gene lethality can arise for different types of reasons. For example, two protein products can be interchanged with respect to an essential function that act in similar pathway or function in two distinct pathways with redundant or complementary essential functions. The study of synthetic lethality plays an important role in explaining functional links between genes and the prediction of the gene functions (
Suthers et al., 2009;
Chowdhury et al., 2015; Raman et al., 2015). Occurrence of synthetic lethality can be seen between genes and small molecules, and it can be used to explain the mechanism of action of drugs.
The true potential of synthetic lethality has been widely studied in yeast (
Nijman, 2011). Biological systems are prone to mutation, to the variation in environmental conditions and to random variations in the abundance of component molecules of the biological systems. Many physiological and developmental systems are robust to such disturbances and hence immune to any such fluctuations (
Masel and Siegal, 2010). Biological systems that have been experimentally proved of being robust to the significant changes in their environmental conditions need mathematical models for robustness analysis of the metabolic network. These mathematical models should themselves be robust. The necessary condition for the model robustness is that the dynamics of the model should be insensitive to small variations in the parameters of the model. The model dynamics may be very much sensitive to simultaneous parameter variations (Kim et al., 2006). Phenotype phase plane analysis can study the optimal utilization of the metabolic network as a function of the constraints. At present metabolic flux maps are typically calculated for the single growth condition, hence gives a limited view of the metabolic phenotype – genotype relation. But phenotypic phase plane analysis maps all the growth conditions characterized by two environmental variables presented in the single plane (
Edwards et al., 2001). Fast-SL is an algorithm which computes combinations of reactions, which when deleted, leads to the elimination of the growth of the organism due to the modification in the metabolic pathway. It achieves this by a combination of limiting the search space and exhaustively iterating through the remaining combinations (
Pratapa et al., 2015). The functional
in silico model of
Mycobacterium tuberculosis, iNJ6611, contains 661 genes, 1025 reactions and 826 metabolites. Genome scale models can be used as hypothesis generating tools and also for analysis and discovery which will expectantly support the rational drug development process (
Jamshidi and Palsson, 2007). The functional
in silico model of bacterium
Staphylococcus aureus,
iSB619, consists of 619 genes by which 640 metabolic reactions are catalyzed. The reaction list is the most complete till date for this pathogen (
Becker and Palsson, 2005;
Heinemann et al., 2005). The reconstructed network,
iIT341 represents a detailed review of the current literature about
Helicobacter pylori as it integrates biochemical and genomic data in a comprehensive framework. In total, it contains 341 metabolic genes, 476 intracellular reactions, 78 exchange reactions, and 485 metabolites (
Schilling et al., 2002; Palsson et al., 2005). Salmonella enterica subspecies I serovar Typhimurium is a human pathogen which causes different diseases and its increasing antibiotic resistance poses many public health problems. Different diseases caused by Salmonella are mainly Typhoid fever, Paratyphoid fever and Food poisoning. The functional
in silico model of
Salmonella typhimurium,
STM_v1_0, consists of 1271 genes, 1802 metabolites and 2545 reactions (
McClelland et al., 2001;
Thiele et al., 2011). Methods of Constraint Based Reconstruction and Analysis (COBRA) have been successfully applied in the field of microbial metabolic engineering and are being extended to model transcriptional and signaling networks and in the field of public health. The COBRA approach focuses on using physicochemical and biological constraints to analyze the set of feasible phenotypic states of a reconstructed biological network under a given condition (
Schellenberger et al., 2011). SBMLToolbox, a toolbox that facilitates importing and exporting models presented in the Systems Biology Markup Language (SBML) format in the MATLAB environment. It also provides functionality that helps an experienced user of either SBML or MATLAB to combine the computing power of MATLAB with the exchangeability and portability of an SBML model. SBMLToolbox supports all levels and versions of SBML (
Keating et al., 2006). In this work, we describe the similarities and dissimilarities of metabolic networks of
Salmonella typhimurium (STM_v1_0),
Helicobacter pylori (iIT341),
Mycobacterium tuberculosis (iNJ661) and
Staphylococcus aureus (iSB619) by pairwise comparison with the help of flux balance analysis and gene deletion method.
Materials and methods
The comparative study of metabolic network for finding the similarity and dissimilarities is conducted for the following four pathogenic bacteria.
1) Mycobacterium tuberculosis (iNJ661),
2) Staphylococcus aureus (iSB619),
3) Salmonella typhimurium (STM_v1_0),
4) Helicobacter pylori (iIT341).
The genome scale metabolic networks of the four pathogenic organisms were collected from different sources like BiGG database, BioModel database and TheModelSeed database. The MATLAB compliant files of the four given organisms were downloaded from the BiGG Database.
Flux balance analysis
The constraints in FBA have the form α≤vi≤β where vector vi represents all the fluxes in the given metabolic network, α and β are the lower and upper limits. Thermodynamic constraints according to the reversibility and irreversibility of a reaction can be applied by setting the lower limit (α) for the equivalent flux to 0 if the reaction is irreversible and −1000 if the reaction is reversible. The upper limit is set to 1000 for reversible and irreversible reactions. The COBRA toolbox was used to run the FBA. For constraint based modeling, COBRA toolbox is used which has a collection of MATLAB scripts that are run from within the MATLAB environment. The MATLAB compliant metabolic network files of each of the four organisms were loaded in the MATLAB. A specific metabolic flux distribution within the feasible set was found by using linear programming (LP). The Gurobi solver package was used to perform linear programming in COBRA Toolbox. The fluxes of all reaction were obtained and the growth rate of the organism was calculated by setting the biomass reaction as the objective function.
Simulating gene knockouts
Gene-protein-reaction associations are embodied in rxnGeneMat matrix in the MATLAB files, which is a matrix with rows equal to number of reactions in the model and columns equal to number of genes in the model. The function –singleGeneDeletion and doubleGeneDeletion performs the single and double gene deletion respectively.
Performing synthetic lethality analysis
It was performed using the Fast – SL algorithm that is implemented using the MATLAB.
Code to perform the synthetic lethal reaction analysis- fastSL (model, cutoff, order, eliList, atpm)
Code to perform the synthetic lethal gene analysis- fastSLgenes (model, cutoff, order, flag)
where model is metabolic network file to be analyzed, cutoff is cutoff percentage value for lethality and have default value of 0.01, order is order of synthetic lethals and have default value of 2 and maximum value 3,eliList is the list of the reactions to be excluded from the analysis, flag is set to 1 for rigorous search and have default value 0, atpm is ATP Maintenance Reaction Id in model.rxns if other than ATP Maintenance Reaction.
Robustness analysis
Robustness analysis uses the FBA to analyze the metabolic network properties. In this step, the flux was varied through one reaction and the optimal objective value was calculated as a function of this value. This tells about the sensitivity of the objective value to a particular reaction. We have checked the effect of varying glucose and oxygen uptake on the growth rate. COBRAToolbox has the functionrobustnessAnalysis to perform this method.
Phenotypic phase plane analysis
While performing the Robustness analysis, we varied one parameter and made another parameter to be constant. However, in phenotypic phase plane analysis, we varied two parameters glucose and oxygen uptake rate simultaneously and the results were plotted as phenotypic phase plane. It revealed the interaction between two reactions. COBRA Toolbox has the built in functionphenotypePhasePlane to perform this step.
Metabolite connectivity
Metabolite connectivity is the number of reactions in which the metabolites occur in. In this step the Stoichiometric matrix S of the metabolic model was converted into the binary matrix Sbin. It replaces all nonzero elements in the matrix S with ‘1’ in the binary matrix. Then all ‘1s’ in each row of the Sbin was added up to determine metabolite connectivity.
Reaction essentiality and metabolite connectivity
Correlation between essentiality of reaction and metabolite connectivity can be estimated in the metabolic network. In this step, all the reactions which were associated with the metabolites were deleted and checked whether it could still produce the biomass or not. Reactions associated with the metabolites could be found out by scanning through the corresponding rows in the Stoichiometric S matrix. After then all the associated genes were knocked out and FBA was used to predict the possibility of the growth. Lethality fraction was calculated based on the above result. The semi log graph was plotted with this lethality fraction vs. metabolite connectivity.
Results
Calculation of growth rate using FBA
The growth rates of the four organisms were calculated using flux balance analysis. The maximization of biomass was set as the objective function to calculate the growth rate. The growth rate is calculated in the unit mmol/(gDW·h) (millimole/(gramDryWeight·hour)). The growth rates of four organisms are shown in Table 1.
The table showing growth rates, are the growth rates calculated for the wild type strain of each of the model organism. The objective function for each of the organism was different. Apart from this, growth rates under the alternative substrates like glucose, succinate, pyruvate etc. were also calculated. The calculation of growth rates was done under both aerobic and anaerobic conditions. The growth rates calculated under the alternative substrates were different compared to the wild type ones. For most of the substrate, growth rate under the anaerobic condition was found to be 0 for each of the organism studied. Alternative substrates under which growth rates were calculated are the important substrates that play an important role in the major metabolic pathways like glycosis, TCA cycle, oxidative phosphorylation, amino acid metabolism etc. The growth rates calculated under alternative substrates are shown in the given below tables for each of the organism separately. The maximum substrate uptake rate was set to −20 mmol/(gDW·h) for every substrate. The calculated growth rates under the alternative substrates in the metabolic networks of each of the four different organisms have been shown in the tables from Tables 2 to 5.
Retrieving essential reactions and essential genes from gene knockouts
If a single gene is associated with multiple reactions, the deletion of that gene results in the removal of all reactions associated to the genes. On the other hand, a reaction that can be catalyzed by multiple gene products which do not interact with one other will not be removed in a single gene deletion. The essential reactions were extracted by reaction knockout method. The essential genes were extracted by gene knockout method. By the aid of the COBRA toolbox, we can retrieve the essential reactions in the metabolic network. After gene knockout, a mutant strain of the organism is generated and it also provides the grRatio (gene-reaction ratio) of the mutant strain and wild type strain of the organisms. The grRatio is the ratio between the grRatio of wild type strain and the grRatio of the mutated strain. Those genes and reactions are the essential genes and essential reactions of the organisms, if the grRatio is less than 0.05. Key metabolites affected by gene deletion were 3PG(3-phosphoglycerate), AcCoA (acetyl CoA), ADP, AKG(α-ketoglutaric acid), ATP, CoA(coenzyme-A), E4P(erythrose-4-phosphate), F6P(fructose-6-phosphate), G3P(glycerol-3-phosphate), G6P(glucose-6-phosphate), Gln-L(L-glutamine), Glu-L(L-glutamic acid), H2O, H(hydrogen), NAD, NADH, NADP, NADPH, OAA(oxaloacetic acid), PEP(phosphoenol pyruvate), Pi(inorganic phosphate), pyr(pyruvate), R5P(ribose-5-phosphate). Number of essential reactions and essential genes present in metabolic network of each organism has been shown in Table 6. The essential reactions thus obtained for each of the organism can be compared pairwise with essential reactions of each of the organism. The pairwise comparison was performed with the help of Compare plugin in the Notepad++. Common essential reactions thus obtained by the pairwise comparison are shown in the below given Table 7.
In the recent years, genome-scale metabolic networks have been reconstructed for many organisms (
Thiele et al., 2005,
2011). These networks have been studied using tools such as Flux Balance Analysis (FBA) (
Kauffman et al., 2003), for the identification of targets for metabolic engineering (
Alper et al., 2005) and to understand the robustness of organisms through systematic experimental evaluation of gene knockouts (
Deutscher et al., 2006). Another important aspect of analyzing these reconstructed metabolic networks is the identification of combinations of genes, which when simultaneously deleted, abolish growth
in silico (
Deutscher et al., 2006;
Suthers et al., 2009).
Essential genes and essential reactions obtained by the gene knockout have been listed in the Table 8 and Table 9 respectively. Since this is a quantitative study, the list contains only selected genes which were found to have scientific evidence in the previous studies. They have been analyzed in the past for prediction of novel genetic interactions and analyzing the extent of robustness of biological networks (
Raghunathan et al., 2009).
Gene knockout performed on each of the organism can be visualized using the MATLAB. Double gene deletion was performed for all the genes present in each of the organism. The lethality of double gene deletion can be visualized where the scale of lethality varies from 0 to 1. Scale of 1 refers to highly lethal deletion for the particular two genes and scale of 0 refers the lethal effect to be normal.
Synthetic lethality analysis
It was performed using the algorithm Fast–SL algorithm. It is MATLAB implemented algorithm using COBRAToolbox. The synthetic gene lethality and the synthetic reaction lethality were performed separately. For synthetic gene lethality, both single and double gene lethality was performed. Similarly for synthetic reaction lethality, both single and double reaction lethality was performed. In single gene lethality there was deletion of single gene and similarly in double gene lethality, pair of genes was deleted. Similar procedure was followed for reaction lethality also. The result for gene lethality has been shown in the Table 10. The result for Reaction Lethality has been shown in the Table 11. The table shows the number of single lethal reactions and double lethal reactions. Single lethal reactions were identified by performing exhaustive single reaction deletions. Double lethal reactions were identified by performing pairwise deletions.
Robustness analysis
Robustness analysis was performed to study the effect of the glucose and oxygen uptake on the growth rates. This analysis was performed for metabolic network of each of the organism. The sensitivity of the growth of the metabolic network was checked against the varying glucose and oxygen uptake rate. To determine the effect of varying glucose uptake on growth, oxygen uptake rate was fixed at 17 mmol/(gDW·h) which is a realistic uptake of oxygen. To determine the effect of varying oxygen uptake on growth, glucose uptake rate was fixed at 10 mmol/(gDW·h) which is a realistic uptake of glucose. Results for the robustness analysis of the metabolic networks of each of the four organisms have been shown in the figures from Fig. 1 to Fig.8. Figures 1 and 2 represent the Robustness study of metabolic network ofMycobacterium tuberculosis. In Fig. 1, with oxygen uptake rate kept constant, growth rate remained 0/h until a glucose uptake of around 2.5 mmol/(gDW·h). Growth reached to the maximum level of 0.3/h at the glucose uptake of around 5 mmol/(gDW·h). After the maximum growth rate of 0.3/h growth rate became constant. In Fig. 2, with glucose uptake rate kept constant, growth rate steadily increased with increase in oxygen uptake. Figures 3 and 4 represent the Robustness study of metabolic network ofStaphylococcus aureus. In Fig.3, with oxygen uptake rate kept constant, growth rate steadily increased with increase in glucose uptake rate. In Fig.4, with glucose uptake rate kept constant, growth rate increased steadily for most of the part with increase in oxygen uptake rate. At oxygen uptake of around 2 mmol/(gDW·h), growth rate remained constant at 0.6/h. Figure5 and Fig.6 represent the Robustness study of metabolic network ofHelicobacter pylori. In Fig.5, with oxygen uptake rate kept constant, growth rate steadily increased with increase in glucose uptake rate. In Fig.6, with glucose uptake rate kept constant, growth rate increased steadily for most of the part with increase in oxygen uptake rate. Figure7 and Fig.8 represent Robustness study of metabolic network ofSalmonella typhimurium. In Fig.7, with oxygen uptake rate kept constant, growth rate steadily increased with increase in glucose uptake rate up to the growth rate of around 0.62/h. Then after growth rate becomes constant. In Fig.8, with glucose uptake rate kept constant, growth rate remains 0/h for the oxygen uptake rate of around 12 mmol/(gDW·h). Then after growth rate starts increasing steadily up to the growth rate of 0.62/h and becomes constant.
Phenotypic phase plane analysis
Phenotypic phase plane analysis was performed by varying two substrates simultaneously. In this study phenotypic phase plane analysis is done by varying the glucose and oxygen uptake rate simultaneously. It assesses the effect of the simultaneous variation of the uptake rate of the glucose and oxygen on the growth rate. The effect was plotted on the 2-D graph as well as on the 3-D graph. The plot is divided into different phases with different color coding. The 3-D graph was plotted usingsurfl MATLAB function. The results for the phenotypic phase plane analysis can be visualized in the form of 3-D plot which have been shown in figures from Fig.9 to Fig.12. Figure 9 represents phenotypic phase plane study of metabolic network ofMycobacterium tuberculosis. Phase 1, on the far left of the plots, is characterized by 0 growth. In phase 2, growth is limited by oxygen. The line between phase 2 and phase 3 is where glucose and oxygen are perfectly balanced and growth yield is highest. In phases 3, 4, and 5, oxygen and glucose are both limiting growth. Figure10 represents the phenotypic phase plane study of metabolic network ofStaphylococcus aureus. Phase 1, on the far left of the plots, is characterized by negative growth. In phase 2, growth is limited by oxygen uptake rate. The line between phase 2 and phase 3 is where glucose and oxygen are perfectly balanced and growth yield is highest. In phases 3, 4, and 5, growth rate is limited by glucose uptake rate. Figure 11 represents phenotypic phase plane study of metabolic network ofHelicobacter pylori. Phase 1, on the far left of the plots, is characterized by negative growth. In phase 2, growth is limited by oxygen uptake rate. The line between phase 2 and phase 3 is where glucose and oxygen are perfectly balanced and growth yield is highest. In phases 3, 4, and 5, growth rate is limited by glucose uptake rate. Figure12 represents phenotypic phase plane study of metabolic network ofSalmonella typhimurium. Phase 1, on the far left of the plots, is characterized by 0 growth. In phase 2, growth is limited by oxygen. The line between phase 2 and phase 3 is where glucose and oxygen are perfectly balanced and growth yield is highest. In phases 3, 4, and 5, oxygen and glucose are both limiting growth.
Metabolite connectivity
For each of the four networks, the number of reactions in which each possible metabolite is occurs was determined. It is the measure of individual connectivity of each of the metabolite in the metabolic network. This metabolite connectivity was plotted on the log-log scale for each of the four metabolic networks. The approximate linear appearance of the curve relates to the power law distribution of the metabolic networks. The plots of four different metabolic networks show that there are very few metabolites, which are highly connected. Most of the metabolites occur only in a few reactions. The highly connected metabolites are the ‘global factors’ similar to the hub protein in the protein–protein interaction network. The least connected metabolites are the ‘local factors’ which occur in linear pathway. The power law distribution shows that the networks are scale-free. Metabolite connectivity in the metabolic network of each of the four different organisms has been shown in the figures from Fig.13 to Fig. 16.
Reaction participation, i.e., the number of metabolites per reaction, for each of the reactions in the network was also calculated. The most common type of reaction in the metabolic network of each of the organism was the bi-linear reaction involving two substrates and two products. Average of the reaction participation in each of the network is shown in the Table13.
Reaction essentiality and metabolite connectivity
To correlate the reaction essentiality and the metabolite connectivity in each of the metabolic networks, metabolite associated reactions were found out and finally they are deleted. After then FBA was used to predict the growth if possible. This step was performed for the metabolic networks of each of the four organisms. The results were plotted on the semi-log scale with metabolite connectivity and lethality of the network. The results for the metabolic network of four different organisms show that some less connected metabolites have a higher lethality fraction than highly connected metabolites. For theH. pylori core model, the average lethality fraction lies between 0.3 and 0.6 for the majority of the metabolites, regardless of their connectivity as shown in the Fig.19. For theM. tuberculosis core model, the average lethality fraction lies between 0.3 and 0.6 for the majority of the metabolites, regardless of their connectivity as shown in the Fig.17. For theS. aureus core model, the average lethality fraction lies between 0.2 and 0.6 for the majority of the metabolites, regardless of their connectivity which has been shown in the Fig.18. For theS. typhimurium core model, the average lethality fraction lies between 0.2 and 0.5 for the majority of the metabolites, regardless of their connectivity and is shown in the Fig.20.
Discussion
All the four metabolic network files are the reconstructed models of these organisms. Maximization of growth is very important compartmental objective to find out good FBA estimation using the kind of objective function explored. While performing the FBA analysis, growth was possible both aerobically and anaerobically for each of the four organisms. In this study, the variations in flux values were generally associated with important metabolic pathways like glycolysis, oxidative phosphorylation, and amino acid biosynthesis etc. Single gene deletion and single reaction deletion functions estimate the amount of essential genes and reactions by deleting the certain genes and reactions and analyzing the impact of the deletion. These essential genes and reactions are the major players in the regulation of important pathways of the organisms. These genes and reactions are also involved in the signal transduction pathways. These genes and reactions serve as the candidate for the potential drug target against diseases caused by these organisms. The identification of synthetic lethals in organisms can be used to understand complex genetic interactions between genes and identify drug targets for combinatorial therapy. In this comparative study, it has been found thatSalmonella typhimurium’s metabolic network contains more number of essential genes and essential reactions compared to that of other three. Similar comparison applies for the synthetic lethal genes and reactions also. Since metabolic networks of Salmonella typhimurium has more number of essential genes and reactions, visualization of gene deletion confirms the same. Most of the gene deletions were found to be in the lethal range having the value close to ‘1’ on the lethality scale. Gene deletion inMycobacterium tuberculosis’s and Staphylococcus aureus’s metabolic networks were found to be almost similar on the lethality scale. This implies that it may present more number of drug targets and hence to target the disease more effectively caused by this organism. Robustness analysis estimates up to what extent a given network is stable under the given physiological conditions. We have performed the robustness analysis by analyzing the effect of variation of the glucose and oxygen uptake rate on the growth rate of the organism. Growth rates of these two organisms were stable for most of the part of varying the glucose and oxygen uptake rate. Metabolic Metabolite connectivity enables us to understand the basic structure of these metabolic networks of four different organisms. Among the metabolic networks of all the four organisms studied, there are very few metabolites that are highly connected. Most of the metabolites are scattered. Metabolites which are highly connected serve as ‘global factor’ similar to the hub protein in the protein – protein interaction network. The correlation study between metabolite connectivity and reaction essentiality gives the major insight into the lethality fraction of the metabolic networks. The lethality fraction is computed as the ratio between metabolite connectivity and reaction essentiality. The average lethality fraction of all the four organisms studied, ranged from 0.2 to 0.6.
Higher Education Press and Springer-Verlag Berlin Heidelberg