Introduction
The presence of abnormal number of chromosomes in cells or aneuploidy is a characteristic of cancer. The abnormality in the spindle check point of cell cycle can lead to aneuploidy (
Jallepalli and Lengauer, 2001). Spindle check point is the governing factor of mitosis which makes a delay in chromosome segregation until the entire sister chromatids attach to the bipolar mitotic spindle (
Fava et al., 2011;
Chao et al., 2012). This checkpoint mechanism depends on some proteins. Mad 2 (mitotic arrest deficient 2),one among the spindle check point proteins (
Hoyt et al., 1991;
Li and Murray, 1991), accumulates specifically on kinetochores that are not stably connected with spindle microtubules (
Chen et al., 1996). Mad1, another spindle check point protein, is the factor helping Mad2 to bind with the kinetochores. Once Mad2 is attached to the kinetochore, it moves to an active form by binding with Cdc20, an activator of the anaphase promoting complex or cyclosome (APC/C) and stops the chromosome segregation (
Kim et al., 1998;
Brady and Hardwick, 2000;
Peters, 2006;
Peters, 2008). Thus Mad2 ensures that chromosome segregation is only initiated when all chromosomes have been attached to both poles of the mitotic spindle (
Bharadwaj and Yu, 2004;
Peters, 2008). Mad2-Mad1 complex is the only way of Mad2 to get attached with kinetochores.
Mainly three functions are carried out by the Mad1 protein in spindle check point. It forms a complex with Mad2 through its C terminus domain and uses its N terminus domain for the localization of kinetochore-spindle attachment structure on chromosomes (
Chen et al., 1998;
Hwang et al., 1998). There is an essential role for Mad1 in the formation of Mad2- Cdc20 complex, as it helps the Mad2 for the conformational change to bind Cdc20 (
Yu, 2002). Mad1 also forms a complex with Bub1-Bub3 (budding uninhibited by benzimidazole) in mitosis (
Hardwick et al., 1996;
Brady and Hardwick, 2000).
In various types of cancer, down-regulated expression level of Mad1 (
Han et al., 2000), Mad2 (
Wang et al., 2002), BUB 1 (
Gemma et al., 2000;
Mimori et al., 2001), BUB 2 and 3 (
Hoyt et al., 1991;
Lopes and Sunkel, 2003) BUBR1 (
Ohshima et al., 2000;
Sato et al., 2000;
Reis et al., 2001) were reported. Altered expressions of these proteins play an important role in the progression of cancer. In addition to this, another spindle check point component 3F3/2 kinetochore epitope has a significant role in the detection of stable kinetochore microtubule attachment which in turns shows its importance in the signaling pathway of spindle check point (
Michael et al. 1995). More studies on these components of spindle check point can shed light in the aneuploidy and thus to cancer (
Cahill et al., 1998).
Hence, identifying the cancer-associated missense mutations had been a challenging task for the cancer research. In this study we investigated the detrimental missense mutations of Mad1 protein. Mutations in this protein can lead to the malfunctioning of the spindle check point (
Chung and Chen, 2002) and also different types of cancer (
Nomoto et al., 1999,
Tsukasaki et al., 2001). These missense mutations fall in to the category of non-synonymous SNPs (nsSNP) which cause change in amino acid residue. The missense mutations of Mad1 protein cause numerous types of cancer (
Tsukasaki et al., 2001) and there are 13 reported missense mutations in Mad1. The computational protocol was used to identify and analyze them and the model structures were proposed for the mutants. The binding partner Mad2 was docked with both the native and mutants of Mad1 to determine the binding effect and the nature of flexibility in the binding pockets, which explained the decreased binding efficiency of these missense mutations.
Materials and methods
Data sets
The protein sequence and variants (single amino acid polymorphisms/missense mutations/point mutations) of Mad1 were obtained from the Swiss-prot database available at http://www.expasy.ch/sprot/. The subsection of each Swiss-prot entry provides information on polymorphic variants. Some of the polymorphic variants may be disease(s)-associated by causing defects in a given protein. Most of these polymorphic variants were nsSNPs (non-synonymous SNPs) in the gene sequence and SAPs (single amino acid polymorphisms) in the protein sequence (
Ramensky et al., 2002,
Yip et al., 2004). The 3D Cartesian coordinates of Mad1 was obtained from Protein Data Bank with PDB ID 1GO4 (
Berman et al., 2002) for
in silico mutation modeling and docking studies based on detrimental point mutants.
Predicting stability changes caused by SAPs using support vector machine (I Mutant 2.0)
We used the program I Mutant2.0 (http://gpcr. biocomp.unibo.it/cgi/predictors/IMutant2.0/ I Mutant2.0.cgi) for this study. I Mutant2.0 is a support vector machine (SMV) based tool for the automatic prediction of protein stability changes caused by single point mutations. I Mutant2.0 predictions were performed starting either from the protein structure or, more importantly, from the protein sequence (
Capriotti et al., 2005). This program was trained and tested on a data set derived from ProTherm (
Bava et al., 2004), which is the most comprehensive available database of thermodynamic experimental data of free energy changes of protein stability caused by mutations under different conditions. The output files showed the predicted free energy change value or sign (∆∆
G), which was calculated from the unfolding Gibbs free energy value of the mutated protein minus the unfolding Gibbs free energy value of the native protein (KJ/mol). Positive ∆∆
G values meant that the mutated protein had higher stability and negative values indicated lower stability.
Analysis of functional consequences of point mutations by a sequence homology-based method (SIFT)
The program, SIFT available at http://blocks.fhcrc.org/sift/SIFT.html (
Ng and Henikoff, 2003) is used specifically to detect deleterious single amino acid polymorphisms. SIFT is a sequence homology-based tool, which presumes that important amino acids will be conserved in a protein family; therefore, changes at well-conserved positions tend to be predicted as deleterious (
Ng and Henikoff, 2001). Queries were submitted in the form of protein sequences. SIFT take a query sequence and use multiple alignment information to predict tolerated and deleterious substitutions for every position of the query sequence. SIFT is a multistep procedure that, for given a protein sequence, (i) searches for similar sequences, (ii) chooses closely related sequences that may share similar function, (iii) obtains the multiple alignment of these chosen sequences, and (iv) calculates normalized probabilities for all possible substitutions at each position from the alignment. Substitutions at each position with normalized probabilities less than a chosen cutoff were predicted to be deleterious and those greater than or equal to the cutoff were predicted to be tolerated (
Ng and Henikoff, 2003). The cutoff value in SIFT program was tolerance index of≥0.05. The higher the tolerance index, the less functional impact a particular amino acid substitution would be likely to have.
Simulation for functional change in a point mutant by structure homology-based method (PolyPhen)
Analyzing the damage caused by point mutations at the structural level is considered very important to understand the functional activity of the protein. The server PolyPhen (
Ramensky et al., 2002) available at http://coot.embl.de/ PolyPhen/ was used for this purpose. Input options for the PolyPhen server were protein sequence, SWALL database ID or accession number, together with the sequence position of two amino acid variants. The query was submitted in the form of a protein sequence with a mutational position and two amino acid variants. Sequence-based characterization of the substitution site, profile analysis of homologous sequences, and mapping of the substitution site to known protein 3D structures were the parameters taken into account by PolyPhen server to calculate the score. It calculated position-specific independent counts (PSIC) scores for each of the two variants and then computed the PSIC scores difference between them. The higher the PSIC score difference, the higher the functional impact a particular amino acid substitution would be likely to have.
Modeling point mutation on protein structures to compute the RMSD
Structure analysis was performed to evaluate the structural deviation between native proteins and mutant proteins by means of root mean square deviation (RMSD). The web resource Protein Data Bank (
Berman et al., 2002) was used to identify the 3D structure of Mad1 (PDB ID: 1GO4)and also confirmed the mutation position and the mutation residue in PDB ID 1GO4. To calculate the RMSD and total energy for native and mutant Mad1 with Mad2, we used SWISSPDB viewer for performing mutation, and NOMAD-Ref server performed the energy minimization for 3D structures (
Lindahl et al., 2006). This server used Gromacs as the default force field for energy minimization, based on the methods of steepest descent, conjugate gradient and Limited-memory Broyden-Fletcher-Goldfarb- Shanno (L-BFGS) methods (
Delarue and Dumas., 2004). The conjugate gradient method was used here to minimize the energy of the 3D structure of Mad1. Divergence of the mutant structure from the native structure could be caused by substitutions, deletions and insertions (
Han et al., 2006) and the deviation between the two structures could alter the functional activity (
Varfolomeev et al., 2002) with respect to binding efficiency of the binding partner, which was evaluated by their RMSD values.
Identification of binding residues in Mad1, Mad2 interaction
To understand the functional activity of Mad1 protein with its binding partner Mad2, we selected the PDB ID: 1GO4 in which chain E, F, G and H belongs to Mad1 and chain A, B, C, and D belongs to Mad2 protein. So we once again performed the mutations in chain G of 1GO4 by using SWISSPDB viewer and the energy minimization was done by NOMAD-Ref. Finally, the program PatchDock was used for the docking of the native and mutant Mad1 with Mad2 to compute the atomic contact energy. The underlying principle of this server was based on molecular shape representation, surface patch matching plus filtering and scoring (
Duhovny et al., 2002). It found docking transformations that yield good molecular shape complementarity. Such transformations, when applied, induce both wide interface areas and small amounts of steric clashes. A wide interface ensured that several matched local features of the docked molecules that have complementary characteristics were included. The PatchDock algorithm divided the Connolly dot surface representation (
Connolly, 1983;
Zhang et al., 1997) of the molecules into concave, convex and flat patches. Then, complementary patches were matched to generate candidate transformations. Each candidate transformation was further evaluated by a scoring function that considered both geometric fit and atomic desolvation energy (
Zhang et al., 1997;
Schneidman-Duhovny et al., 2005).
To identify the binding residues between Mad1 and Mad2, we submitted the PDB ID: 1GO4, to the Protein Interactions Calculator (PIC) program (
Tinaet al., 2007). PIC is a server, which given the coordinate set of 3D structure of a protein or an assembly, computed various interactions such as disulphide bonds, interactions between hydrophobic residues, ionic interactions, hydrogen bonds, aromatic-aromatic interactions, aromatic-sulfur interactions and cation-π interactions within a protein or between proteins in a complex (
Tinaet al., 2007).
Exploring the flexibility of binding pocket by normal mode analysis
A quantitative measure of the atomic motions in proteins could be obtained from the mean square fluctuations of the atoms relative to their average positions. Protein flexibility meant important for protein function (
Carlson and McCammon, 2000). In addition, the flexibility of certain amino acids in a protein meant useful for various types of interactions. Moreover, the flexibility of amino acids in the binding pocket was considered a significant parameter for understanding the binding efficiency. In fact, loss of flexibility impaired the binding effect (
Hinkle and Tobacman, 2003) and vice versa (
Rajasekaran et al., 2008). Hence the flexibility of amino acids in the binding site was computed from the mean-square displacement<
R2>of the lowest-frequency normal mode using ElNémo server (
Suhre and Sanejouand, 2004).
Results
Data set from Swiss-prot
The Mad1 protein and a total of 13 variants, namely S29L, R59C, N160S, T299A, R360Q, T500M, E511K, E516K, R556C, R556H, R558H, E569K, and R572H were taken from Swiss-prot (
Boeckmann et al., 2003;
Yip et al., 2004;
Yip et al., 2008).
Identification of functional missense mutants of Mad1 by I Mutant2.0
Of the 13 variants, 10 were observed as less stable from the I Mutant 2.0 server as shown in the Table 1. Out of such 10 variants, 2 variants, viz., R556H and R558H had shown ∆∆G value of -1.29 and -1.20, respectively. All the other variants, viz., S29L, T299A, R360Q, E511K, E516K, R556C, E569K and R572H shown a ∆∆G value of<-1.0 as illustrated in Table 1. Out of 10 less stable variants, 2 variants, namely, S29L and T299A changed their amino acids from polar to non- polar amino acid; 3 variants, E511K, E516K and E569K changed from acidic to basic, 2 variants, R360Q and R556C changed from basic to polar. Another 3variants, R556H, R558H and R572H retained their basic properties.
Predicting the deleterious missense mutants of Mad1 by SIFT program
The degree of conservation of a particular position in a protein was determined using sequence homology based tool, SIFT. The protein sequences of the 13 variants were submitted to SIFT to determine their tolerance indices. As the tolerance level increases, the functional influence of the amino acid substitution decreases and vice versa. Out of 13 variants, 6 variants viz., S29L, R59C, T500M, E511K, R556C and R556H were found to be deleterious having the tolerance index of≤0.05 shown in Table 1. We observed that, out of 6 identified possibly deleterious variants, 2 variants, S29L and E511K showed a highly deleterious tolerance index score of 0.00, the variant R59C showed a tolerance index of 0.01, 2 variants, T500M and R556C with a tolerance index score of 0.02, the variant R556H was with a tolerance index score of 0.03. It was identified that 4 deleterious variants, (S29L, E511K, R556C and R556H) predicted by SIFT program, were seen to be less stable by the I Mutant 2.0 server.
Predicting the damaged missense mutants of Mad1 by PolyPhen
Structural level alterations were determined by PolyPhen program. Protein sequence with mutational position and amino acid variants associated with the 13 single point mutants were submitted to the PolyPhen server. A PSIC score difference of 0.001 and above was considered to be damaging. Of the 13 variants, 12 were said to be damaging by PolyPhen and these variants had a PSIC score difference between 0.001 and 1.00. It was shown a similarity in the prediction of damaged mutants by PolyPhen and less stable mutant by I Mutant 2.0 server in the case of 9 variants (S29L, R360Q, E511K, E516K, R556c, R556H, R558H, E569H, and R572H). Whereas, the deleterious mutation predicted by SIFT server shown a similarity with the damaged mutants predicted by PolyPhen in the case of 6 variants (S29L, R59C, T500M, E511K, R556C, R556H).
After analyzing the results from 3 programs, we found that 4 mutations were commonly found to be less stable, deleterious and damaging by I Mutant 2.0, SIFT and PolyPhen servers, respectively. It was also to be noted that the mutations S29L, E511K, R556C and R556H were experimentally identified as potential candidates for cancer (
Nomoto et al., 1999,
Tsukasaki et al., 2001). Hence we considered the potential detrimental mutations namely E511K, R556C and R556H, which were mapped on the 3D structure, were taken in to the account of the further course of investigation.
Modeling the mutant structures and computing RMSD values
The available structure of Mad1 had PDB ID 1GO4. The mutational position and amino acid variants were mapped onto 1GO4 native structure. Mutations at a specified position were performed in silico by SWISSPDB viewer independently to obtain a modeled structure. NOMAD-Ref server performed the energy minimizations, for both native structure and the 3 mutant modeled structures. To determine the deviation between the native structure and the mutants, the native structure was superimposed with all 3 mutant modeled structures and calculated the RMSD. The higher the RMSD value, the greater the deviation between the native and mutant structure, which in turn changes the binding efficiency with the Mad2 because of the deviation in the 3D space of the binding residues of Mad1. Table 2 illustrated that the two mutants R556H and R556C had higher RMSD>1.90Å and the mutant E511K exhibited an RMSD<1.00 Å. The superimposed structure of native with 3 mutants were also shown in Fig. 1. Total energy calculations for native was found to be -8352.652 KJ/mol, on the other hand the total energy for the mutants E511K, R556H andR556C, were found to be -7958.655, -7630.460 and -7609.384 KJ/mol respectively. This analysis clearly portrayed the three mutants showed slightly higher total energy as compared to native Mad1, which in turn alter the stability of mutants as compared to native.
Rationale of binding efficiency for native and mutant structures of Mad1 with Mad2
To determine the binding efficiency of Mad1 with Mad2, the PDB ID 1GO4 structure was selected, and the PIC program was used to calculate contacts between the binding residues of Mad1 and Mad2. In this analysis we found that the 19 amino acids i.e., Glu(527), Ala(530), Leu(531), Gln(532), Tyr(535), Arg(539), Lys(541), Val(542), Leu(543), His(544), Met(545), Ser(546), Asn(548), Pro(549), Thr(550), Ala(553), Arg(556), Leu(557), His(561) act as binding residues in Mad1 with Mad2 (Table 3).
Docking was performed using Patchdock among Mad2 and Mad1 native and mutant modeled structures to determine atomic contact energy. The ACE between Mad2 and native Mad1 was found to be -398.19Kcal/mol, whereas the ACE between Mad2 and the mutants E511K, R556C and R556H were found to be 103.92, -82.16 and 83.63 Kcal/mol respectively. This analysis showed that the three mutants established low binding affinity with Mad2 as compared to native Mad1 (Table 2). The docked complexes of Mad2 with native and mutant Mad1 were also depicted in Fig. 2. It was also observed that the number of intermolecular interactions was also reduced for the mutants than the native Mad1. As the mutants were commonly found to be less stable, deleterious and damaging by the I mutant 2.0, SIFT and PolyPhen servers respectively, they were also confirmed as detrimental by structural analysis (total energy calculation, RMSD and binding efficiency). Moreover they were also confirmed as detrimental by experimental and clinical observations performed elsewhere (
Nomoto et al., 1999,
Tsukasaki et al., 2001). Hence we further investigated the three detrimental point mutations by normal mode analysis to understand the flexibility of the active site region for the native and mutant structures.
The majority of amino acids in active site showed loss of flexibility
To understand the variation of substrate binding efficiency of the 3 detrimental missense mutations, the program ElNémo was used to compare the flexibility of amino acids that were involved in binding with Mad2 of both the native protein and the mutants. Table 3 depicted the flexibility of the amino acids in the substrate binding pocket (active site) of both the native and mutant proteins of Mad1 by means of the normalized mean square displacement<R2>. These data were further sorted into three different categories of flexibility as shown in Table 4. First one was where the<R2>of the amino acids in the substrate binding pocket of the mutant was the same as that of the native protein (termed identical flexibility). The second category was where the<R2>of the amino acids in the substrate binding pocket of the mutant was higher than that of the native protein (termed increased flexibility). The last category was where the<R2>of the amino acids in the substrate binding pocket of a mutant was lower than that of the native protein (termed decreased flexibility). It was found that majority of amino acids participated in binding region of these 3 mutants lost their binding affinity due to their occurrence in the range of ‘decreased flexibility’ which signified the loss of binding efficiency as could be seen in Table 4.
Discussion
Of the 13 variants that were retrieved from Swissprot, 10 variants were found less stable by I Mutant2.0, 6 variants were found to be deleterious by SIFT and 12 variants were considered damaging by PolyPhen. Three variants were selected as potentially detrimental point mutations because they were commonly found to be less stable, deleterious and damaging by the I Mutant 2.0, SIFT and PolyPhen servers, respectively. The structures of these 3 variants were modeled and the RMSD between the mutants and native structures ranged from 0.52Å to 1.99Å. Docking analysis between Mad2 and Mad1 native and mutant modeled structures established the ACE of -398.19, 103.92, 83.63,-82.16 Kcal/mol. Finally, we concluded that the lower binding affinity of these three mutants (E511K, R556H, and R556C) and RMSD scores made them identified as deleterious mutations. Normalized mean square displacement<R2>by normal mode analysis allowed us to conclude that the majority of amino acids in the mutants bind to Mad2 (i.e., are in the active site) had decreased flexibility which could be the cause for their decreased substrate binding affinity. Thus the results indicated that our approach successfully allowed us to (i) consider computationally a suitable protocol for missense mutation (point mutation/ single amino acid polymorphism) analysis before wet laboratory experimentation and (ii) provided an optimal path for further clinical and experimental studies to characterize Mad1 mutants in depth.
Higher Education Press and Springer-Verlag Berlin Heidelberg