Introduction
Proton-pump inhibitors (PPIs) are benzimidazoles or their analogs that directly target H
+K
+-ATPase proton pump for gastric acid secretion. PPIs are widely used in the treatment of peptic ulcer disease,
Helicobacter pylori infection, gastroesophageal reflux, nonsteroidal anti-inflammatory drug-induced gastrointestinal lesions, and Zollinger–Ellison syndrome [
1]. Since the introduction of the first PPI omeprazole in 1988 [
2], several other PPIs, such as lansoprazole, pantoprazole, esomeprazole, rabeprazole, tenatoprazole, and leminoprazole, have been successively developed. PPIs are widely used and relatively safe drugs for the treatment of gastric acid related diseases due to their significant curative effects attributed to potent inhibition of gastric acid secretion and few adverse reactions. However, reports on the adverse effects of long-term PPI use have increased recently. This condition has led to a new round of research on the mechanisms underlying those adverse effects. Recent studies have postulated two possible mechanisms that were primarily based on clinical trials or on experimental biology that focused on a certain target or pathway. The first mechanism suggested that the long-term use of PPIs led to the increase in gastric pH and the risk of
H. pylori infection that aggravated adverse effects, such as inflammation [
3]. The second mechanism suggested that PPIs were activated in the environment of lysosomes (pH 5) and affected the activities of lysosomal enzymes, which weakened the immune system and induced adverse effects, such as cancer or infectious diseases [
4]. The two studies have only focused on few diseases, such as cancer and infections. Many adverse effects, such as abdominal pain, diarrhea, constipation, emotional disturbance, distractibility, anemia, and angioedema, were reported that were still not explained with the abovementioned mechanisms [
1,
5–
7]. To analyze the side effects of PPIs, we investigated the gene expression profiles of rat liver tissue under PPI perturbation and we established the relationships among PPIs, their adverse effects, and the involved genes [
8]. We determined that the PPIs enriched in the acidic organelles of cells, became activated, and inhibited V-ATPase and hydrolytic enzymes in those acid organelles, which blocked the antigen presenting process and the synthesis and excretion of cytokines, complements, and blood coagulation factors. On this basis, we presented
in silico design for novel PPI molecules with reduced adverse effects.
Materials and methods
Extraction of R groups
The extraction of fragments (R groups) from small molecules in various databases was performed with the method of the breaking of retrosynthetically interesting chemical substructures (rdkit.Chem.BRICS package in RDkit package) [
9] by breaking the synthetically accessible bonds to fragment these small molecules. The generated fragments were merged, and the frequency of these R groups was calculated after the redundancy was removed.
Molecule enumeration
The enumeration was conducted to derive all the possible combinations within the molecular scaffolds of benzimidazoles and abovementioned R groups. All the attachment points of molecular scaffolds were bonded with one random R group to generate a new set of candidate molecules.
Calculation of pKa value
The pKa values were calculated by pKaPlugin in the ChemAxon Marvin toolkit [
10]. The candidates were screened by 1<pKa1<4 of the nitrogen atom on its pyridine ring.
Drug-likeness screening
A drug-likeness screening was performed based on the “Rule of Five” as follows: (1) less than 5 hydrogen bond donors; (2) less than 10 hydrogen bond receptors; (3) less than 500 Dalton molecular mass; (4) greater than 5 lipid-water partition coefficients [
11]. RDKit was applied in the calculation. The candidate molecules that complied with these rules were used for screening.
All the scripts of core methods in this study are available at Supplementary Codes of Methods.
Results and discussion
We first generated a virtual data set of benzimidazole molecules based on the knowledge of PPI pharmacology and toxicology. These molecules were screened by the pKa value of their nitrogen atoms on pyridine ring. Subsequently, the drug-likeness evaluation was conducted, and those passed the evaluation were considered as the candidates for new PPIs.
Generation of virtual data set of benzimidazole molecules
The design of benzimidazole molecular scaffolds was enlightened by PPI pharmacology. PPIs are prodrugs that require activation in an acidic environment. Activated PPIs can bind to the alpha region of the H
+K
+-ATPases on secretory capillaries and can form disulfide bonds with cysteine residues that inhibit the activities of those acid secretion pumps (Fig. 1) [
12]. Stable Ring I and Ring II successively formed after PPI activation under acidic environment, which were critical for the drug efficacy. Considering the one more atom constantly existed in Ring II than in Ring I, PPIs were preferred to be 5 or 6 or 7-membered heteroatomic ring for their stability.
In addition, with regard to PPI toxicology, pH of the tube cavity on gastric wall is approximately 1, pH around acid organelles, such as lysosome, is approximately 5, and pKa1 (Fig. 2; pKa of nitrogen atom on pyridine ring) of current PPIs is close to 4. However, there are some factors that make PPIs enriched in weak acid environment before playing its effective role, which can lead to series of adverse effects. These factors include the following: (1) PPIs should go through the blood circulation before reaching gastric parietal cells and getting close to their target proton pump; (2) the activation of nitrogen atom on its pyridine ring is a typical reversible reaction; (3) PPIs form irreversible covalent bond with their target proton pump, which enable PPIs to reach an “irreversible” chemical equilibrium in other weak acid environments before reaching their target cell.
On this basis, we proposed a clear target to design new PPIs by modifying the structure of the current PPIs to decrease its pKa1, which increases the acidity of the nitrogen atom on its pyridine ring. New PPI molecules should have the following characteristics:
(1) Number Ring II= Number Ring I+ 1
(2) Prefer to be 5 or 6 or 7-membered heteroatomic ring for its stability
(3) 1<pKa1<4
Under the same conditions (pH, temperature, etc.), the pKa value of a certain atom can be affected by the substituents on or near the atom. However, no substituent can be added to the nitrogen atom on its pyrimidine ring in PPI, which indicates that the modification should focus on the nearing sites. Considering the structure of several existing benzimidazoles, such as tenatoprazole, we designed six molecular scaffolds, as illustrated in Fig. 3.
Extraction of R groups
To generate potential candidate molecules in comprehensive aspects, we collected and organized massive compound structure data from multiple databases for the extraction of R groups, which included DrugBank [
13], TCM Database@Taiwan [
14], TTD [
15], GDB [
16,
17], HMDB [
18], KEGG COMPOUND [
19], BitterDB [
20], CancerResource [
21], T3DB [
22,
23], and FooDB [
24]. After redundancy removal, we calculated the frequency for the remaining R groups. Next, the attachment point of each R group was marked and was bonded to six molecular scaffolds. Through direct downloading or manual collection, the molecular structures from the 10 abovementioned databases were derived and R groups were split. The representative results are illustrated in Fig. 4, and the complete results can be found at the corresponding authors’ website. Please contact us if you are interested.
Molecule enumeration
Full enumeration was performed on all possible combinations within the six molecular scaffolds, and different sets of R groups were derived from 10 different databases. Ten virtual data sets of benzimidazole molecules were generated and can be found at the corresponding authors’ website. Please contact us if you are interested.
pKa prediction and molecule screening
On the basis of knowledge of PPI pharmacology and toxicology, the pKa value of nitrogen atom on pyridine ring is the key factor for PPIs to activate in acid organelles and induce adverse effects. Therefore, the R groups on PPI molecular scaffolds can be replaced to decrease those adverse effects caused by long-term use of PPIs and to decrease the pKa value of the nitrogen atom on its pyridine ring. Therefore, the pKa value prediction was applied to all virtual molecules.
Currently available pKa prediction tools include ACD/pKa DB [
10], Epik [
25], Jaguar [
26], Pallas pKalc [
27], Pipeline Pilot [
28], and Marvin [
10]. They are all empirical methods that involve linear free-energy relationships [
25], quantitative structure-property relationships (QSPR) [
29,
30], and database lookup. Marvin belongs to QSPR and can calculate the pKa value of a certain atom on molecules, which is easy to expand and customize through its API. In this study, Marvin was adopted for predicting the pKa1 values and for screening all virtual benzimidazole molecules.
After prediction, all molecules were screened by 1<pKa1<4. The remaining molecules were collected for subsequent study.
Fig. 5 illustrates the pKa1 distributions of virtual molecules. The detailed information for virtual molecules derived from the first three R groups of different databases can be found at the corresponding authors’ website. Please contact us if you are interested. The results significantly vary between different data sources, which indicate the necessity to be as comprehensive and complete as possible to collect molecules from different databases.
Drug-likeness screening
Since Lipinski
et al. [
11] established the first and renowned “Rule of Five” drug-likeness rule in 1997, a series of similar rules, known as “rules of thumb,” has been successively set. Typical rules include rapid elimination of swill (REOS) by Walters
et al. [
31] in 1998 and quantitative estimate of drug-likeness index by Hopkins
et al. [
32] in 2012. To improve the accuracy of prediction, many machine learning approaches have also been employed to fit the complex relationships within various properties and drugs in recent years [
33].
We adopted Lipinski’s “Rule of Five” for drug-likeness screening in our work by considering the following facts: (1) this rule is generally recognized and widely used; (2) the disadvantage of relatively few molecule properties can be compensated by later ADMET theoretical evaluation, which incorporates many properties; (3) The results can be repeatable by other researchers because many commercial and open-source software are embedded in this rule. The new candidate PPI molecule data set that is provided to effectively decrease the adverse effect induced by long-term therapy is listed in Supplementary Table S1. The scripts of “Rule of Three (Lead-like drugs)” and REOS are also provided as an extension to ROF application for researchers at Supplementary Codes of Methods.
The pKa value distributions of virtual molecules with different scaffolds (Fig. 6A) demonstrate that the differences in scaffolds have an effect on the pKa value of molecules to some extent, in which the pKa value for scaffolds 3, 6 generated molecules are relatively low, and for 1, 4, 5 are relatively high. Scaffolds 2, 3 generated many molecules that meet the requirements of 1<pKa1<4 and “Rule of Five.” Fig. 6B demonstrates that scaffolds 1, 2, 3, 4, 5, and 6 generated molecules that have similar molecular structures that are mainly reflected in the PC2 axis. In addition, all molecules can be divided into two major parts in the PC1 axis, which is caused by difference of the R groups in the molecules.
Conclusions
We designed new PPIs based on their pharmacological and toxicological mechanisms. We first constructed six molecular scaffolds that complied with three abovementioned criteria. Next, R groups were extracted from compound molecules in 10 different databases, such as DrugBank, TCM Database@Taiwan, and GDB. On this basis, we employed the “virtual structure generation” technique to establish a virtual molecule data set. Subsequently, the pKa values of specific atoms on the generated molecules were calculated to select those molecules with required pKa values. Finally, drug-likeness screening was conducted, in which we obtained the remaining molecules that can be the new PPI candidates to significantly reduce the adverse effects. Although these new PPI candidates with reduced adverse effects required experimental validation, we showed that they can be achieved theoretically by lowering the pKa values of nitrogen atoms on their pyrimidine ring. The study also provided insights and tools for designing the targeted molecules
in silico that were suitable in practical applications. Considering that the annual sales of PPIs have exceeded $20 billion on the global market [
34], this work may have significant effect on global economy. All the scripts and core functions of these methods in this study are available at Supplementary Codes of Methods.
Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature