DeepDrug: A general graph-based deep learning framework for drug-drug interactions and drug-target interactions prediction

Qijin Yin , Rui Fan , Xusheng Cao , Qiao Liu , Rui Jiang , Wanwen Zeng

Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 260 -274.

PDF (5368KB)
Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 260 -274. DOI: 10.15302/J-QB-022-0320
RESEARCH ARTICLE
RESEARCH ARTICLE

DeepDrug: A general graph-based deep learning framework for drug-drug interactions and drug-target interactions prediction

Author information +
History +
PDF (5368KB)

Abstract

Background: Computational approaches for accurate prediction of drug interactions, such as drug-drug interactions (DDIs) and drug-target interactions (DTIs), are highly demanded for biochemical researchers. Despite the fact that many methods have been proposed and developed to predict DDIs and DTIs respectively, their success is still limited due to a lack of systematic evaluation of the intrinsic properties embedded in the corresponding chemical structure.

Methods: In this paper, we develop DeepDrug, a deep learning framework for overcoming the above limitation by using residual graph convolutional networks (Res-GCNs) and convolutional networks (CNNs) to learn the comprehensive structure- and sequence-based representations of drugs and proteins.

Results: DeepDrug outperforms state-of-the-art methods in a series of systematic experiments, including binary-class DDIs, multi-class/multi-label DDIs, binary-class DTIs classification and DTIs regression tasks. Furthermore, we visualize the structural features learned by DeepDrug Res-GCN module, which displays compatible and accordant patterns in chemical properties and drug categories, providing additional evidence to support the strong predictive power of DeepDrug. Ultimately, we apply DeepDrug to perform drug repositioning on the whole DrugBank database to discover the potential drug candidates against SARS-CoV-2, where 7 out of 10 top-ranked drugs are reported to be repurposed to potentially treat coronavirus disease 2019 (COVID-19).

Conclusions: To sum up, we believe that DeepDrug is an efficient tool in accurate prediction of DDIs and DTIs and provides a promising insight in understanding the underlying mechanism of these biochemical relations.

Graphical abstract

Keywords

drug-drug interaction / drug-target interaction / graph neural network / deep learning

Cite this article

Download citation ▾
Qijin Yin, Rui Fan, Xusheng Cao, Qiao Liu, Rui Jiang, Wanwen Zeng. DeepDrug: A general graph-based deep learning framework for drug-drug interactions and drug-target interactions prediction. Quant. Biol., 2023, 11(3): 260-274 DOI:10.15302/J-QB-022-0320

登录浏览全文

4963

注册一个新账户 忘记密码

1 INTRODUCTION

The exploration for biomedical interactions between chemical compounds (drugs, molecules) and protein targets is of great significance for drug discovery [1]. It is believed that drugs interact with biological systems by binding to protein targets and affecting their downstream activity. Prediction of drug-target interactions (DTIs) is thus important for identification of therapeutic targets or characteristics of drug targets. The abundant knowledge of DTIs also provides valuable insight towards understanding and uncovering higher-level information such as therapeutic mechanisms in drug repurposing [2]. For instance, Sildenafil was initially developed to treat pulmonary hypertension, but identification of its side effects allowed it to be repositioned for treating erectile dysfunction [3]. In addition, since most human diseases are complex biological processes that are resistant to the activity of a single drug [4,5], polypharmacy has become a promising strategy among pharmacists. Prediction and validation of drug-drug interactions (DDIs) can sometimes reveal potential synergies in drug combinations to improve the therapeutic efficacy of individual drugs [6]. More importantly, negative DDIs are major causes of adverse drug reactions (ADRs) [7], especially among the elderly who are more likely to take multiple medications [8]. The severe ADRs from critical DDIs may lead to the withdrawal of drugs from market, such as withdrawal of mibefradil and cerivastatin from the US market [9,10]. Hence, accurate interactions prediction between drugs can not only ensure drug safety, but also can shed a light for drug repositioning or drug repurposing, which potentially can lower the overall drug development costs and enhance the drug development efficiency.

Over the past decade, the emergence of various biochemical databases, such as DrugBank [11], TwoSides [12], RCSB Protein Data Bank [13] and PubChem [14], has provided a rich resource for studying DTIs and DDIs for health professionals. However, prediction of novel or unseen biochemical interactions still remains a challenging task. In vitro experimental techniques are reliable but expensive and time-consuming. In silico computational approaches have received far more attention due to their cost-effectiveness and increasing accuracy in various drug-related prediction tasks [1519]. The state-of-the-art computational methods for interactions prediction rely on machine learning algorithms that incorporate large-scale biochemical data. Most of these efforts are based on the principle that similar drugs tend to share similar target proteins and vice versa [20]. Hence, the most popular frameworks formulate the prediction of DTIs and DDIs as classification tasks and use different forms of similarity functions as inputs [21]. Another common types of approach are to construct heterogeneous networks in the chemogenomics space to predict potential interactions using random walks [22]. The rise of machine learning methods, especially deep learning methods have promoted drug-related research tremendously in the last two decades, including the tasks for predicting DTIs and DDIs [23,24]. For example, DeepDDI [16] first generated a feature vector called structural similarity profile (SSP) for each drug, then calculated a combined SSPs of a pair of drugs by dimension reduction, i.e., PCA, from concatenation of two SSP of drugs. The combined SSPs were used for training DeepDDI model to perform DDI prediction. Similar to DeepDDI, NDD [15] first calculated high-level features of drug by multiple drug similarities based on drug substructure, target, side effect, pathway and etc. Then it used a multi-layer perceptron for the interaction prediction based on curated features. DeepPurpose [25] is a deep learning framework for DTI and DDI prediction tasks by integrating different types of neural network structure only using sequence-based inputs. DeepDTA [26] used two convolutional neural networks to learn from compound SMILES and protein sequences to predict interactions. GraphDTA [27] used graph neural networks and convolutional neural network to learn the high dimension features of drugs and targets separately and makes interaction prediction via fully connected layers. DDIMDL [28] built a multimodal deep learning framework with multiple drugs features to predict DDIs. SkipGNN [29] is a graph neural network approach for predicting molecular interactions by aggregating information from direct and second-order interactions.

In spite of these advances, there is still room for improvement in several aspects. First of all, the accurate prediction of unseen drug interactions depends heavily on the feature extraction technique or similarity kernel used. Since different forms of feature extraction or similarity kernel introduce varying amount of human-engineered bias, they often display different levels of predictive performance depending on the relevant settings and no single kernel outperforms others universally [30]. Similarity-based methods also have difficulty applied on large-scale datasets due to the significant computational complexity of measuring similarity matrices [31]. Network-based methods built upon topological properties of the multipartite graph suffer from the same problem depending on the complexity of the graph [32]. Deep learning-based methods utilized either sequence-based or structural information only, none of them combined both information for specific drug and protein to comprehensively model the biological entities. Moreover, none of existing methods consider solving DDIs and DTIs tasks using a unified framework.

In recent years, deep learning frameworks based on various of graph neural networks such as graph convolutional network (GCNs) [33], graph attention networks (GATs) [34], gated graph neural networks (GGNNs) [35] and residual gated graph convolutional network [36] have demonstrated ground-breaking performance on social science, natural science, knowledge graphs and many other research areas [3739]. In particular, GCNs have been applied to various biochemical problems such as molecular properties prediction [40], molecular generation, protein function prediction [41]. As pharmacological similarities are mainly originated and computed from not only sequence but also structural properties, graph representations of biochemical entities have shown capability of capturing the structural features of Euclidean ones without requiring feature engineering [42,43].

Based on these observations, we propose DeepDrug, a graph-based deep learning framework, to learn drug interactions such as pairwise DDIs or DTIs. A key insight of our framework is that biochemical interactions are primarily determined by both the sequence and structure of the participating entities and both drugs and proteins can be naturally represented as graphs. Therefore, it is crucial for the predictive model to incorporate both sequence-based and structural information and employ a graph-based architecture for DeepDrug. The proposed model mainly differs from previous methods in the following three aspects: (i) Unlike previous methods that only use sequence or structure information. DeepDrug takes both traditional sequence representation and structure-based graph representations as inputs to learn a more comprehensive representation for drugs or proteins; (ii) We introduced a novel Res-GCN module to better capture the intrinsic structural information among atoms of a compound and residues of a protein; (iii) To the best of our knowledge, DeepDrug is the first work to solve both DDIs and DTIs tasks within a unified framework. A series of systematic experiments show that DeepDrug outperforms other state-of-the-art models and demonstrates high robustness under different experimental settings. We summarized that DeepDrug, as an effective tool for predicting DDIs and DTIs, could shed light on the understanding of biochemical interactions.

2 RESULTS

2.1 Overview of DeepDrug

We developed a deep learning framework, DeepDrug, to predict drug interactions (e.g., DDIs and DTIs) by combining sequence profile and structural profile. For each input (drug or protein), we used sequence data as well as the partially available structural data as separate input branch to the DeepDrug model (Fig.1). The input sequence of drug and protein was converted into a representation using one-hot encoding and fed to convolution layers. The drug chemical structure was encoded as a graph, where node represents atom and edge denotes chemical bond. Similarly, the protein structure was encoded also as a graph with nodes and edges denoting amino acids and the interaction between them. Then the graph representation was fed to several residual graph convolution layers (Res-GCNs). The hidden features extracted from the sequence branches and structural branches were subsequently merged by concatenation. Finally, a fully connected layer with Sigmoid/Softmax/None activation functions were used to get different types of output for binary classification, multi-class/multi-label classification, and regression, respectively. Detailed hyperparameters were illustrated in Supplementary Fig. S1.

2.2 DeepDrug enables superior drug-drug interactions prediction

DDIs prediction falls into two categories: (i) binary classification where each pair of drugs in the database was annotated as positive example or negative examples. Negative samples were selected by either random pairing or stringent blind test. (ii) multi-class/multi-label classification where the multi-labels were obtained from annotations based on the different types of interactions defined in DrugBank and TwoSides (see Methods). We first evaluated the performance of DeepDrug for DDIs prediction in a binary classification setting. We benchmarked DeepDrug against eight baseline methods, including random forest classification (RF) and logistic regression (LR), DeepDDI [16], DeepPurpose [25], NDD [15], AttentionDDI [44], DDIMDL [28] and SkipGNN [29]. Five datasets were used for evaluation, including DDInter [45], DrugBank, TwoSides and two datasets from NDD paper [15]. Our analysis showed that deep learning methods outperform similarity-based methods and traditional machine learning methods, such as RF and LR, across different datasets by a large margin. Among all the competing methods, DeepDrug consistently outperformed all other methods by achieving the highest F1 score of 0.916–0.955, highest area under precision-recall curve (auPRC) score of 0.964–0.987 and highest area under receiver operation curve (auROC) score of 0.971–0.988 with balanced setting (Supplementary Table S1). Comparing to the second-best baseline method DeepPurpose, DeepDrug achieved averaged 2.1% higher F1 score, 1.3% higher auPRC score and 1.1% higher auROC score.

However, Due to the rarity of occurrence of DDIs [46], the number of known DDIs among a typical drug database is usually very low. Hence, to be more realistic and practical, we also evaluated robustness of DeepDrug with imbalanced datasets by altering the ratio between positive samples and negative samples to 1:2, 1:4, 1:8 and 1:16 based on the number of drugs in different datasets (Fig.2). Note that the results of NDD and AttentionDDI were directly collected from the original paper. In our case, although the auPRC scores of all comparing methods dropped, DeepDrug still outperforms other comparison methods across all datasets with different positive-to-negative ratio by achieving the highest F1 and auPRC scores (Fig.2, Supplementary Fig. S2). Specifically, DeepDrug is more robust and achieves a significantly higher performance than the best baselines DeepPurpose and DeepDDI when the dataset is extremely unbalanced (Fig.2). For example, the superiority demonstrated by DeepDrug over DeepPurpose in terms of F1 score increased from 1.7% to 9.6% when the positive-to-negative sample ratio changed from 1:1 to 1:16 for DDInter dataset. The auPRC score of DeepDrug over DeepPurpose increased from 1.40% to 8.50% when the positive-to-negative sample ratio changed from 1:1 to 1:16 for DDInter dataset. To sum up, the performance of DeepDrug in terms of F1 and auPRC scores over other prediction methods demonstrated the superior ability of DeepDrug in predicting DDIs, especially with unbalanced dataset.

Next, we compare DeepDrug with other methods with multi-class/multi-label classification settings where only DeepDDI and DeepPurpose are applicable. We conducted the classification experiments using DrugBank and TwoSides databases based on the 86 and 1317 interaction types, respectively. All of the DDI methods were evaluated using standard metrics including macro F1 score and auPRC score. In multi-class classification, DeepDrug achieved the best performance by obtaining 4.3%–5.8% higher F1 score and 4.9%–6.7% higher auPRC than the best baseline method (Supplementary Table S2). The outperformance by DeepDrug indicated the advantage of using structural representation and sequence-based representation of drug in DDI predictions. The same trend was observed in multi-label classification results where the introduction of 1317 types of interactions in dataset lowers F1 scores of all methods, DeepDrug demonstrated much higher F1 as 0.292 and auPRC score as 0.265 than the second-best method DeepPurpose (F1 score 0.227 and auPRC score 0.191, Supplementary Table S2, Fig. S3).

To evaluate the performance of DeepDrug under a more stringent setting, we used blind test for binary classification where five-fold cross-validation was used to ensure that one drug or both drugs in test set were not used in training set. DeepDrug again outperforms the best baseline DeepPurpose by achieving an average 4.45% higher F1 score and 1.8% higher auPRC with double-blind testing across four datasets (Supplementary Table S3). To sum up, DeepDrug was shown to be superior and robust in both binary and multi-class/multi-label classification of DDIs. Therefore, unlike DeepPurpose that only used the SMILES sequence information, DeepDrug exploited both structural information from a novel graph representation and sequence information from SMILES string, which is potentially capable of learning the underlying structural properties to gain better performance.

2.3 DeepDrug accurately identifies drug-target interactions

Although proteins generally have more intricate structures than chemical drugs due to their three-dimensional arrangement of sequence residuals, they can still be effectively represented by 3D graphs. We first classified the DTI dataset with binary labels and benchmarked DeepDrug against six baseline methods, including RF and LR, DeepPurpose [47], CPI [48], MolTrans [49] and TransformerCPI [50]. Three benchmark datasets were introduced, including BindingDB, DAVIS and KIBA. The benchmark experimental results also showed the same trend as DDIs tasks that deep learning methods dominated the DTIs prediction tasks. DeepDrug again obtained the best performance across all deep learning methods by achieving an average auPRC of 0.811 in the above three datasets, compared to 0.788 of the second-best baseline DeepPurpose (Fig.3, Supplementary Table S4). Noticeably, DeepDrug and DeepPurpose were the only two deep learning methods that were applicable in the largest BindingDB dataset while the transformer-based method TransformerCPI failed due to the low computational efficiency. The superior performance of DeepDrug in DTIs prediction tasks indicated that the graph-based representation of drug can be regarded as a general framework for boosting prediction performance in various drug-related tasks.

Next, we compared DeepDrug to four baseline methods, including GraphDTA [27], DeepDTA [26], DeepPurpose [47] and RF, in DTIs regression settings where we directly predict the continuous binding affinity, which is measured by Kd value (see Methods). We conducted the regression experiments in the same three datasets (BindingDB, DAVIS and KIBA [51]) based on the Kd value (kinase dissociation constant). All of the comparing methods were evaluated using standard metrics including concordance score, Pearson r score and R2 score. Again, DeepDrug achieved the best performance in terms of the three evaluation measurements compared to baseline methods (Fig.3, Supplementary Table S5). Specifically, DeepDrug achieved the highest concordance score of 0.836 in BindingDB, which is 1.2% and 2.3% higher than DeepPurpose and a graph neural network-based method GraphDTA, respectively. The superiority of DeepDrug was consistently observed in DAVIS and KIBA datasets. Different from GraphDTA that only updated node features in the graph convolutional layers, DeepDrug considered both node features and edge features and updated them iteratively, thus leading to a more comprehensive representation of a drug and resulting in an incremental predictive power in the DTIs tasks. The superiority of DeepDrug indicated the benefit of combining a comprehensive structural representation and sequence representation for both drugs and proteins in DTI prediction tasks.

To further explore the ability of DeepDrug in drug repositioning, similar to DDI blind test, we stringently separated the drugs and proteins into training and test sets using five-fold cross-validation, thus curating a blind test set where the drugs or/and proteins were unseen in training set. This task became much more challenging as both the drugs and proteins were unseen during the training process. DeepDrug demonstrated an average concordance score of 0.677 and Pearson r of 0.468 in DAVIS dataset, which outperformed DeepPurpose (concordance score of 0.605, Pearson r of 0.392) by a noticeable margin (Supplementary Table S6). Therefore, by exploiting useful structural information from graph representation of drugs and proteins, DeepDrug was shown to be consistently superior over baseline methods in both classification and regression of DTIs. We then summarized that DeepDrug provided a powerful representation of both drugs and proteins by considering both the comprehensive structural information as well as the sequence information. The superior performance of DeepDrug across various settings in DDIs and DTIs prediction tasks implicated a strong generalization ability of DeepDrug in wide drug-related applications.

2.4 Model ablation analysis

To further support the results shown in the above sections, we conducted comprehensive model ablation analysis to measure the contribution of different modules used in DeepDrug architecture (Methods). First, we analyzed the performance of DeepDrug with respect to the following model ablation setting: presence of Res-GCN module and presence of CNN module. Res-GCN module and CNN module are used in the DeepDrug to leverage structural and sequence information, respectively. We used the binary classification task of DDIs in multiple positive-to-negative sample ratios and DTIs regression for ablation studies. It was observed that using Res-GCN module alone led to a decreased performance with 0.5%–2.6% lower F1 score while using only CNN module resulted in a decline of 0.2%–1.6% in F1 score (Supplementary Table S7). Similar decrease trends were noticed in terms of R2 and concordance score in DTI regression tasks. DeepDrug with structure information removed reduces 1.1% and 1.8% on the R2 metric and 0.6% and 1.4% on the Pearson r in the KIBA and DAVIS datasets, respectively. DeepDrug with fused structural and sequence features performs optimally, indicating the benefits of integrating structural information with sequence information in the DTI tasks. Next, we removed the edge features that were ignored by existing works but used in Res-GCN modules, the F1 score decreases about 2.4% to 3.2% (Supplementary Table S8). To summarize, Res-GCN module and CNN module are complementing each other to further improve the predictive performance, indicating the usefulness of our designed DeepDrug architecture.

We also analyzed the robustness of DeepDrug with respect to the following hyperparameter setting: choice of feature aggregation, number of hidden units in each GCN layer, the total number of GCN layers. The performance of DeepDrug using SoftMax aggregation function demonstrated better performance than other aggregation functions such as Mean and Sum (Supplementary Table S9). As the number of hidden units increased significantly (e.g., 32 and higher) in Res-GCN layer, both evaluation metrics started to saturate. As the number of GCN layers increased, the model became insensitive to the number of Res-GCN and CNN layers as well (Supplementary Table S9). To sum up, DeepDrug was insensitive to most parameter choices, illustrating the robustness of the framework.

2.5 DeepDrug embeddings reflect drug types and drug functions

To demonstrate that DeepDrug effectively captured the variability of structural information in the embeddings learned from Res-GCN module, we visualized the structural embeddings of drugs from benchmark DrugBank dataset using t-distributed stochastic neighbor embedding (tSNE). We found that the DeepDrug embeddings exhibited clear patterns that corresponded to the underlying drug types and drug functions (Fig.4, Supplementary Fig. S4). We assumed that drugs that were closer in the embedding space (e.g., within the same cluster) implied the presence of certain form of higher similarity or closer relationship. To verify this, we then quantified the effectiveness of the embeddings by various evaluation settings and found that DeepDrug embeddings consistently outperformed DeepPurpose by achieving a higher averaged Drug Category Enrichment Score (0.690 vs 0.621, see DCES in Supplementary Note 1) and higher silhouette score (0.568 vs 0.543, see Fig.4 and Supplementary Table S10). Extensive evaluations under various folds showed that the DeepDrug embeddings consistently achieved the best performance (Fig.4). Furthermore, to evaluate the performance of DeepDrug applied to unseen drugs, we further collected 4886 unseen drugs from DrugBank website that were not used in benchmark studies and the mean DCES was again better than the DeepPurpose (0.575 vs 0.514, Supplementary Table S11).

Next, we isolated 28 drugs in the cluster 4 (enriched as opioids) and compared their chemical structures as well as their functionalities with other randomly sampled drugs in the dataset that were far away from the cluster. A subset of our sampled drugs is presented in Fig.4. The striking observation was that drugs in the cluster shared very similar structural compositions. In terms of functionality, the cluster of drugs identified by DeepDrug embeddings were highly similar among themselves as well (Fig.4). Out of the 28 drugs in the cluster 4, all of them were meant for pain relief (Supplementary Table S12). Taken together, these results demonstrated that the DeepDrug structural embeddings effectively captured the structural information which might determine the functionality of the input entities to reflect the underlying drug function. Such structural embedding capability is considered to be the main driving force to the superior performance of DeepDrug.

2.6 DeepDrug provides potential therapeutic opportunities against SARS-CoV-2

SARS-CoV-2 is a newly enveloped positive-strand RNA virus, which has probably the largest genome (approximately 30 kb) among all RNA viruses. The nucleocapsid (N) protein, which is mainly responsible for recognizing and wrapping viral RNA into helically symmetric structures, has been reported to boost the efficiency of transcription and replication of viral RNA, implying its vital and multifunctional roles in the life cycle of coronavirus [52].

We then investigated whether DeepDrug was able to correctly identify the interactions of SARS-CoV-2 proteins. We constructed two drug-target positive datasets (i.e., one is expert-confirmed and one is literature-based) for SARS-CoV-2 from a recent study [53] (Supplementary Note 2). In our benchmark BindingDB dataset, there were 68 SARS-CoV-2 interacting drugs and 124 proteins which were similar with these SARS-CoV-2 proteins. To obtain a stringent rule for constructing dataset, we removed those SARS-CoV-2 interacting drugs and analogous drugs from the training set that shared similar SMILE sequences (defined as drugs sequence similarities > 60%, see Methods, Supplementary Fig S5A, B), and removed proteins similar to SARS-CoV-2 with protein sequence similarities > 30%. After removing these records, we re-trained the DeepDrug model and combined the SARS-CoV-2 interacting drugs to construct an independent test set. The DeepDrug prediction scores for interacting pairs and non-interacting pairs were shown in Fig.5, and we noticed that DeepDrug assigned higher prediction scores for those interacting pairs. The results showed that DeepDrug was able to distinguish expert-confirmed positive pairs from negative pairs in both of mean and maximum strategies (p-values equal to 5.06 × 10-9 and 8.06 × 10-7 respectively, one-side paired t-test). Results predicted on the similar templates for RCSB database, rather than the simulation structures, also show similar significant discrimination between expert-confirmed positive pairs and negative pairs (Supplementary Fig. S6). In addition, the results of DeepDrug training on the original BindingDB dataset also showed similar performance (Supplementary Fig. S7). We observed that there were some outliers with very high affinity in the predictions of the negative pairs, which could be potential valid potential drugs. Among the top-ranked drug-protein pairs, 2 out of top-3 drugs, 7 out of top-10 drugs were already reported by literatures (Supplementary Table S13). In these molecules, prinomastat (the 2nd top-ranked molecule), a matrix metalloprotease inhibitor, was reported to have selective activity against SARS-CoV-2 but not against SARS-CoV [54]. Besides, pioneering research [55] have shown that TNF, IL1B, IL6, IL8, NFKB1, NFKB2 and RELB genes are significant upregulated, leading to strongly activation of TNF and NFκB-signaling pathways in the SARS-CoV-2 patients. These pathologic features are similar to those of chronic obstructive pulmonary disease (COPD). Tiotropium (the 4th top-ranked molecule), which is observed to alleviate airway inflammation and improve pulmonary function, is a well-known therapeutic drug for COPD patients. Therefore, tiotropium is a potential effective drug for the treatment of SARS-CoV-2. In addition, a therapeutic treatment on an array of 11 SARS-CoV-2 patients have demonstrated that danoprevir (the 7th top-ranked molecule) treatment effectively inhibited viral replication and improved patient health status [56]. Hence, the repurposing of danoprevir, a powerful hepatitis C virus (HCV) protease inhibitor, for SARS-CoV-2 is a promising therapeutic treatment option. Such results further demonstrated the strong predictive power of DeepDrug and DeepDrug may provide therapeutic opportunities against newly found proteins such as SARS-CoV-2.

3 DISCUSSION

In this study, we proposed DeepDrug as a novel end-to-end deep learning framework for DDIs and DTIs predictions. DeepDrug takes both topological structure information and sequence information of either drug-drug pair or drug-protein pair as inputs and utilizes Res-GCNs and CNNs to learn the graph representation and high-level sequence embeddings, respectively. Multi-source features are fused together to complement each other in order to achieve a superior prediction level with high accuracy. To the best of our knowledge, DeepDrug is the first work to apply both graph convolutions and sequence convolutions to molecular representation. In addition, we demonstrated that the combination of intrinsic graph-based representation and high-level sequence embeddings are appealing for a comprehensive assessment for predicting DDIs and DTIs. Unlike the AdvProp method [57] that uses ensemble learning strategies to combine the outputs of different sub-models that take structural and sequence information separately as inputs, DeepDrug is more capable of integrating structural and sequence information benefiting from the design of a single model architecture. Our extensive experiments highlighted the predictive power of DeepDrug and its potential translational value in drug repositioning.

We also provide three possible future directions for improving our DeepDrug model. First, the rich multi-omics data, including genomic, transcriptomic, epigenomics and proteomic data, which are proven to be informative [5863], could help DeepDrug further improve the predictive power. We will try to incorporate these abundant data into our DeepDrug model. Second, the current interaction predictions (e.g., DTIs) do not consider the causal interaction where one drug is involved in a biological or biochemical process to directly or indirectly affect a protein. Identifying such direct interactions and indirect interactions by causal inference method [64] could help us better understand the related biological or chemical pathways or mechanisms. Furthermore, based on complex gene-protein-drug-disease heterogeneous networks constructed from multiple genomics databases [6567], combining sequence and structural features of proteins/targets with association features in complex graphs through heterogeneous graph convolution networks would be a direction that could be improved.

To sum up, we introduced DeepDrug which can be served as a framework for systematically exploring the DDIs and DTIs prediction tasks with a unified model architecture. With DeepDrug, researchers could perform drug repositioning with specific target proteins. Then, one can simultaneously learn the interaction mechanism and annotate the interaction potential for every possible drug. Using large-scale public data, one could train an accurate and interpretable model to predict the interactions associated with human diseases (e.g., SARS-CoV-2). We hope our approach could help unveil the drug interaction mechanism and facilitate the further biochemical research.

4 METHODS

4.1 Drug and protein feature representation

We used DeepChem [68] for converting drug SMILES strings into graph representations in the form of feature matrices (i.e., node/edge feature matrices) and adjacency matrices. We used PAIRPred [69] software for extracting the protein PDB data into similar graph representations, including feature matrices and adjacency matrices. Specifically, each drug is constructed with 11-dimension edge features and 93-dimension node features, of which 91 features were calculated using DeepChem and the remaining two are the in-degree and out-degree of each node. The cutoff for drug sequence length is set to 200. As for graph feature of proteins, we firstly collected the PDB files of all proteins on the RCSB database. For each protein, we selected the longest crystal structure, i.e., the longest chain in the PDB file, as the 3D structure of the protein. Each protein is constructed with 80-dimensional node features, including amino acid features, and 2-dimension edge features (the distance of amino acids, and the angle of amino acids [41]. Among the node features, 78 of them are calculated by PAIRPred software and the remaining two are the in-degree and out-degree of amino acids. Note that we did not take into account the conformational plasticity of proteins to different drugs due to the lack of sufficient available data. The cutoff for protein sequence length is set to 1000. We removed proteins without 3D structure and the corresponding DTI pairs for DTI datasets

4.2 Residual graph convolutional network

The residual graph convolutional network (Res-GCN) module was capable of learning both node embeddings and edge embeddings simultaneously by graph convolutions while other GCN-based methods in this field only consider node embeddings. The Res-GCN module converted original node features (93 and 80 features for drug and protein respectively) to the 128-dimension features and also converted the original edge features (11 features for drug and 2 features for protein) to 128-dimension features. Borrowed from the success of deep residual network [70], we applied convolutional residual blocks in the Res-GCN module, which contained 22 residual blocks for drug branch (DDIs and DTIs tasks) and 6 residual blocks for protein branch (DTIs tasks).

We carefully designed a strategy for iteratively updating the edge features and node features in each convolutional residual block as follows. Taking the lth graph convolutional residual block for an example, we denote the input node features and edge features as H(l)RNh×Dh(l) and E(l)RNe×De(l) respectively, where Nh and Ne denote the number of nodes and edges, Dh(l) and De(l) represent the node feature dimension and edge feature dimension. For initialization, we set Dh(0)=93 for drug or Dh(0)=80 for protein, De(0)=11 for drug and De(0)=2 for protein. Both node features H(l) and edge features E(l) were first passed through a layer-normalization layer [71], a ReLu nonlinear layer and a dropout layer (ratio=0.1), which are represented as

{H^(l)=Dropout(ReLu(LayerNorm(H(l))))E^(l)=Dropout(ReLu(LayerNorm(E(l))))

We next illustrate how to get the node features H(l+1) and edge E(l+1) based on the processed feature matrices H^(l) and E^(l) through an iterative strategy. We use h^i(l) and e^i,j(l) for denoting the ith node features (e.g., ith row of H^(l)) and edge features between ith and jth nodes. Taking the ith node for an instance, we first calculated the residual features of edge between the ith node and jth node based on the current edge features e^i,j(l) and the node features h^i(l) and h^j(l), which is represented as

e~i,j(l)=FCN(l)(e^i,j(l),h^i(l),h^j(l))

where FCN(l) is two-layer perceptron with 256 and 128 nodes with ReLu activation function by taking the concatenation of node features and the corresponding edge features as input. Next, we calculated the residual features of the ith node based on the current ith node feature h^i(l) and all edge features connected to node i, which is formulated as

h~i(l)=W(l)h^i(l)+jN(i)exp(βe^i,j(l))kN(i)exp(βe^i,k(l))e^i,j(l)

where W(l)RDh(l)×Dh(l+1) is the learnable parameter in the lth graph convolutional residual block and the second term is a SoftMax aggregation function [70] for aggregating the information of edges between node i and all neighboring nodes. Note that the SoftMax aggregation function is parametrized by β which is also learnable in the training process. After getting the residual features of all nodes and edges, which form the residual node feature matrix H(l) and the residual edge feature matrix E(l), the node feature H(l+1) and edge feature E(l+1) in (l+1)-th graph convolutional residual block were updated through the following propagation rule:

{H(l+1)=H~(l)+H(l)E(l+1)=E~(l)+E(l)

To ensure the compatibility of adding the features H(l) and E(l) for shortcuts when l=0, we additionally used a linear layer with 128 nodes and a layer normalization to transform the input dimension by

{H(0)LayerNorm(Linear(H(0)))E(0)LayerNorm(Linear(E(0)))

Note that the two Res-GCN or CNN modules had shared weights during DDI tasks and are independent for DTI tasks. In the DDI tasks, the interactions of two drugs in the DDIs are not reciprocal to each other, and thus, although DeepDrug’s feature extraction modules are shared for each drug, the input positions of the two drugs cannot be switched (i.e., the A-B drug pair input is not equivalent to the B-A drug pair input for DeepDrug).

4.3 Data preparation

We collected 5 DDI benchmark datasets for evaluation. DrugBank benchmark dataset consists of 1706 drugs with 191,808 drug pairs among 86 types of drug interactions based on the drug function. TwoSides dataset consists of 645 drugs with 63,473 drug pairs among 1317 kinds of interactions based on the side effects, such as “abscess”, “adenoma” and “agnosia”. Different from the exclusive interactions of Drugbank dataset, side effects in TwoSides dataset are not exclusive, indicating that side effect prediction is a multi-label classification task. We further filtered out classes with less than 500 samples and construct TwoSides (963), the Twosides dataset with 963 types of interaction. Two datasets from NDD [15] are collected. The first one, termed NDD_DS1, is composed of 548 drugs with 300,304 drug pairs, in which 97,168 pairs are positive and the rest are negative. The second one, termed NDD_DS2, consists of 707 drugs with 499,849 drug pairs, in which 34,412 pairs are positive. DDInter [45] dataset consists of 1493 drugs with 117,608 drug pairs.

To generate a series of binary datasets from DrugBank with different positive-to-negative ratio, we considered all the pairs in the DrugBank dataset as positive samples. As for negative samples, we randomly selected drug pairs in the dataset and eliminated drug pairs that overlapped with positive samples and duplicated drug pairs. In this way, we constructed a series of binary classification datasets with positive-to-negative ratio of 1:1, 1:2, 1:4, 1:8 and 1:16.

We collected 3 DTA benchmark datasets for evaluation, including DAVIS, KIBA and BindingDB [72] dataset. After discarding proteins without 3D structure in RSCB database, DAVIS dataset consists of 68 drugs and 316 proteins, which constructs 21,488 drug-protein pairs. KIBA dataset consists of 2111 drugs and 185 proteins, which constructs 390,535 drug-protein pairs. As for BindingDB dataset, it consists of 417,893 drugs and 2076 proteins, which constructs 751,808 drug-protein pairs. We applied thresholds of 100, 12.1 and 400 to the raw affinity scores in the DAVIS, KIBA and BindingDB datasets, respectively, to construct the corresponding binary datasets, according to pioneer study [26].

For the DAVIS and BindingDB datasets, the binding affinity is measured by Kd value (kinase dissociation constant), of which the range is too large. Kd is log-transformed, i.e., pKd, using the formula as follows [26]:

pKd=log10(Kd109)

4.4 Baseline methods

To evaluate the performance of DeepDrug, we benchmarked DeepDrug on multiple datasets with a 5-fold cross-validation strategy for DDI tasks and DTI tasks. For classification task, we benchmarked DeepDrug with multiple baseline methods, including DeepPurpose [47], DeepDDI [16], NDD [15], AttentionDDI [44], DDIMDL [28], SkipGNN [29], logical regression (LR) and random forest (RF). We have modified DeepDDI slightly to make it suitable for binary classification. Note that NDD and AttentionDDI are based on multiple similarity matrices, which is not able to calculate on others dataset since the source code is not released, we directly collected the results on NDD_DS1 and NDD_DS2 from the original paper. DeepPurpose is a deep learning framework for DTI prediction. We used default setting (CNN embedding for drugs and targets) of DeepPurpose for benchmarking. We have also modified DeepPurpose slightly to make it suitable for DDI prediction. For drug-target interaction task, we benchmarked DeepDrug with RF, LR, MolTrans [49], CPI [48], TransformerCPI [50] and DeepPurpose. Note that we did not evaluate MolTrans, LR, TransformerCPI on BindingDB dataset due to time limitation (within 48 hours). For drug-target affinity regression tasks, we benchmarked DeepDrug with DeepDTA [26], GraphDTA [27] and DeepPurpose.

4.5 Model training and evaluation

The final prediction layer was a linear layer with an activation function, which was dependent on the tasks. Specifically, the Sigmoid activation function were used for binary classification task and multi-label classification task. The multi-label information was collected from TwoSides and Drugbank databases, which contain 1317 and 86 categories for the interaction types, respectively. The Softmax activation function was selected for multi-class classification task and none of activation function was used for regression task. Cross entropy (CE) loss is in classification settings and mean square error (MSE) loss was used in regression settings. We used Adam optimizer with initial settings of a learning rate of 0.01, and a weight decay of 10−4. The dropout ratio was set to 0.1. The DeepDrug was implemented with PyTorch framework [73]. We used Ray-project [74] for hyper-parameters searching (Supplementary Note 3).

F1 score, auROC and auPRC are used for measuring the performance in classification task. Due to the unbalance of the datasets, macro F1 score and auPRC are the more suitable metrics. For multi-label and multi-class classification, we regarded the problems as multiple binary classification tasks and calculated auROC and auPRC individually and then averaged them as the final auROC and auPRC score. As for metrics of regression task, we used serval metrics to evaluate the performance of affinity prediction, including R2, Pearson correlation, and concordance index.

References

[1]

Bleakley, K. (2009). Supervised prediction of drug-target interactions using bipartite local models. Bioinformatics, 25: 2397–2403

[2]

Zitnik, M., Nguyen, F., Wang, B., Leskovec, J., Goldenberg, A. Hoffman, M. (2019). Machine learning for integrating data in biology and medicine: principles, practice, and opportunities. Inf. Fusion, 50: 71–91

[3]

Boolell, M., Allen, M. J., Ballard, S. A., Gepi-Attee, S., Muirhead, G. J., Naylor, A. M., Osterloh, I. H. (1996). Sildenafil: an orally active type 5 cyclic GMP-specific phosphodiesterase inhibitor for the treatment of penile erectile dysfunction. Int. J. Impot. Res., 8: 47–52

[4]

Jia, J., Zhu, F., Ma, X., Cao, Z., Cao, Z. W., Li, Y., Li, Y. X. Chen, Y. (2009). Mechanisms of drug combinations: interaction and network perspectives. Nat. Rev. Drug Discov., 8: 111–128

[5]

Han, K., Jeng, E. E., Hess, G. T., Morgens, D. W., Li, A. Bassik, M. (2017). Synergistic drug combinations for cancer identified in a CRISPR screen for pairwise genetic interactions. Nat. Biotechnol., 35: 463–474

[6]

Sun, Y., Sheng, Z., Ma, C., Tang, K., Zhu, R., Wu, Z., Shen, R., Feng, J., Wu, D., Huang, D. . (2015). Combining genomic and network characteristics for extended capability in predicting synergistic drugs for cancer. Nat. Commun., 6: 8481

[7]

Lazarou, J., Pomeranz, B. H. Corey, P. (1998). Incidence of adverse drug reactions in hospitalized patients: a meta-analysis of prospective studies. JAMA, 279: 1200–1205

[8]

Gallagher, P. F., Barry, P. J., Ryan, C., Hartigan, I. (2008). Inappropriate prescribing in an acutely ill population of elderly patients as determined by Beers’ Criteria. Age Ageing, 37: 96–101

[9]

Meinertz, T. (2001). Mibefradil—a drug which may enhance the propensity for the development of abnormal QT prolongation. Eur. Heart J. Suppl., 3: K89–K92

[10]

Staffa, J. A., Chang, J. (2002). Cerivastatin and reports of fatal rhabdomyolysis. N. Engl. J. Med., 346: 539–540

[11]

Wishart, D. S., Knox, C., Guo, A. C., Shrivastava, S., Hassanali, M., Stothard, P., Chang, Z. (2006). DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res., 34: D668–D672

[12]

Tatonetti, N. P., Ye, P. P., Daneshjou, R. Altman, R. (2012). Data-driven prediction of drug effects and interactions. Sci. Transl. Med., 4: 125ra31

[13]

Burley, S. K., Berman, H. M., Bhikadiya, C., Bi, C., Chen, L., Di Costanzo, L., Christie, C., Dalenberg, K., Duarte, J. M., Dutta, S. . (2019). RCSB Protein Data Bank: biological macromolecular structures enabling research and education in fundamental biology, biomedicine, biotechnology and energy. Nucleic Acids Res., 47: D464–D474

[14]

Kim, S., Chen, J., Cheng, T., Gindulyte, A., He, J., He, S., Li, Q., Shoemaker, B. A., Thiessen, P. A., Yu, B. . (2019). PubChem 2019 update: improved access to chemical data. Nucleic Acids Res., 47: D1102–D1109

[15]

Rohani, N. (2019). Drug-drug interaction predicting by neural network using integrated similarity. Sci. Rep., 9: 13645

[16]

Ryu, J. Y., Kim, H. U. Lee, S. (2018). Deep learning improves prediction of drug-drug and drug-food interactions. Proc. Natl. Acad. Sci. USA, 115: E4304–E4311

[17]

Liu, Q., Hu, Z., Jiang, R. (2020). DeepCDR: a hybrid graph convolutional network for predicting cancer drug response. Bioinformatics, 36: i911–i918

[18]

Ma, T., Liu, Q., Li, H., Zhou, M., Jiang, R. (2022). DualGCN: a dual graph convolutional network model to predict cancer drug response. BMC Bioinformatics, 23: 129

[19]

Yan, X., Zhang, S., Yiu, S. (2021). Interpretable prediction of drug-cell line response by triple matrix factorization. Quant. Biol., 9: 426–439

[20]

Wang, C. (2020). Survey of similarity-based prediction of drug-protein interactions. Curr. Med. Chem., 27: 5856–5886

[21]

Yamanishi, Y., Araki, M., Gutteridge, A., Honda, W. (2008). Prediction of drug-target interaction networks from the integration of chemical and genomic spaces. Bioinformatics, 24: i232–i240

[22]

Zitnik, M., Agrawal, M. (2018). Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34: i457–i466

[23]

Zhang, T., Leng, J. (2020). Deep learning for drug-drug interaction extraction from the literature: a review. Brief. Bioinform., 21: 1609–1627

[24]

Bagherian, M., Sabeti, E., Wang, K., Sartor, M. A., Nikolovska-Coleska, Z. (2021). Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief. Bioinform., 22: 247–269

[25]

Huang, K., Fu, T., Glass, L. M., Zitnik, M., Xiao, C. (2021). DeepPurpose: a deep learning library for drug-target interaction prediction. Bioinformatics, 36: 5545–5547

[26]

rk, H., (2018). DeepDTA: deep drug-target binding affinity prediction. Bioinformatics, 34: i821–i829

[27]

NguyenT.,LeH.,QuinnT. P.,NguyenT.,LeT. D.. (2020) Graphdta: predicting drug–target binding affinity with graph neural networks. bioRxiv. 684662

[28]

Deng, Y., Xu, X., Qiu, Y., Xia, J., Zhang, W. (2020). A multimodal deep learning framework for predicting drug-drug interaction events. Bioinformatics, 36: 4316–4322

[29]

Huang, K., Xiao, C., Glass, L. M., Zitnik, M. (2020). SkipGNN: predicting molecular interactions with skip-graph networks. Sci. Rep., 10: 21092

[30]

Bajusz, D., cz, A. (2015). Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations? J.. Cheminform, 7: 20

[31]

Mousavian, Z. (2014). Drug-target interaction prediction via chemogenomic space: learning-based methods. Expert Opin. Drug Metab. Toxicol., 10: 1273–1287

[32]

Luo, Y., Zhao, X., Zhou, J., Yang, J., Zhang, Y., Kuang, W., Peng, J., Chen, L. (2017). A network integration approach for drug-target interaction prediction and computational drug repositioning from heterogeneous information. Nat. Commun., 8: 573

[33]

KipfT. N.. (2016) Semi-supervised classification with graph convolutional networks. arXiv, 160902907

[34]

CucurullG.,CasanovaA.,RomeroA.,LioP.. (2017) Graph attention networks. arXiv, 171010903

[35]

LiY.,TarlowD.,BrockschmidtM.. (2015) Gated graph sequence neural networks. arXiv,151105493

[36]

BressonX.. (2017) Residual gated graph convnets. arXiv,171107553

[37]

Xu, C., Liu, Q., Huang, M. (2020). Reinforced molecular optimization with neighborhood-controlled grammars. Adv. Neural Inf. Process. Syst., 33: 8366–8377

[38]

DingK.,ZhouM.,WangZ.,LiuQ.,ArnoldC. W.,ZhangS.MetaxasD.. (2022) Graph convolutional networks for multi-modality medical imaging: methods, architectures, and clinical applications. arXiv, 220208916

[39]

Yin, Q., Liu, Q., Fu, Z., Zeng, W., Zhang, B., Zhang, X., Jiang, R. (2022). scGraph: a graph neural network-based approach to automatically identify cell types. Bioinformatics, 38: 2996–3003

[40]

DuvenaudD. K.,MaclaurinD.,IparraguirreJ.,BombarellR.,HirzelT.,Aspuru-GuzikA.AdamsR.. (2015) Convolutional networks on graphs for learning molecular fingerprints. In: Proceedings of the 28th International Conference on Neural Information Processing Systems Adv. Neural Inf. Process. Syst., pp. 2224–2232

[41]

FoutA.,ByrdJ.,ShariatB.. (2017) Protein interface prediction using graph convolutional networks. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6533–6542

[42]

FengQ.,DuevaE.,. and Ester, M. (2018) Padme: a deep learning-based framework for drug-target interaction prediction. arXiv,180709741

[43]

Zamora-ResendizR.. (2019) Structural learning of proteins using graph convolutional neural networks. bioRxiv. 610444

[44]

Schwarz, K., Allam, A., Perez Gonzalez, N. A. (2021). AttentionDDI: Siamese attention-based deep learning method for drug-drug interaction predictions. BMC Bioinformatics, 22: 412

[45]

Xiong, G., Yang, Z., Yi, J., Wang, N., Wang, L., Zhu, H., Wu, C., Lu, A., Chen, X., Liu, S. . (2022). DDInter: an online drug-drug interaction database towards improving clinical decision-making and patient safety. Nucleic Acids Res., 50: D1200–D1207

[46]

Bansal, M., Yang, J., Karan, C., Menden, M. P., Costello, J. C., Tang, H., Xiao, G., Li, Y., Allen, J., Zhong, R. . (2014). A community computational challenge to predict the activity of pairs of compounds. Nat. Biotechnol., 32: 1213–1222

[47]

Huang, K., Fu, T., Glass, L. M., Zitnik, M., Xiao, C. (2021). DeepPurpose: a deep learning library for drug-target interaction prediction. Bioinformatics, 36: 5545–5547

[48]

Tsubaki, M., Tomii, K. (2019). Compound-protein interaction prediction with end-to-end learning of neural networks for graphs and sequences. Bioinformatics, 35: 309–318

[49]

Huang, K., Xiao, C., Glass, L. M. (2021). MolTrans: molecular interaction transformer for drug-target interaction prediction. Bioinformatics, 37: 830–836

[50]

Chen, L., Tan, X., Wang, D., Zhong, F., Liu, X., Yang, T., Luo, X., Chen, K., Jiang, H. (2020). TransformerCPI: improving compound-protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics, 36: 4406–4414

[51]

Herrero-Zazo, M., Segura-Bedmar, I., nez, P. (2013). The DDI corpus: an annotated corpus with pharmacological substances and drug-drug interactions. J. Biomed. Inform., 46: 914–920

[52]

Surjit, M. Lal, S. (2008). The SARS-CoV nucleocapsid protein: a protein with multifarious activities. Infect. Genet. Evol., 8: 397–405

[53]

Gordon, D. E., Jang, G. M., Bouhaddou, M., Xu, J., Obernier, K., White, K. M., Meara, M. J., Rezelj, V. V., Guo, J. Z., Swaney, D. L. . (2020). A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature, 583: 459–468

[54]

Stukalov, A., Girault, V., Grass, V., Karayel, O., Bergant, V., Urban, C., Haas, D. A., Huang, Y., Oubraham, L., Wang, A. . (2021). Multilevel proteomics reveals host perturbations by SARS-CoV-2 and SARS-CoV. Nature, 594: 246–252

[55]

Kang, K., Kim, H. H. (2020). Tiotropium is predicted to be a promising drug for COVID-19 through transcriptome-based comprehensive molecular pathway analysis. Viruses, 12: 776

[56]

Chen, H., Zhang, Z., Wang, L., Huang, Z., Gong, F., Li, X., Chen, Y. Wu, J. (2020). First clinical study using HCV protease inhibitor danoprevir to treat COVID-19 patients. Medicine (Baltimore), 99: e23357

[57]

Wang, Z., Liu, M., Luo, Y., Xu, Z., Xie, Y., Wang, L., Cai, L., Qi, Q., Yuan, Z., Yang, T. . (2022). Advanced graph and sequence neural networks for molecular property prediction and drug discovery. Bioinformatics, 38: 2579–2586

[58]

Chen, X., Chen, S., Song, S., Gao, Z., Hou, L., Zhang, X., Lv, H. (2022). Cell type annotation of single-cell chromatin accessibility data via supervised bayesian embedding. Nat. Mach. Intell., 4: 116–126

[59]

Liu, Q., Chen, S., Jiang, R. Wong, W. (2021). Simultaneous deep generative modeling and clustering of single cell genomic data. Nat. Mach. Intell., 3: 536–544

[60]

Duren, Z., Chang, F., Naqing, F., Xin, J., Liu, Q. Wong, W. (2022). Regulatory analysis of single cell multiome gene expression and chromatin accessibility data with scREG. Genome Biol., 23: 114

[61]

Yin, Q., Wu, M., Liu, Q., Lv, H. (2019). DeepHistone: a deep learning approach to predicting histone modifications. BMC Genomics, 20: 193

[62]

LanceC.,LueckenM. D.,BurkhardtD. B.,CannoodtR.,RautenstrauchP.,LaddachA.,UbingazhibovA.,CaoZ.DengK.,KhanS.,. (2022) Multimodal single cell data integration challenge: results and lessons learned. In: Proceedings of the NeurIPS 2021 Competitions and Demonstrations Track, pp.162–176

[63]

Liu, Q., Hua, K., Zhang, X., Wong, W. H. (2022). Deepcage: incorporating transcription factors in genome-wide prediction of chromatin accessibility. Genomics Proteomics Bioinformatics, 20: 496–507

[64]

LiuQ.,ChenZ.WongW.. (2022) Causalegm: a general causal inference framework by encoding generative modeling. arXiv, 221205925

[65]

Zeng, W., Liu, Q., Yin, Q., Jiang, R. Wong, W. (2023). HiChIPdb: a comprehensive database of HiChIP regulatory interactions. Nucleic Acids Res., 51: D159–D166

[66]

Chen, S., Liu, Q., Cui, X., Feng, Z., Li, C., Wang, X., Zhang, X., Wang, Y. (2021). OpenAnnotate: a web server to annotate the chromatin accessibility of genomic regions. Nucleic Acids Res., 49: W483–W490

[67]

Davis, A. P., Wiegers, T. C., Johnson, R. J., Sciaky, D., Wiegers, J. Mattingly, C. (2023). Comparative toxicogenomics database (ctd): Update 2023. Nucleic Acids Res., 51: D1257–D1262

[68]

RamsundarB.,EastmanP.,WaltersP.. (2019) Deep Learning for the Life Sciences: Applying Deep Learning to Genomics, Microscopy, Drug Discovery, and More. Sebastopol, CA: O’Reilly Media

[69]

Minhas, F., Geiss, B. J. (2014). PAIRpred: partner-specific prediction of interacting residues from sequence and structure. Proteins, 82: 1142–1155

[70]

LiG.,XiongC.,ThabetA.. (2020) Deepergcn: All you need to train deeper GCNs. arXiv, 2006.07739

[71]

BaJ. L.,KirosJ. R.HintonG.. (2016) Layer normalization. arXiv, 160706450

[72]

Gilson, M. K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L. (2016). BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res., 44: D1045–D1053

[73]

PaszkeA.,GrossS.,MassaF.,LererA.,BradburyJ.,ChananG.,KilleenT.,LinZ.,GimelsheinN.. (2019) Pytorch: an imperative style, high-performance deep learning library. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems, pp. 8026–8037

[74]

MoritzP.,NishiharaR.,WangS.,TumanovA.,LiawR.,LiangE.,ElibolM.,YangZ.,PaulW.JordanM.. (2018) Ray: a distributed framework for emerging AI applications. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp. 561–577

RIGHTS & PERMISSIONS

The Author(s). Published by Higher Education Press.

AI Summary AI Mindmap
PDF (5368KB)

Supplementary files

QB-22320-OF-JR_suppl_1

3604

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/