School of Computer Science, Shaanxi Normal University, Xi’an 710119, China
xjlei@snnu.edu.cn
Show less
History+
Received
Accepted
Published
2023-10-31
2024-05-26
2025-05-15
Issue Date
Revised Date
2024-05-29
PDF
(4590KB)
Abstract
Discovering new drugs is a complicated, time-consuming, costly, risky and failure-prone process. However, about 80% of the drugs that have been approved so far are targeted at protein targets, and 99% of them only target specific proteins. This means that there are still a large number of protein targets that are considered “useless”. By exploring miRNA as a potential therapeutic target, we can expand the range of target selection and improve the efficiency of drug development. Therefore, it is of great significance to search for potential miRNA-drug interactions (MDIs) through reasonable computational methods. In this paper, a dual-channel network model, MDIDCN, based on Temporal Convolutional Network (TCN) and Bi-directional Long Short-Term Memory (BiLSTM), was proposed to predict MDIs. Specifically, we first used a known bipartite network to represent the interaction between miRNAs and drugs, and the graph embedding technique of BiNE was applied to learn the topological features of both. Secondly, we used TCN to learn the MACCS fingerprints of drugs, BiLSTM to learn the k-mer of miRNA, and concatenated the topological and structural features of the two together as their fusion features. Finally, the fusion features of miRNA and drug underwent max-pooling, and they were input into the Softmax layer to obtain the predicted scores of both, so as to obtain the potential miRNA-drug interaction pairs. In this paper, the prediction performance of the model was evaluated on three different datasets by using 5-fold cross-validation, and the average AUC were 0.9567, 0.9365, and 0.8975, respectively. In addition, case studies on the drugs Gemcitabine and hsa-miR-155-5p were also conducted in this paper, and the results showed that the model had high accuracy and reliability. In conclusion, the MDIDCN model can accurately and efficiently predict MDIs, which has important implications for drug development.
Discovering new drugs is a complicated, time-consuming, costly, risky and failure-prone process. For a long time, the pharmaceutical industry has focused on proteins when it comes to developing therapeutic targets, which involves significant time and resources devoted to exploring the drug response of proteins. However, about 80% of the drugs that have been approved so far are targeted at protein targets, and 99% of them only target specific proteins. This means there are still a large number of protein targets that are considered ‘useless’ [1]. To solve this problem, some researchers have turned their attention to other biomolecular entities, such as microRNAs (miRNAs). By exploring miRNA as a potential therapeutic target, we can expand the range of target selection and improve the efficiency of drug development. miRNA is a kind of small molecule RNA, which plays an important role in gene regulation and disease development. Therefore, shifting the focus of research from traditional proteins to non-protein targets such as miRNA is expected to bring new opportunities and breakthroughs to the pharmaceutical industry. miRNA is an endogenous non-coding RNA with a length of about 20 nucleotides, which is found in humans, plants, animals and viruses and is involved in many basic functions of organism growth and development [2]. miRNAs are regulators of multiple cellular pathways [3], and they perform their functions by binding to complementary sequences of mRNA molecules. In recent years, many experiments have shown that miRNAs play an important role as biomarkers and therapeutic targets for chronic diseases [4,5]. The unique secondary structure and conserved sequence of miRNAs also make them suitable as drug targets [6]. As the experimental methods to find the relationship between drugs and miRNAs are expensive, time-consuming and labor-intensive, and have a high failure rate, it is urgent to find scientific and reasonable calculation methods to identify the relationship between new drugs and miRNAs [7].
In bioinformatics, several predictive methods have been proposed to infer potential interactions between miRNAs and drugs. Wang et al. [8] designed a drug-miRNA associations (DMAs) prediction model called RFDMA. The model combined the similarity between miRNAs and drugs, and used random forest algorithm to predict DMAs. In addition, Zhao et al. [9] proposed a model called SNMFDMA, which used a different approach to predict the associations between drugs and miRNAs. The model did not directly use the similar matrix of drugs and miRNAs, but generated a new similar matrix by symmetric non-negative matrix decomposition of the similar matrix. The Kronecker product of the new similarity matrix was then used as the similarity between the drugs and miRNAs, and finally regularized least squares method is used to predict the interaction between them. This approach differs from traditional predictive models and can more accurately predict potential links between drugs and miRNAs. Qu et al. [10] proposed a method that used functional similarities of miRNAs and multiple similarity measures of small molecules (SMs) to predict interactions between SMs and miRNAs. Their method used a heterogeneous network to calculate the correlation of SMs and miRNAs in the resultant network. This combined use of different types of biosignatures allows for more accurate predictions of potential interactions between SMs and miRNAs. Wang et al. [11] proposed a novel computational method called DCMF to predict potential SM-miRNA associations. The DCMF model was constructed by combining the network information of SM and miRNA to predict the potential SM-miRNA associations. Zhang et al. [12] proposed SGNNMD, a signed graph neural network for predicting deregulation types of miRNA-disease associations, which leveraged the structural information learned from subgraphs around miRNA-disease pairs as well as the biological information of miRNAs and diseases. Wang et al. [13] employed stacked autoencoder computational model called SAEMDA for potential miRNA-disease associations prediction. Zheng et al. [14] proposed NASMDR, a framework for predicting miRNA-drug resistance through efficient neural architecture search and graph isomorphism networks. They also propose a new sequence characterization method, k-mer Sparse non-negative matrix decomposition (KSNMF). Ma et al. [15] used SFGAE model to predict miRNA-disease associations, which employed a graph encoder with an attention mechanism to concatenate aggregated feature embeddings and self-feature embeddings, and a bilinear decoder to predict associations. Wang et al. [16] proposed TSPN to predict potential SM–miRNA associations by minimizing the non-convex truncated Schatten p-norm.
Neural networks have been widely used in the field of bioinformatics, and have been used to predict RBP binding sites on circRNAs [17-19], drug-drug interactions [20], molecular toxicity [21], and metabolite-disease associations [22]. Therefore, in this paper, the variant TCN of neural network was applied to extract the fingerprint features of drugs, and the k-mer of miRNA was extracted by BiLSTM, which enabled us to extract key features of drugs and miRNAs, and further helped us to predict MDIs.
In this work, we presented a new computational method called MDIDCN that utilized drug SMILES sequences, miRNA nucleotide sequences, and miRNA-drug interaction networks to predict miRNA-drug interactions. First, a binary network was established to represent the interaction between the drug and miRNA, and vectors were used to represent the chemical substructure of the drug and the sequence of miRNA after obtaining the two. The structural features of the drug were represented by MACCS fingerprints, and the structural features of miRNA were represented by k-mers [23,24]. Second, since graph embedding methods are commonly used to reveal complex features of each entity [25,26]. Therefore, a graph embedding technique called BiNE [27] was applied to the bipartite graph to learn topological features of drug and miRNA nodes in the graph. We constructed the known MDIs into a bipartite graph, taking miRNAs and drugs as the nodes of the graph, and used the BiNE graph embedding method to obtain the topological features of drugs and miRNAs. Then, we used TCN [28], a temporal modeling method based on convolutional neural network (CNN) proposed by Bai et al., to learn the potential embedding vectors of drug fingerprints. BiLSTM [29] was used to learn potential embedding vectors for miRNA structural features. And the learned structural features and topological features were spliced together as the fusion features of drugs and miRNAs. Finally, we fed fusion features into the Softmax layer to obtain miRNA-drug interaction scores, thereby inferring potential MDI pairs. Experiments showed that MDIDCN had stability and reliability.
Overall, our contributions are summarized as follows:
(1) We propose a dual-channel miRNA-drug interactions prediction model, MDIDCN.
(2) We used BiLSTM to learn the k-mer feature of miRNA, which allows the model to better acquire the interaction and dependence relationship between different positions in the sequence. We also used TCN to learn fingerprint feature of drugs, which enabled it to better characterize the structure of drug molecules.
(3) We connected sequence features and topological features respectively to represent the fusion features of drugs and miRNAs, and further improving the understanding and prediction ability of MDIs.
2 Materials and methods
2.1 Dataset
Guan et al. [30] used five datasets. In reference to their work, we selected three datasets from their studies, namely, the database for non-coding RNAs involved in drug resistance (ncDR) [31], the RNA interaction dataset (RNAInter) [32], and the database of validated small molecules’ effects on miRNA expression (SM2miR3) [33]. There are many datasets on MDIs. Guan et al. only collected miRNA-drug interaction pairs of Homo sapiens in the above three databases, and then pre-treated them with de-redundancy and de-duplication, and took known MDI pairs as positive samples. An equal number of unproven MDI pairs were randomly selected as negative samples. The details of the three databases are shown in Tab.1. The sparse ratio is defined as the ratio of known correlations to all possible correlations.
miRNA sequences and drug SMILES were collected from MiRBase [34] and Pubchem [35]. The SMILES [36] string of drugs is a normalized representation that explicitly describes the molecular structure in the form of an ASCII string. It uses a series of characters and symbols to express the atomic, bond, and stereochemical information in a molecule.
2.2 Workflow
In order to predict unknown MDIs, we proposed a model named MDIDCN, which mainly consists of three steps: feature extraction, feature fusion and prediction.
As shown in Fig.1, we extract the fusion features of drugs and miRNAs via dual-channel network. Firstly, TCN and BiLSTM are used to extract the structural features of drugs and miRNAs respectively, while BiNE method is used to extract the topological features of drugs and miRNAs. Secondly, the structural and topological features of drugs are linked together as the fusion features of drugs, the structural and topological features of miRNAs are linked together as the fusion features of miRNAs. Finally, the fusion features of the two are dimensionalized through the max-pooling layer respectively to retain the most significant feature information, which was then converted into the probability distribution of interaction scores by the Softmax function to obtain the final miRNA-drug interaction scores.
2.3 Represented drug by MACCS fingerprints
In earlier studies, scholars developed a variety of features to describe the chemical structure of drug compounds, including geometric structure, topological structure, and quantum chemical properties [37]. Typically, people use fingerprints based on substructural keys as descriptors to represent chemical structures. Substructural fingerprinting is a method of encoding a molecular structure as a fixed length bit string without involving three-dimensional structural information. Here, we use molecular substructural fingerprints to describe drugs. Molecular substructural fingerprints encode chemical structures directly into a binary vector that can indicate the presence or absence of a specific substructure in a drug. Fig.2 shows an example of MACCS fingerprints for the drug Arsenolite. Specifically, we create a dictionary that contains a list of substructure features represented in SMART form. SMART is a system that utilizes SMILES extension rules to identify substructures. After the first step of creating the dictionary, we compare each entry in the dictionary to a given molecular substructure. If the dictionary contains a molecular substructure, the corresponding fingerprint bit is set to 1, otherwise it is set to 0. Finally, we characterize the drug with a vector of 166 Boolean values.
2.4 Extract structural features of drugs
TCN is specifically designed to process data with a time series structure, and it performs well in a variety of time series modeling tasks, such as speech recognition, natural language processing, action recognition, time series prediction, and more. The core idea of TCN is to apply convolution operations to timing data to capture long-range dependencies in timing through local awareness and parameter sharing. Different from RNN, TCN has no requirement on the sequence length of input and output, and can handle variable-length and fixed-length time series data. TCN can also process the entire time series data in parallel, overcoming the limitation that RNN is prone to gradient disappearance or gradient explosion when the sequence is longer. The MACCS fingerprint of a drug can be viewed as a binary sequence in which each bit indicates the presence or absence of a particular structural or functional group. In the MACCS fingerprint of a drug, there may be a certain correlation between adjacent bits. The convolution operation of TCN can effectively capture this local correlation. So we use dilated Non-Causal Convolutions of TCN to extract fingerprint features of drugs.
As shown in Fig.2, this long-range dependence is captured by setting up different dilatation factors to more fully represent the molecular structural features of the drug. At any time step, the operation of non-causal dilative convolution involves the previous step, the current step, and the next step. Specifically, it processes data from the current step by considering information from both the previous step and the next step. During convolution, the features of the current step are mixed with those of the previous and subsequent steps, taking full advantage of temporal dependencies. This operation mode enables the model to obtain more comprehensive context information, and improves the ability to understand and represent the time series information of drug molecules:
where denotes the result of the dilated non-causal convolution at time t. The activations in the lth layer are given by . Note that each layer has an equal number of filters with the kernel size of 10. Each layer consists of a set of dilated convolutions with the rate parameter d and non-linearity function f. The filters are parameterized by with and the bias vector . To avoid vanishing or exploding gradients, we further use residual connections to facilitate gradients flow, and is converted as follows:
where and are a set of weights and biases for the residual connection, respectively. As the receptive field can increase with the dilated convolutions, it contributes to exhibiting substantially longer memory and prevents the model from overfitting without drastically increasing the number of parameters in the sequence modeling [28]. The receptive field at each layer is calculated as below:
where l represents a vector containing the dilations and is the length of filter.
2.5 Represented miRNA structural features by k-mer
k-mer is a feature representation method for sequence fragments, which is widely used in bioinformatics. In this paper, 3-mer (substring of length 3) is used to represent the structural feature of miRNA. The miRNA sequence was divided into substrings of length 3, that is, three consecutive nucleotides were extracted as a 3-mer. The 3-mer principle is shown at the bottom of Fig.1. For example, if the miRNA sequence is “CCAGCUCG…CACU”, the following 3-mer fragments can be obtained: “CCA”, “CAG”, “AGC”, … , “ACU”. Since miRNA sequences are composed of four bases (A, G, C, U), there are 64 () distinct 3-mer patterns in a single sequence. These 3-mer patterns can be sequentially extracted from the first nucleotide of miRNA by sliding window. For each 3-mer mode, its normalized frequency in the miRNA sequence can be calculated. Normalized frequency refers to the number of occurrences of a 3-mer in the miRNA sequence divided by the total length of the miRNA sequence, and is used to represent the relative importance of the 3-mer in the miRNA sequence. Finally, based on the 3-mer representation, a miRNA feature vector of length =64 can be obtained, where each position represents different 3-mer patterns, and its normalized frequency reflects the information of the miRNA sequence.
2.6 Extract miRNA structural features
BiLSTM is a variant of RNN that is widely used in natural language processing and sequence modeling tasks. Compared to traditional one-way LSTM, BiLSTM is able to leverage past and future context information to better capture long-term dependencies in the sequence. There are complex long-term dependencies among the nucleotides in miRNA sequences, which are essential to accurately predict the function and structure of miRNA. BiLSTM is able to process k-mer sequences of miRNAs simultaneously in both forward and reverse directions, enabling more accurate capture of local features. As shown in Fig.3, BiLSTM consists of two independent LSTMS, one processing input sequence in positive order and the other processing input sequence in reverse order. These two LSTM networks are called forward LSTM and backward LSTM respectively. When processing an input sequence, the forward LSTM reads the input from left to right, and the forward LSTM reads the input from right to left. In this way, the output of each time step is concatenated by the output of two LSTMS, that is, the output of the positive sequence LSTM and the reverse sequence LSTM are connected. The output representation of BiLSTM connects the output of forward and reverse LSTM, and this representation contains contextual information about the past and future of k-mer, so it has richer feature representation capabilities. For each LSTM cell with time step t, do the following:
where respectively represent the three gating mechanisms used in the LSTM at each time step: input gate, forget gate, and output gate. These gating mechanisms help determine whether to retain, update, and output memorized information. and represent the weight of the input and the output of the previous unit. is the deviation term. represents the sigmoid function. represents a new value that can be added to the storage unit. indicates the output. The · operator represents the element-wise product.
2.7 Extract topological features by BiNE
The challenge of predicting MDIs can be viewed as a link prediction problem for heterogeneous graphs. In this study, a heterogeneous graph is constructed, which contains two types of nodes: drug nodes (i is the number of drugs in the dataset) and miRNA nodes ( j is the number of miRNAs in the dataset). The edge set indicates the known interaction between the drug and the miRNA. Matrix represents the weight of the edge between drug and miRNA . If there is an interaction between drug and miRNA , , otherwise 0. In this section, the graph embedding method of BiNE was proposed by Gao et al., which is used to learn the low-dimensional representation vectors of each node in order to preserve the topological information of the graph and the sequences of the node. BiNE encodes each node in the graph and embeds the nodes into a low-dimensional space, where each node is represented as a dense embedding vector. The problem can be defined as:
● Input:A bipartite network and its weight matrix W.
● Output:A map function , which maps each vertex in G to a h-dimensional embedding vector, where .
In the embedded space, it is necessary to preserve both the implicit relationship between nodes of the same type and the explicit relationship between nodes of different types. Therefore, this method constructs a joint optimization framework composed of three objective functions, including one explicit relation, two implicit relation, and variable weights.
2.7.1 Explicit relational model
In order to represent the explicit relationships, we compute the local adjacency among diverse nodes within a bipartite network, utilizing the local adjacency method inspired by LINE [38], and the joint probability between two connected vertices and is defined as:
where is the weight of the edge between vertices.
Inspired by the word2vec principle, BiNE uses inner products to model the local similarity between two different kinds of nodes in an embedded space [39], and uses the sigmoid function to map the interaction values into the probability space. The joint probability of two types of nodes in the embedding space is defined as follows:
where and are the embedding vectors of drugs and miRNA , respectively.
KL-divergence is a measure of the difference between two probability distributions, which can be used to measure the difference between the empirical distribution and the reconstructed distribution of the probability of coexistence between vertices. By minimizing this difference, we can learn more accurate embedding vectors. The first part in the joint optimization framework is defined as follows:
2.7.2 Implicit relational model
In a bipartite network, there are explicit relationships between nodes of different types and implicit relationships between nodes of the same type. For two vertices of the same type, if there is a path between them, then there should be some implicit relationship between them. Research on recommendation systems shows that not only explicit but also implicit relationships help to discover potential information in heterogeneous graphs [40,41]. This implies that nodes of the same type within a bipartite network do not have direct connections, yet they still hold a wealth of valuable information, which emphasizes the importance of modeling implicit relationships between nodes of the same type. It is a common method to transform the network into a vertex sequence corpus by random walks on the network, which has been adopted by some homogeneous network embedding methods [42,43]. However, direct random walks on bipartite networks may fail because random walks on bipartite networks do not have a stationary distribution due to periodicity problems [44]. To reveal the second proximity of the heterogeneous graph, BiNE employs co-HITS [45] to construct two weighted homogeneous networks: a drug-drug network and a miRNA-miRNA network, and performs a random walk on the two homogeneous networks to encode the higher-order implicit relationships of the original network. According to co-HITS, the correlation coefficient between two nodes can be defined as:
where is the weight of the edge . Therefore, we can use the matrix with dimension and the matrix to represent the two induced homogeneous networks, respectively. Let the matrix be a MDI bipartite matrix with dimension , the drug-drug network is denoted by a matrix with dimension , and the miRNA-miRNA network is denoted by a matrix with dimension .
In order to learn higher-order implicit relationships in bipartite networks, Gao et al. used the truncated random walks method to generate corpora on two homogeneous networks. However, previous corpora generated using DeepWalk may not fully capture the features of real-world networks. To solve this problem, a new biased and self-adaptive random walk generator is introduced to generate a higher fidelity corpus. The generator can maintain the distribution features of vertices in the bipartite network, so as to reflect the features of the real network more accurately. Its core ideas are as follows:
First, the importance of each vertex is determined by the number of random walks starting from it, and the importance is measured in terms of centrality. The more centrality a vertex has, the more likely it is to start a random walk from it. So HITS [46] is used to measure the centrality of nodes in a homogeneous network.
Second, we’re going to assign each step a probability that stops the random walk. Unlike DeepWalk and other methods that apply fixed lengths to random walks [47], this method generates a sequence of vertices of variable length, which better simulates sentences of variable length in natural language. By allowing vertices of variable length, we are able to better capture the complexity and diversity that exists in real networks.
Through the above steps, we get two sequence corpora, and the process of generating sequences follows the principle of “richer get richer”. By considering this principle, we can better simulate the connection between nodes in the real network. This method can improve the quality of the generated corpus and reflect the features and structure of the real network more accurately by flexibly generating variable length vertex sequence and considering the enrichment phenomenon of the network.
Next, we used the Skip-gram model [48] to learn the node embedding vector in the sequential corpus of drugs and miRNA respectively. If two nodes frequently appear together in the context of the same sequence, we assign them similar embedding vectors. In order to learn the implicit relation and obtain the truncated random walk sample of the corpus, two objective functions are defined to preserve the high order similarity by maximizing the conditional probability. The corpus objective function of homogeneous network is defined as follows:
where represents the context of node in a sequence .
Similarly, the corpus objective function of the miRNA homogeneous network can be obtained as follows:
where represents the context of node in a sequence .
According to the existing neural embedding method [38,42,43], we use the inner product kernel to parameterize two conditional probabilities and , and use Softmax for output:
where and represent the number of drugs and miRNAs, respectively. and represent the context vectors of two types of nodes.
2.7.3 Joint optimization
To preserve both explicit and implicit relationships to learn low-dimensional embedding vectors, we combine the three component Eqs. (12), (14), and (15) into a joint optimization framework:
where parameters , are hyper-parameters of implicit relation, is the hyper-parameters of explicit relation in the joint optimization framework.
2.8 Prediction layer
In order to build a predictive model for miRNA-drug interactions, we adopted a dual-channel approach that combines structural and topological features. Specifically, we concatenate the structural features of drugs with their topological features as the fusion features of drugs, and the structural features of miRNAs with their topological features as the fusion features of miRNAs. Then we maximized the pooling of these two fusion features respectively. Finally, the features are input into the Softmax layer to obtain an interaction score of miRNA-drug. Specifically, it is assumed that the fusion feature of drug node is , and the fusion feature of miRNA node is . Then the interaction fraction between drug and miRNA is defined as follows:
The Softmax layer maps the features to a probability distribution that represents the degree of interaction between the drug and the miRNA. This score can be used to assess the strength of the possible interaction between the drug and the miRNA.
3 Results
3.1 Evaluation metrics
In order to ensure the fairness and impartiality of the experiment, we used 5-fold cross validation to evaluate the model performance. Specifically, we split the dataset into five parts, with each part in turn serving as the test set and the remaining four parts serving as the training set. Finally, we calculate the average of these five sets of prediction results as an evaluation estimate of the model performance.
We choose some common classification indexes to evaluate the model performance of MDIs, a binary classification problem, including the area under the receiver operating feature curve (AUC), AUPR, Accuracy (ACC), Precision, Sensitivity, Specificity and F1-score. The relevant formulas as follows:
where TP indicates that the sample that was actually positive was correctly predicted to be positive; TN indicates that the sample that was actually negative was correctly predicted to be negative; FP indicates that a sample that was actually negative was incorrectly predicted to be positive; FN indicates that a sample that was actually positive was incorrectly predicted to be negative. By using these evaluation indicators to evaluate the performance of the model, we can get comprehensive and accurate evaluation results.
3.2 Performance comparison of different datasets
In order to comprehensively, reliably and systematically evaluate the predictive performance of the model, we conducted experiments on three different datasets, ncDR, RNAInter and SM2miR3, and listed each evaluation index in Tab.2 to illustrate the predictive performance of the model.
In order to visually demonstrate the predictive performance of the model on each dataset, Fig.4−Fig.6 show the ROC and PR curves of the 5-fold cross-validation results on the three datasets respectively. By drawing the ROC curve and PR curve, we can calculate the average AUC and AUPR of the MDIDCN model on different datasets. Specifically, the average AUC on the ncDR dataset is 0.9567 and the average AUPR is 0.9556, the average AUC on the RNAInter dataset is 0.9365 and the average AUPR is 0.9348, and the average AUC on the SM2miR3 dataset is 0.8975 and the average AUPR is 0.8881.
Taken together, it can be concluded from these results that the MDIDCN model exhibits smooth and efficient predictive performance across datasets and has a high average AUC and AUPR, demonstrating the potential of the model to predict MDIs.
3.3 Feature extraction layer dimension analysis
In the feature extraction part, we used BiLSTM to extract structural features of miRNAs and TCN to extract structural features of drugs. However, in the process of feature extraction, increasing the feature dimension will increase the complexity of the model, and the difference in the dimension will also lead to the difference in the prediction ability of the model. To ensure good generalization of the MDIDCN model, we tried different combinations of 32, 64, 128, 256, and 512 hidden layer units for BiLSTM and TCN, respectively. As shown in Fig.7, the darker the color of the grid in the figure, the better the prediction performance of the model when the dimensions of the corresponding two feature extraction methods are used. We found that when BiLSTM_dimension = 64, TCN_dimension = 128, ncDR and SM2miR3 had the best AUC, which are 0.9567 and 0.8975, respectively. When BiLSTM_dimension = 64, TCN_dimension = 64, RNAInter had the best AUC, which is 0.9365.
These results showed that choosing the right dimension for extracting features is crucial. Too high a dimension will increase the complexity of the model and may lead to overfitting problems, while too low a dimension may not adequately capture the feature information, affecting the performance of the model. By optimizing the dimensions of BiLSTM and TCN, the MDIDCN model can better predict miRNA-drug interactions, and has high generalization ability and potential.
3.4 Ablation experiment
When constructing the representation vectors of miRNA and drug, we considered both the structural features and topological features. This section discussed the effects of three presentation vectors on the predictive performance of the model MDIDCN: using only structural features, using only topological features, and using both structural and topological features. The structural features provide clues about the chemical structural similarities between miRNAs and drugs, while the topological features take into account the higher-order implicit and explicit transition relationships that exist between miRNAs and drugs.
As shown in Fig.8, we use AUC as the evaluation criteria. On each dataset, AUC using both structural features and topological features are the highest, those using only topological features are the second, and those using only structural features are the lowest. It can be seen that topological features play a greater role in improving model prediction performance. This is because topological features can capture complex relationships between nodes, thus providing richer similarity information. While structural features may not be as good as topological features in terms of model performance, their generation process is relatively simple and only requires consideration of structural information. Therefore, structural features have certain advantages in representing new samples, which provides convenience for dealing with large-scale datasets. It can be seen that the MDIDCN model can comprehensively and accurately consider the chemical structure of miRNA and drug and their interrelationships in the network by using both structural and topological features, thus improving the prediction performance.
3.5 Effect of dataset sparsity on performance
As can be seen from Tab.1, the sparse rate of ncDR dataset is 0.0752, that of RNAInter dataset is 0.0202, and that of SM2miR3 dataset is 0.0212. In this section, we compare three different miRNA-drug interaction networks and analyze the effect of their sparsity on the predictive performance of the model. The result is shown in Fig.9.
As can be seen from Fig.9, ncDR has the highest sparsity rate, and all evaluation indicators are also the highest in the three datasets, followed by RNAInter and SM2miR3. Although the sparse rates of RNAInter and SM2miR3 are close, their evaluation indicators are not close. It can be reasonably inferred from Fig.9 that the sparsity of the dataset will result in significantly lower metrics such as AUC, AUPR, Pre, and Spec, but this does not mean that the model performance is poor.
3.6 Comparison with Other Methods
This section compares the model MDIDCN with other methods. There are five comparison methods, including LDAMAN [49], which predicts potential lncRNA-disease interactions from heterogeneous information network with SDNE embedding model. Guan et al. proposed two methods, one called BNEMDI [33] and the other called MFIDMA [50] to predict MDIs. The BNEMDI method uses BiNE to extract topological features, and then inputs topological and attribute features of both drug and miRNA into a deep neural network to search for potential miRNA-drug interactions. The MFIDMA approach uses SDNE to extract topological features, and then, similar to the BNEMDI approach, inputs topological and attribute features of the drug and miRNA into a convolutional neural network (CNN) and a deep neural network (DNN) to integrate the features and predict potential target miRNAs of the drug. Wei et al. [51] proposed a multi-view contrast learning model named GCFMCL based on graph-collaborative filtering. The GCFMCL model aggregated neighborhood information through graph-cooperative filtering, and uses topological contrastive learning and feature contrastive learning to reduce the impact of heterogeneous node noise and interaction sparsity. Niu et al. [52] developed a prediction model called GCNNMMA based on graph neural networks (GNNs) and convolutional neural networks (CNNs), and used this model to predict the associations between small molecule drugs and miRNAs. All comparison methods used the ncDR dataset, RNAInter dataset and SM2miR3 dataset for 5-fold cross-validation, and the results were shown in Tab.3−Tab.5.
As can be seen from Tab.3−Tab.5, our proposed model, MDIDCN, had the best comprehensive performance and could be used to predict potential MDIs. The invalidity of the last three indicators of the GCNNMMA model in the RNAInter and SM2miR3 datasets may be due to the fact that these two datasets are too sparse.
3.7 Case study
To further evaluate the predictive power of the model MDIDCN, we selected the drug Gemcitabine (PubChem ID:60750) and miRNA hsa-miR-155-5p for case studies based on the ncDR dataset. To investigate Gemcitabine-related miRNA-drug interactions, we excluded Gemcitabine-related miRNA-drug interaction information from the dataset and trained predictive models using the remaining miRNA-drug interaction data. We then used the MDIDCN model to screen potential miRNAs that may interact with Gemcitabine. According to our prediction results, eight of the top 10 predicted miRNAs were verified by PubMed literature, which further confirmed that these miRNAs are likely to interact with Gemcitabine. The list of specific predicted miRNAs is shown in Tab.6, and we also found some supporting evidence. For example, the regulatory effect of hsa-miR-155-5p led to changes in the expression of WEE1 gene, thus affecting the resistance of bladder cancer cells to gemcitabine [53]. hsa-miR-20a-5p regulates gemcitabine chemosensitivity by targeting RRM2 in pancreatic cancer cells and serves as a predictor for gemcitabine-based chemotherapy [54]. hsa-miR-17-5p/RRM2 regulated gemcitabine resistance in lung cancer A549 cells [55]. hsa-miR-133b interacts with gemcitabine in promoting tumor cell apoptosis, which helps to improve the therapeutic effect [56].
In addition, we used the same method to conduct a case study on miRNA hsa-miR-155-5p. As shown in Tab.7, seven of the top 10 predicted drugs were validated by PubMed literature, and we found some evidence for them. For example, Under specific drug conditions, hsa-miR-155-5p can enhance the resistance of esophageal squamous cell cells to Docetaxel [57]. hsa-miR-155-5p plays a certain role in Uric Acid, and it is one of the miRNAs significantly down-regulated in Uric Acid [58]. The case study results showed that MDIDCN demonstrated a high degree of accuracy and reliability in predicting MDIs.
4 Conclusion
We present a new computational method called MDIDCN for predicting interactions between drugs and miRNAs. First, MACCS fingerprints were used to represent the structural features of drugs, and k-mer was used to represent the structural features of miRNAs. MDIDCN then uses graph embedding techniques (BiNE) to learn topological feature representations of drugs and miRNAs on bipartitic graphs. Secondly, the potential embedding vectors of drug fingerprints are learned through TCN, and the potential embedding vectors of miRNA nucleotide sequence are learned through BiLSTM. Finally, the structural and topological features of the drug and miRNA learned by the dual-channel are combined into their respective fusion features and input into the CNN layer. After passing through the max-pooling layer and Softmax layer, miRNA-drug interaction scores were obtained.
To evaluate the performance of the MDIDCN model, we performed a 5-fold cross-validation using three datasets. The results show that the average AUC of our model on these three datasets are 0.9567, 0.9365, and 0.8975, respectively. In addition, we conducted case studies on the drug Gemcitabine and miRNA hsa-miR-155-5p and compared our approach to other existing methods. Taken together, our model can accurately and robustly predict MDIs. In the future, we will consider adding the side effect features of drugs and the physicochemical properties of miRNAs to further improve the predictive power of the model.
Schmidt M F. Drug target miRNAs: chances and challenges. Trends in Biotechnology, 2014, 32( 11): 578–585
[2]
Meister G, Tuschl T. Mechanisms of gene silencing by double-stranded RNA. Nature, 2004, 431( 7006): 343–349
[3]
Xu L F, Wu Z P, Chen Y, Zhu Q S, Hamidi S, Navab R. MicroRNA-21 (miR-21) regulates cellular proliferation, invasion, migration, and apoptosis by targeting PTEN, RECK and Bcl-2 in lung squamous carcinoma, Gejiu city, China. PLoS One, 2014, 9( 8): e103698
[4]
Garofalo M, Condorelli G, Croce C M. MicroRNAs in diseases and drug response. Current Opinion in Pharmacology, 2008, 8( 5): 661–667
[5]
Chen X, Xie D, Zhao Q, You Z H. MicroRNAs and complex diseases: from experimental results to computational models. Briefings in Bioinformatics, 2019, 20( 2): 515–539
[6]
Zhang S, Chen L, Jung E J, Calin G A. Targeting MicroRNAs with small molecules: from dream to reality. Clinical Pharmacology & Therapeutics, 2010, 87( 6): 754–758
[7]
Chen X, Guan N N, Sun Y Z, Li J Q, Qu J. MicroRNA-small molecule association identification: from experimental results to computational models. Briefings in Bioinformatics, 2020, 21( 1): 47–61
[8]
Wang C C, Chen X, Qu J, Sun Y Z, Li J Q. RFSMMA: a new computational model to identify and prioritize potential small molecule−miRNA associations. Journal of Chemical Information and Modeling, 2019, 59( 4): 1668–1679
[9]
Zhao Y, Chen X, Yin J, Qu J. SNMFSMMA: using symmetric nonnegative matrix factorization and Kronecker regularized least squares to predict potential small molecule-microRNA association. RNA Biology, 2020, 17( 2): 281–291
[10]
Qu J, Chen X, Sun Y Z, Zhao Y, Cai S B, Ming Z, You Z H, Li J Q. In silico prediction of small molecule-miRNA associations based on the HeteSim algorithm. Molecular Therapy Nucleic Acids, 2019, 14: 274–286
[11]
Wang S H, Wang C C, Huang L, Miao L Y, Chen X. Dual-network collaborative matrix factorization for predicting small molecule-miRNA associations. Briefings in Bioinformatics, 2022, 23( 1): bbab500
[12]
Zhang G Z, Li M L, Deng H, Xu X, Liu X, Zhang W. SGNNMD: signed graph neural network for predicting deregulation types of miRNA-disease associations. Briefings in Bioinformatics, 2022, 23( 1): bbab464
[13]
Wang C C, Li T H, Huang L, Chen X. Prediction of potential miRNA−disease associations based on stacked autoencoder. Briefings in Bioinformatics, 2022, 23( 2): bbac021
[14]
Zheng K, Zhao H, Zhao Q, Wang B, Gao X, Wang J. NASMDR: a framework for miRNA-drug resistance prediction using efficient neural architecture search and graph isomorphism networks. Briefings in Bioinformatics, 2022, 23( 5): bbac338
[15]
Ma M Y, Na S, Zhang X, Chen C, Xu J. SFGAE: a self-feature-based graph autoencoder model for miRNA−disease associations prediction. Briefings in Bioinformatics, 2022, 23( 5): bbac340
[16]
Wang S, Liu T, Ren C, Wu W, Zhao Z, Pang S, Zhang Y. Predicting potential small molecule−miRNA associations utilizing truncated schatten p-norm. Briefings in Bioinformatics, 2023, 24( 4): bbad234
[17]
Wang Z, Lei X. Matrix factorization with neural network for predicting circRNA-RBP interactions. BMC Bioinformatics, 2020, 21( 1): 229
[18]
Wang Z, Lei X. Prediction of RBP binding sites on circRNAs using an LSTM-based deep sequence learning architecture. Briefings in Bioinformatics, 2021, 22( 6): bbab342
[19]
Guo Y, Lei X, Liu L, Pan Y. circ2CBA: prediction of circRNA-RBP binding sites combining deep learning and attention mechanism. Frontiers of Computer Science, 2023, 17( 5): 175904
[20]
Ma M, Lei X. A dual graph neural network for drug−drug interactions prediction based on molecular structure and interactions. PLoS Computational Biology, 2023, 19( 1): e1010812
[21]
Liu J, Lei X, Zhang Y, Pan Y. The prediction of molecular toxicity based on BiGRU and GraphSAGE. Computers in Biology and Medicine, 2023, 153: 106524
[22]
Lei X, Tie J, Pan Y. Inferring metabolite-disease association using graph convolutional networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2022, 19( 2): 688–698
[23]
Kurtz S, Narechania A, Stein J C, Ware D. A new method to compute K-mer frequencies and its application to annotate large repetitive plant genomes. BMC Genomics, 2008, 9: 517
[24]
Cereto-Massagué A, Ojeda M J, Valls C, Mulero M, Garcia-Vallvé S, Pujadas G. Molecular fingerprint similarity search in virtual screening. Methods, 2015, 71: 58–63
[25]
Li M M, Huang K, Zitnik M. Graph representation learning in biomedicine. 2021, arXiv preprint arXiv: 2104.04883
[26]
Yue Y, He S. DTI-HeNE: a novel method for drug-target interaction prediction based on heterogeneous network embedding. BMC Bioinformatics, 2021, 22( 1): 418
[27]
Gao M, Chen L, He X, Zhou A. BiNE: bipartite network embedding. In: Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 2018, 715−724
[28]
Bai S, Kolter J Z, Koltun V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. 2018, arXiv preprint arXiv: 1803.01271
[29]
Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673−2681
[30]
Guan Y J, Yu C Q, Li L P, You Z H, Ren Z H, Pan J, Li Y C. BNEMDI: a novel MicroRNA−drug interaction prediction model based on multi-source information with a large-scale biological network. Frontiers in Genetics, 2022, 13: 919264
[31]
Dai E, Yang F, Wang J, Zhou X, Song Q, An W, Wang L, Jiang W. ncDR: a comprehensive resource of non-coding RNAs involved in drug resistance. Bioinformatics, 2017, 33( 24): 4010–4011
[32]
Kang J, Tang Q, He J, Li L, Yang N, Yu S, Wang M, Zhang Y, Lin J, Cui T, Hu Y, Tan P, Cheng J, Zheng H, Wang D, Su X, Chen W, Huang Y. RNAInter v4. 0: RNA interactome repository with redefined confidence scoring system and improved accessibility. Nucleic Acids Research, 2022, 50( D1): D326–D332
[33]
Liu X, Wang S, Meng F, Wang J, Zhang Y, Dai E, Yu X, Li X, Jiang W. SM2miR: a database of the experimentally validated small molecules’ effects on microRNA expression. Bioinformatics, 2013, 29( 3): 409–411
[34]
Kozomara A, Birgaoanu M, Griffiths-Jones S. miRBase: from microRNA sequences to function. Nucleic Acids Research, 2019, 47( D1): D155–D162
[35]
Kim S, Chen J, Cheng T, Gindulyte A, He J, He S, Li Q, Shoemaker B A, Thiessen P A, Yu B, Zaslavsky L, Zhang J, Bolton E E. PubChem in 2021: new data content and improved web interfaces. Nucleic Acids Research, 2021, 49( D1): D1388–D1395
[36]
Weininger D. SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules. Journal of Chemical Information and Computer Sciences, 1988, 28( 1): 31–36
[37]
Cao D S, Liu S, Xu Q S, Lu H M, Huang J H, Hu Q N, Liang Y Z. Large-scale prediction of drug-target interactions using protein sequences and drug topological structures. Analytica Chimica Acta, 2012, 752: 1–10
[38]
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. LINE: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web. 2015, 1067−1077
[39]
Church K W. Word2Vec. Natural Language Engineering, 2017, 23( 1): 155–162
[40]
Jiang M, Cui P, Yuan N J, Xie X, Yang S. Little is much: bridging cross-platform behaviors through overlapped crowds. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence. 2016, 13−19
[41]
Yu L, Zhang C, Pei S, Sun G, Zhang X. WalkRanker: a unified pairwise ranking model with multiple relations for item recommendation. In: Proceedings of the 32th AAAI Conference on Artificial Intelligence. 2018
[42]
Grover A, Leskovec J. node2vec: scalable feature learning for networks. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016, 855−864
[43]
Perozzi B, Al-Rfou R, Skiena S. DeepWalk: online learning of social representations. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2014, 701−710
[44]
Alzahrani T, Horadam K J, Boztas S. Community detection in bipartite networks using random walks. In: Proceedings of the 5th Workshop on Complex Networks CompleNet 2014. 2014, 157−165
[45]
Deng H, Lyu M R, King I. A generalized Co-HITS algorithm and its application to bipartite graphs. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2009, 239−248
[46]
Kleinberg J M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999, 46( 5): 604–632
[47]
Dong Y, Chawla N V, Swami A. metapath2vec: scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2017, 135−144
[48]
Mikolov T, Sutskever I, Chen K, Corrado G, Dean J. Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. 2013, 3111−3119
[49]
Zhang P, Zhao B W, Wong L, You Z H, Guo Z H, Yi H C. A novel computational method for predicting LncRNA-disease associations from heterogeneous information network with SDNE embedding model. In: Proceedings of the 16th International Conference on Intelligent Computing Theories and Application. 2020, 505−513
[50]
Guan Y J, Yu C Q, Qiao Y, Li L P, You Z H, Ren Z H, Li Y C, Pan J. MFIDMA: a multiple information integration model for the prediction of drug-miRNA associations. Biology, 2022, 12( 1): 41
[51]
Wei J, Zhuo L, Zhou Z, Lian X, Fu X, Yao X. GCFMCL: predicting miRNA-drug sensitivity using graph collaborative filtering and multi-view contrastive learning. Briefings in Bioinformatics, 2023, 24( 4): bbad247
[52]
Niu Z, Gao X, Xia Z, Zhao S, Sun H, Wang H, Liu M, Kong X, Ma C, Zhu H, Gao H, Liu Q, Yang F, Song X, Lu J, Zhou X. Prediction of small molecule drug-miRNA associations based on GNNs and CNNs. Frontiers in Genetics, 2023, 14: 1201934
[53]
Yang Y, Zhang G, Li J, Gong R, Wang Y, Qin Y, Ping Q, Hu L. Long noncoding RNA NORAD acts as a ceRNA mediates gemcitabine resistance in bladder cancer by sponging miR-155−5p to regulate WEE1 expression. Pathology-Research and Practice, 2021, 228: 153676
[54]
Lu H, Lu S, Yang D, Zhang L, Ye J, Li M, Hu W. MiR-20a-5p regulates gemcitabine chemosensitivity by targeting RRM2 in pancreatic cancer cells and serves as a predictor for gemcitabine-based chemotherapy. Bioscience Reports, 2019, 39( 5): BSR20181374
[55]
Ma X, Fu T, Ke Z Y, Du S L, Wang X C, Zhou N, Zhong M Y, Liu Y J, Liang A L. MiR-17- 5p/RRM2 regulated gemcitabine resistance in lung cancer A549 cells. Cell Cycle, 2023, 22( 11): 1367–1379
[56]
Crawford M, Batte K, Yu L, Wu X, Nuovo G J, Marsh C B, Otterson G A, Nana-Sinkam S P. MicroRNA 133B targets pro-survival molecules MCL-1 and BCL2L2 in lung cancer. Biochemical and Biophysical Research Communications, 2009, 388( 3): 483–489
[57]
Luo W, Zhang H, Liang X, Xia R, Deng H, Yi Q, Lv L, Qian L. DNA methylation-regulated miR-155-5p depresses sensitivity of esophageal carcinoma cells to radiation and multiple chemotherapeutic drugs via suppression of MAP3K10. Oncology Reports, 2020, 43( 5): 1692–1704
[58]
Bai B, Liu Y, Abudukerimu A, Tian T, Liang M, Li R, Sun Y. Key genes associated with pyroptosis in gout and construction of a miRNA-mRNA regulatory network. Cells, 2022, 11( 20): 3269
RIGHTS & PERMISSIONS
Higher Education Press
AI Summary 中Eng×
Note: Please be aware that the following content is generated by artificial intelligence. This website is not responsible for any consequences arising from the use of this content.