Quantitative Biology

Online First
The manuscripts published below will continue to be available from this page until they are assigned to an issue.
Please wait a minute...
For Selected: View Abstracts Toggle Thumbnails
On the use of kernel machines for Mendelian randomization
Weiming Zhang, Debashis Ghosh
Quant. Biol.    DOI: 10.1007/s40484-017-0124-3
Abstract   HTML   PDF (239KB)

Background: Properly adjusting for unmeasured confounders is critical for health studies in order to achieve valid testing and estimation of the exposure’s causal effect on outcomes. The instrumental variable (IV) method has long been used in econometrics to estimate causal effects while accommodating the effect of unmeasured confounders. Mendelian randomization (MR), which uses genetic variants as the instrumental variables, is an application of the instrumental variable method to biomedical research fields, and has become popular in recent years. One often-used estimator of causal effects for instrumental variables and Mendelian randomization is the two-stage least square estimator (TSLS). The validity of TSLS relies on the accurate prediction of exposure based on IVs in its first stage.

Results: In this note, we propose to model the link between exposure and genetic IVs using the least-squares kernel machine (LSKM). Some simulation studies are used to evaluate the feasibility of LSKM in TSLS setting.

Conclusions: Our results show that LSKM based on genotype score or genotype can be used effectively in TSLS. It may provide higher power when the association between exposure and genetic IVs is nonlinear.

Table and Figures | Reference | Related Articles | Metrics
Cap-seq reveals complicated miRNA transcriptional mechanisms in C. elegans and mouse
Jiao Chen, Dongxiao Zhu, Yanni Sun
Quant. Biol.    DOI: 10.1007/s40484-017-0123-4
Abstract   HTML   PDF (565KB)

Background: MicroRNAs (miRNAs) regulate target gene expression at post-transcriptional level. Intense research has been conducted for miRNA identification and the target finding. However, much less is known about the transcriptional regulation of miRNA genes themselves. Recently, a special group of pre-miRNAs that are produced directly by transcription without Drosha processing were validated in mouse, indicating the complexity of miRNA biogenesis.

Methods: In this work, we detect clusters of aligned Cap-seq reads to find the transcription start sites (TSSs) for intergenic miRNAs and study their transcriptional regulation in Caenorhabditis elegansand mouse.

Results: In both species, we have identified a class of special pre-miRNAs whose 5′ ends are capped, and are most probably generated directly by transcription. Furthermore, we distinguished another class of special pre-miRNAs that are 5′-capped but are also part of longer primary miRNAs, suggesting they may have more than one transcription mechanism. We detected multiple cap reads peaks within miRNA clusters in C. elegans. We surmised that the miRNAs in a cluster may either be transcribed independently or be re-capped during the microprocessor cleavage process. We also observed that H3K4me3 and Pol II are enriched at those identified miRNA TSSs.

Conclusions: The Cap-seq datasets enabled us to annotate the primary TSSs for miRNA genes with high resolution. Special class of 5′-capped pre-miRNAs have been identified in both C. elegans and mouse. The capping patter of miRNAs in a cluster indicate that clustered miRNA transcripts probably undergo a re-capping procedure during the microprocessor cleavage process.

Table and Figures | Reference | Supplementary Material | Related Articles | Metrics
Systems and synthetic biology approaches in understanding biological oscillators
Zhengda Li, Qiong Yang
Quant. Biol.    DOI: 10.1007/s40484-017-0120-7
Abstract   HTML   PDF (648KB)

Background: Self-sustained oscillations are a ubiquitous and vital phenomenon in living systems. From primitive single-cellular bacteria to the most sophisticated organisms, periodicities have been observed in a broad spectrum of biological processes such as neuron firing, heart beats, cell cycles, circadian rhythms, etc. Defects in these oscillators can cause diseases from insomnia to cancer. Elucidating their fundamental mechanisms is of great significance to diseases, and yet challenging, due to the complexity and diversity of these oscillators.

Results: Approaches in quantitative systems biology and synthetic biology have been most effective by simplifying the systems to contain only the most essential regulators. Here, we will review major progress that has been made in understanding biological oscillators using these approaches. The quantitative systems biology approach allows for identification of the essential components of an oscillator in an endogenous system. The synthetic biology approach makes use of the knowledge to design the simplest,de novo oscillators in both live cells and cell-free systems. These synthetic oscillators are tractable to further detailed analysis and manipulations.

Conclusion: With the recent development of biological and computational tools, both approaches have made significant achievements.

Table and Figures | Reference | Related Articles | Metrics
Variable importance-weighted Random Forests
Yiyi Liu, Hongyu Zhao
Quant. Biol.    DOI: 10.1007/s40484-017-0121-6
Abstract   HTML   PDF (3398KB)

Background: Random Forests is a popular classification and regression method that has proven powerful for various prediction problems in biological studies. However, its performance often deteriorates when the number of features increases. To address this limitation, feature elimination Random Forests was proposed that only uses features with the largest variable importance scores. Yet the performance of this method is not satisfying, possibly due to its rigid feature selection, and increased correlations between trees of forest.

Methods: We propose variable importance-weighted Random Forests, which instead of sampling features with equal probability at each node to build up trees, samples features according to their variable importance scores, and then select the best split from the randomly selected features.

Results: We evaluate the performance of our method through comprehensive simulation and real data analyses, for both regression and classification. Compared to the standard Random Forests and the feature elimination Random Forests methods, our proposed method has improved performance in most cases.

Conclusions: By incorporating the variable importance scores into the random feature selection step, our method can better utilize more informative features without completely ignoring less informative ones, hence has improved prediction accuracy in the presence of weak signals and large noises. We have implemented an R package “viRandomForests” based on the original R package “randomForest” and it can be freely downloaded from http://zhaocenter.org/software.

Table and Figures | Reference | Supplementary Material | Related Articles | Metrics
Transcriptome assembly strategies for precision medicine
Lu Wang, Lipi Acharya, Changxin Bai, Dongxiao Zhu
Quant. Biol.    DOI: 10.1007/s40484-017-0109-2
Abstract   HTML   PDF (1377KB)

Background: Precision medicine approach holds great promise to tailored diagnosis, treatment and prevention. Individuals can be vastly different in their genomic information and genetic mechanisms hence having unique transcriptomic signatures. The development of precision medicine has demanded moving beyond DNA sequencing (DNA-Seq) to much more pointed RNA-sequencing (RNA-Seq) [Cell, 2017, 168: 584?599].

Results: Here we conduct a brief survey on the recent methodology development of transcriptome assembly approach using RNA-Seq.

Conclusions: Since transcriptomes in human disease are highly complex, dynamic and diverse, transcriptome assembly is playing an increasingly important role in precision medicine research to dissect the molecular mechanisms of the human diseases.

Table and Figures | Reference | Related Articles | Metrics
Metabolic pathway databases and model repositories
Abraham A. Labena, Yi-Zhou Gao, Chuan Dong, Hong-li Hua, Feng-Biao Guo
Quant. Biol.    DOI: 10.1007/s40484-017-0108-3
Abstract   HTML   PDF (183KB)

Background: The number of biological Knowledge bases/databases storing metabolic pathway information and models has been growing rapidly. These resources are diverse in the type of information/data, the analytical tools, and objectives. Here we present a review of the most popular metabolic pathway databases and model repositories, focusing on their scope, content including reactions, enzymes, compounds, and genes, and applicability. The review aims to help researchers choose a suitable database or model repository according to the information and data required, by providing an insight look of each pathway resource.

Results: Four pathways databases and three model repositories were selected on the basis of popularity and diversity. Our review showed that the pathway resources vary in many aspects, such as their scope, content, access to data and the tools. In addition, inconsistencies have been observed in nomenclature and representation of database entities. The three model repositories reviewed do not offer a brief description of the models’ characteristics such as simulation conditions.

Conclusions: The inconsistencies among the databases in representing their contents may hamper the maximal use of the knowledge accumulated in these databases in particular and the area of systems biology at large. Therefore, it is strongly recommended that the database creators and the metabolic network models developers should follow international standards for the nomenclature of reactions and metabolites. Besides, computationally generated models that could be obtained from model repositories should be utilized with manual curations as they lack some important components that are necessary for full functionality of the models.

Table and Figures | Reference | Related Articles | Metrics
First page | Prev page | Next page | Last page Page 1 of 1, 6 articles found