Background: Superpixel segmentation is a powerful preprocessing tool to reduce the complexity of image processing. Traditionally, size uniformity is one of the significant features of superpixels. However, in medical images, in which subjects scale varies greatly and background areas are often flat, size uniformity rarely conforms to the varying content. To obtain the fewest superpixels with retaining important details, the size of superpixel should be chosen carefully.
Methods: We propose a scale-adaptive superpixel algorithm relaxing the size-uniformity criterion for medical images, especially pathological images. A new path-based distance measure and superpixel region growing schema allow our algorithm to generate superpixels with different scales according to the complexity of image content, that is smaller (larger) superpixels in color-riching areas (flat areas).
Results: The proposed superpixel algorithm can generate superpixels with boundary adherence, insensitive to noise, and with extremely big sizes and extremely small sizes on one image. The number of superpixels is much smaller than size-uniformly superpixel algorithms while retaining more details of images.
Conclusion: With the proposed algorithm, the choice of superpixel size is automatic, which frees the user from the predicament of setting suitable superpixel size for a given application. The results on the nuclear dataset show that the proposed superpixel algorithm superior to the respective state-of-the-art algorithms on both quantitative and quantitative comparisons.
Background: Inference of population structure is crucial for studies of human evolutionary history and genome-wide association studies. While several genomic regions have been reported to distort population structure analysis of European populations, no systematic analysis has been performed on non-European continental groups and with the latest human genome assembly.
Methods: Using the 1000 Genomes Project high coverage whole-genome sequencing data from four major continental groups (Europe, East Asia, South Asia, and Africa), we developed a statistical framework and systematically detected genomic regions with unusual contributions to the inference of population structure for each of the continental groups.
Results: We identified and characterized 27 unusual genomic regions mapped to GRCh38, including 13 regions around centromeres, 2 with chromosomal inversions, 8 under natural selection, and 4 with unknown causes. Excluding these regions would result in a more interpretable population structure inferred by principal components analysis and ADMIXTURE analysis.
Conclusions: Unusual genomic patterns in certain regions can distort the inference of population structure. Our compiled list of these unusual regions will be useful for many population-genetic studies, including those from non-European populations. Availability: The code to reproduce our results is available at the website of Github (/dwuab/UnRegFinder).
Background: Modern machine learning-based models have not been harnessed to their total capacity for disease trend predictions prior to the COVID-19 pandemic. This work is the first use of the conditional RNN model in predicting disease trends that we know of during development that complemented classical epidemiological approaches.
Methods: We developed the long short-term memory networks with quantile output (condLSTM-Q) model for making quantile predictions on COVID-19 death tolls.
Results: We verified that the condLSTM-Q was accurately predicting fine-scale, county-level daily deaths with a two-week window. The model’s performance was robust and comparable to, if not slightly better than well-known, publicly available models. This provides unique opportunities for investigating trends within the states and interactions between counties along state borders. In addition, by analyzing the importance of the categorical data, one could learn which features are risk factors that affect the death trend and provide handles for officials to ameliorate the risks.
Conclusion: The condLSTM-Q model performed robustly, provided fine-scale, county-level predictions of daily deaths with a two-week window. Given the scalability and generalizability of neural network models, this model could incorporate additional data sources with ease and could be further developed to generate other valuable predictions such as new cases or hospitalizations intuitively.
Background: Due to the limited availability and high cost of the reverse transcription-polymerase chain reaction (RT- PCR) test, many studies have proposed machine learning techniques for detecting COVID-19 from medical imaging. The purpose of this study is to systematically review, assess and synthesize research articles that have used different machine learning techniques to detect and diagnose COVID-19 from chest X-ray and CT scan images.
Methods: A structured literature search was conducted in the relevant bibliographic databases to ensure that the survey solely centered on reproducible and high-quality research. We selected papers based on our inclusion criteria.
Results: In this survey, we reviewed 98 articles that fulfilled our inclusion criteria. We have surveyed a complete pipeline of chest imaging analysis techniques related to COVID-19, including data collection, pre-processing, feature extraction, classification, and visualization. We have considered CT scans and X-rays as both are widely used to describe the latest developments in medical imaging to detect COVID-19.
Conclusions: This survey provides researchers with valuable insights into different machine learning techniques and their performance in the detection and diagnosis of COVID-19 from chest imaging. At the end, the challenges and limitations in detecting COVID-19 using machine learning techniques and the future direction of research are discussed.
Background: Genome-wide association studies (GWAS) have identified thousands of genomic non-coding variants statistically associated with many human traits and diseases, including cancer. However, the functional interpretation of these non-coding variants remains a significant challenge in the post-GWAS era. Alternative polyadenylation (APA) plays an essential role in post-transcriptional regulation for most human genes. By employing different poly(A) sites, genes can either shorten or extend the 3′-UTRs that contain cis-regulatory elements such as miRNAs or RNA-binding protein binding sites. Therefore, APA can affect the mRNA stability, translation, and cellular localization of proteins. Population-scale studies have revealed many inherited genetic variants that potentially impact APA to further influence disease susceptibility and phenotypic diversity, but systematic computational investigations to delineate the connections are in their earliest states.
Results: Here, we discuss the evolving definitions of the genetic basis of APA and the modern genomics tools to identify, characterize, and validate the genetic influences of APA events in human populations. We also explore the emerging and surprisingly complex molecular mechanisms that regulate APA and summarize the genetic control of APA that is associated with complex human diseases and traits.
Conclusion: APA is an intermediate molecular phenotype that can translate human common non-coding variants to individual phenotypic variability and disease susceptibility.
Background: The concept of phase separation has been used to describe and interpret physicochemical phenomena in biological systems for decades. Many intracellular macromolecules undergo phase separation, where it plays important roles in gene regulation, cellular signaling, metabolic reactions and so on, due to its unique dynamic properties and biological effects. As the noticeable importance of phase separation, pioneer researchers have explored the possibility to introduce the synthetically engineered phase separation for applicable cell function.
Results: In this article, we illustrated the application value of phase separation in synthetic biology. We described main states of phase separation in detail, summarized some ways to implement synthetic condensates and several methods to regulate phase separation, and provided a substantial amount of identical examples to illuminate the applications and perspectives of phase separation in synthetic biology.
Conclusions: Multivalent interactions implement phase separation in synthetic biology. Small molecules, light control and spontaneous interactions induce and regulate phase separation. The synthetic condensates are widely used in signal amplifications, designer orthogonally non-membrane-bound organelles, metabolic pathways, gene regulations, signaling transductions and controllable platforms. Studies on quantitative analysis, more standardized modules and precise spatiotemporal control of synthetic phase separation may promote the further development of this field.
Background: The concept of biomolecular condensate was put forward recently to emphasize the ability of certain cellular compartments to concentrate molecules and comprise proteins and nucleic acids with specific biological functions, from ribosome genesis to RNA splicing. Due to their unique role in biological processes, it is crucial to investigate their compositions, which is a primary determinant of condensate properties.
Results: Since a wide range of macromolecules comprise biomolecular condensates, it is necessary for researchers to investigate them using high-throughput methodologies while low-throughput experiments are not efficient enough. These high-throughput methods usually purify interacting protein libraries from condensates before being scanned in mass spectrometry. It is possible to extract organelles as a whole for specific condensates for further analysis, however, most condensates do not have a distinguishable marker or are sensitive to shear force to be extracted as a whole. Affinity tagging allows a comprehensive view of interacting proteins of target molecule yet only proteins with strong bonds may be pulled down. Proximity labeling serves as a complementary method to label more dynamic proteins with weaker interactions, increasing sensitivity while decreasing specificity. Image-based fluorescent screening takes another path by scanning images automatically to illustrate the condensing state of biomolecules within membraneless organelles, which is a unique feature unlike the previous mass spectrometry-based methods.
Conclusion: This review presents a rough glimpse into high-throughput methodologies for biomolecular condensate investigation to encourage usage of bioinformatic tools by researchers in relevant fields.
Background: The availability of vaccines provides a promising solution to contain the COVID-19 pandemic. However, it remains unclear whether the large-scale vaccination can succeed in containing the COVID-19 pandemic and how soon. We developed an epidemiological model named SUVQC (Suceptible-Unquarantined-Vaccined-Quarantined-Confirmed) to quantitatively analyze and predict the epidemic dynamics of COVID-19 under vaccination.
Methods: In addition to the impact of non-pharmaceutical interventions (NPIs), our model explicitly parameterizes key factors related to vaccination, including the duration of immunity, vaccine efficacy, and daily vaccination rate etc. The model was applied to the daily reported numbers of confirmed cases of Israel and the USA to explore and predict trends under vaccination based on their current epidemic statuses and intervention measures. We further provided a formula for designing a practical vaccination strategy, which simultaneously considers the effects of the basic reproductive number of COVID-19, intensity of NPIs, duration of immunological memory after vaccination, vaccine efficacy and daily vaccination rate.
Results: In Israel, 53.83% of the population is fully vaccinated, and under the current NPI intensity and vaccination scheme, the pandemic is predicted to end between May 14, 2021, and May 16, 2021, assuming immunity persists for 180 days to 365 days. If NPIs are not implemented after March 24, 2021, the pandemic will end later, between July 4, 2021, and August 26, 2021. For the USA, if we assume the current vaccination rate (0.268% per day) and intensity of NPIs, the pandemic will end between January 20, 2022, and October 19, 2024, assuming immunity persists for 180 days to 365 days. However, assuming immunity persists for 180 days and no NPIs are implemented, the pandemic will not end and instead reach an equilibrium state, with a proportion of the population remaining actively infected.
Conclusions: Overall, the daily vaccination rate should be decided according to vaccine efficacy and immunity duration to achieve herd immunity. In some situations, vaccination alone cannot stop the pandemic, and NPIs are necessary to supplement vaccination and accelerate the end of the pandemic. Considering that vaccine efficacy and duration of immunity may be reduced for new mutant strains, it is necessary to remain cautiously optimistic about the prospect of ending the pandemic under vaccination.
Background: A novel coronavirus (the SARS-CoV-2) has been identified in January 2020 as the causal pathogen for COVID-19 , a pandemic started near the end of 2019. The Angiotensin converting enzyme 2 protein (ACE2) utilized by the SARS-CoV as a receptor was found to facilitate the infection of SARS-CoV-2, initiated by the binding of the spike protein to human ACE2.
Methods: Using homology modeling and molecular dynamics (MD) simulation methods, we report here the detailed structure and dynamics of the ACE2 in complex with the receptor binding domain (RBD) of the SARS-CoV-2 spike protein.
Results: The predicted model is highly consistent with the experimentally determined structures, validating the homology modeling results. Besides the binding interface reported in the crystal structures, novel binding poses are revealed from all-atom MD simulations. The simulation data are used to identify critical residues at the complex interface and provide more details about the interactions between the SARS-CoV-2 RBD and human ACE2.
Conclusion: Simulations reveal that RBD binds to both open and closed state of ACE2. Two human ACE2 mutants and rat ACE2 are modeled to study the mutation effects on RBD binding to ACE2. The simulations show that the N-terminal helix and the K353 are very important for the tight binding of the complex, the mutants are found to alter the binding modes of the CoV2-RBD to ACE2.