Cofitness network connectivity determines a fuzzy essential zone in open bacterial pangenome

  • Pan Zhang 1,2,3 ,
  • Biliang Zhang 2,4 ,
  • Yuan-Yuan Ji 1,2 ,
  • Jian Jiao 1,2 ,
  • Ziding Zhang , 4 ,
  • Chang-Fu Tian , 1,2
Expand
  • 1. State Key Laboratory of Plant Environmental Resilience, and College of Biological Sciences, China Agricultural University, Beijing, China.
  • 2. MOA Key Laboratory of Soil Microbiology, and Rhizobium Research Center, China Agricultural University, Beijing, China.
  • 3. Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
  • 4. State Key Laboratory of Livestock and Poultry Biotechnology Breeding, and College of Biological Sciences, China Agricultural University, Beijing, China.
zidingzhang@cau.edu.cn
cftian@cau.edu.cn

Received date: 07 Nov 2023

Accepted date: 24 Apr 2024

Published date: 20 Feb 2024

Copyright

2024 2024 The Author(s). mLife published by John Wiley & Sons Australia, Ltd on behalf of Institute of Microbiology, Chinese Academy of Sciences.

Abstract

Most in silico evolutionary studies commonly assumed that core genes are essential for cellular function, while accessory genes are dispensable, particularly in nutrient-rich environments. However, this assumption is seldom tested genetically within the pangenome context. In this study, we conducted a robust pangenomic Tn-seq analysis of fitness genes in a nutrient-rich medium for Sinorhizobium strains with a canonical open pangenome. To evaluate the robustness of fitness category assignment, Tn-seq data for three independent mutant libraries per strain were analyzed by three methods, which indicates that the Hidden Markov Model (HMM)-based method is most robust to variations between mutant libraries and not sensitive to data size, outperforming the Bayesian and Monte Carlo simulation-based methods. Consequently, the HMM method was used to classify the fitness category. Fitness genes, categorized as essential (ES), advantage (GA), and disadvantage (GD) genes for growth, are enriched in core genes, while nonessential genes (NE) are over-represented in accessory genes. Accessory ES/GA genes showed a lower fitness effect than core ES/GA genes. Connectivity degrees in the cofitness network decrease in the order of ES, GD, and GA/NE. In addition to accessory genes, 1599 out of 3284 core genes display differential essentiality across test strains. Within the pangenome core, both shared quasi-essential (ES and GA) and strain-dependent fitness genes are enriched in similar functional categories. Our analysis demonstrates a considerable fuzzy essential zone determined by cofitness connectivity degrees in Sinorhizobium pangenome and highlights the power of the cofitness network in understanding the genetic basis of ever-increasing prokaryotic pangenome data.

Cite this article

Pan Zhang , Biliang Zhang , Yuan-Yuan Ji , Jian Jiao , Ziding Zhang , Chang-Fu Tian . Cofitness network connectivity determines a fuzzy essential zone in open bacterial pangenome[J]. mLife, 2024 , 3(2) : 277 -290 . DOI: 10.1002/mlf2.12132

Options
Outlines

/