Quantifying taxonomy-function associations across hierarchical scales of bacterial nitrogen cycling
Mingli Jiang , Yanni Huang , Zhiming Wu , Qian Zhu , Qian Li , Kaihua Pan , Mingliang Zhang , Liang Shi , Jiguo Qiu , Pengfa Li , Xin Yan , Yiyong Zhu , Qing Hong
Soil Ecology Letters ›› 2026, Vol. 8 ›› Issue (5) : 260460
Microbes drive global nitrogen cycling, yet the extent to which taxonomic identity is associated with functional potential across bacterial diversity remains poorly quantified. Using 73472 representative bacterial genomes, we develop a quantitative framework integrating Information Gain analysis, functional classification, and molecular evolutionary analysis across six nitrogen cycling pathways and five taxonomic ranks. Association strength increases monotonically from phylum to genus level across all six pathways, with genus-level associations ranging from 39.5% (ANRA) to 67.5% (DNRA) among pathways with substantial prevalence (NIT excluded due to its extremely low global prevalence of 0.3%, which mathematically amplifies normalized IG values). Hierarchical clustering identifies four class-level functional archetypes—Functionally Inactive, Functionally Moderate, N-Retention Dominant, and Nitrification Specialist—among 77 bacterial classes, largely stable across genome source environments. At genus level, 1281 genera resolve into five ecological strategies spanning from single-direction specialists to genera maintaining both nitrogen retention and loss capacities. Molecular evolutionary analysis of 13 genes reveals that sequence conservation operates partially independently of pathway-level functional associations, generating four systematic decoupling patterns: congruent conservation for NF (nifH), under-conservation for DNRA (nrfA), over-conservation for DNN (napA), and gene-specific heterogeneity within DNF (nirS versus nirK). This framework establishes quantitative baselines that enable probabilistic inference of nitrogen cycling capabilities from taxonomic composition, with applications in amplicon-based community analysis, targeted cultivation, and biogeochemical modeling.
nitrogen cycling / taxonomy-function associations / functional archetypes / comparative genomics / sequence conservation
| ● Establishes a quantitative framework for predicting nitrogen cycling potential from bacterial taxonomic identity. | |
| ● Reveals that taxonomy-function association strength is pathway-specific, enabling targeted functional inference with explicit reliability estimates. | |
| ● Identifies class-level functional archetypes and genus-level ecological strategies as intrinsic genomic properties stable across sampling environments. | |
| ● Reveals that pathway-level taxonomy-function associations and gene-level sequence conservation operate as partially independent dimensions, manifesting as four systematic decoupling patterns. |
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
Lynch, M., 2007. The Origins of Genome Architecture. Sunderland: Sinauer. |
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
| [46] |
|
| [47] |
|
| [48] |
|
| [49] |
|
| [50] |
|
| [51] |
|
| [52] |
|
| [53] |
|
| [54] |
|
| [55] |
|
| [56] |
|
| [57] |
|
| [58] |
|
| [59] |
|
| [60] |
|
| [61] |
|
| [62] |
|
| [63] |
|
| [64] |
|
| [65] |
Wilcoxon, F., 1992. Individual comparisons by ranking methods. In: Kotz, S., Johnson, N.L., eds. Breakthroughs in Statistics: Methodology and Distribution. New York: Springer, 196–202. |
| [66] |
|
| [67] |
|
| [68] |
|
| [69] |
|
Higher Education Press
/
| 〈 |
|
〉 |