The China National GeneBank Sequence Archive (CNSA) 2024 update
Weiwen Wang , Cong Tan , Ling Li , Xia Li , Lei Zhang , Xiaoqiang Li , Jieyu Wang , Ziyi He , Tao Yang , Kailong Ma , Qingjiang Hu , Wenzhen Yang , Zhiyong Li , Mingwen Zhang , Wensi Du , Fan Yang , Zhicheng Xu , Xizheng Ma , Jiawei Tong , Jia Cai , Cong Hua , Fengzhen Chen , Lijin You , Liang Li , Wenjun Zeng , Bo Wang , Xun Xu , Xiaofeng Wei
Horticulture Research ›› 2025, Vol. 12 ›› Issue (5) : 36
The China National GeneBank Sequence Archive (CNSA) is an open and freely accessible curated data repository built for archiving, sharing, and reutilizing of multiomics data. The remarkable advancement in sequencing technologies has triggered a paradigm shift in life science research. However, it also poses tremendous challenges for the research community in data management and reusability. With the dramatic advance of sequencing technologies like spatial transcriptome sequencing, it brings an unprecedented explosion in sequence data and new requirements for data archiving. CNSA was established in 2017 as one of the fundamental infrastructures to offer multiomics data archiving for the worldwide research community. Here, we present the state-of-the-art enhancements of CNSA encompassing the dramatical increase of varied types of data, the latest features and services implemented in CNSA as well as consistent efforts supporting global cooperation in biodiversity preservation and utilization. CNSA provides public archiving and open-sharing services for sequencing data and relevant metadata including genome, transcriptome, metabolism, and proteome from single-cell (also spatial resolved) level to individual and population level, as well as further analyzed results. As of 2024, CNSA has archived >16.3 petabytes of data and provided the data curation, preservation, and open-share service for >1581 publications from >560 institutions. It plays a pivotal role in supporting global scientific projects such as the 10 000 Plant Genomes Project. So far, CNSA has been recommended by various academic publishers such as Cell, Elsevier, and Oxford University Press. CNSA is accessible at https://db.cngb.org/cnsa/.
| [1] |
|
| [2] |
|
| [3] |
Genome 10K Community of Scientists. Genome 10K: a proposal to obtain whole-genome sequence for 10,000 vertebrate species. J Hered. 2009;100:659-74 |
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
CNCB-NGDC Members and Partners. Database resources of the National Genomics Data Center, China National Center for Bioinformation in 2024. Nucleic Acids Res. 2023;52:D18-32 |
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
/
| 〈 |
|
〉 |