1 Introduction
Achieving precise and efficient genome-editing is an important goal in life science research that has promoted the constant development and optimization of genome-editing tools. Since its advent in 2012, CRISPR/Cas9 gene-editing technology has rapidly spread worldwide and become the most widely used genome-editing tool in laboratories due to its powerful editing efficiency and simple operation [
1–
3].
The effectiveness of the CRISPR/Cas9 system has been demonstrated in a number of organisms [
4–
7]. Many traits of plants and animals and many known human genetic diseases are caused by point mutations [
8–
10], which require precise base-editing to create an organism with favorable traits, as well as to authentically mimic and treat point mutation-derived human genetic diseases. However, introducing precise point mutations is a challenge for the CRISPR/Cas9 system because of its poor editing efficiency, off-target effects, and requirement for donor DNA templates. The opportunity to address these issues did not arise until the advent of base editors.
Base editor (BE), a newly developed genome-editing tool derived from the CRISPR/Cas9 system, can directly convert target base pairs into another base pairs in DNA and RNA efficiently without inducing a DNA double-strand break (DSB) or requiring donor DNA templates in living cells [
11–
14]. Hence, they have been rapidly adopted to install or correct point mutations in bacteria, plants, animals, and human embryos, exhibiting widespread applications in various fields such as basic research in life science, agriculture, and biomedicine [
10,
15–
18]. According to the ClinVar database, approximately two-thirds of the known human pathogenic genetic variants are point mutations [
9]. Therefore, efficient installation or correction of pathogenetic point mutations is of great significance for the study and treatment of human genetic diseases.
In 2016, the first base editor, the cytosine base editor (CBE), which is a derivative of the CRISPR/Cas system, was found to have the capacity to convert C·G base pairs efficiently and precisely to T·A base pairs without inducing DSBs or the need of DNA templates [
11,
12]. CBEs have become a milestone as it rapidly brought genome-editing technology into the era of single-base resolution. In the following year, another base editor, adenine base editor (ABE), which efficiently substitutes an A·T base pair to a G·C base pair, came into use [
13], further promoting the development of base-editing tools. However, the two original classes of base editors cannot accomplish the full aims in therapeutic applications as they are able to reverse only approximately 61% of pathogenic point mutations [
9,
10].
In recent years, researchers have made great efforts in the development and optimization of BEs and have developed more than 100 optimized base editors with improved editing efficiency, precision, specificity, targeting scope, and capacity to be delivered
in vivo [
18], enriching the arsenal of base editors and expanding their application potential in biomedicine. In terms of functions, in addition to CBEs [
11,
12] and ABEs [
13], C-to-G base editors (CGBEs) [
19,
20] and adenine transversion base editors (AYBEs) [
21] enable base substitutions; dual-base editors that enable concurrent adenine and cytosine editing, such as STEME [
22], ACBE [
23], and AGBE [
24], have also been developed. According to editing substrates, in addition to the nuclear DNA base editors mentioned above, RNA base editors (such as REPAIR [
14] and RESCUE [
25], etc.) and mitochondrial DNA base editors (such as DdCBE [
26] and TALED [
27], etc.) have also been developed. Theoretically, currently reported BEs can correct almost all types of pathogenic single-nucleotide variants [
10], but their editing accuracy is relatively limited. To overcome this limitation, David R. Liu and colleagues developed prime editors (PEs) that can mediate all 12 possible base-to-base conversions without requiring DSBs, which substantially expands the scope of genome-editing at single-base resolution [
28]. Here, Tab.1 provides a rough comparison of the three typical CRISPR-based genome-editing agents, which may be helpful for the optimal choice of editing agent in different applications.
With high efficiency and precision of gene editing, base editors have been employed in the biomedicine field for disease modeling [
15,
16], treatment of human genetic diseases [
29], directed protein evolution [
30], identification of drug targets [
31–
35], and unravelling cell lineage and fate determination [
36,
37].
The objective of this review is to present an overview of the recent advances in base editors, including the development and medical relative applications, and to discuss future perspectives and existing challenges for therapeutic research.
2 Classification and development of base editors
Currently, there are more than 100 base-editing tools that can create precise base conversions and random mutations, available for meeting requirements in biomedical applications [
10,
16,
38–
40] (Fig.1). These base editors can be classified into three categories according to their distinct target substrates: nuclear DNA base editors, mitochondrial DNA base editors, and RNA base editors (Fig.2 and Tab.2).
2.1 Nuclear DNA base editors
The nuclear genome represents the most important genetic material in cells, and base editors for nuclear DNA have attracted increasing attention. CBEs (C-to-T conversion) and ABEs (A-to-G conversion) are the first two base editors for base transitions in the nuclear genome [
11–
13]. New base editors enabling base transversions (such as C-to-R and A-to-Y conversions) [
19–
21], as well as dual-deaminase base editors [
22–
24,
41–
43] have also been developed.
2.1.1 Cytosine base editors (CBEs)
The CBE was first developed by David R. Liu’s group in 2016 [
11]. In the CBE system, the fusion of cytosine deaminase and Cas9 is guided to the target by sequence-specific sgRNA, where the “R-loop” complex is formed with sgRNA and targeted DNA strands. Cytosine (C) within the exposed local ssDNA in the R-loop is converted to uracil (U) by cytosine deaminase, and U is then recognized as thymine (T) during DNA replication, resulting in C-to-T conversion (or G-to-A conversion on the opposite strand). Initially, a cytosine deaminase derived from rat (rAPOBEC1) was fused to the amino terminus of dCas9, named BE1, which converted C-to-T at targets with poor efficiency [
11]. To increase the conversion efficiency, BE3 was developed by fusing an uracil DNA glycosylase inhibitor (UGI) to the C-terminus of BE1 and replacing dCas9 with a Cas9(D10A) nickase (nCas9). UGI was designed to inhibit the base-excision repair pathway, with nicks induced by nCas9 simulating newly synthesized DNA or damaged DNA. The resulting BE3 could achieve C-to-T conversion with efficiency up to 74.9% in living cells, far more than Cas9 nuclease-mediated homology-directed repair (HDR) [
11]. Different from conventional nuclease tools (such as ZFNs, TALENs, and CRISPR/Cas), base editors enable precise base substitutions efficiently without DSB or donor DNA templates, showing significant advantages in effectiveness, precision, and safety. The advent of BE3 represented a new milestone in the development of genome-editing technology. In the same year, Nishida
et al. [
12] reported another type of CBE, named Target-AID, created by fusing a cytosine deaminase from sea lamprey (PmCDA1) to the C-terminus of nCas9(D10A). Unlike the poor efficiency for the GC motif of rAPOBEC1 in BE3, equivalent efficiency for GC and other motifs can be achieved by PmCDA1 in Target-AID, indicating that different cytosine deaminases can confer CBEs different editing activities.
2.1.2 Adenine base editors (ABEs)
C-to-T mutations can mimic approximately half of the human pathogenic point mutations [
9]; thus, more new base editors are needed to tackle genetic diseases with other mutations. Theoretically, deamination of adenine (A) yields inosine (I), which is recognized as guanine (G) during DNA synthesis. However, there is no known adenine deaminase that acts on DNA. David R. Liu and colleagues thus evolved an adenine deaminase variant accepting DNA as a substrate from wild-type
Escherichia coli TadA (ecTadA) [
13]. The evolved TadA*7.10 variant harboring 14 amino acid mutations was used to develop an ABE system, named ABE7.10, which achieved approximately 50% A-to-G conversion efficiency and high product purity in human cells [
13]. ABE7.10 is formed by the fusion of heterodimeric ecTadA-TadA*7.10 to the N-terminus of nCas9(D10A), in which the heterodimer is expected to improve the editing efficiency of ABE, given that TadA natively acts as a homodimer on tRNA [
13]. Interestingly, a subsequent study demonstrated that wild-type ecTadA is dispensable for ABE editing and that a miniature version of ABE without ecTadA exhibits comparable editing efficiencies in human cells, which would benefit therapeutic applications, given that vectors with smaller sizes are easier to be delivered to targeted cells [
44].
2.1.3 Glycosylase base editors or C-to-G base editors (GBEs/CGBEs)
Different from base transitions of pyrimidine-to-pyrimidine or purine-to-purine [
10], transversions between pyrimidine and purine are difficult to operate due to their significant differences in structure. To circumvent these obstacles, researchers have attempted to harness the internal base excision repair pathway of cells. In theory, the U base deaminated from a C base can be excised by uracil DNA N-glycosylase (UNG) and forms an abasic site, which initiates the DNA repair process and can be repaired by any of the four bases. Sparked by this speculation, two groups independently reported a new type of base editor, GBE and CGBE, by replacing UGI in CBEs with UNG. In GBE/CGBE systems, a C is converted preferentially to a G in mammalian cells and to A in bacteria, with efficiencies up to 63% and 87%, respectively [
19,
20]. In the study reported by Kurt
et al., the UNG component was not essential for C-to-G conversion in CGBEs, given that miniCGBE without UNG only showed a modestly decreased editing efficiency. This result was important for further optimization of CGBEs with reduced sizes and lower indel (insertion and deletion) frequencies [
20]. Based on systematic examination and optimization for different components and architectures, other versions of CGBE have been developed by several other groups [
45–
47]. In a recent study, the adenine deaminase TadA*8e variant with an N46L mutation was re-engineered for cytosine deamination, and the resulting Td-CGBE was capable of highly efficient and precise C-to-G conversions [
48]. GBEs/CGBEs currently engineered offer diverse editing performances at different target loci, enabling efficient and high-purity C-to-G conversion in mammalian cells.
2.1.4 Adenine transversion base editors (AYBEs)
Enlightened by the design of GBEs/CGBEs, an AYBE for A-to-T and A-to-C transversion was recently successfully developed based on the ABE [
21]. In the AYBE system, an engineered human N-methylpurine DNA glycosylase protein (MPG; also known as alkyladenine DNA glycosylase (AAG)) is fused to the C-terminus of ABE*8e for excision of ABE-induced hypoxanthine (Hx) in damaged DNA, and the resulting abasic site can be repaired to all types of bases. AYBE can perform efficient A-to-C and A-to-T conversions at the same target, leading to diverse outcomes, which compromises the specificity and predictability of base editing [
21]. A similar study was performed in rice by fusing TadA*8e and the nCas9 variant SpGn (with NG-PAM) with
Escherichia coli endonuclease V (EndoV) and human AAG, respectively, but no A-to-Y editing was detected in these two editors [
49]. Although both AYBEs and GBEs/CGBEs can efficiently execute expected base conversions such as C-to-G, A-to-C, and A-to-T and have performed proof-of-concept for correcting disease-related mutations, substantial efforts are needed to improve editing precision for therapeutic applications [
19–
21,
45,
46]. However, AYBEs and GBEs/CGBEs are potentially suited for saturation mutagenesis, which requires enriched mutation patterns to reveal the relationship between traits and mutation patterns.
2.1.5 Dual-deaminase base editor
Dual-deaminase base editors, which are able to execute conversion of double types of nucleotides, have also reached a stage of rapid development. By combining two deaminase domains, a dual-deaminase base editor enables concurrent adenine and cytosine editing. We and other groups have developed several dual-deaminase base editors, which can be categorized into two classes. Members of the first class are derived from the combination of CBE and ABE, such as STEME, A&C-BE, SPACE, Target-ACE, ACBE, and sgBE, and can simultaneously introduce C-to-T and A-to-G mutations on the same allele of the target [
22,
23,
41–
43,
50]. Available experimental results indicate that these dual-deaminase base editors have superior capabilities in simultaneous dual-base-editing compared to the co-expression of different single-base editors. Additional efforts have been made to expand the applicational scope of dual-deaminase base editors, for example, relieving PAM restrictions with the Cas9 variant [
51], achieving simultaneous C-to-T and A-to-G conversions on the same allele with a dual-guide-RNA strategy [
52], and developing smaller dual base editors containing a new version of CBEs derived from evolved TadA variants with a single deaminase [
53,
54]. The second class is derived from the combination of base editors that can generate multiple types of base conversions, such as GBE/CGBE and AYBE. For example, AGBE, a dual base editor derived from fusion of ABE with CGBE, can simultaneously introduce four types of base conversions (C-to-G, C-to-T, C-to-A, and A-to-G) with the same sgRNA [
24]. These types of dual-deaminase base editors have an obvious advantage over single-base editors in saturation mutagenesis at target sites [
22,
24,
41–
43], which is useful for gene functional screening, single-cell lineage tracing, and directed evolution of the proteins of interested.
2.2 Mitochondrial DNA base editors
In addition to the nuclear genome, eukaryotic cells contain another relatively independent genome, mitochondrial DNA (mtDNA), a circular DNA that exists in multiple copies in mitochondria [
55]. Manipulation of mtDNA has long been hampered by technical limitations, though tools for nuclear DNA modification have undergone a burst of development in recent years, especially since the CRISPR/Cas systems were introduced [
18,
55]. Because of the obstruction of mitochondrial membrane, guide RNA cannot be efficiently delivered into mitochondria, which hinders CRISPR/Cas-derived base editors for mtDNA editing [
56,
57]. There are other factors, of course, that may constrain mtDNA base-editing, such as the multi-copy status of mtDNA and the unique maternal inheritance pattern and genetic bottleneck effect. Base editing of mtDNA was not available until the development of CRISPR-free DddA-derived cytosine base editors (DdCBEs) in 2020, which enabled efficient C-to-T base conversions of mtDNA in cultured human cells [
26]. Unlike the widely used CRISPR/Cas-based base editors, DdCBEs consist of transcription activator-like effector (TALE) arrays, split cytosine deaminase DddA
tox, and UGI. DddA
tox from
Burkholderia cenocepacia is able to catalyze deamination of C within double-stranded DNA (dsDNA) and is toxic to cells. Accordingly, DddA
tox was split into two inactive halves (DddA
tox-N and DddA
tox-C) separately and fused to the C-terminus of a pair of mitochondrially targeted TALE arrays. The deamination activity of the two DddA
tox halves is recovered on target mtDNA by adjacently bound programmable TALE arrays. DdCBEs are the first developed mtDNA base editors and have been applied in mice, zebrafish, rats, and human embryos [
58–
63]. Furthermore, significant progress has been made for DdCBEs in terms of efficiency, sequence preference, size, types of base conversion, and specificity [
27,
62,
64–
72]. More recently, Cho
et al. presented TALE-linked deaminases (TALEDs), which are composed of TALE, a catalytically impaired DddA
tox, and an engineered adenosine deaminase TadA*8e, enabling efficient A-to-G conversion in human mitochondria [
27], representing another vital advancement in mtDNA base editing.
2.3 RNA base editors (RBEs)
RNA base editors are supposed to introduce base changes at the RNA level, which is a reversible modification and is thought to reduce risk in potential clinical applications. RNA base editing was attempted long ago when the adenosine deaminase acting on RNA (ADAR) from
Xenopus was used to introduce A-to-I editing in the target RNA, as assisted by complementary RNA oligonucleotides [
73]. To alleviate unintended off-targets from exogenous ADAR, strategies that leverage endogenous ADAR were subsequently developed, resulting in RESTORE (recruiting endogenous ADAR to specific transcripts for oligonucleotide-mediated RNA editing) and LEAPER (leveraging endogenous ADAR for programmable editing of RNA) [
74–
76]. With the discovery of the CRISPR/Cas system, the RNA-targeted protein Cas13b was used to engineer RNA base editors, with REPAIR (RNA editing for programmable A-to-I replacement) representing the first tool [
14,
77]. REPAIR was constructed with a catalytically inactive Cas13b and a catalytic domain from ADAR2, which enables A-to-I editing at endogenous transcripts with high efficiencies and specificity [
14]. To further perform C-to-U RNA base editing, ADAR2 in REPAIR was re-engineered to be endowed with cytosine deaminase activity, and the resulting RESCUE (RNA editing for specific C-to-U exchange) is capable of C-to-U editing as well as A-to-I editing [
25,
77]. Huang and colleagues reported the first cytidine-specific C-to-U RNA editor developed by fusing APOBEC3A to dCas13 [
78]. Han and colleagues reported an artificial guide-RNA-free system, REWIRE (RNA editing with individual RNA-binding enzymes) with further optimization of editing and delivery efficiency [
79]. The successful development of these RNA base editors for A-to-I and C-to-U editing will undoubtedly facilitate the development of additional types of RNA base editors in the future, such as U-to-C editing [
80].
2.4 Prime editors for all types of base substitutions
The diverse toolboxes of base editors have provided opportunities for several types of base substitutions, though a more flexible and versatile system for all types of base substitutions was lacking until the emergence of prime editors (PEs) [
81]. The second-generation prime editing (PE2) system consists of a nCas9(H840A) nickase, an engineered reverse transcriptase (RTase), and an extended guide RNA (named pegRNA), which achieves gene editing based on a “search-and-replace” strategy [
28]. After targeting the site of interest, nCas9(H840A) nicks DNA, and RTase executes the incorporation of desired genetic information pre-existing in pegRNA. The outcomes of prime editing rely on the design of pegRNAs, which serve as a template for RTase. In theory, all types of base editing, including precise single- or multiple-base substitutions, insertions, and deletions, can be realized via PEs. When an additional sgRNA is used to nick the non-edited strand in the PE2 system, the resulting PE3/PE3b shows significant improvement in efficiency [
28]. Since their first appearance, PEs have been employed in various animals [
82–
84]. However, the compromised efficiency hinders the widespread application of PEs; thus, many optimization solutions have been proposed, such as intervention in cellular repair pathways, optimization of PE architecture, modification of pegRNA, modulation of chromatin accessibility, and adoption of dual-pegRNA [
85–
92]. Overall, the PE system seems to be more complex in architecture and less predictable in efficiency compared to CBEs and ABEs, causing difficulty in application.
3 Optimization of base editors
Base editors have revolutionized genome-engineering technology to precise single-base resolution with considerable efficiency. To fully exploit their potential, especially for therapeutic applications, further optimization is needed to improve their effectiveness and safety [
16]. In this section, we focus mainly on optimization of the two most common base editors, CBEs and ABEs.
3.1 Enhancing on-target efficiencies
The activity of the deaminases in base editors acts as the main determinant of on-target editing. The original cytosine deaminase (rAPOBEC1 in BE3) and adenine deaminase (TadA*7.10 in ABE7.10) are not the most active [
11,
13]. Other natural or engineered cytosine deaminases (such as human APOBEC3A, human APOBEC3B, evoAPOBEC1, evoFERNY, and evoCDA1) [
93–
98] and adenine deaminases (such as TadA*8) [
99–
101] have been adopted to elevate the editing efficiencies of CBEs and ABEs, respectively. Codon optimization and nuclear localization sequence modification are also two universal strategies for improving efficiency [
102,
103], given that the expression and localization of the fused proteins intrinsically limit their function. Additional components have been introduced into base editors to increase on-target efficiencies, such as employing dCas9 to bind the proximal location of the target [
97], fusing a single-stranded DNA-binding protein domain from Rad51 [
104], or co-expressing a dominant negative fragment of p53 [
105]. Overall, combining multiple strategies will contribute to a synergistic effect to improve on-target editing.
3.2 Broadening the targeting scope
CRISPR/Cas-based base editors are intrinsically limited by the requirement for certain PAM sequences, for instance, NGG PAM for SpCas9. To broaden the targeting scope, other natural Cas orthologs (such as LbCas12a recognizing TTTV PAM and Nme2Cas9 recognizing N4CC PAM) have been adopted to construct various base editors [
96,
97,
99,
106,
107], yet their efficiencies are compromised compared to SpCas9-derived editors. Alternatively, SpCas9 can be engineered through directed evolution or structure-guided engineering, and SpCas9 variants (such as xCas9, SpCas9-NG, SpCas9-NRRH, SpCas9-NRTH, SpCas9-NRCH, SpG, and SpRY) can recognize non-canonical PAMs in addition to NGG, where SpRY represents a near-PAMless variant [
108–
111]. Similarly, Cas12a variants (such as enAsCas12a, impLbCas12a, enLbCas12a, RR-LbCas12a, and RVR-LbCas12a) have also been engineered to develop base editors for non-canonical PAM recognition [
101,
112,
113]. Given that base editing typically occurs within a certain activity window at the target, such as positions 4–8 for BE3 and 4–7 for ABE7.10 [
11,
13], these PAM-relaxed Cas variant-derived base editors enable tuning of spacer design to position bases of interest within the activity window for efficient and precise substitutions [
111]. In addition, particular architectures of base editors or more robust deaminases can contribute to wider activity windows, expanding the target scope [
94,
95,
99,
114,
115]. Of particular note is that wider windows, in turn, mean compromised precision.
3.3 Decreasing off-target editing
Off-target editing outside the target is harmful in most cases and poses safety risks in clinical applications. As optimal base editors, undesired off-target editing activity should be minimized while maintaining high on-target efficiencies. In fact, off-target editing for nuclear DNA base editors and RNA base editors can occur in the nuclear genome and transcriptome in a Cas-dependent or Cas-independent manner [
116–
121], whereas mtDNA off-target editing can occur in the nuclear genome as well as the mitochondrial genome [
69,
70]. For Cas-dependent off-target effects, engineered high-fidelity Cas9 variants seem to be an ideal option [
122]. A novel strategy involving dual guiders (including sgRNA and TALE) is also capable of eliminating the Cas-dependent off-targeting of ABEs and CBEs [
123,
124]. However, more efforts are needed to reduce the off-target editing induced by deaminases and various variants have been reported to reduce genome- or transcriptome-wide off-target effects [
44,
99,
118–
121,
125,
126]. An effective transformer BE (tBE) system with a cleavable deoxycytidine deaminase inhibitor domain was developed by Wang
et al. [
127], enabling elimination of genome-wide and transcriptome-wide off-target effects with a complex system. Given that ABEs show lower Cas-independent off-target activity than CBEs and that TadA enzymes have the potential to induce cytosine deamination, TadA*8 and other TadA orthologs have been re-engineered to perform C-to-T editing, resulting in TadA-harbored CBEs with lower off-target effects, smaller sizes, and considerable efficiencies [
48,
53,
54,
128–
130]. Additional strategies have also been proposed, for example, delivering base editors as ribonucleoproteins (RNPs) or mRNAs can mitigate off-target effects on both DNA and RNA due to their shorter duration [
131], and inhibiting APOBEC3A with anti-deaminases (Ades) is an alternative approach to decrease Cas-dependent and Cas-independent off-target effects [
132].
3.4 Minimizing by-products and bystander effects
Precise base editing of targets is affected by undesired by-products and bystander effects. CBEs and ABEs can perform C-to-non-T and A-to-non-G conversions, respectively [
30,
35,
128,
133–
135]. The mixed outcomes of BE3 involving UNG and occurring in a target site-dependent manner reduce the product purity, which may be improved by introducing additional UGI and Gam proteins, referred to as BE4/BE4-Gam [
134]. Previous studies have indicated that ABEs could induce cytosine deamination [
128,
135–
137], and this undesired activity might be significantly reduced by introducing a D108Q mutation in TadA*7.10 [
128]. Bystander editing arises from the presence of multiple editable bases within the activity window, which seems to have a greater impact on precision at the target. Common strategies to decrease bystander editing include narrowing the activity window and conferring certain sequence preferences with engineered deaminases. For example, engineered YE1-BE3 and YEE-BE3 variants substantially narrow the activity window from ~5 to ~2 nucleotides [
138,
139]; an ABE9 variant refines the editing window to 1–2 nucleotides [
140], and engineering the linker in the base editors is capable of changing the window width [
141]. The hAPOBEC3A variant with an N57G mutation shows preferential deamination of C in TCR and TCCR motifs [
97,
101,
142], whereas TadA*7.10 and TadA*8e exhibit a preference in TA and CA motifs and inefficiency with the AA motif [
101]. Notably, to improve editing precision, on-target efficiencies of most CBE and ABE variants are often sacrificed to some extent, though some appear to retain high efficiency.
3.5 Reducing size
The size of base editors directly influences
ex vivo and
in vivo delivery; thus, a reduction in size is beneficial for delivery efficiency [
16]. Adeno-associated virus (AAV) is one of the most promising vehicles for
in vivo delivery, but its low packaging capacity limits its application in base editor delivery [
143]. Therefore, reducing the size of base editor architecture is necessary for therapeutic applications. Smaller Cas orthologs, such as SaCas9, CjCas9, Nme2Cas9, SauriCas9, and Un1Cas12f1, have been employed to develop compact base editors [
130,
144–
146], which are able to be packaged into a single AAV vehicle for delivery. In practice, these miniature Cas-derived base editors usually exhibit lower editing efficiency, which needs to be improved. It is also possible to reduce the size of another important component of base editors. Grünewald
et al. suggested that ecTadA is not indispensable for ABE7.10 and described a smaller miniABE with reduced off-target RNA-editing activity and comparable on-target DNA-editing activity [
44].
4 Applications of base editors in biomedicine
To date, various base editors have been employed to modify genomes among different species, including bacteria, plants, animals, and discarded human embryos [
10,
15–
18], due to their high on-target editing efficiency and relatively few undesired off-target effects. In this section, we focus on summarizing the applications of base editors in the medical field, which shows the tremendous therapeutic potential of base editing (Fig.3).
4.1 Regulation of gene expression
4.1.1 Disruption of gene expression
Disruption of gene expression through gene knockout (KO) is an effective approach to studying gene function by the CRISPR/Cas system [
147–
150]. Leveraging base editors provides an efficient alternative to disrupt gene expression, which can be realized by creating premature termination codons (PTCs) in exons or altering start codon sequences. Two studies published in 2017 confirmed that BE3 enables efficient disruption of genes by precisely converting four codons (CAA, CAG, CGA, and TGG) to PTCs within protein-coding regions [
151,
152]. An ABE-mediated gene disruption approach, named i-Silence, has been used to convert start codon sequences from ATG to GTG or ACG, which were employed to analyze ~17 804 human genes and mimic 147 kinds of pathogenic diseases caused by start codon mutation [
1]. In addition, gene-knockout strategies based on base editing can be used for disease modeling and therapeutics [
152–
155] as well as genetic screening [
156]. For example, CBE has been employed to generate a diabetic canine model by introducing a PTC in the
GCK gene [
157].
4.1.2 Transcriptional regulation
Alternative splicing plays an important role in post-transcriptional regulation in eukaryotic cells, and its abnormal regulation is highly relevant to many human diseases [
158,
159]. Base editing-based technologies enable programmable modulating splicing events. Several groups have used CBEs and ABEs to disrupt or recover conserved splice-site motifs and activate cryptic splice sites, which results in controlling isoform-specific gene expression [
24,
160–
162]. Therefore, splice-site mutations by base editing exhibits a wide range of applications, including disease modeling and treatment [
163,
164] and elucidation of gene functions [
165,
166].
4.2 Generation of human genetic disease models
The paramount utility of base editors is their ability to install pathogenic point mutations efficiently and cleanly in DNA, as point mutations represent the vast majority of known human genetic variants [
9]. To date, base editors have made great contributions to human genetic disease modeling in diverse systems, accelerating the research ranging from basic study and drug discovery to targeted therapeutic intervention [
167] (Tab.3).
Base-editing systems have been applied for human disease modeling in many animal species. Non-mammalian animals such as zebrafish [
168,
169] and
Xenopus laevis [
170] have been used to mimic human oculocutaneous albinism (OCA) and congenital myasthenic syndrome (CMS). As the most frequently used model organism for biomedical research, base editor-based rodent models have been extensively reported for numerous human diseases, such as Duchenne muscular dystrophy (DMD) [
133], PCG deficient-mediated infertility [
171], androgen insensitivity syndrome (AIS) [
136], ocular albinism (OA), OCA [
163], familial Alzheimer’s disease (fAD) [
172], and glycogen storage disease type II (GSDII) [
173]. Similar studies have been performed on other large animal species, such as rabbits [
174], dogs [
157], pigs [
175,
176] and monkeys [
177], which would bridge the gap between rodent animals and humans for research.
Specific nucleotide changes in mtDNA are associated with a range of human maternally heritable genetic diseases and age-related diseases [
55,
178,
179]. The recently emerged mtDNA base editors have been used to conduct precise mtDNA manipulation in animals such as zebrafish [
59], mice [
58,
65,
180], rats [
60] and even in discarded human embryos [
61,
62]. Mouse models have been produced with m.G12918A and m.C12336T in the
MT-ND5 gene [
58,
65] and m.G2820A mutation in the
MT-ND1 gene [
180], which encode subunits of NADH dehydrogenase that catalyzes NADH dehydration and are essential for the electron transport chain, demonstrating its potential to generate mitochondrial disease animal models used for pre-clinical trials of mitochondrial disorder gene therapy.
4.3 Correction of disease-associated mutations
4.3.1 Correction of disease-associated mutations in nuclear DNA
Considering that the majority of known human hereditary disease-associated mutations occur in nuclear DNA, base editors hold great promise for directly correcting these mutations or providing therapeutic restoration in somatic tissues. To this day, this concept has been verified using a variety of animal models with disease-associated point mutations through
in vivo adeno-associated virus (AAV)-mediated delivery of base editor agents. Two genetic metabolism diseases, tyrosinaemia [
181] and cholesterolaemia [
182–
184] have been cured or prevented by modifying the defective mutation in hepatocytes, with editing efficiencies in liver cells up to 38% by using an optimized delivery strategy [
185]. Yeh
et al. used base editors to restore auditory function by editing the transmembrane channel-like 1 (TMC1) [
186] and β-catenin genes [
187] in the inner ear. Suh
et al. restored visual function in adult mice with an inherited retinal disease, Leber congenital amaurosis (LCA), via adenine base editing [
188]. Koblan
et al. rescued Hutchinson–Gilford progeria syndrome (HGPS) in mice [
189]. Ryu
et al. [
163] and Xu
et al. [
190] corrected a nonsense mutation in the
Dmd gene of muscle cells in mouse models of DMD, a neuromuscular disease caused by a deficiency of dystrophin, and successfully rescued progressive muscle degeneration. In addition, many studies have shown that base editing is an efficient treatment for hereditary cardiovascular diseases, such as sickle cell disease (SCD) [
41,
191–
194], dilated cardiomyopathy [
195], cardiac diseases caused by ischemia/reperfusion injury [
196], β-thalassemia [
41,
192–
194], and hypertrophic cardiomyopathy (HCM) [
197,
198].
Unlike precise genome-editing approaches that rely on the cell cycle-dependent HDR pathway [
199,
200], base editing mediates target base pair conversions through the cell cycle-independent base excision repair mechanism [
11], with great promise for installing or correcting hereditary disease-associated mutations in non-dividing cells. For example, Lim
et al. achieved lower expression of mutant SOD1 in amyotrophic lateral sclerosis by introducing a nonsense-coding substitution [
154], and Levy
et al. corrected the mutation causing Niemann–Pick disease type C in the mouse brain [
185].
Additionally, base editing is a useful approach for prophylactic treatment of genetic diseases. Much evidence supports that naturally occurring nonsense variants in the human proprotein convertase subtilisin/kexin type 9 (
PCSK9) gene result in significant reductions in blood cholesterol levels and an 88% reduction in the risk of coronary heart disease. Hence, disruption of the
PCSK9 gene
in vivo seems to be a therapeutic alternative for familial hyper-cholesterolaemia [
201–
203]. As a proof of principle,
in vivo base editing was used to introduce nonsense variants into the murine
Pcsk9 gene, with the goal of prolonged reduction of blood cholesterol levels [
182,
204,
205]. Similar efforts were made in non-human primate model [
183,
184], demonstrating that delivery of ABEs to the liver by lipid nanoparticles (LNPs) led to efficient knockout of the
PCSK9 gene in healthy cynomolgus macaques. These findings opened a new door for prevention and treatment of atherosclerotic cardiovascular disease in the future.
4.3.2 Correction of disease-associated mutations in mRNA
In addition to the commonly used DNA base editors, RNA base editors have recently been used in disease-related therapeutics. One of the crucial advantages of RNA base-editing is its reversibility, whereas permanent nucleotide alteration of DNA potentially causes genome instability and presents safety risks. Therefore, temporary RNA base editing is a safer alternative for gene therapy. Researchers in Feng Zhang’s lab used two RNA editors, REPAIR [
14] and RESCUR [
25] to correct Fanconi anemia-relevant A-to-I mutations and ear trauma-relevant C-to-U mutations in HEK293FT cells, respectively, initially confirming the potential of CRISPR/Cas13-derived RNA base editors for therapeutic application. Later, RNA base editors derived from smaller Cas13, such as compact Cas13bt [
77] and truncated Cas13X [
206,
207], and compact Cas6e [
208], were designed to fit the packaging size of the AAV delivery system, and were used to treat autosomal dominant hearing loss and DMD in mouse models. The other two types of RNA base editors, endogenous ADAR-dependent [
75,
209–
211] and CRISPR-free RNA base editors [
212], which result in high efficiency of RNA editing and cannot evoke immune responses, are a more promising RNA-editing tool for therapeutics. For example, LEAPER, which employs short engineered ADAR-recruiting RNAs to recruit native ADAR1 or ADAR2 enzymes to change specific adenosine to inosine, can restore α-L-iduronidase catalytic activity in Hurler syndrome patient-derived primary fibroblasts without evoking innate immune responses [
75].
4.3.3 Correction of disease-associated mutations in mitochondrial DNA
Mitochondrial DNA mutations manifest mainly as nucleotide changes, resulting in maternally inheritable diseases that affect multiple organs and systems [
213–
215]. However, there is no efficient therapy so far. The recent advent of the DddA-derived mtDNA base editor provides an ideal choice to correct homo- and hetero-plasmic pathogenic point mutations in mtDNA [
26,
27,
66]. Their effectiveness has been verified in cells and animals [
26,
55]. To date, because too few suitable animal models for
mtDNA diseases are available, no practice of correcting disease-associated mutations in mitochondrial DNA has been reported. However, a trial of
in vivo mitochondrial base editing using healthy mice by Pinheiro
et al. proved that DdCBE-mediated mtDNA editing is possible in post-mitotic tissue upon AAV delivery [
63].
4.4 Clinical trials
Base editing has shown promising results in permanently or temporarily reversing pathogenic mutations in a variety of animal models. The aforementioned studies on the correction or disruption of disease-causing mutations in rodent animals provide evidence that base editors are a potential treatment for human genetic diseases. As the large animal model can bridge the gap between basic research in rodents and clinical trials in humans, the pre-clinical studies in large animal models are critical to pave the way for
in vivo administration of base editors to patients in the clinic [
216–
218]. Therefore, data collected from pre-clinical trials in large animal models, especially in non-human primates [
10,
16,
189,
205,
210], promote clinical trials for treating human genetic diseases with base editors.
Currently, there are two strategies,
ex vivo and
in vivo, for human disease treatments based on base editing. For the
ex vivo strategy, the cells collected from the patients are modified
ex vivo and then reintroduced into the patients; for the
in vivo approach, editing agents are delivered to patients to modify the genome directly [
219–
222]. In 2022, 13-year-old Alyssa received the world-first base-edited chimeric antigen receptor (CAR) T cells to treat T-cell acute lymphoblastic leukemia (T-ALL). These T-cells had been edited with base editor
ex vivo and reinjected into Alyssa to attack leukemic T-cells. Six months after the CD7 CAR-T cell therapy and a second bone marrow transplant, she was leukemia-free. This milestone research paves the way for therapeutic applications of base editors. Notably, the FDA approved clinical trial applications for two base-editing drugs in 2022: BEAM-201, a CAR-T cell therapy for treatment of T-ALL, and VERVE-101, a liver-targeted
PCSK9-silencing base editor developed for prevention and treatment of heterozygous familial hypercholesterolaemia [
205]. The first patient was administered VERVE-101 in New Zealand in July 2022. In addition to these two drugs, clinical trials are set to begin for other drugs that are designed to silence, mend, modulate, and upregulate genes by changing a single nucleotide at a specific locus in patients (Tab.4).
Theoretically, programmable genome manipulation in human embryos offers the possibility of a permanent cure of genetic diseases. The technical feasibility has been verified by using discarded human tripronuclear (3PN) embryos [
223–
226]. The first study using CBE to correct mutation causing β-thalassemia in human embryos was reported in 2017, demonstrating the feasibility of curing generic disease in human embryos by base editor systems [
227]. Subsequently, the Marfan syndrome pathogenic
FBN1T7498C mutation was corrected by BE3 in embryos [
228]. The editing efficiencies were relatively low at the one-cell stage in human embryos, limiting the utility of base editing in therapeutic applications. Yang and colleagues compared editing frequencies in human 3PN embryos injected with BE3 mRNA and sgRNA at different developmental stages and suggested that human cleaving embryos provide an efficient base editing window for robust gene correction [
229]. Three years later, the same group efficiently induced disease-preventive mutations in human 8-cell embryos [
230].
In clinical trials, safe and efficient
in vivo delivery of base editing agents to organs and tissues of interest is the crucial factor affecting the effectiveness of therapeutic base editing [
16,
220,
221]. AAV delivery is a viral-based method that offers many advantages in delivering macromolecular therapeutics encoded as DNA, such as low immunogenicity and toxicity [
219], non-genomic integration expression, and especially tissue-targeting specificity offered by numerous AAV serotypes [
231]. Despite these advantages, the obvious limitation when using AAVs to deliver base editors is their low packaging capacity of 4.7 kb [
164]. To overcome this limitation, split-intein base editors with dual-AAV vehicles [
232] and small base editors [
130,
144–
146] have been developed. Nevertheless, there are still issues that need to be addressed for safer therapeutic applications, including the immune response triggered by non-human proteins [
233–
235] and off-target editing caused by long-term expression of base editors [
18,
235]. LNP delivery is the non-viral method used for mRNA delivery [
236–
238]. Unlike DNA, mRNA allows for rapid expression of proteins in the cytoplasm and has a high level of safety because it does not integrate into the genome and degrades rapidly with minimized off-target editing [
221]. However, appropriate chemical modifications are required to maintain the stability of mRNA [
239], and the modified mRNA is encapsulated in LNPs for
in vivo delivery to target cells. Synthetic LNPs have several advantages over AAV vehicles: large cargo size, lower immunogenicity, biodegradability, non-toxicity, and large-scale producibility. With these advantages, LNP-mediated
in vivo delivery is currently the most commonly used strategy in terms of the clinical pipeline (Tab.4). Virus-like particles (VLPs) are composed of non-infectious viral proteins [
240–
242], which have the potential to address the pitfalls associated with the use of AAV and LNP vehicles. Because VLPs lack the genetic material, they may be safer than other methods that use viruses, which may insert their genetic material into the genome of target cells. In addition, the current VLPs used for mRNA, protein, or RNP delivery offer the shortest exposure to gene editing agents and therefore the lowest potential for off-target editing compared to other delivery methods [
243], which makes VLPs one of the most attractive delivery methods in the future. Recently, a bacterial contractile injection system (CIS) for protein
in vivo delivery has been reported, raising the possibility that CIS can be harnessed for therapeutic protein delivery [
244,
245]. Nevertheless, both VLPs and CISs require continuous refinement before they can be formally entered into clinical trials. More details about the delivery strategies of editing agents have been reviewed extensively in previous reviews [
16,
220,
221].
4.5 High-throughput genetic screening
Due to their ability to introduce random base substitutions at target loci, base editors have been employed for high-throughput gene functional screening. First, screening platforms with UGI-less CBEs have been used for random mutagenesis, given that they can produce C-to-A/G/T mutations. With these platforms, drug resistance mutations at targets and the gene function of nucleotide variants can be identified through certain selection methods [
30–
35,
166,
246,
247]. The developed dual deaminase-mediated base editors increase the diversity of edited outcomes and serve as a more flexible strategy for site-directed saturated mutagenesis [
22–
24,
42,
43,
52]. These results suggest that base editors offer a valuable tool for large-scale functional characterization of nucleotide variants and drug discovery.
4.6 Single-cell genetic lineage tracing
Genetic lineage tracing at single-cell resolution with barcodes provides new opportunities to unravel cell lineage and fate determination [
248–
250]. Different from NHEJ-derived barcodes, base-editing barcodes can record more mutation information, as base editing in targets can slow exhaustion of targets, which is a benefit for long-term lineage labeling experiments [
36,
251,
252]. Most importantly, base editors have a unique ability to record cell division events because base substitution is dependent on DNA replication [
11–
13]. To date, attempts at using base editors for genetic lineage tracing
in vivo have been performed by two groups. Hwang
et al. showed a proof-of-concept for base editor-based single-cell lineage tracing with a Cas9-deaminase barcoding system targeting endogenous L1 elements in mammalian cells [
36]. In another study, Liu
et al. developed SMALT, a substitution mutation-aided lineage tracing system with the sequence-specific DNA-binding protein iSceI [
37]. They used SMALT to introduce substitution mutations in the readout sequence in
D. melanogaster and provided as much information as possible about cell division modes, which helped to obtain accurate and high-quality cell phylogeny. Both studies demonstrated that base editing might be a highly efficient cell barcoding method for mapping the cell phylogeny of complex organisms by single-cell genetic lineage tracing.
4.7 DNA writers and molecular recorders
The idea of using DNA as a storage medium was first proposed in 1959 because of its dense and durable information storage capacity [
253,
254]. Recent advances in base editing reveal enormous potential for DNA writing in living cells, as base editors enable durable base changes in cells in an efficient and programmable manner. Tang and Liu presented the CAMERA system with CBEs and demonstrated its ability to faithfully record multiple stimuli and event orders in
E.coli and mammalian cells [
255]. Later, Farzadfard
et al. developed a DOMINA platform for DNA writing, which enabled long-term recording and monitoring of
in vivo molecular events [
256]. In summary, these systems may translate stimuli of interest into durable DNA changes in living cells, which is essential for therapeutic applications, for instance, alerting the occurrence of cancer and other diseases.
5 Conclusions and future perspectives
Since its first emergence in 2016, base editing technology has experienced rapid development. With extensive efforts in optimization, the current base editors are now able to catalyze transition mutations (such as a purine to a purine or a pyrimidine to a pyrimidine) [
11–
14,
25] as well as transversion mutations (such as a purine to a pyrimidine or a pyrimidine to a purine) [
19–
21] in DNA and RNA sequences and have been greatly improved in terms of editing efficiency, precision, targeting scope, and size. The improvements in base editors allow us to repurpose them for high-throughput genetic screening, single-cell lineage tracing, and memory devices for recording cellular events [
255,
256].
Most importantly, the constant development of base editors propels them towards increasingly ambitious and sophisticated applications for gene therapy for point mutation-derived human diseases. For nuclear DNA base editors, improvement of on-target editing efficiency and specificity has led to the clinical trial on the effectiveness of the first drug candidate for treating cardiovascular and haematological diseases [
257] in 2022. However, there are still several issues, such as off-target effects and instability of the genome caused by CRISPR/Cas proteins [
258–
260] and deaminases [
116–
119], as well as difficulty in efficiently and specifically delivering base editor agents to target cells
in vivo [
221], which need to be addressed before base editors extensively enter clinical trials. As a negative example, investigational new drug (IND) applications of VERVE-101 and BEAM-201 were suspended by the FDA at the end of 2022 due to safety concerns. For RNA base editors, they are safer alternatives because they cause temporary base changes without affecting genome stability, and leverage endogenous ADAR [
75,
209–
211] is able to alleviate unintended off-targets from exogenous deaminases as well as immunogenicity responses. Nonetheless, they are still plagued by limited types of base conversion and lack of effective delivery methods to specific tissues. For the recently developed mtDNA base editors, there are many more challenges to be addressed before they are used for translational study from the bench to bedside, such as low editing efficiency, high off-target effects, complex architecture, and limited types of base conversion. In fact, certain disease-causing genes involve diverse point mutations, which lead to differences in clinical symptoms. In this case, the best strategy is to design specific editing agents for the target mutation, which is critical for precise treatment.
In addition to the technical challenges inherent in base editors, ethical issues should be another consideration when using base editors to edit germ cells. The CRISPR-baby scandal that occurred in 2018 [
261] raised public concern about scientific ethics and research safety, and negatively influenced the long-term development of base editing in biomedicine.