Animal models are invaluable tools for understanding biology and developing therapies for human diseases. The capability of making precise alterations at chosen genomic loci in cells, or the whole animal, has been established and improved upon at an unprecedented speed, thanks to the development of genome editing technology. This mini-review briefly introduces the key developments in genome editing technology and its applications to animal model creation. Given that this is a brief review, it is not possible to include all important contributions here, and we apologize to those whose work has been unintentionally omitted.
The capability of precisely editing any position in the genome has long been a dream of biologists. Recombinant DNA technology has enabled scientists to engineer DNA molecules in the test tube with great precision since the 1970s. Transgenes can be expressed by introducing engineered DNA molecules into different cells. These DNA fragments become integrated with low efficiency into the genome of cells, leading to the addition of constructed genetic information, although at random locations. The first demonstration that exogenous DNA could be introduced into early mammalian embryos was made by Jaenisch and Mintz in 1974
[1], followed by the demonstration of germ line transmission of the exogenous DNA
[2]. Subsequent efforts of Gordon et al. showed that linear DNA fragments injected into the pronucleus of mouse embryos can lead to the generation of transgenic animals
[3]. This technology is still being actively used for expressing exogenous genes in the mouse and other species. While transgenic animals are useful for expressing a particular transgene and to generate insertional mutations
[4], integration of the DNA into the genome is random and thus this approach cannot be used to disrupt specific genes.
The seminal work of Smithies et al.
[5] and Thomas et al.
[6] showed that an exogenous DNA fragment could be precisely integrated into the desired genomic locus via homologous recombination (HR), a cell-intrinsic DNA repair mechanism. Although the efficiency of this gene-targeting technology
[7] was initially low, the rare clones containing the desired targeting event could be enriched and selected by a variety of elegantly designed strategies, leading to the derivation of clonal cell lines containing the desired genetic modification. The combination of gene-targeting and mouse embryonic stem cell (mESC) technologies
[8] made it possible to generate a mouse composed entirely of cells containing the designed predetermined genetic modification
[9,10]. While this approach transformed modern biology and resulted in the award of a Nobel Prize, its application was limited to mice and more recently to rats
[11], mainly because embryonic stem cells with germ line contribution capability have not been established in most mammalian species. Also, gene targeting in most human cell types is inefficient, thereby limiting its application for genetic correction-based therapy. A means of increasing the efficiency of HR therefore became a key question for improving the capability of modifying genomes of all life forms.
First in yeast and then in mammalian cells, pioneering work on DNA double-stranded break (DSB) repair demonstrated that site-specific DSB in the genome could stimulate the rate of local HR by several orders of magnitude
[12,13]. Also, it was shown that, once the DSB was introduced, in addition to the HR pathway the non-homologous end joining (NHEJ) pathway could repair the DSB, often leading to the introduction of small insertions and/or deletions (indels) at target sites. These results motivated the development of methods that generate DNA DSB at a specific locus in the genome. The earlier efforts were aimed at engineering a class of rare-cutting endonuclease, meganucleases, that recognize long stretches of DNA sequences
[14]. However, it has been quite difficult to engineer naturally occurring meganucleases to bind and cleave chosen DNA target sequences and this has greatly limited their wider application.
Zinc finger proteins (ZFPs) are one of the most abundant types of proteins in eukaryotes and the largest transcription factor family in human
[15–17]. With the coordination of zinc ions, each zinc finger domain recognizes a 3–4 bp DNA sequence, and a combination of several fingers allows for the recognition of a longer sequence. Zinc finger nuclease (ZFN) is generated by fusing a zinc finger binding domain with the FokI nuclease domain. A pair of ZFNs binds to two proximal sites next to each other, leading to the homodimerization of FokI, generating site-specific DNA DSBs in the genome. ZFNs have been used to generate various indel mutations as well as to stimulate homology-directed repair (HDR) in different species including human cells
[18]. Upon their injection into the nuclei of
Xenopus oocytes, a pair of ZFNs binding in the opposite direction induced DNA cleavage efficiently, with the linker between FokI and ZF domains constraining the spacing of the two binding sites
[19]. Further, by introducing heat-shock promoter-driven ZFN transgenes into the
Drosophila germline, ZFNs efficiently knocked out endogenous genes
[20] or increased the frequency of targeted integration of transgenes into the chromosome
[21]. Several studies in 2008 showed that direct injection of ZFNs into embryos of
Drosophila and zebrafish could lead to the efficient generation of gene knockout (KO) animals
[22–24]. In 2009, Geurts et al. showed that microinjection of ZFN into rat zygotes led to the generation of a gene KO rat
[25], demonstrating a powerful strategy to generate gene-edited animals without limitation due to HR efficiency or availability of germline contributing embryonic stem cells.
Although a variety of strategies have been established to generate ZFPs with designed specificity, they often require significant expertise and intensive screening. This has limited the wider adoption of ZFN technology within the scientific community. In 2009, two groups independently deciphered the mechanism of transcription activator-like effectors (TALEs) recognizing target DNA sequences and found a remarkably simple modular recognition code
[26,27]. Depending on the repeat-variable di-residue (RVD), each module of the highly repetitive TALE DNA binding domain can specifically recognize one single DNA base pair. Based on the RVD-nucleotide association, a TALE DNA binding protein could be generated to recognize a sequence of choice by simply linking different modules together. The TALE nuclease (TALEN) system was readily established by replacing the ZF domain with the TALE DNA binding domain. The principle of TALEN is very similar to ZFN with the only difference being that the DNA binding specificity is determined by the array of TALE motifs. Soon afterwards, TALEN was shown to work efficiently in human cells
[28,29]. By injecting TALEN into zygotes, animals with specific gene KO were generated in various species including non-human primates
[30–34]. Co-injection of donor template into zygotes leads to efficient HDR, generating animals with precise nucleotide change and targeted gene integration
[35–37]. Editing mESCs using TALENs generated the first mouse containing targeted mutations on the Y chromosome
[38]. TALEN was also used in livestock with well-established cloning technology to generate specific mutations in somatic cell lines which were then cloned to generate genetically-modified animals
[39,40].
Although TALENs were much easier to generate than ZFNs, CRISPR (clustered regularly interspaced short palindromic repeat DNA sequences)-Cas9 readily outperformed all previous tools for generating designed double strand breaks and quickly became the method of choice for genome editing. CRISPR-Cas systems are adaptive immune defense mechanisms protecting against invading nucleic acids such as plasmid and phage infection in a large proportion of bacterial and archaeal species
[41]. Through decades of work, people dissected the mechanism of the CRISPR-Cas system
[42], and the biochemical characterization of the
Streptococcus pyogenes and
Streptococcus thermophilus CRISPR systems demonstrated that Cas9 is an RNA guided DNA endonuclease that targets specific DNA sequences complementary to the 20 nucleotide sequence residing at the 5′ end of the guide RNA
[43,44]. Upon appropriate codon optimization, CRISPR-Cas9 was able to efficiently target specific genomic loci in mammalian cells
[45,46]. By simply designing a 20-nucleotide sequence within single guide RNA (sgRNA), the Cas9 nuclease-sgRNA complex can generate DNA DSBs at a specific genomic locus. The ease of use and robust performance of the CRISPR-Cas9 system resulted in its rapid adoption by the scientific community.
The efficiency of the CRISPR-Cas9 system makes it particularly attractive for performing multiplex gene editing. By co-expressing multiple sgRNAs, Cas9 can generate DSBs at multiple chosen loci within the same cell
[46–48]. Following the previous principle established using ZFN and TALEN, injection of CRISPR-Cas9 into the zygote leads to highly efficient generation of genetically-modified animals. Mice carrying multiple gene KOs or precise nucleotide changes can be derived within a month
[47,49]. A similar approach produced genetically edited animals from ever-increasing numbers of species
[50–54]. To make the process even simpler, several groups developed the protocols to deliver CRISPR-Cas9 components into rat and mouse zygotes by electroporation, which has been successfully adopted by the field
[55–59]. In addition to modifying the genome of the zygote, germline modification has also been achieved in mice by genome editing of spermatogonial stem cells. After development to spermatids and injection into oocytes, animals with specific genetic modification were derived
[60]. By combining the gene editing and haploid stem cell technologies, an artificial sperm strategy was established as an efficient method to generate gene-edited mice
[61]. Whether these strategies can be transferred to other species remains to be tested.
One of the advantages of the CRISPR-Cas9 system is the great flexibility of repurposing its function. Cas9 has two nuclease domains, each capable of cleaving one of the target DNA strands. When either one of these domains is mutated, the Cas9-sgRNA complex becomes a sequence and strand-specific nickase; and when both nuclease domains are mutated, Cas9 becomes a programmable DNA binding protein (dCas9) without any endonuclease activity. Guided by sgRNA, dCas9 fused with different effector domains can bind to regulatory elements and regulate transcription, as well as epigenetic modifications
[62,63]. In particular for genome editing, dCas9 or Cas9 nickase fused with single-stranded DNA deaminase was developed into programmable DNA base editors, which can make specific nucleotide changes without introducing DNA DSB or replying on HDR
[64]. To construct cytosine base editors, single-stranded DNA cytosine deaminase was used to mediate C·G to T·A conversions
[65,66]; to generate adenine base editors capable of converting A·T to G·C, the DNA adenine deaminase was artificially evolved from the bacterial tRNA-specific adenosine deaminase TadA
[67]. Following the demonstration of efficient base editing in cultured cells, multiple groups introduced base editors into the embryos of various species and demonstrated efficient generation of base-edited animal models
[68–73]. These studies showed that, in general, base editing generated animals carrying precise single nucleotide editing more efficiently than previous HDR based genome editing strategies.
Generation of genetically edited animal models is becoming much easier with rapidly-improving genome editing technology. New strategies based on recently developed prime-editing
[74] and newly discovered site-specific transposon systems
[75,76] promise to make even more sophisticatedly-edited animal models more efficiently in the future.
The Author(s) 2020. Published by Higher Education Press. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0)