Transgenic technologies in cassava for nutritional improvement and viral disease resistance: a key strategy for food security in Africa

As a major staple food source in Africa and other tropical developing countries, cassava (Manihot esculenta) provides basic sustenance for many subsistence farmers. However, cassava roots mainly accumulate starch with limited contribution of other nutrients such as proteins and vitamins. Also, two viral diseases, cassava mosaic disease (CMD) and cassava brown streak disease (CBSD), cause great losses in cassava production in subSaharan Africa and the Indian sub-continent. Genetic engineering provides promising approaches to improve nutritional value and increase resistance to viral diseases in cassava. This report presents several successful case studies on engineering protein content by overexpression of nutritious storage proteins and improving cassava resistance to viral diseases by RNA interference. Perspectives on the sustainable acquisition of new knowledge and development of biotechnology to solve these bottlenecks are discussed.


Introduction
Cassava (Manihot esculenta) is considered as a special gift to African people because it is so easy to grow and tolerant to poor soil as well as drought conditions, with flexible harvest time; this makes cassava critical to food security in Africa [1] . Cassava is mainly consumed as a dietary energy supply by subsistence farmers, who have limited access to other food sources. However, cassava storage roots mainly contain starch, with limited amounts of other nutrients such as proteins, vitamins and micronutrients [2,3] . Malnutrition caused by protein deficiency, termed protein energy malnutrition (PEM), is more subtle, resulting in kwashiorkor, a severe protein malnutrition, in children [4] . PEM could be easily alleviated by a healthy and diversified protein diet or even through consumption of protein-rich cassava. Therefore, enhancing protein content and quality in cassava would provide significant health benefits to cassava consumers suffering from protein malnutrition, and greatly contribute to the fight against PEM [5,6] .
However, there are several other important constraints to cassava production in Africa. CMD and CBSD have been reported in many regions of the Africa [7,8] . Due to the increasing incidence and prevalence of these viruses over the last decade, the CMD and CBSD pandemics are considered the most important cassava diseases in the African and Indian sub-continents [9] . A recent report of Sri Lanka CMD in Cambodia caused a regional alarm to CMD pandemics in South Asian countries [10] . Losses due to CMD and CBSD in Africa were estimated at about 25% of the total cassava production, representing 34 Mt annually [11] . To reduce the impact of CMD and CBSD on production, the development of tolerant and resistant cassava cultivars remains a promising approach, and can be achieved by hybridization or transgenesis [12,13] . With the current understanding of the pathogens responsible for CMD and CBSD and their interactions with the host, as well as the propagation and dynamics of CMD pandemics, it is possible to implement engineered virus resistance in cassava using different approaches.
Genetic engineering shows great potential in germplasm innovation by improving specific traits without changes in other important characteristics, especially with the development of genome editing technology. In the last decade, cassava genetic improvement using transgenic approaches has yielded significant progress worldwide [13,14] . Farmer-preferred cultivars have been explored to develop cassava with increased resistance to viruses and abiotic stress, enhanced nutritional value, improved starch yield and quality and prolonged shelf life. The '-omics' tools have led to intensive study of cassava, focused especially on starchy storage root development, starch accumulation, health-promoting components (e.g., β-carotene), and stress response and regulation [13,14] . In this review, we update the recent progress related to transgenic modification of cassava in protein improvement and virus resistance. As a vital component of an integrated breeding system, genetic engineering, together with functional genomics, proteomics, marker-assisted selection and traditional hybridization, has greatly enhanced the efficiency of cassava genetic improvement. Hence, the role of cassava in food security, commercialization and bioenergy development can be addressed by strengthening fundamental research and applied technology. (about 1% FW), whose nutritional value is further reduced by the particularly low concentrations of essential amino acids (EAA) lysine and leucine as well as the sulfurcontaining EAA (SAA) methionine and cysteine [2,3,15,16] . According to the US Department of Agriculture, a 60 kg individual should consume at least 4.5 kg cassava storage roots per day to meet the recommended daily requirements for all essential amino acids ( Fig. 1) [17] . Based on our measurement of protein composition of storage roots in cassava TMS60444, very low concentrations of hydroxyproline, cysteine, methionine, tryptophan, tyrosine, and histidine are found in total protein samples. The major amino acids are glutamic acid, and aspartic acid, which have at least twice the concentrations of the remaining amino acids (Fig. 2). It has been demonstrated that cyanogens provide reduced nitrogen substrates for amino acid synthesis in cassava roots and, therefore, manipulation of linamarin metabolism in cassava might elevate root amino acid pools and promote total protein biosynthesis [6] . Overexpression of hydroxynitrile lyase, the enzyme that catalyzes the conversion of acetone cyanohydrin to cyanide, in cassava roots was reported to increase overall protein content [18] . In fact, efforts to develop high protein cassava by conventional breeding have also been made by the investigation of crude protein content in cassava roots Fig. 1 The essential amino acid (EAA) requirement in humans (a) and the amount of consumed food products needed to meet the minimum daily requirement of all EAAs (b). Data adapted from reference [17] . of different varieties or hybrids [19,20] . However, since the selection for high protein depended on the method of N quantification, these approaches might not reflect the real protein concentration and only limited success has been achieved.
Given that cassava storage roots have low protein and essential amino acid concentrations in comparison with other root crops, identification of endogenous proteins with a typical storage function is a challenging objective. Unlike seed crops, such as wheat and maize, in which various storage proteins have been identified [21] , few native cassava storage proteins have been described. Although thousands of proteins have been identified by different proteomics approaches, only few of them have known functions, e.g., post-harvest physiological deterioration [22] and carotenoid sequestration [23] . Apparently, active expression of proteins (e.g., membrane proteins) is essentially related to storage root development [24][25][26][27] . Different protein profiles were also reported in distinct cultivars [28] . Proteomic data indicates that a highly abundant protein in cassava roots (belonging to the heat shock protein family) may have functions in normal development and stress response, but not for storage purposes [23] . Unlike potato and sweet potato which accumulate patatin and sporamin as storage proteins, respectively, no cassava storage protein has been identified [29] . Furthermore, cassava storage root has not been reported to have protein bodies.
As an alternative, expression of valuable storage proteins from other plants appears to be the most practical strategy to accomplish nutritional biofortification of cassava storage roots. Transfer and expression of heterologous plant storage proteins has previously been widely applied for improving protein quality in crop plants [5,30] . When choosing a storage protein candidate for improvement of cassava storage root protein, several important parameters need to be considered, including potential allergenicity [31] , nutritional value, stability and storability in the target tissue. For example, several seed storage proteins with a particularly high SAA content from acha (Digitaria exilis) have been used for nutritional enhancement of various crops [32,33] . The sunflower seed albumin is rich in SAA, and has been used in narrow leafed lupin, rice endosperm, and chickpea seeds for nutritional improvement [34][35][36] . In addition, a number of storage proteins from starch-rich organs have been identified and might be useful, such as dioscorin, patatin and sporamin [29] . Sporamin contributes more than 80% of the total protein content in sweet potato roots. No allergenicity is known for this protein, thus making it a suitable candidate for cassava storage roots. Another protein candidate is dioscorin, which is mainly found in the tuberous storage organ of yam, contributing more than 80% of its total protein content. The two subclasses (A and B) of dioscorin differ mainly in the presence or absence of a single disulfide bond. The size of dioscorin A and its function as storage protein in yam tubers were confirmed recently [37] . Interestingly, dioscorin has antioxidative properties with radical scavenging activity, equivalent to that of glutathione at similar protein concentrations [38] . Lysine is the only limiting amino acid in dioscorin, with an amino acid score of 0.9. Given that cassava, sweet potato and yam storage organs show structural similarities, expression of these valuable proteins in cassava storage roots might be feasible to increase protein concentration, thus improving the value of cassava storage roots for alleviation of PEM in target populations of developing tropical countries.

Improvement of protein concentration by overexpression of nutritious storage proteins
Although the above storage proteins have been identified in plants, most of them have inadequate amino acid compositions compared to reference proteins from egg, bovine milk or beef. However, such proteins could complement human nutritional requirements by mutagenesis or even through design of novel proteins with an optimized ratio of EAA. Expression of such nutritionimproved proteins in crops could provide a safe and effective means to increase their nutritional value [5] .
We previously expressed an artificial storage protein (ASP1) in transgenic cassava [39] . ASP1 was designed de novo based on the structure of maize zein protein and optimized for human EAA requirements (Fig. 3). ASP1 was expressed in tobacco leaves at high concentrations, markedly increasing EAA concentrations [41] . ASP1 expression is controlled by the constitutive CaMV 35S promoter, with high expression levels detected in leaves and storage roots of transgenic cassava lines [40] . The expression levels were strongly impacted by the number of transgene copies in transgenic plants. When single transgene insertion occurs, all lines showed high levels of ASP1, as assessed by Northern and Western blot analyses. Multiple copies in transgenic cassava could result in transgene silencing [40] .
In a field assessment of ASP1 transgenic cassava in Hainan Island, China, six ASP1 transgenic lines showed phenotypically normal growing. After harvest, four lines produced excellent storage roots, similar to wild-type plants; the remaining two lines produced fewer roots. Protein analysis of the storage roots showed that two lines with normal storage roots had a 200% increase of protein concentration (about 3% protein), demonstrating that it is feasible to improve protein concentration by overexpression of foreign storage protein genes in cassava. Amino acid analysis of storage roots revealed altered amino acid profiles, possibly because ASP1 expression affected the amino acid metabolism in the roots. So far, ASP1 expression is the only transgenic approach reported to successfully increase the nutritional value of cassava. To increase protein accumulation and deposition in cassava storage roots, we developed an approach involving targeting the protein to certain subcellular organelles using N-or C-terminal signal sequences, instead of cytosolic accumulation [42] . Storage organelles, such as endoplasmic reticulum (ER), plastid, and vacuole, have been tested for storage protein accumulation in several plants, and might provide a sink for storage proteins. This approach has yet to be tested in cassava, and the most suitable subcellular localization for an exogenous storage protein in cassava storage roots still needs to be determined empirically. We recently confirmed the functionality of several signal sequences that target proteins to plastids or vacuoles, or trigger ER retention, in cassava BY2-like protoplasts (unpublished data), and are presently assessing whether localization to these compartments may increase protein levels in cassava storage roots.
Storage proteins can be distributed in various subcellular compartments. Tuber storage proteins, such as dioscorin, patatin, and sporamin, usually accumulate in the vacuole, while dioscorin can occur as protein aggregates in the cytoplasm [29] . Sporamin is synthesized in the ER and transported to vacuoles via a signal sequence. However, sporamin expression in tobacco shows O-glycosylation of serine residues, which is different from the native sporamin of sweet potato [43] . Unlike root and tuber crops, storage Fig. 3 ASP1 protein. (a) Amino acid sequence and conformation; (b) essential amino acid (EAA) composition. Adapted from reference [40] . proteins (prolamins) from cereals, such as maize, rice, and sorghum, are able to form new compartments attached to the ER called protein bodies. The unique characteristic of prolamins is the cysteine-rich region, which forms disulfide bridges and sequentially creates a large insoluble protein granule. The advantage of protein bodies is that they require only the ER protein folding machinery, reducing cellular energy during protein trafficking [44] . Zein is a maize prolamin that has been extensively studied; with 27 kDa, g-zein is the main subunit essential to initiate the formation of protein bodies [45] . The expression of α-zein in tobacco seed endosperms led to weakly accumulated and unstable protein bodies, but co-expression with g-zein prevented protein bodies from degradation [46] . These findings also suggest that protein bodies from monocots can also be formed in dicots. It could be useful to investigate whether protein body formation can enhance storage protein content in cassava roots.
As mentioned above, sporamin is a suitable candidate cassava storage protein, although protein compartmentalization and modification also need to be considered. Given that cassava is starch-rich and cassava protein trafficking to vacuoles is poorly understood, we targeted sporamin to plastids and the latter were found to bind starch granules (unpublished data). Starch binding domains (SBD) are a non-catalytic protein domain found in starch and glycogen metabolism enzymes; they help protein attach to the insoluble surface of starch granules. SBD have been used for protein modification, affinity purification and starch bioengineering. It was reported that expression of tandem repeat SBD in amylose-free potato results in higher protein accumulation and altered starch granule size [47] . Therefore, we hypothesized that SBD fusion is able to mediate sporamin binding with cassava starch, enhancing protein deposition. We generated transgenic cassava plants, and high sporamin expression was detected in leaves of plants grown in vitro. Fusion of sporamin to green fluorescent protein confirmed its localization in plastids of cassava guard cells (unpublished data). However, performance of transgenic cassava plants in the field remains to be assessed.

RNA-mediated strategies for increased resistance to cassava mosaic viruses
The RNA-based silencing mechanism against viruses is an efficient and robust method for engineering virus resistance traits [48] . Technologies, such as antisense, dsRNA, and miRNA, have been developed to inhibit both RNA and DNA virus replication and accumulation in model plants as well as economically important crops [49] . We developed cassava mosaic virus (CMV) resistant transgenic cassava using both antisense and dsRNA technologies [12] , providing alternatives to CMV resistant cassava plants in the field (Fig. 4). Currently two sources of CMD resistance have been recognized: CMD-1 (a polygenic source originated from an interspecific hybrid clone 58308 with Manihot glaziovii Mull. Arg) [50] and CMD-2 (a dominant gene identified in the Nigerian landrace TMEB3) [51] . This reflects the limited genetic base for dominant resistance in cassava and, therefore, it is critical to diversify the resistance sources to ensure durability using transgenesis, especially in farmer-preferred cultivars [52] .

In vitro replication assay for cassava geminiviruses
To study resistance to CMV in cassava, it is important to develop a reproducible protocol for a transient replication of cloned African cassava mosaic virus (ACMV). Using particle bombardment-mediated DNA delivery of cloned ACMV that originates from West Kenya ACMV isolate 844 (ACMV-KE), we established a transient viral replication assay in cassava leaf disks [53] . To distinguish between the input DNA (methylated by DNA adenine methyltransferase) and de novo synthesized viral DNA (nonmethylated), methylation-sensitive restriction enzymes were used. The conditions for efficient ACMV replication have been optimized, including media, pre-culture and post-culture treatments and best time for replication analysis. This technology proved very useful for screening various cassava cultivars for CMD resistance as well as analyzing virus replication in transgenic cassava engineered for geminivirus resistance.

Expression of ACMV antisense genes for virus resistance in cassava
ACMV encodes three key viral proteins, including replication-associated protein (Rep), replication enhancer protein (REn) and viral transcriptional activator protein (TrAP), which are critical for efficient viral replication, gene activation and silencing/suppression [54] . Inhibiting the expression of these genes by antisense interference should allow cassava plants to inhibit viral replication. We have successfully produced transgenic cassava plants resistant to CMV by expressing ACMV Rep, TrAP or REn antisense RNAs under the constitutive CaMV 35S promoter [55] . To ensure efficient expression of antisense RNA, we used an improved expression strategy by inserting the full coding sequences of Rep, REn and TrAP, separately, in antisense orientation into the 3′ untranslated region of the hygromycin phosphotransferase gene. This linkage to the selectable marker increases the probability of asRNA being produced in regenerated transformants, thus increasing the likelihood of success while reducing the workload associated with handling, regeneration, and analysis of transformants lacking the gene of interest.
Among the dozens of transgenic cassava plant lines, several resistant ones were produced for each construct. ACMV infection analysis confirmed that a high level of resistance was achieved in some cases under medium to high viral pressure (Fig. 4). These findings suggest that introduction of geminivirus resistance by antisense RNA is not restricted to targeting Rep but is also successful with other viral proteins. In antisense transgenic plants, it is possible that significant amounts of siRNA are produced only upon infection, which provides sense RNA to form dsRNA with antisense molecules. Indeed, Rep-homologous short RNA could not be detected prior infection in Rep-antisense transgenic cassava plants [55] . This indicated that antisense molecules are efficient in conferring geminivirus resistance per se. ACMV resistance does not derive from a sequence-specific silencing mechanism initiated against the overproduction of antisense Rep RNA, as attested by the high levels of antisense RNAs expression in resistant transgenic cassava lines.
Bejarano and Lichtenstein suggested that the antisense approach might face limitations, when elaborating resistance to a mixture of cassava mosaic geminivirus species [56] . For example, ACMV and East African cassava mosaic virus (EACMV) Rep sequences share high homology only in short stretches. These regions of scarce homology might not be sufficient to maintain silencing of both ACMV and EACMV Rep sequences. Nevertheless, it remains unclear where antisense RNA molecules enter the silencing pathways. Antisense technology improvement for multiple targets sharing low sequence homology would require further study of the mechanism behind the antisense approach.

Expression of dsRNA homologous to the ACMV bidirectional promoter
Transcriptional gene silencing correlated with increased viral promoter methylation provides another approach to engineering virus resistance in plants. Methylation of geminivirus-derived transgene promoters could be triggered upon tomato leaf curl virus infection [57] . Pooggin et al. demonstrated recovery of geminivirus Vigna mungo yellow mosaic virus infected Vigna mungo plants by bombarding them with RNA interference (RNAi) constructs expressing dsRNA homologous to the DNA A viral promoter sequence [58] . We also demonstrated that replication of ACMV DNA A can be impaired in leaf disks from transgenic cassava plants expressing dsRNA homologous to the ACMV DNA A promoter [59] . Furthermore, transgenic cassava lines expressing ACMV promoterhomologous hairpin dsRNA showed an enhanced recovery phenotype upon ACMV infection. Notably, we also detected a similar pattern of ACMV promoter-derived short RNAs in wild-type cassava plants upon ACMV infection [60] . Generation and function of these promoterderived RNAs remain to be determined, but their structures would permit integration in multiple silencing complexes and pathways. Since the conserved shared region located inside the promoter region is highly homologous between the A and B genomic components ( > 90%), the short RNAs could potentially direct the modification of both ACMV promoters. As the promoter region of cassava geminiviruses is not highly conserved, such approach might not be a successful wide-spectrum virus resistance strategy.

Expression of dsRNA homologous to viral coding sequences
Expression of dsRNA homologous to viral coding sequences was shown to be an effective technology for engineering RNA and DNA virus resistance in plants. The siRNA generated from hairpin dsRNA homologous to viral coding sequences would enter at least two pathways: (1) they could direct modification of DNA and histone complexes at homologous DNA sequences in the nucleus; (2) the siRNA can also potentially impede viral infection through virus induced gene silencing [61] . Indeed, siRNA accumulation in CMV-infected cassava confirmed the existence of an RNAi pathway as a natural defense mechanism against geminiviruses in cassava [62] .
We recently produced transgenic cassava expressing hairpin dsRNA homologous to the conserved regions of ACMV Rep and AV1 coding sequences [63] . Several transgenic plant lines expressing high levels of short RNA homologous to Rep or AV1 were resistant to ACMV infection under high viral pressure. Importantly, resistance was strongly correlated with high siRNA expression in transgenic cassava.
Since Rep and the coat protein have highly conserved regions among different cassava geminiviruses, we believe that targeting such sequences by siRNA via hairpin dsRNA expression should confer cassava wide-spectrum resistance. In Africa, CMD occurrence is usually a consequence of mixed cassava geminivirus infection [64] . The RNAi approach against conserved coding sequences provides a robust tool against various cassava geminivirus species. In our infection experiments, complete immunity to ACMV inoculation was confirmed under high virus load. Currently, few field experiments for virus resistance have been reported from collaboration between laboratories and African partners. The demonstration of the existence of more small RNA [65,66] and DNA molecules [67] in cassava virus infected plants that suppress the expression of key siRNA and the possibility of mutation of DNA by the CRSPR/CAS9 system would provide novel approaches for targeting viral diseases.
As induction of CMD resistance in cassava by RNAi was successful, engineering resistance to other cassava viruses, such as cassava brown streak virus (CBSV), is required. Both CMD and CBSD are considered the major constraints in cassava cultivation. Therefore, stacking multiple resistances to both viruses should be plausible and worthwhile.

Conclusions and outlooks
Genetic transformation of cassava has provided a promising approach to producing more nutritious crops that are resistant to viral diseases [13] . Through the Bill and Melinda Gates Foundation supported project "BioCassava Plus," scientists from several laboratories have worked collaboratively to deal with the main constraints related to the production and utilization of cassava, as staple food in Africa, attempting to enhance bioavailable levels of zinc, iron, protein, vitamins A, B6 and E, reducing the amounts of toxic cyanogenic glycosides, improving post-harvest durability, and increasing resistance to viral diseases [6] . Nevertheless, technology transfer to local farmers through improved cultivars is considered the most important component of the project, and much effort is being undertaken through international collaborations, such as the Root, Tuber and Banana Consortium (http://www.rtb. cgiar.org).
So far, improvement in cassava protein by crossbreeding methods, using protein rich genotypes as donors, has not been feasible due to the loss of the desired traits after several backcrosses. Therefore, understanding storage protein metabolism, i.e., its biosynthesis, accumulation and deposition and degradation, in cassava storage roots might provide a basis for improving protein content in cassava [5] . Using exogenous proteins as expression candidates in cassava storage roots is also important for advancing knowledge of protein metabolism in cassava storage root cells.
Although CMV resistant transgenic cassava is ready for field studies in Africa, with field experiments in the pipeline, local institutions, scientists and government representatives from Africa still need to establish biosafety guidelines and consider the implementations of current regulations for handling and using transgenic cassava [68,69] . The impact to date from cassava genetic improvement has been exclusively from traditional crossbreeding approaches. As in other major crops, this is likely to change in the near future as regulatory agencies develop rules and policies on transgenic cassava, and the traits available begin to meet farmer and consumer demands. Therefore, it is essential to understand and prioritize the specific economic traits and biological features of cassava in order to effectively improve the crop using biotechnology tools.