TCGA whole-transcriptome sequencing data reveals significantly dysregulated genes and signaling pathways in hepatocellular carcinoma

Daniel Wai-Hung Ho , Alan Ka-Lun Kai , Irene Oi-Lin Ng

Front. Med. ›› 2015, Vol. 9 ›› Issue (3) : 322 -330.

PDF (455KB)
Front. Med. ›› 2015, Vol. 9 ›› Issue (3) : 322 -330. DOI: 10.1007/s11684-015-0408-9
RESEARCH ARTICLE
RESEARCH ARTICLE

TCGA whole-transcriptome sequencing data reveals significantly dysregulated genes and signaling pathways in hepatocellular carcinoma

Author information +
History +
PDF (455KB)

Abstract

This study systematically evaluates the TCGA whole-transcriptome sequencing data of hepatocellular carcinoma (HCC) by comparing the global gene expression profiles between tumors and their corresponding non-tumorous liver tissue. Based on the differential gene expression analysis, we identified a number of novel dysregulated genes, in addition to those previously reported. Top-listing upregulated (CENPF and FOXM1) and downregulated (CLEC4G, CRHBP, and CLEC1B) genes were successfully validated using qPCR on our cohort of 65 pairs of human HCCs. Further examination for the mechanistic overview by subjecting significantly upregulated and downregulated genes to gene set enrichment analysis showed that different cellular pathways were involved. This study provides useful information on the transcriptomic landscape and molecular mechanism of hepatocarcinogenesis for development of new biomarkers and further in-depth characterization.

Keywords

TCGA / whole-transcriptome sequencing / HCC / liver cancer

Cite this article

Download citation ▾
Daniel Wai-Hung Ho, Alan Ka-Lun Kai, Irene Oi-Lin Ng. TCGA whole-transcriptome sequencing data reveals significantly dysregulated genes and signaling pathways in hepatocellular carcinoma. Front. Med., 2015, 9(3): 322-330 DOI:10.1007/s11684-015-0408-9

登录浏览全文

4963

注册一个新账户 忘记密码

Introduction

Hepatocellular carcinoma (HCC) is a common type of cancer and one of the leading causes of cancer-related mortality worldwide [ 1, 2]. HCC is an aggressive malignancy and patients with HCC have a poor prognosis. Unfortunately, only a few effective treatment options are available. Despite much effort in studying the molecular mechanism of HCC carcinogenesis, current understanding on this lethal disease is still limited.

In the past, delineating the underlying genome-wide HCC regulatory and interaction networks primarily relied on microarray-based technology [ 3- 7]. Recent advancement in next-generation sequencing facilitated the realization of whole-transcriptome sequencing (WTS). This new technological platform allows more comprehensive and accurate examination of global gene expression profile. Currently, only a few studies have utilized WTS strategies in delineating the transcriptomic landscape of HCC [ 8, 9] or liver cancer stem cells [ 10]. However, all of them are limited by small sample size in providing a comprehensive and representative overview of HCC transcriptome. The Cancer Genome Atlas (http://cancergenome.nih.gov/) represents a global collaboration in cancer research. It has large collections of tissue samples, which were examined in multiple aspects (e.g., genomic, transcriptomic, and epigenetic). More importantly, the data are of open access and freely available to all researchers for use in their own studies. Therefore, the relatively large TCGA HCC WTS data set was utilized in the discovery of the current study.

In our study, we extracted WTS data from the collections of free-access repositories from all 50 HCC cases, in which tumorous (T) and their corresponding non-tumorous (NT) liver tissue was available and analyzed by TCGA. We compared global gene expression profiles between T and NT liver tissue and identified differentially expressed (DE) genes. Top-listing genes were validated by quantitative PCR (qPCR) by using an independent sample cohort (n = 65). DE genes were then subjected to gene set enrichment analysis, and we identified gene sets and signaling pathways that were significantly enriched with upregulated and downregulated genes. These genes are attractive molecular targets and are worthy of further investigation, and they may be used as HCC biomarkers.

Materials and methods

TCGA WTS data of HCC

From the TCGA data portal (http://cancergenome.nih.gov/), we extracted all available WTS data of HCC (a total of 50 cases), which have both T and their corresponding NT samples, through bulk download mode [liver HCC (cancer type), RNASeqV2 (data type), level 3 (archive type) and 1.12.0 (data version)]. The data were generated based on Illumina HiSeq 2000 platform and annotated to reference transcript set of UCSC hg19 gene standard track. Gene expression data were available as upper quartile normalized RSEM count estimates. Extracted data were used without further transformation, except by rounding off values to integers.

Validation sample cohort of paired HCCs

A cohort of 65 surgically resected HCCs and their corresponding NT livers were randomly selected for validation. The specimens were collected from patients who underwent surgical resection for HCC at Queen Mary Hospital, Hong Kong. All of them were obtained immediately after surgical resection, snap-frozen in liquid nitrogen and kept at -80 °C. Each case had both frozen tissue blocks and formalin-fixed paraffin-embedded tissue; frozen sections were cut from tumor blocks and stained for histological examination to ensure a homogenous cell population of tissue. The use of the tissue was approved by the Institutional Review Board of the University of Hong Kong/Hospital Authority Hong Kong West Cluster. The demographic data of the patients are summarized in Supplementary Table 1.

Differential gene expression detection

Differential gene expression (DGE) analysis was performed using edgeR [ 11]. It uses negative binomial models to capture variance dispersion for WTS read count data, empirical Bayes estimation for gene-specific variation, and generalized linear models applicable to general experiments. As suggested by edgeR, genes with very low read counts are usually not of interest in DGE analysis; hence, average count-per-million (CPM) was used to determine whether a gene was reasonably expressed or not. Subsequently, log2(fold change), log2(CPM), statistical significance, and the corresponding false discovery rate (FDR) were reported by edgeR. DE genes were selected based on these parameters, with the T/NT expression fold change (FC) denoting upregulation or downregulation.

Gene set enrichment analysis on DE genes

To evaluate the mechanistic overview of DGE for HCC, the significantly upregulated and downregulated genes were tested for gene set or pathway enrichment by uGPA package [ 12]. Enrichment analyses of the upregulated and downregulated genes were performed separately as recommended previously [ 13]. Curated gene sets were obtained from MSigDB v4.0 (www.broadinstitute.org/gsea/msigdb) and classified into functional gene sets according to the domains of gene ontology (GO) [i.e., biological process (825 gene sets), cellular component (233 gene sets), and molecular function (396 gene sets)] or pathway gene sets according to canonical pathways as documented by KEGG (186 gene sets). uGPA takes DGE events as input and assesses them for enrichment events within gene sets or signaling pathways by cumulative hypergeometric test. An FDR of<0.05 was treated as significant event.

Validation on top-listing gene candidates by qPCR in human HCCs

To confirm the WTS findings on DGE, the top-listing upregulated (CENPF and FOXM1) and downregulated (CLEC4G, CRHBP, and CLEC1B) genes were subjected to qPCR by TaqMan real-time qPCR assays (Hs01118845_m1, Hs01073586_m1, Hs00962163_g1, Hs00181810_m1, and Hs00212925_m1), following manufacturer’s instructions. Total RNA was extracted by Trizol (Invitrogen) and cDNA was synthesized by reverse transcription kit (Life Technologies) on the validation sample cohort (n = 65).

Results

Comparison of global gene expression profiles of HCC T and NT tissue

By comparing the WTS read counts of the various genes between T and NT tissue and subsequently applying the selection criteria of log2(FC)≥2, log2(CPM)≥1, and FDR<0.05, 734 genes were regarded as having DGE, among which 220 were upregulated and 514 were downregulated (Fig. 1). In terms of statistical significance, CENPF (centromere protein F, 350/400 kDa) (log2(FC)= 3.64, FDR= 5.32E‒78) and CLEC4G (C-type lectin domain family 4, member G) (log2(FC)= -8.96, FDR= 1.19E‒80) were the most significantly upregulated and downregulated genes, respectively (Supplementary Tables 2 and 3).

Successful validation of top-listing candidates by qPCR

Top-listing upregulated (CENPF and FOXM1) and downregulated (CLEC4G, CRHBP, and CLEC1B) genes were subjected to qPCR assays on our validation sample cohort of 65 HCC pairs. All of these genes were found to be successfully validated (P<0.0001, Mann-Whitney U test) and the dysregulation trend matched with those observed in the TCGA WTS data (Fig. 2).

Significantly enriched pathways for upregulated and downregulated genes

By subjecting the significantly upregulated genes to enrichment analysis on gene sets based on GO (i.e., biological process, cellular component, and molecular function) and KEGG canonical pathways, we observed that upregulated genes were significantly enriched in various domains (Table 1). For GO biological process, the genes were mainly enriched in cell cycle processes. For GO cellular component, non-membrane-bound organelles and cytoskeleton were involved. For GO molecular function, motor activity and various binding activities were implicated. Based on the canonical signaling pathways documented in KEGG, pathways on cell cycle and p53 signaling were significantly enriched.

Meanwhile, downregulated genes were also subjected to gene set enrichment analysis (Table 2). For GO biological process, the genes were mainly related to signal transduction, response to stimulus, and various metabolic processes. For GO cellular component, they were implicated in membrane and extracellular matrix (ECM). For GO molecular function, they were involved in versatile types of activities including oxygen binding, receptor activity, and oxidoreductase activity. They were also enriched in canonical signaling pathways that are related to metabolism of various substrates.

Discussion

In the current study, we made use of the T-NT TCGA WTS data extracted from 50 HCC pairs to provide useful transcriptomic landscape for HCC. We systematically compared the gene expression profiles of HCC T samples with their corresponding NT samples, and identified 734 DE genes. A number of DE genes that were reported in previous studies [ 8, 9], such as ALG1L, SERPINA11, TMEM82, GPC3, SPINK1, and ESM1, were also detected in the current study. In addition, many other novel genes were found to be significantly upregulated (Supplementary Table 2) and downregulated (Supplementary Table 3). CENPF (centromere protein F) and FOXM1 (forkhead box M1) were among the top-listing significantly upregulated genes. CENPF is required for kinetochore function and chromosome segregation in mitosis. On the other hand, FOXM1 is a transcription factor that regulates the expression of cell cycle genes for DNA replication and mitosis. It may also have roles in controlling cell proliferation and DNA-break repair of DNA damage checkpoint response. Intriguingly, through an integrative computational approach in which the interactomes of human and mice were compared, CENPF and FOXM1 were predicted to be the master regulators for prostate cancer malignancy [ 14]. Moreover, they were also shown to act synergistically in driving aggressive prostate cancer. Knockdown of CENPF and FOXM1 synergistically reduced the proliferation of prostate cancer cells and tumor growth in cell-line-derived xenografts. It was further shown that knockdown of CENPF expression reduced the binding of FOXM1 to its targets. These two proteins were also demonstrated to co-localize in nucleus and their co-expression was a robust prognostic indicator of poor survival and metastasis. Thus, the concurrent upregulation of them in HCC likely suggests a similar synergistic co-operation in hepatocarcinogenesis.

Among the most significantly downregulated genes, we noted multiple members of the C-type lectin family (CLEC4G, CLEC1B, and CLEC4M) and CRHBP [corticotropin-releasing factor (CRF) binding protein]. C-type lectins are calcium-dependent glycan binding proteins and function as adhesion and signaling receptors in various immune functions, including inflammation and immunity to tumor and virally infected cells [ 15]. According to the Human Protein Atlas [ 16], CLEC4G, CLEC1B, and CLEC4M are predominantly expressed in liver; however, CLEC4G and CLEC4M are expressed at very low levels or are undetectable in liver cancer tissue (data not available for CLEC1B on liver cancer tissue). This finding suggests that disruption of expression of these C-type lectin proteins may have a role in the pathogenesis of HCC. CRHBP is a member of the CRF system. Activation of CRF receptors, particularly CRFR2 was shown to inhibit tumor progression, modulate proliferation and apoptosis, and interfere with angiogenesis through reduction of VEGF expression in vivo in various cancers [ 17- 20]. A recent study also indicated that reduced expression of CRHBP was associated with a more aggressive behavior of human kidney cancer, suggesting depletion of CRHBP may be involved in renal carcinogenesis [ 21].

Gene set enrichment analysis further provides a mechanistic overview of HCC. First, proteins of various cell cycle processes were frequently upregulated, particularly for multiple cyclins and cyclin-dependent kinases (CCNA2, CCNB1, CCNB2, CCNE1, CDK1, CDKN2A, CDKN2C, and CDKN3) (Table 1 and Supplementary Table 2). Given that cell cycle is controlled at various checkpoints by regulating cyclins, cyclin-dependent kinases and other cell cycle proteins [ 22, 23], upregulation of these genes may lead to disruption in cell cycle control and result in abnormal cell proliferation. Second, the expression of many genes for various metabolic processes was preferentially downregulated in HCC, including metabolism of retinol, fatty acids, amino acids and carbohydrates, steroid hormone biosynthesis, and glycolysis and gluconeogenesis. In particular, multiple components of cytochrome P450 were significantly downregulated in HCC (Table 2 and Supplementary Table 3) and they play critical roles in biosynthesis and metabolism [ 24]. Besides, they are also involved in the removal of toxic substances from the body [ 25, 26]. Meanwhile, numerous cytokines (CCLs and CXCLs) were also downregulated in HCC (Table 2 and Supplementary Table 3). Cytokines and its receptors are important for triggering immune responses through the action of various immune cells [ 27]. These immune responses are critical in responses against infection [ 28] and cancer [ 29]. Overall, these findings suggest altered metabolic and immune systems of HCC compared with non-tumorous hepatocytes.

In the initial global analyses of the TCGA WTS data of HCC and subsequent validation by an independent sample cohort, we discovered several promising gene candidates and pathways that are significantly dysregulated in HCC. These findings shed light on some novel targets that may potentially drive hepatocarcinogenesis. However, further functional characterization and in vivo validation using animal model are needed to substantiate our findings.

In conclusion, this study explored the molecular mechanism of hepatocarcinogenesis through assessment of TCGA WTS data of HCC and validation of some of the top-listing DE genes in an independent cohort. It provides useful information on the transcriptomic landscape as well as a mechanistic overview of HCC. Our findings offer novel insights and useful support in biomarker development and suggest new potential targets in HCC characterization.

References

[1]

El-Serag HB. Hepatocellular carcinoma. N Engl J Med2011; 365(12): 1118–1127

[2]

Villanueva A, Llovet JM. Liver cancer in 2013: mutational landscape of HCC—the end of the beginning. Nat Rev Clin Oncol2014; 11(2): 73–74

[3]

Jia HL, Ye QH, Qin LX, Budhu A, Forgues M, Chen Y, Liu YK, Sun HC, Wang L, Lu HZ, Shen F, Tang ZY, Wang XW. Gene expression profiling reveals potential biomarkers of human hepatocellular carcinoma. Clin Cancer Res2007; 13(4): 1133–1139

[4]

Lee JS, Thorgeirsson SS. Comparative and integrative functional genomics of HCC. Oncogene2006; 25(27): 3801–3809

[5]

Marshall A, Lukk M, Kutter C, Davies S, Alexander G, Odom DT. Global gene expression profiling reveals SPINK1 as a potential hepatocellular carcinoma marker. PLoS ONE2013; 8(3): e59459

[6]

Patil MA, Chua MS, Pan KH, Lin R, Lih CJ, Cheung ST, Ho C, Li R, Fan ST, Cohen SN, Chen X, So S. An integrated data analysis approach to characterize genes highly expressed in hepatocellular carcinoma. Oncogene2005; 24(23): 3737–3747

[7]

Skawran B, Steinemann D, Weigmann A, Flemming P, Becker T, Flik J, Kreipe H, Schlegelberger B, Wilkens L. Gene expression profiling in hepatocellular carcinoma: upregulation of genes in amplified chromosome regions. Mod Pathol2008; 21(5): 505–516

[8]

Huang Q, Lin B, Liu H, Ma X, Mo F, Yu W, Li L, Li H, Tian T, Wu D, Shen F, Xing J, Chen ZN. RNA-Seq analyses generate comprehensive transcriptomic landscape and reveal complex transcript patterns in hepatocellular carcinoma. PLoS ONE2011; 6(10): e26168

[9]

Lin KT, Shann YJ, Chau GY, Hsu CN, Huang CY. Identification of latent biomarkers in hepatocellular carcinoma by ultra-deep whole-transcriptome sequencing. Oncogene2014; 33(39): 4786–4794

[10]

Ho DW, Yang ZF, Yi K, Lam CT, Ng MN, Yu WC, Lau J, Wan T, Wang X, Yan Z, Liu H, Zhang Y, Fan ST. Gene expression profiling of liver cancer stem cells by RNA-sequencing. PLoS ONE2012; 7(5): e37159

[11]

Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics2010; 26(1): 139–140

[12]

Ho DW, Ng IO. uGPA: unified Gene Pathway Analyzer package for high-throughput genome-wide screening data provides mechanistic overview on human diseases. Clin Chim Acta2015; 441: 105–108

[13]

Hong G, Zhang W, Li H, Shen X, Guo Z. Separate enrichment analysis of pathways for up- and downregulated genes. J R Soc Interface2014; 11(92): 20130950

[14]

Aytes A, Mitrofanova A, Lefebvre C, Alvarez MJ, Castillo-Martin M, Zheng T, Eastham JA, Gopalan A, Pienta KJ, Shen MM, Califano A, Abate-Shen C. Cross-species regulatory network analysis identifies a synergistic interaction between FOXM1 and CENPF that drives prostate cancer malignancy. Cancer Cell2014; 25(5): 638–651

[15]

Cummings RD, McEver RP. C-type lectins. In: Varki A, Cummings RD, Esko JD, Freeze HH, Stanley P, Bertozzi CR, Hart GW, Eltzler ME. Essentials of Glycobiology. Cold Spring Harbor (NY): Cold Spring Harbor Laboratory Press, 2009

[16]

Uhlen M, Oksvold P, Fagerberg L, Lundberg E, Jonasson K, Forsberg M, Zwahlen M, Kampf C, Wester K, Hober S, Wernerus H, Björling L, Ponten F. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol2010; 28(12): 1248–1250

[17]

Bale TL, Giordano FJ, Hickey RP, Huang Y, Nath AK, Peterson KL, Vale WW, Lee KF. Corticotropin-releasing factor receptor 2 is a tonic suppressor of vascularization. Proc Natl Acad Sci USA2002; 99(11): 7734–7739

[18]

Graziani G, Tentori L, Portarena I, Barbarino M, Tringali G, Pozzoli G, Navarra P. CRH inhibits cell growth of human endometrial adenocarcinoma cells via CRH-receptor 1-mediated activation of cAMP-PKA pathway. Endocrinology2002; 143(3): 807–813

[19]

Hao Z, Huang Y, Cleman J, Jovin IS, Vale WW, Bale TL, Giordano FJ. Urocortin2 inhibits tumor growth via effects on vascularization and cell proliferation. Proc Natl Acad Sci USA2008; 105(10): 3939–3944

[20]

Wang J, Xu Y, Xu Y, Zhu H, Zhang R, Zhang G, Li S. Urocortin’s inhibition of tumor growth and angiogenesis in hepatocellular carcinoma via corticotrophin-releasing factor receptor 2. Cancer Invest2008; 26(4): 359–368

[21]

Tezval H, Atschekzei F, Peters I, Waalkes S, Hennenlotter J, Stenzl A, Becker JU, Merseburger AS, Kuczyk MA, Serth J. Reduced mRNA expression level of corticotropin-releasing hormone-binding protein is associated with aggressive human kidney cancer. BMC Cancer2013; 13(1): 199

[22]

Graña X, Reddy EP. Cell cycle control in mammalian cells: role of cyclins, cyclin dependent kinases (CDKs), growth suppressor genes and cyclin-dependent kinase inhibitors (CKIs). Oncogene1995; 11(2): 211–219

[23]

Lew DJ, Kornbluth S. Regulatory roles of cyclin dependent kinase phosphorylation in cell cycle control. Curr Opin Cell Biol1996; 8(6): 795–804

[24]

Nebert DW, Russell DW. Clinical importance of the cytochromes P450. Lancet2002; 360(9340): 1155–1162

[25]

Denison MS, Whitlock JP Jr. Xenobiotic-inducible transcription of cytochrome P450 genes. J Biol Chem1995; 270(31): 18175–18178

[26]

Guengerich FP. Common and uncommon cytochrome P450 reactions related to metabolism and chemical toxicity. Chem Res Toxicol2001; 14(6): 611–650

[27]

Burkholder B, Huang RY, Burgess R, Luo S, Jones VS, Zhang W, Lv ZQ, Gao CY, Wang BL, Zhang YM, Huang RP. Tumor-induced perturbations of cytokines and immune cell networks. Biochim Biophys Acta2014; 1845(2): 182–201

[28]

Lacy P, Stow JL. Cytokine release from innate immune cells: association with diverse membrane trafficking pathways. Blood2011; 118(1): 9–18

[29]

Lippitz BE. Cytokine patterns in patients with cancer: a systematic review. Lancet Oncol2013; 14(6): e218–e228

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag Berlin Heidelberg

AI Summary AI Mindmap
PDF (455KB)

2968

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/