Understanding spatial organizations of chromosomes via statistical analysis of Hi-C data
Ming Hu, Ke Deng, Zhaohui Qin, Jun S. Liu
Understanding spatial organizations of chromosomes via statistical analysis of Hi-C data
Understanding how chromosomes fold provides insights into the transcription regulation, hence, the functional state of the cell. Using the next generation sequencing technology, the recently developed Hi-C approach enables a global view of spatial chromatin organization in the nucleus, which substantially expands our knowledge about genome organization and function. However, due to multiple layers of biases, noises and uncertainties buried in the protocol of Hi-C experiments, analyzing and interpreting Hi-C data poses great challenges, and requires novel statistical methods to be developed. This article provides an overview of recent Hi-C studies and their impacts on biomedical research, describes major challenges in statistical analysis of Hi-C data, and discusses some perspectives for future research.
[1] |
Naumova, N. and Dekker, J. (2010) Integrating one-dimensional and three-dimensional maps of genomes. J. Cell. Sci., 123, 1979−1988
Pubmed
|
[2] |
Woodcock, C. L. and Ghosh, R. P. (2010) Chromatin higher-order structure and dynamics. Cold Spring Harb Perspect Biol, 2, a000596
Pubmed
|
[3] |
Misteli, T. (2004) Spatial positioning; a new dimension in genome function. Cell, 119, 153−156
Pubmed
|
[4] |
Dekker, J. (2008) Gene regulation in the third dimension. Science, 319, 1793−1794
Pubmed
|
[5] |
Miele, A. and Dekker, J. (2008) Long-range chromosomal interactions and gene regulation. Mol Biosyst, 4, 1046−1057
Pubmed
|
[6] |
Fraser, P. and Bickmore, W. (2007) Nuclear organization of the genome and the potential for gene regulation. Nature, 447, 413−417
Pubmed
|
[7] |
Misteli, T. (2007) Beyond the sequence: cellular organization of genome function. Cell, 128, 787−800
Pubmed
|
[8] |
Alt, F. W., Zhang, Y., Meng, F. L., Guo, C. and Schwer, B. (2013) Mechanisms of programmed DNA lesions and genomic instability in the immune system. Cell, 152, 417−429
Pubmed
|
[9] |
Mitelman, F. (2000) Recurrent chromosome aberrations in cancer. Mutat. Res., 462, 247−253
Pubmed
|
[10] |
Rowley, J. D. (1998) The critical role of chromosome translocations in human leukemias. Annu. Rev. Genet., 32, 495−519
Pubmed
|
[11] |
van Steensel, B. and Dekker, J. (2010) Genomics tools for unraveling chromosome architecture. Nat. Biotechnol., 28, 1089−1095
Pubmed
|
[12] |
Cremer, T.,
|
[13] |
Cremer, T., Cremer, M., Dietzel, S., Müller, S., Solovei, I. and Fakan, S. (2006) Chromosome territories—a functional nuclear landscape. Curr. Opin. Cell Biol., 18, 307−316
Pubmed
|
[14] |
Branco, M. R. and Pombo, A. (2007) Chromosome organization: new facts, new models. Trends Cell Biol., 17, 127−134
Pubmed
|
[15] |
Wasserman, W. W. and Sandelin, A. (2004) Applied bioinformatics for the identification of regulatory elements. Nat. Rev. Genet., 5, 276−287
Pubmed
|
[16] |
Gilbert, N., Boyle, S., Fiegler, H., Woodfine, K., Carter, N. P. and Bickmore, W. A. (2004) Chromatin architecture of the human genome: gene-rich domains are enriched in open chromatin fibers. Cell, 118, 555−566
Pubmed
|
[17] |
de Wit, E. and de Laat, W. (2012) A decade of 3C technologies: insights into nuclear organization. Genes Dev., 26, 11−24
Pubmed
|
[18] |
Dekker, J., Marti-Renom, M. A. and Mirny, L. A. (2013) Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data. Nat. Rev. Genet., 14, 390−403
Pubmed
|
[19] |
Dekker, J., Rippe,K., Dekker, M. and Kleckner, N. (2002) Capturing chromosome conformation. Science, 295, 1306−1311
Pubmed
|
[20] |
Simonis, M., Klous,P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B. and de Laat, W. (2006) Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat. Genet., 38, 1348−1354
Pubmed
|
[21] |
Zhao, Z., Tavoosidana, G., Sjölinder, M., Göndör, A., Mariano, P., Wang, S., Kanduri, C., Lezcano, M., Sandhu, K. S., Singh, U.,
Pubmed
|
[22] |
Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee,W. L., Honan, T. A., Rubio, E. D., Krumm,A., Lamb, J., Nusbaum, C.,
Pubmed
|
[23] |
Dostie, J. and Dekker, J. (2007) Mapping networks of physical interactions between genomic elements using 5C technology. Nat Protoc, 2, 988−1002
Pubmed
|
[24] |
Simonis, M., Kooren, J. and de Laat, W. (2007) An evaluation of 3C-based methods to capture DNA interactions. Nat. Methods, 4, 895−901
Pubmed
|
[25] |
Fullwood, M. J. and Ruan, Y. (2009) ChIP-based methods for the identification of long-range chromatin interactions. J. Cell. Biochem., 107, 30−39
Pubmed
|
[26] |
Handoko, L., Xu,H., Li, G., Ngan, C. Y., Chew, E., Schnapp, M., Lee,C. W., Ye, C., Ping, J. L., Mulawadi, F.,
Pubmed
|
[27] |
Espinoza, C. A. and Ren, B. (2011) Mapping higher order structure of chromatin domains. Nat. Genet., 43, 615−616
Pubmed
|
[28] |
Fullwood, M. J., Liu,M. H., Pan, Y. F., Liu, J., Xu, H., Mohamed, Y. B., Orlov,Y. L., Velkov, S., Ho, A., Mei,P. H.,
Pubmed
|
[29] |
Rusk, N. (2009) When ChIA PETs meet Hi-C. Nat. Methods, 6, 863.
|
[30] |
Miele, A., Bystricky, K. and Dekker, J. (2009) Yeast silent mating type loci form heterochromatic clusters through silencer protein-dependent long-range interactions. PLoS Genet., 5, e1000478
Pubmed
|
[31] |
Tolhuis, B., Palstra, R. J., Splinter, E., Grosveld, F. and de Laat, W. (2002) Looping and interaction between hypersensitive sites in the active beta-globin locus. Mol. Cell, 10, 1453−1465
Pubmed
|
[32] |
Lajoie, B. R., van Berkum, N. L., Sanyal, A. and Dekker, J. (2009) My5C: web tools for chromosome conformation capture studies. Nat. Methods, 6, 690−691
Pubmed
|
[33] |
Baù D., Sanyal, A., Lajoie, B. R., Capriotti, E., Byron,M., Lawrence, J. B., Dekker, J. and Marti-Renom, M. A. (2011) The three-dimensional folding of the α-globin gene domain reveals formation of chromatin globules. Nat. Struct. Mol. Biol., 18, 107−114
Pubmed
|
[34] |
Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O.,
Pubmed
|
[35] |
van Berkum, N.L.,
|
[36] |
Baker, M. (2011) Genomics: Genomes in three dimensions. Nature, 470, 289−294
Pubmed
|
[37] |
Kalhor, R., Tjong,H., Jayathilaka, N., Alber, F. and Chen, L. (2012) Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat. Biotechnol., 30, 90−98 .
Pubmed
|
[38] |
Duan, Z., Andronescu, M., Schutz, K., McIlwain, S., Kim,Y. J., Lee, C., Shendure, J., Fields, S., Blau, C. A. and Noble, W. S. (2010) A three-dimensional model of the yeast genome. Nature, 465, 363−367
Pubmed
|
[39] |
Rousseau, M., Fraser, J., Ferraiuolo, M. A., Dostie, J. and Blanchette, M. (2011) Three-dimensional modeling of chromatin structure from interaction frequency data using Markov chain Monte Carlo sampling. BMC Bioinformatics, 12, 414
Pubmed
|
[40] |
Tanizawa, H., Iwasaki, O., Tanaka, A., Capizzi, J. R., Wickramasinghe, P., Lee, M., Fu, Z. and Noma, K. (2010) Mapping of long-range associations throughout the fission yeast genome reveals global genome organization linked to transcriptional regulation. Nucleic Acids Res., 38, 8164−8177
Pubmed
|
[41] |
Marti-Renom, M. A. and Mirny, L. A. (2011) Bridging the resolution gap in structural modeling of 3D genome organization. PLoS Comput. Biol., 7, e1002125
Pubmed
|
[42] |
Mateos-Langerak, J., Bohn, M., de Leeuw, W., Giromus, O., Manders, E. M., Verschure, P. J., Indemans, M. H., Gierman, H. J., Heermann, D. W., van Driel, R.,
Pubmed
|
[43] |
Bohn, M. and Heermann, D. W. (2010) Diffusion-driven looping provides a consistent framework for chromatin organization. PLoS ONE, 5, e12218
Pubmed
|
[44] |
Barbieri, M., Chotalia, M., Fraser, J., Lavitas, L. M., Dostie, J., Pombo, A. and Nicodemi, M. (2012) Complexity of chromatin folding is captured by the strings and binders switch model. Proc. Natl. Acad. Sci. U.S.A., 109, 16173−16178
Pubmed
|
[45] |
Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li,Y., Shen, Y., Hu, M., Liu, J. S. and Ren, B. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376−380
Pubmed
|
[46] |
Nora, E. P., Lajoie, B. R., Schulz, E. G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N. L., Meisig, J., Sedat, J.,
Pubmed
|
[47] |
Sexton, T., Yaffe,E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay,A. and Cavalli, G. (2012) Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 148, 458−472
Pubmed
|
[48] |
Hou, C., Li,L., Qin, Z. S. and Corces, V. G. (2012) Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol. Cell, 48, 471−484
Pubmed
|
[49] |
Duan, Z. and Blau, C. A. (2012) The genome in space and time: does form always follow function? How does the spatial and temporal organization of a eukaryotic genome reflect and influence its functions? Bioessays, 34, 800−810
Pubmed
|
[50] |
Lan, X., Farnham, P. J. and Jin, V. X. (2012) Uncovering transcription factor modules using one- and three-dimensional analyses. J. Biol. Chem., 287, 30914−30921
Pubmed
|
[51] |
Lan, X., Witt,H., Katsumura, K., Ye, Z., Wang,Q., Bresnick, E. H., Farnham, P. J. and Jin, V. X. (2012) Integration of Hi-C and ChIP-seq data reveals distinct types of chromatin linkages. Nucleic Acids Res., 40, 7690−7704
Pubmed
|
[52] |
Khrameeva, E. E., Mironov, A. A., Fedonin, G. G., Khaitovich, P. and Gelfand, M. S. (2012) Spatial proximity and similarity of the epigenetic state of genome domains. PLoS ONE, 7, e33947
Pubmed
|
[53] |
Hwang, Y. C., Zheng,Q., Gregory, B. D. and Wang, L. S. (2013) High-throughput identification of long-range regulatory elements and their target promoters in the human genome. Nucleic Acids Res., 41, 4835−4846
Pubmed
|
[54] |
Wang, J., Lan,X., Hsu, P. Y., Hsu, H. K., Huang, K., Parvin, J., Huang,T. H. and Jin, V. X. (2013) Genome-wide analysis uncovers high frequency, strong differential chromosomal interactions and their associated epigenetic patterns in E2-mediated gene regulation. BMC Genomics, 14, 70
Pubmed
|
[55] |
Baker, A., Audit,B., Chen, C. L., Moindrot, B., Leleu,A., Guilbaud, G., Rappailles, A., Vaillant,C., Goldar, A., Mongelard, F.,
Pubmed
|
[56] |
Moindrot, B., Audit,B., Klous, P., Baker, A., Thermes,C., de Laat, W., Bouvet, P., Mongelard, F. and Arneodo, A. (2012) 3D chromatin conformation correlates with replication timing and is conserved in resting cells. Nucleic Acids Res., 40, 9470−9481
Pubmed
|
[57] |
Takebayashi, S., Dileep, V., Ryba, T., Dennis, J. H. and Gilbert, D. M. (2012) Chromatin-interaction compartment switch at developmentally regulated chromosomal domains reveals an unusual principle of chromatin folding. Proc. Natl. Acad. Sci. U.S.A., 109, 12574−12579
Pubmed
|
[58] |
Fudenberg, G., Getz,G., Meyerson, M. and Mirny, L. A. (2011) High order chromatin architecture shapes the landscape of chromosomal alterations in cancer. Nat. Biotechnol., 29, 1109−1113
Pubmed
|
[59] |
De, S. and Michor, F. (2011) DNA replication timing and long-range DNA interactions predict mutational landscapes of cancer genomes. Nat. Biotechnol., 29, 1103−1108
Pubmed
|
[60] |
Chiarle, R., Zhang,Y., Frock, R. L., Lewis, S. M., Molinie, B., Ho, Y. J., Myers, D. R., Choi,V. W., Compagno, M., Malkin, D. J.,
Pubmed
|
[61] |
Zhang, Y., McCord, R. P., Ho, Y. J., Lajoie, B. R., Hildebrand, D. G., Simon, A. C., Becker, M. S., Alt,F. W. and Dekker, J. (2012) Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell, 148, 908−921
Pubmed
|
[62] |
Elemento, O., Rubin,M. A. and Rickman, D. S. (2012) Oncogenic transcription factors as master regulators of chromatin topology: a new role for ERG in prostate cancer. Cell Cycle, 11, 3380−3383
Pubmed
|
[63] |
Rickman, D. S., Soong,T. D., Moss, B., Mosquera, J. M., Dlabal,J., Terry, S., MacDonald, T. Y., Tripodi, J., Bunting, K., Najfeld, V.,
Pubmed
|
[64] |
Engreitz, J. M., Agarwala, V. and Mirny, L. A. (2012) Three-dimensional genome architecture influences partner selection for chromosomal translocations in human disease. PLoS ONE, 7, e44196
Pubmed
|
[65] |
Shugay, M., Ortiz de Mendíbil, I., Vizmanos, J. L. and Novo, F. J. (2012) Genomic hallmarks of genes involved in chromosomal translocations in hematological cancer. PLoS Comput. Biol., 8, e1002797
Pubmed
|
[66] |
Wang, Z., Cao,R., Taylor, K., Briley, A., Caldwell, C. and Cheng, J. (2013) The properties of genome conformation and spatial gene interaction and regulation networks of normal and malignant human cell types. PLoS ONE, 8, e58793
Pubmed
|
[67] |
Chambers, E. V., Bickmore, W. A. and Semple, C. A. (2013) Divergence of Mammalian higher order chromatin structure is associated with developmental Loci. PLoS Comput. Biol., 9, e1003017
Pubmed
|
[68] |
Dai, Z. and Dai, X. (2012) Nuclear colocalization of transcription factor target genes strengthens coregulation in yeast. Nucleic Acids Res., 40, 27−36
Pubmed
|
[69] |
Witten, D. M. and Noble, W. S. (2012) On the assessment of statistical significance of three-dimensional colocalization of sets of genomic elements. Nucleic Acids Res., 40, 3849−3855
Pubmed
|
[70] |
Paulsen, J., Lien,T. G., Sandve, G. K., Holden, L., Borgan, O., Glad, I. K. and Hovig, E. (2013) Handling realistic assumptions in hypothesis testing of 3D co-localization of genomic elements. Nucleic Acids Res., (In press)
Pubmed
|
[71] |
Belton, J. M., McCord, R. P., Gibcus, J. H., Naumova, N., Zhan,Y. and Dekker, J. (2012) Hi-C: a comprehensive technique to capture the conformation of genomes. Methods, 58, 268−276
Pubmed
|
[72] |
Duan, Z., Andronescu, M., Schutz, K., Lee, C., Shendure, J., Fields, S., Noble, W. S. and Anthony Blau, C. (2012) A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes. Methods, 58, 277−288
Pubmed
|
[73] |
Yaffe, E. and Tanay, A. (2011) Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet., 43, 1059−1065
Pubmed
|
[74] |
Imakaev, M., Fudenberg, G., McCord, R. P., Naumova, N., Goloborodko, A., Lajoie, B. R., Dekker, J. and Mirny, L. A. (2012) Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999−1003
Pubmed
|
[75] |
Li, H. and Durbin, R. (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25, 1754−1760
Pubmed
|
[76] |
Li, H., Ruan,J. and Durbin, R. (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res., 18, 1851−1858
Pubmed
|
[77] |
Langmead, B., Trapnell, C., Pop, M. and Salzberg, S. L. (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol., 10, R25
Pubmed
|
[78] |
Tools: Novocraft.
|
[79] |
Picard is avaible in the website of Github.
|
[80] |
Servant, N., Lajoie, B. R., Nora, E. P., Giorgetti, L., Chen,C. J., Heard, E., Dekker, J. and Barillot, E. (2012) HiTC: exploration of high-throughput ‘C’ experiments. Bioinformatics, 28, 2843−2844
Pubmed
|
[81] |
Shavit, Y. and Lio’, P. (2013) CytoHiC: a cytoscape plugin for visual comparison of Hi-C networks. Bioinformatics, 29, 1206−1207
Pubmed
|
[82] |
Zhou, X., Lowdon, R. F., Li, D., Lawson, H. A., Madden, P. A., Costello, J. F. and Wang, T. (2013) Exploring long-range genome interactions using the WashU Epigenome Browser. Nat. Methods, 10, 375−376
Pubmed
|
[83] |
Aird, D., Ross,M. G., Chen, W. S., Danielsson, M., Fennell, T., Russ, C., Jaffe,D. B., Nusbaum, C. and Gnirke, A. (2011) Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol., 12, R18
Pubmed
|
[84] |
Benjamini, Y. and Speed, T. P. (2012) Summarizing and correcting the GC content bias in high-throughput sequencing. Nucleic Acids Res., 40, e72
Pubmed
|
[85] |
Gascoigne, D. K.,
|
[86] |
Cournac, A., Marie-Nelly, H., Marbouty, M., Koszul, R. and Mozziconacci, J. (2012) Normalization of a chromosomal contact map. BMC Genomics, 13, 436
Pubmed
|
[87] |
Hu, M., Deng,K., Selvaraj, S., Qin, Z., Ren,B. and Liu, J. S. (2012) HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28, 3131−3133
Pubmed
|
[88] |
Bickmore, W. A. and van Steensel, B. (2013) Genome architecture: domain organization of interphase chromosomes. Cell, 152, 1270−1284
Pubmed
|
[89] |
Smallwood, A. and Ren, B. (2013) Genome organization and long-range regulation of gene expression by enhancers. Curr. Opin. Cell Biol., 25, 1−8
Pubmed
|
[90] |
Gibcus, J. H. and Dekker, J. (2013) The hierarchy of the 3D genome. Mol. Cell, 49, 773−782
Pubmed
|
[91] |
Tanay, A. and Cavalli, G. (2013) Chromosomal domains: epigenetic contexts and functional implications of genomic compartmentalization. Curr. Opin. Genet. Dev., 23, 1−7
Pubmed
|
[92] |
Cavalli, G. and Misteli, T. (2013) Functional implications of genome topology. Nat. Struct. Mol. Biol., 20, 290−299
Pubmed
|
[93] |
Liu, L., Zhang,Y., Feng, J., Zheng, N., Yin,J. and Zhang, Y. (2012) GeSICA: genome segmentation from intra-chromosomal associations. BMC Genomics, 13, 164
Pubmed
|
[94] |
Fudenberg, G. and Mirny, L. A. (2012) Higher-order chromatin structure: bridging physics and biology. Curr. Opin. Genet. Dev., 22, 115−124
Pubmed
|
[95] |
Gasser, S. M. (2002) Visualizing chromatin dynamics in interphase nuclei. Science, 296, 1412−1416
Pubmed
|
[96] |
Lanctôt, C., Cheutin, T., Cremer, M., Cavalli, G. and Cremer, T. (2007) Dynamic genome architecture in the nuclear space: regulation of gene expression in three dimensions. Nat. Rev. Genet., 8, 104−115
Pubmed
|
[97] |
Gerlich, D., Beaudouin, J., Kalbfuss, B., Daigle, N., Eils,R. and Ellenberg, J. (2003) Global chromosome positions are transmitted through mitosis in mammalian cells. Cell, 112, 751−764
Pubmed
|
[98] |
Grosberg, A. Y., Nechaev, S. K. and Shakhnovich, E. I. (1988) The role of topological constraints in the kinetics of collapse of macromolecules. J. Phys., 49, 2095−2100.
|
[99] |
Grosberg, A. Y.,
|
[100] |
Munkel, C. and Langowski, J. (1998) Chromosome structure predicted by a polymer model. Physcial Review E, 57, 5888−5896.
|
[101] |
Mirny, L. A. (2011) The fractal globule as a model of chromatin architecture in the cell. Chromosome Res., 19, 37−51
Pubmed
|
[102] |
Baù D. and Marti-Renom, M. A. (2011) Structure determination of genomic domains by satisfaction of spatial restraints. Chromosome Res., 19, 25−35
Pubmed
|
[103] |
Hu, M., Deng,K., Qin, Z., Dixon, J., Selvaraj, S., Fang, J., Ren,B. and Liu, J. S. (2013) Bayesian inference of spatial organizations of chromosomes. PLoS Comput. Biol., 9, e1002893
Pubmed
|
[104] |
Liu, J. S., Chen,R. and Wong, W. H. (1998) Rejection control and sequential importance sampling. J. Am. Stat. Assoc., 93, 1022−1031.
|
[105] |
Liu, J. (2001) Monte Carlo Strategies in scientific computing. New York: Springer-Verlag.
|
[106] |
Duane, S.,
|
[107] |
Misteli, T. (2012) Parallel genome universes. Nat. Biotechnol., 30, 55−56
Pubmed
|
[108] |
Akaike, H. (1974) A new look at the statistical model identification. IEEE Trans. Automat. Contr., 19, 716−723.
|
/
〈 | 〉 |