Computational tools for Hi-C data analysis

Zhijun Han, Gang Wei

PDF(466 KB)
PDF(466 KB)
Quant. Biol. ›› 2017, Vol. 5 ›› Issue (3) : 215-225. DOI: 10.1007/s40484-017-0113-6
REVIEW
REVIEW

Computational tools for Hi-C data analysis

Author information +
History +

Abstract

Background: In eukaryotic genome, chromatin is not randomly distributed in cell nuclei, but instead is organized into higher-order structures. Emerging evidence indicates that these higher-order chromatin structures play important roles in regulating genome functions such as transcription and DNA replication. With the advancement in 3C (chromosome conformation capture) based technologies, Hi-C has been widely used to investigate genome-wide long-range chromatin interactions during cellular differentiation and oncogenesis. Since the first publication of Hi-C assay in 2009, lots of bioinformatic tools have been implemented for processing Hi-C data from mapping raw reads to normalizing contact matrix and high interpretation, either providing a whole workflow pipeline or focusing on a particular process.

Results: This article reviews the general Hi-C data processing workflow and the currently popular Hi-C data processing tools. We highlight on how these tools are used for a full interpretation of Hi-C results.

Conclusions: Hi-C assay is a powerful tool to investigate the higher-order chromatin structure. Continued development of novel methods for Hi-C data analysis will be necessary for better understanding the regulatory function of genome organization.

Graphical abstract

Keywords

3D genome structure / Hi-C data processing tool / chromatin interactions

Cite this article

Download citation ▾
Zhijun Han, Gang Wei. Computational tools for Hi-C data analysis. Quant. Biol., 2017, 5(3): 215‒225 https://doi.org/10.1007/s40484-017-0113-6

References

[1]
Gorkin, D. U., Leung, D. and Ren, B. (2014) The 3D genome in transcriptional regulation and pluripotency. Cell Stem Cell, 14, 762–775
CrossRef Pubmed Google scholar
[2]
Phillips-Cremins, J. E., Sauria, M. E., Sanyal, A., Gerasimova, T. I., Lajoie, B. R., Bell, J. S., Ong, C. T., Hookway, T. A., Guo, C., Sun, Y., (2013) Architectural protein subclasses shape 3D organization of genomes during lineage commitment. Cell, 153, 1281–1295
CrossRef Pubmed Google scholar
[3]
Dekker, J., Rippe, K., Dekker, M. and Kleckner, N. (2002) Capturing chromosome conformation. Science, 295, 1306–1311
CrossRef Pubmed Google scholar
[4]
Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B. and de Laat, W. (2006) Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-on-chip (4C). Nat. Genet., 38, 1348–1354
CrossRef Pubmed Google scholar
[5]
Dostie, J., Richmond, T. A., Arnaout, R. A., Selzer, R. R., Lee, W. L., Honan, T. A., Rubio, E. D., Krumm, A., Lamb, J., Nusbaum, C., (2006) Chromosome conformation capture carbon copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res., 16, 1299–1309
CrossRef Pubmed Google scholar
[6]
Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326, 289–293
CrossRef Pubmed Google scholar
[7]
Fullwood, M. J., Liu, M. H., Pan, Y. F., Liu, J., Xu, H., Mohamed, Y. B., Orlov, Y. L., Velkov, S., Ho, A., Mei, P. H., (2009) An oestrogen-receptor-alpha-bound human chromatin interactome. Nature, 462, 58–64
CrossRef Pubmed Google scholar
[8]
Jäger, R., Migliorini, G., Henrion, M., Kandaswamy, R., Speedy, H. E., Heindl, A., Whiffin, N., Carnicer, M. J., Broome, L., Dryden, N., (2015) Capture Hi-C identifies the chromatin interactome of colorectal cancer risk loci. Nat. Commun., 6, 6178
CrossRef Pubmed Google scholar
[9]
Dixon, J. R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J. S. and Ren, B. (2012) Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature, 485, 376–380
CrossRef Pubmed Google scholar
[10]
Schmitt, A. D., Hu, M., Jung, I., Xu, Z., Qiu, Y., Tan, C. L., Li, Y., Lin, S., Lin, Y., Barr, C. L., (2016) A Compendium of chromatin contact maps reveals spatially active regions in the human genome. Cell Rep., 17, 2042–2059
CrossRef Pubmed Google scholar
[11]
Castellano, G., Le Dily, F., Hermoso Pulido, A., Beato, M. and Roma, G. (2015) Hi-Cpipe: a pipeline for high-throughput chromosome capture. bioRxiv, doi: https://doi.org/10.1101/020636
[12]
HiC-Box. available from https://github.com/koszullab/HiC-Box
[13]
Schmid, M. W., Grob, S. and Grossniklaus, U. (2015) HiCdat: a fast and easy-to-use Hi-C data analysis tool. BMC Bioinformatics, 16, 277
CrossRef Pubmed Google scholar
[14]
Hwang, Y. C., Lin, C. F., Valladares, O., Malamon, J., Kuksa, P. P., Zheng, Q., Gregory, B. D. and Wang, L. S. (2015) HIPPIE: a high-throughput identification pipeline for promoter interacting enhancer elements. Bioinformatics, 31, 1290–1292
CrossRef Pubmed Google scholar
[15]
Durand, N. C., Shamim, M. S., Machol, I., Rao, S. S., Huntley, M. H., Lander, E. S. and Aiden, E. L. (2016) Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst., 3, 95–98
CrossRef Pubmed Google scholar
[16]
Imakaev, M., Fudenberg, G., McCord, R. P., Naumova, N., Goloborodko, A., Lajoie, B. R., Dekker, J. and Mirny, L. A. (2012) Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods, 9, 999–1003
CrossRef Pubmed Google scholar
[17]
Wingett, S., Ewels, P., Furlan-Magaril, M., Nagano, T., Schoenfelder, S., Fraser, P. and Andrews, S. (2015) HiCUP: pipeline for mapping and processing Hi-C data. F1000Res, 4, 1310
Pubmed
[18]
Servant, N., Varoquaux, N., Lajoie, B. R., Viara, E., Chen, C. J., Vert, J. P., Heard, E., Dekker, J. and Barillot, E. (2015) HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol., 16, 259
CrossRef Pubmed Google scholar
[19]
Serra, F., Baù, D., Filion, G. and Marti-Renom, M. A. (2016) Structural features of the fly chromatin colors revealed by automatic three-dimensional modeling. bioRxiv, doi: https://doi.org/10.1101/036764
[20]
Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., Homer, N., Marth, G., Abecasis, G., Durbin, R., and the 1000 Genome Project Data Processing Subgroup. (2009) The sequence alignment/map format and SAMtools. Bioinformatics, 25, 2078–2079
CrossRef Pubmed Google scholar
[21]
Ma, W., Ay, F., Lee, C., Gulsoy, G., Deng, X., Cook, S., Hesson, J., Cavanaugh, C., Ware, C. B., Krumm, A., (2015) Fine-scale chromatin interaction maps reveal the cis-regulatory landscape of human lincRNA genes. Nat. Methods, 12, 71–78
CrossRef Pubmed Google scholar
[22]
Hu, M., Deng, K., Selvaraj, S., Qin, Z., Ren, B. and Liu, J. S. (2012) HiCNorm: removing biases in Hi-C data via Poisson regression. Bioinformatics, 28, 3131–3133
CrossRef Pubmed Google scholar
[23]
Knight, P. A. and Ruiz, D. (2013) A fast algorithm for matrix balancing. IMA J. Numer. Anal., 33, 1029–1047
CrossRef Google scholar
[24]
Yaffe, E. and Tanay, A. (2011) Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet., 43, 1059–1065
CrossRef Pubmed Google scholar
[25]
Rao, S. S., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159, 1665–1680
CrossRef Pubmed Google scholar
[26]
Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A. and Cavalli, G. (2012) Three-dimensional folding and functional organization principles of the Drosophila genome. Cell, 148, 458–472
CrossRef Pubmed Google scholar
[27]
Filippova, D., Patro, R., Duggal, G. and Kingsford, C. (2014) Identification of alternative topological domains in chromatin. Algorithms Mol. Biol., 9, 14
CrossRef Pubmed Google scholar
[28]
Lévy-Leduc, C., Delattre, M., Mary-Huard, T. and Robin, S. (2014) Two-dimensional segmentation for analyzing Hi-C data. Bioinformatics, 30, i386–i392
CrossRef Pubmed Google scholar
[29]
Wang, Y., Li, Y., Gao, J. and Zhang, M. Q. (2015) A novel method to identify topological domains using Hi-C data. Quant. Biol., 3, 81–89
CrossRef Google scholar
[30]
Zhou, X., Lowdon, R. F., Li, D., Lawson, H. A., Madden, P. A., Costello, J. F. and Wang, T. (2013) Exploring long-range genome interactions using the WashU Epigenome Browser. Nat. Methods, 10, 375–376
CrossRef Pubmed Google scholar
[31]
The 3D Genome Browser.
[32]
Karolchik, D., Barber, G. P., Casper, J., Clawson, H., Cline, M. S., Diekhans, M., Dreszer, T. R., Fujita, P. A., Guruvadoo, L., Haeussler, M., (2014) The UCSC Genome Browser database: 2014 update. Nucleic Acids Res., 42, D764–D770
CrossRef Pubmed Google scholar
[33]
Asbury, T. M., Mitman, M., Tang, J. and Zheng, W. J. (2010) Genome3D: a viewer-model framework for integrating and visualizing multi-scale epigenomic information within a three-dimensional genome. BMC Bioinformatics, 11, 444
CrossRef Pubmed Google scholar
[34]
Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cozzetto, D., Dana, J. M., Filippis, I., Gough, J., (2015) Genome3D: exploiting structure to help users understand their sequences. Nucleic Acids Res., 43, D382–D386
CrossRef Pubmed Google scholar
[35]
Lewis, T. E., Sillitoe, I., Andreeva, A., Blundell, T. L., Buchan, D. W., Chothia, C., Cuff, A., Dana, J. M., Filippis, I., Gough, J., (2013) Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains. Nucleic Acids Res., 41, D499–D507
CrossRef Pubmed Google scholar
[36]
TADkit. available from http://sgt.cnag.cat/3dg/tadkit
[37]
Ay, F. and Noble, W. S. (2015) Analysis methods for studying the 3D architecture of the genome. Genome Biol., 16, 183
CrossRef Pubmed Google scholar
[38]
Schmitt, A. D., Hu, M. and Ren, B. (2016) Genome-wide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol., 17, 743–755
CrossRef Pubmed Google scholar
[39]
Ashish, N., Dewan, P., Ambite, J. L. and Toga, A. W. (2015) GEM: the GAAIN entity mapper. Data Integr. Life Sci., 9162, 13–27
CrossRef Pubmed Google scholar
[40]
Marco-Sola, S., Sammeth, M., Guigó, R. and Ribeca, P. (2012) The GEM mapper: fast, accurate and versatile alignment by filtration. Nat. Methods, 9, 1185–1188
CrossRef Pubmed Google scholar
[41]
Durand, N. C., Robinson, J. T., Shamim, M. S., Machol, I., Mesirov, J. P., Lander, E. S. and Aiden, E. L. (2016) Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst., 3, 99–101
CrossRef Pubmed Google scholar
[42]
Li, W., Gong, K., Li, Q., Alber, F. and Zhou, X. J. (2015) Hi-Corrector: a fast, scalable and memory-efficient package for normalizing large-scale Hi-C data. Bioinformatics, 31, 960–962
CrossRef Pubmed Google scholar
[43]
Sauria, M. E., Phillips-Cremins, J. E., Corces, V. G. and Taylor, J. (2015) HiFive: a tool suite for easy and efficient HiC and 5C data analysis. Genome Biol., 16, 237
CrossRef Pubmed Google scholar
[44]
Lun, A. T. and Smyth, G. K. (2015) diffHic: a Bioconductor package to detect differential genomic interactions in Hi-C data. BMC Bioinformatics, 16, 258
CrossRef Pubmed Google scholar

ACKNOWLEDGEMENTS

This work is supported by the National Basic Research Program of China (Nos. 2016YFA0100703 and 2015CB964800) and the National Natural Science Foundation of China (No. 31271354).

COMPLIANCE WITH ETHICS GUIDELINES

The authors Zhijun Han and Gang Wei declare they have no conflict of interest.
This article is a review article and does not contain any studies with human or animal subjects performed by any of the authors.

RIGHTS & PERMISSIONS

2017 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(466 KB)

Accesses

Citations

Detail

Sections
Recommended

/