A study of biases of DNA copy number estimation based on PICR model
Quan WANG, Jianghan QU, Xiaoxing CHENG, Yongjian KANG, Lin WAN, Minping QIAN, Minghua DENG
A study of biases of DNA copy number estimation based on PICR model
Affymetrix single-nucleotide polymorphism (SNP) arrays have been widely used for SNP genotype calling and copy number variation (CNV) studies, both of which are dependent on accurate DNA copy number estimation significantly. However, the methods for copy number estimation may suffer from kinds of difficulties: probe dependent binding affinity, crosshybridization of probes, and the whole genome amplification (WGA) of DNA sequences. The probe intensity composite representation (PICR) model, one former established approach, can cope with most complexities and achieve high accuracy in SNP genotyping. Nevertheless, the copy numbers estimated by PICR model still show array and site dependent biases for CNV studies. In this paper, we propose a procedure to adjust the biases and then make CNV inference based on both PICR model and our method. The comparison indicates that our correction of copy numbers is necessary for CNV studies.
single-nucleotide polymorphism (SNP) array / copy number variation (CNV) / cross-hybridization
[1] |
Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, Hurles M E. A robust statistical method for case-control association testing with copy number variation. Nat Genet, 2008, 40(10): 1245-1252
CrossRef
Google scholar
|
[2] |
Bengtsson H, Irizarry R, Carvalho B, Speed T P. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics, 2008, 24(6): 759-767
CrossRef
Google scholar
|
[3] |
Bengtsson H, Wirapati P, Speed T P. A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics, 2009, 25(17): 2149-2156
CrossRef
Google scholar
|
[4] |
Bignell G R, Huang J, Greshock J, Watt S, Butler A, West S, Grigorova M, Jones K W, Wei W, Stratton M R,
CrossRef
Google scholar
|
[5] |
Carter N P. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet, 2007, 39(7 Suppl): S16-21
CrossRef
Google scholar
|
[6] |
Di X, Matsuzaki H, Webster T A, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G,
CrossRef
Google scholar
|
[7] |
Greenman C D, Bignell G, Butler A, Edkins S, Hinton J, Beare D, Swamy S, Santarius T, Chen L, Widaa S, Futreal P A, Stratton M R. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics, 2010, 11(1): 164-175
CrossRef
Google scholar
|
[8] |
Held G A, Grinstein G, Tu Y. Modeling of DNA microarray data by using physical properties of hybridization. Proc Natl Acad Sci USA, 2003, 100(13): 7575-7580
CrossRef
Google scholar
|
[9] |
Held G A, Grinstein G, Tu Y. Relationship between gene expression and observed intensities in DNA microarrays—a modeling study. Nucleic Acids Res, 2006, 34(9): e70
CrossRef
Google scholar
|
[10] |
Huang J, Wei W, Chen J, Zhang J, Liu G, Di X, Mei R, Ishikawa S, Aburatani H, Jones K W,
CrossRef
Google scholar
|
[11] |
Iafrate A J, Feuk L, Rivera M N, Listewnik M L, Donahoe P K, Qi Y, Scherer S W, Lee C. Detection of large-scale variation in the human genome. Nat Genet, 2004, 36(9): 949-951
CrossRef
Google scholar
|
[12] |
Johnson W E, Li W, Meyer C A, Gottardo R, Carroll J S, Brown M, Liu X S. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci USA, 2006, 103(33): 12457-12462
CrossRef
Google scholar
|
[13] |
Kapur K, Jiang H, Xing Y, Wong W H. Cross-hybridization modeling on Affymetrix exon arrays. Bioinformatics, 2008, 24(24): 2887-2893
CrossRef
Google scholar
|
[14] |
Korn J M, Kuruvilla F G, McCarroll S A, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins P J, Darvishi K,
CrossRef
Google scholar
|
[15] |
Laframboise T, Harrington D, Weir B A. PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. Biostatistics, 2007, 8(2): 323-336
CrossRef
Google scholar
|
[16] |
McCarroll S A, Kuruvilla F G, Korn J M, Cawley S, Nemesh J, Wysoker A, Shapero M H, de Bakker P I, Maller J B, Kirby A,
CrossRef
Google scholar
|
[17] |
Nannya Y, Sanada M, Nakazaki K, Hosoya N, Wang L, Hangaishi A, Kurokawa M, Chiba S, Bailey D K, Kennedy G C,
CrossRef
Google scholar
|
[18] |
Olshen A B, Venkatraman E S, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 2004, 5(4): 557-572
CrossRef
Google scholar
|
[19] |
Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T. An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays. Bioinformatics, 2008, 24(10): 1278-1285
CrossRef
Google scholar
|
[20] |
Pugh T J, Delaney A D, Farnoud N, Flibotte S, Griffith M, Li H I, Qian H, Farinha P, Gascoyne R D, Marra M A. Impact of whole genome amplification on analysis of copy number variants. Nucleic Acids Res, 2008, 36(13): e80
CrossRef
Google scholar
|
[21] |
Rabbee N, Speed T P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics, 2006, 22(1): 7-12
CrossRef
Google scholar
|
[22] |
Redon R, Ishikawa S, Fitch K R, Feuk L, Perry G H, Andrews T D, Fiegler H, Shapero M H, Carson A R, Chen W,
CrossRef
Google scholar
|
[23] |
Scherer S W, Lee C, Birney E, Altshuler D M, Eichler E E, Carter N P, Hurles M E, Feuk L. Challenges and standards in integrating surveys of structural variation. Nat Genet, 2007, 39(7 Suppl): S7-15
CrossRef
Google scholar
|
[24] |
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M Y,
CrossRef
Google scholar
|
[25] |
Slater H R, Bailey D K, Ren H, Cao M, Bell K, Nasioulas S, Henke R, Choo K H, Kennedy G C. High-resolution identification of chromosomal abnormalities using oligonucleotide arrays containing 116,204 SNPs. Am J Hum Genet, 2005, 77(5): 709-726
CrossRef
Google scholar
|
[26] |
Wan L, Sun K, Ding Q, Cui Y, Li M, Wen Y, Elston R C, Qian M, Fu W J. Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation. Nucleic Acids Res, 2009, 37(17): e117
CrossRef
Google scholar
|
[27] |
Wan L, Xiao Y, Chen Q, Deng M, Qian M. The analysis of biases of copy numbers from Affymetrix SNP arrays. Communications in Information and Systems, 2010, 10(2): 81-96
|
[28] |
Weir B A, Woo M S, Getz G, Perner S, Ding L, Beroukhim R, Lin W M, Province M A, Kraja A, Johnson L A,
CrossRef
Google scholar
|
[29] |
Xiao Y, Segal M R, Yang Y H, Yeh R F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics, 2007, 23(12): 1459-1467
CrossRef
Google scholar
|
[30] |
Zhang L, Miles M F, Aldape K D. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol, 2003, 21(7): 818-821
CrossRef
Google scholar
|
[31] |
Zhang L, Wu C, Carta R, Zhao H. Free energy of DNA duplex formation on short oligonucleotide microarrays. Nucleic Acids Res, 2007, 35(3): e18
CrossRef
Google scholar
|
/
〈 | 〉 |