RESEARCH ARTICLE

A study of biases of DNA copy number estimation based on PICR model

  • Quan WANG 1 ,
  • Jianghan QU 2 ,
  • Xiaoxing CHENG 3 ,
  • Yongjian KANG 2 ,
  • Lin WAN 1,3,4 ,
  • Minping QIAN 1,3 ,
  • Minghua DENG , 1,3,5
Expand
  • 1. Center for Theoretical Biology, Peking University, Beijing 100871, China
  • 2. Yuanpei College, Peking University, Beijing 100871, China
  • 3. School of Mathematical Sciences, Peking University, Beijing 100871, China
  • 4. Molecular and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
  • 5. Center for Statistical Sciences, Peking University, Beijing 100871, China

Received date: 01 Sep 2010

Accepted date: 16 Mar 2011

Published date: 01 Dec 2011

Copyright

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

Abstract

Affymetrix single-nucleotide polymorphism (SNP) arrays have been widely used for SNP genotype calling and copy number variation (CNV) studies, both of which are dependent on accurate DNA copy number estimation significantly. However, the methods for copy number estimation may suffer from kinds of difficulties: probe dependent binding affinity, crosshybridization of probes, and the whole genome amplification (WGA) of DNA sequences. The probe intensity composite representation (PICR) model, one former established approach, can cope with most complexities and achieve high accuracy in SNP genotyping. Nevertheless, the copy numbers estimated by PICR model still show array and site dependent biases for CNV studies. In this paper, we propose a procedure to adjust the biases and then make CNV inference based on both PICR model and our method. The comparison indicates that our correction of copy numbers is necessary for CNV studies.

Cite this article

Quan WANG , Jianghan QU , Xiaoxing CHENG , Yongjian KANG , Lin WAN , Minping QIAN , Minghua DENG . A study of biases of DNA copy number estimation based on PICR model[J]. Frontiers of Mathematics in China, 2011 , 6(6) : 1203 -1216 . DOI: 10.1007/s11464-011-0125-x

1
Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, Hurles M E. A robust statistical method for case-control association testing with copy number variation. Nat Genet, 2008, 40(10): 1245-1252

DOI

2
Bengtsson H, Irizarry R, Carvalho B, Speed T P. Estimation and assessment of raw copy numbers at the single locus level. Bioinformatics, 2008, 24(6): 759-767

DOI

3
Bengtsson H, Wirapati P, Speed T P. A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6. Bioinformatics, 2009, 25(17): 2149-2156

DOI

4
Bignell G R, Huang J, Greshock J, Watt S, Butler A, West S, Grigorova M, Jones K W, Wei W, Stratton M R, . High-resolution analysis of DNA copy number using oligonucleotide microarrays. Genome Res, 2004, 14(2): 287-295

DOI

5
Carter N P. Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet, 2007, 39(7 Suppl): S16-21

DOI

6
Di X, Matsuzaki H, Webster T A, Hubbell E, Liu G, Dong S, Bartell D, Huang J, Chiles R, Yang G, . Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays. Bioinformatics, 2005, 21(9): 1958-1963

DOI

7
Greenman C D, Bignell G, Butler A, Edkins S, Hinton J, Beare D, Swamy S, Santarius T, Chen L, Widaa S, Futreal P A, Stratton M R. PICNIC: an algorithm to predict absolute allelic copy number variation with microarray cancer data. Biostatistics, 2010, 11(1): 164-175

DOI

8
Held G A, Grinstein G, Tu Y. Modeling of DNA microarray data by using physical properties of hybridization. Proc Natl Acad Sci USA, 2003, 100(13): 7575-7580

DOI

9
Held G A, Grinstein G, Tu Y. Relationship between gene expression and observed intensities in DNA microarrays—a modeling study. Nucleic Acids Res, 2006, 34(9): e70

DOI

10
Huang J, Wei W, Chen J, Zhang J, Liu G, Di X, Mei R, Ishikawa S, Aburatani H, Jones K W, . CARAT: a novel method for allelic detection of DNA copy number changes using high density oligonucleotide arrays. BMC Bioinformatics, 2006, 7: 83

DOI

11
Iafrate A J, Feuk L, Rivera M N, Listewnik M L, Donahoe P K, Qi Y, Scherer S W, Lee C. Detection of large-scale variation in the human genome. Nat Genet, 2004, 36(9): 949-951

DOI

12
Johnson W E, Li W, Meyer C A, Gottardo R, Carroll J S, Brown M, Liu X S. Model-based analysis of tiling-arrays for ChIP-chip. Proc Natl Acad Sci USA, 2006, 103(33): 12457-12462

DOI

13
Kapur K, Jiang H, Xing Y, Wong W H. Cross-hybridization modeling on Affymetrix exon arrays. Bioinformatics, 2008, 24(24): 2887-2893

DOI

14
Korn J M, Kuruvilla F G, McCarroll S A, Wysoker A, Nemesh J, Cawley S, Hubbell E, Veitch J, Collins P J, Darvishi K, . Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet, 2008, 40(10): 1253-1260

DOI

15
Laframboise T, Harrington D, Weir B A. PLASQ: a generalized linear model-based procedure to determine allelic dosage in cancer cells from SNP array data. Biostatistics, 2007, 8(2): 323-336

DOI

16
McCarroll S A, Kuruvilla F G, Korn J M, Cawley S, Nemesh J, Wysoker A, Shapero M H, de Bakker P I, Maller J B, Kirby A, . Integrated detection and populationgenetic analysis of SNPs and copy number variation. Nat Genet, 2008, 40(10): 1166-1174

DOI

17
Nannya Y, Sanada M, Nakazaki K, Hosoya N, Wang L, Hangaishi A, Kurokawa M, Chiba S, Bailey D K, Kennedy G C, . A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res, 2005, 65(14): 6071-6079

DOI

18
Olshen A B, Venkatraman E S, Lucito R, Wigler M. Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics, 2004, 5(4): 557-572

DOI

19
Ono N, Suzuki S, Furusawa C, Agata T, Kashiwagi A, Shimizu H, Yomo T. An improved physico-chemical model of hybridization on high-density oligonucleotide microarrays. Bioinformatics, 2008, 24(10): 1278-1285

DOI

20
Pugh T J, Delaney A D, Farnoud N, Flibotte S, Griffith M, Li H I, Qian H, Farinha P, Gascoyne R D, Marra M A. Impact of whole genome amplification on analysis of copy number variants. Nucleic Acids Res, 2008, 36(13): e80

DOI

21
Rabbee N, Speed T P. A genotype calling algorithm for affymetrix SNP arrays. Bioinformatics, 2006, 22(1): 7-12

DOI

22
Redon R, Ishikawa S, Fitch K R, Feuk L, Perry G H, Andrews T D, Fiegler H, Shapero M H, Carson A R, Chen W, . Global variation in copy number in the human genome. Nature, 2006, 444(7118): 444-454

DOI

23
Scherer S W, Lee C, Birney E, Altshuler D M, Eichler E E, Carter N P, Hurles M E, Feuk L. Challenges and standards in integrating surveys of structural variation. Nat Genet, 2007, 39(7 Suppl): S7-15

DOI

24
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi M Y, . Large-scale copy number polymorphism in the human genome. Science, 2004, 305(5683): 525-528

DOI

25
Slater H R, Bailey D K, Ren H, Cao M, Bell K, Nasioulas S, Henke R, Choo K H, Kennedy G C. High-resolution identification of chromosomal abnormalities using oligonucleotide arrays containing 116,204 SNPs. Am J Hum Genet, 2005, 77(5): 709-726

DOI

26
Wan L, Sun K, Ding Q, Cui Y, Li M, Wen Y, Elston R C, Qian M, Fu W J. Hybridization modeling of oligonucleotide SNP arrays for accurate DNA copy number estimation. Nucleic Acids Res, 2009, 37(17): e117

DOI

27
Wan L, Xiao Y, Chen Q, Deng M, Qian M. The analysis of biases of copy numbers from Affymetrix SNP arrays. Communications in Information and Systems, 2010, 10(2): 81-96

28
Weir B A, Woo M S, Getz G, Perner S, Ding L, Beroukhim R, Lin W M, Province M A, Kraja A, Johnson L A, . Characterizing the cancer genome in lung adenocarcinoma. Nature, 2007, 450(7171): 893-898

DOI

29
Xiao Y, Segal M R, Yang Y H, Yeh R F. A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays. Bioinformatics, 2007, 23(12): 1459-1467

DOI

30
Zhang L, Miles M F, Aldape K D. A model of molecular interactions on short oligonucleotide microarrays. Nat Biotechnol, 2003, 21(7): 818-821

DOI

31
Zhang L, Wu C, Carta R, Zhao H. Free energy of DNA duplex formation on short oligonucleotide microarrays. Nucleic Acids Res, 2007, 35(3): e18

DOI

Options
Outlines

/