AbstractsBiology & Animal Science

Statistical Methods for the Analysis of Copy Number Variation

by Tan Hoang Nguyen

Institution: University of Otago
Year: 0
Keywords: copy number variation; CCL3L1; DEFB4; read depth; split read; 1000 Genomes; FCGR3B
Record ID: 1314096
Full text PDF: http://hdl.handle.net/10523/5434


Copy number variation (CNV) is a type of genomic structural variation which has been associated with disease risk in humans and with trait variation in agricultural species. This type of variation has also been implicated in adaptive natural selection. Advances in next-generation sequencing (NGS) technologies facilitate the determination of CNV at specific loci. In this study, computational approaches based on NGS data have been proposed and applied to specific genomic loci. Firstly, a read-depth based method was developed specifically for the complex FCGR genetic locus. The pipeline was used to measure copy number at the FCGR3A/3B genes, and identified SNPs associated with CNV (tag SNPs) at this locus. Next, this method was modified and applied to two highly copy-number variable regions, CCL3L1 and DEFB103A. The new pipeline determined putative boundaries for CNV in these two regions, and reported CN genotype for both genes. This methodology was also used to identify novel polymorphic regions on chromosome 17 of the human genome. Next, evidence of selective pressure at two loci, CCL3L1 and FCGR3B, was investigated using tag SNP and CN information from the modified pipeline. Finally, an integrated framework of read-depth and split-read based approaches was developed to pinpoint breakpoints of CNV events occurring across samples.