Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: - Please come visit my blog there.

Thursday, February 25, 2010

AGBT 2010 - Carlos Bustamante - Stanford University School of Medicine

Complete Genome Sequencing and Analysis of a Diploid African-American and Mexican-American Genome: Implications for Personal Ancestry Reconstruction and Multi-Ethnic Medical Genomics

Motivation and Objectives:
* GWAS has been successful
* Many traits, however, are not being explained by GWAS
* Understanding rare and common genetic variants will require multi-ethnic sequencing

* reseq two admixed genomes to high coverage
* compare to population and demographic models
* Understand diversity [?... missed this point]

* Establish resource for studying human population gentics, recent dmography and admixture
* Using Affymetrix 500k
* 500 samples

You can cluster by ancestry, principle component analyses.
* Admixing: S. Asian and Mexican.
* Using PC 3 an PC 4, you get huge amount of diversity from native americans that aren't sampled by current chips.

Think about approaches that are "dated ribbon?"
* proportion of african ancestry P = b / (a+b)

[I'm really missing stuff in this talk - it's very quick, and I know nothing about admixing.... will read up on that later.]

* PCA along windows across genome
* Use HMM for Admixture Estimation in African Americans
* This identifies "Ancestry switchpoints" which seem to be cross over events that skew towards one or the other ancestry within the same chromosome.
* Multiple events in one chromosome are possible.

Individual ancestry results:
* You get a lot of variation in content across single chromosomes, you can then quantify this amount.
* Latin americans, however, are all over the place - they are really mosaic.

Great variety in amount of ancestry and location of breakpoints.

Take home message:
Personal ancestry reconstruction including detection of admixture tracts ins feasible on genome-wide scale

How to improve ability to deconvolute this using Sequencing.
* use reference human genome samples sequenced with SOLiD.
* includes 100 genomes data

New Tools:
* STRUCTURE 2.0 LIKE" algorithm -- J. Degenhardt

Use Reconstruction - shows each chromosome in different colors representing which of the ancestries is likely at each window.
* Can see small regions. Are they important? Are they real? Do they matter?

Haplotype-based Andmixture deconvolution
* can reveal fine-scale admixture.
* Seems the signals (small regions are real.)
* Lots of small regions (segments) of diverse ancestry signatures in the genome.
* Do they happen at hotspots?

Looked at mexican
* Many more switch points than previous example (African)
* [Sums up history.... ]

Distribution of Ancestry switches is used to compare
* can look at history of mixture - correlates to length of mix of population
* Scales as (1 + k) / (1 + theta).

Can use this information to find "time to most common recent ancestry"

Extrapolate this to show lengths of time for whole chromosomes and genomes.
* TMRCA varies dramatically along the genome.
* Also fits nicely with SNP work that people have done (dbSNP mentioned as well.)

Functional implications
* discovered ~10,000 NS snps in each genome (Varies by individual)
** Some might be deleterious...
* Functional Annotation of nsSPS using PolyPhen
* Show that admixed populations share more snps, I think.
* snps that are probably damaging are highest in CEPH and MEX.

[Moving very fast over a bunch of slides showing the same message - no notes here.]

Bottle neck in European founding population - Europeans show more deleterious SNPs.

Demographic models explain difference in dN/dS

* 3M snps for each genome
* 10k nsSNPs
* thinking about demographic history is important
* they're really only been working on only one small bit of diversity of the human genome. More will be necessary moving forward for medical applications.



Post a Comment

<< Home