Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: - Please come visit my blog there.

Thursday, February 25, 2010

AGBT 2010 - Daniel MacArthur - Welcome Trust Sanger Institute

Loss-of_Function Mutations in Healthy Human Genomes: Implications for Clinical Genome Sequencing

[Missed the firsts couple minutes?]
Analysis of 1000 genomes data.

Loss of Function sub-group
Aim: create a catalogue of variants predicted to result in severe disruption of gene function
What is a LOF variant: [annotation based on GENCODE v3lb]
1. stop codon SNPs
2. splice disruption SNPs
3. frame shift indels
4. disruptive structural variants. (eg. loss of exons, loss of start codons...)

LOF variants:
* enriched for:
** severe recessive mutations
** other variants with functional effects
** neutral variatns in redundant genes/pseudogenes
** Sequencing and annotation arefacts

Many of these will be neutral.

3 pilots.
* total of 1,6556 unique genes affected.
* that is to say that a substantial portion of the genome has LOF variants
* acknowledging that there are errors, that's still a lot. (=

Disrupted genes per individual. Visible difference between European vs. Yoruba. (Africans have higher variability)

Structural variants seem relatively constant, splicing seems constant, stops seem to vary most. (CEU, CHB, JPT, YRI) [I'm eyeballing]

Expect to se some carriers for recessive disease mutations
* Several likely carrier mutations identified. [didn't catch them]

Derived allele frequency spectra.
* stop and splice are heavily shifted to the low end (0.05+)
LOF sites are enriched for artefacts
* Conserved region have less polymorphisms, but equal amount of error.
* Non-conserved have more polymorphisms, and equal error:
** thus tends to increase artefact rate in conserved regions.

LOF clustering points to mapping and annotation arefacts
* 91% of LOF carying genes contain only one LOF variant.
* there are some genes that are enriched for multiple independent LOF variants.
** many of them are CNV, seg dup, close paralogues.... which means that they're artefacts too.
* other annotation artefacts exist too... LOFs are making them stand out.

Beyond cataloging:
* large scale sequencing studies tend to produce many potential LOF candidates
* discriminate between disease causing and benign variations.
* is there a functional profile distinguishing recessive and LOF-tolerant genes?

Compare LOF-tolerant genes (& non-OR) to 725 recessive disease genes from OMIM. (Early results)
* use it to do classification
* linear discriminant analysis

[Kind of feels like a fast drive-by-blogging... my notes really didn't do justice to Daniel's explanations - i just managed to get down some of the points.]



Post a Comment

<< Home