Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: http://blogs.nature.com/fejes - Please come visit my blog there.

Friday, February 26, 2010

AGBT 2010 - Christopher Mason - Weill Cornel Medical College

Developmental Changes in Human Neocortical Transcriptome Revealed by RNA-Seq

How do we go from sequence to organism?

Example of disease that they were able to find change in exon.. but that's not the normal. Brain transcriptome is especiallly bad.

Complexity of transcriptome is vast.

NGS transformed the amount of data we're getting

Compared microarrays vs RNA-seq
* RNA-seq gives you much more information on DE.
* Metric for RNA-seq expression (Reads per kb per million reads)
* Controls: spike in synthetic w poly-A tails [next slide: control worked]

Looking at brain
* validate existing gene boundaries.
* longer isoforms
* find other genes
* 70-90% of genes expressed in the brain with strong neuro-developmental correlation
* Ensembl genes categories expressed: many types of RNAs found
* ~18% of splicee forms are unique to each individual - splicing levels similar across development
* at high expression, 80-90% of genes have alt isoforms

[Lists of genes that were DE in fetal/adult brain - "things that make sense"]

What is different is Transcription Factors - especially Zinc Finger TFs.
* Shift towards fetal expression

Zinc Finger
* most rapidly expanding class of genes

Look at UTRs
* fetal brain exhibits myriad extensions of gene models and variable UTRs.
* TARs found. (Transcriptionally activated regions) - confirmed with PCR

No visible end of gene discovery.
* the deeper you go, the more new things you see.

ROC plot
* sensitivity (TP / TP + FN) and specificity
* looks incredible - nearly straight to 1.

Source of "wiggles" in RNA-seq.
* it's everything, really
* biggest problem: annotation is one source.

Human genome is not just 33Mb.... it's only 1/2 to 1/5th ofthe exome capture.
* 165 Mb have been validated on multiple SeQC platforms!

There aren't just 20,000 genes - it's closer to 45,000!

Begat: every bp of the genome is a locus for ttesting, each remiaing sequence is a variable.

Don't forget, we also have to filter out viruses/bacteria/other
* Code for Begat is available. (Email given - forgot to copy it down.)

Labels:

3 Comments:

Blogger klafka said...

gah this, comparative RNA-seq/epigenetics across different neuron cell types !

February 26, 2010 11:28:00 PM PST  
Anonymous Anonymous said...

what does 'gah this' mean? no good?
elaborate!

February 27, 2010 11:39:00 AM PST  
Blogger Anthony Fejes said...

I can't find "gah this" in the article - I was worried it was my typo. I have no idea what the first comment means - but I enjoyed it!

GAH!

February 27, 2010 1:29:00 PM PST  

Post a Comment

<< Home