Tom Hudson – Ontario Institute of Cancer Research “Genome Variation and Cancer”
1200 cases and 1200 controls
looking for predictors of disease
1536 SNPs from candidate genes, in 10K coding non-synonymous SNPs, Affy 100K and 500K arrays.
Eventually found a hit in a gene dessert (Long intergenic non-coding RNA... learned the name this morning. (= Close to myc, but hasn't been correlated to anything.
In last year, 10 validated loci in 10,000 individuals, with very small odds rations (1.10 to 1.25). One of them is a gene: SMAD7. 5 loci are also in near genes that are involved in things... but are not actually in the gene.
Since there are 10 alleles, you'd think it would be a distribution, however most people carry 9 (27%)! There is also a linear relationship between the number of alleles and the risk of developing cancer. However, this still doesn't seem to be the causative allele.
Enrichment of Target Regions. Using a specific chip with 3.14Mb colon cancer specific regions. Those regions didn't take all of the space, so they added other colon cancer gene sequences as well.
Protocol: 6ng, fragmentation (300-500bp)... [I'm too slow]
Exon capture arrays are being used, andpPreliminary results: 40 DNA's : 65Gb.
Use MAQ to do alignments. Coverage 75% at 10X, 95.6% at 1X.
“More than 99% of gDNA has % GC that allows effective capture”
Analyzable Target Regions: 39175, 232 coding exons
Average coverage: 70.3
40 individuals yeild 8,706 SNPs
new snps, 2,397
Total number in coding exons: 77
Sequencing data compared to Affy data, very high concordance.
Rare alleles may be driving risk in several sporadic cases. Stop codons were found in 6 individuals with sporadic CRC.
Follow up genotyping is required to validate new SNPs and correlate with phenotype.
Second topic: International Cancer Genome Consortium.
“Every cancer patient is different, every tumour is different.” Lessons learned: Huge amount of heterogeneity within and across tumour types. High rate of abnormality, and sample quality matters!
50 tumour types x 500 tumours = 50,000 genomes.
Major issues: Specimens, consent, quality measures, goals, datasets, technologies, data releases.
[Mostly discussion of the mechanics of the project management, who's involved and where it's happening, as well as tumours, which I'm sure can be found on the OICR's web page. OICR is committeed to 500 tumours, using Illumina and Solid. They are also creating cell lines and the like, so there will be a good resource available.
Pancreatic data sets should be available on the OICR web page by June 2009.
Question: why Illumina and Solid? Answer: they didn't know which would mature faster. By doing both, they have more confidence in SNPs. They never know which will win in the end, either.
My Comments: Not a lot of science content in the second half, but quite neat to know they've had success with their CRC work. It seems like a huge amount of work for a very small amount of information, but still quite neat.
Labels: AGBT 2009