AGBT 2010 - Illumina Workshop
Introducing the HiSeq 2000(tm)
- redefining the trajectory of sequencing
- Jared from Marketing
Overview of machine.
- real data of Genome and transcriptome
- more than 2 billion base pairs per run
- more than 25Gb per day
- uses line scanning (scan in rows, like a photocopier, instead of a whole picture at once, like a camera)
- now uses "dual surface engineering": image both the top and bottom surface, which means you have twice as much area to form clusters
- Machine holds two individual flow cells
- flow cells are held in by a vacuum
- simple insertion - just toggle a switch through three positions - an LED lights up when you've turned it on.
- preconfigured reagenets - bottles all stacked together: just push in the rack
- touch screen user interface
- "wizard" like set up for runs
- realtime metrics available on interface - even an ipod app (available for ipad too..)
- multimedia help will walk you through things you may not understand.
- major focus on ease of use
- it has the "simplest workflow" of any of the sequencing machines available
- tile size reduced [that's what I wrote but I seem to recall him saying that the number of tiles is smaller, but the tiles themselves are larger?]
- 1 run can now do a 30x coverage for a cancer and a normal (one in each flow cell.)
- 2 methylomes can be done in a week
- you could do 20 RNA-Seq experiments in 4 days.
- error rates and feel of data are similar if not identical to the GAIIx.
- from a small sampling of experiments shown it looks like error rate is very slightly higher
- Demonstrated 300Gb/run, more than 25Gb per day at release
- PET 2x100 supported.
- Software is same for GAII [Although somewhere in the presentation, I heard that they are working on a new version of the pipeline (v 1.6?)... no details on it, tho.]
Eliot Margulies, NHGRI/NIH Sequencing
- talking about projects today for the undiagnosed disease program
- basically same as in his earlier talk [notes are already posted.]
- use cross match to do realignment of reads that don't map first time
- use MPG scores
[In a technology talk, I didn't want to take notes on the experiment itself... mainly points are on the HiSeq data.
Data set: concordance with SNP Chips was in the range of 98% for each flow cell, 99% when both are combined (72x coverage)
- Speed: Increased throughput
- more focus on biology rather than on tweaking pipelines and bioinformatic processing. (eg, biological analysis takes front seat.)
Working on a project for Body Map 2.0 : Total human transcriptome
- 16 tissues, each PET 2x50bp, 1x75bp
- $8,900 for 1x50bp
- multiplexing will reduce cost further.
- if you only need 7M reads, you could mutliplex 192 samples (on both cells, I assume), and the cost would be $46. (including seqeuncing, not sample prep.
[which just makes the whole cost equation that much more vague in my mind... Wouldn't it be nice to know how much it costs to do the whole process?]
[Many examples of how RNA-seq looks on HiSeq 2000 (tm)]
- output has 5 billion reads, 300Gb of data.
Present a graph
- amount of sequence per run.
- looks like a "hockey stick graph"
[Shouldn't it be sequence per machine per day? It'd still look good - and wouldn't totally shortchange the work done on the human genome project. This is really a bad graph.... at least put it on a log scale.]
In the past 5 years:
- 10^4 scale in throughput
- 10^7 scale up in parallelizations
Buzzwords about the future of the technology:
- "Democratizating sequencing"
- "putting it to work"
Labels: AGBT 2010