David Dooling, Washington University School of Medicine, “Next-Generation Informatics”
The rate of change is far outstripping moores law.
So: Framing the problem - Viewpoints:
Lims: [Picture of Richard Stallman.. Nice!] how do we process and track information?
Analysis: [picture of Freud.. also Nice... same beard?] How do we process and extract information?
Project Leads: In, and Out... what's the answer?
Pipelines: Always changing! Buffers, software, tools, etc etc, etc!!!!
Analysis: Changing Pipeline: Proliferation of Data has led to a proliferation of tools.
So how do we do things on a massive scale, but deal with the constant change.
“We've always been pushing the envelope...” using the past as a guide to how to deal with the change.
As developers, put it in terms of flow charts, databases, pathways.. etc. Get a handle on the problem
How we deal with it: Regular entities to event entities to processing directives
The problem comes when the processing directives change... and that's a big change – frequently. So, to deal with it, entities were classified. To apply this, things were abstracted to big units, which can each be modular. By making things modular, they can be substituted.
1.Created an object-relational mapping (ORM) layer.
3.Dynamic command-line interface
4.Integrated Documentation System.
ORM was created from scratch because none of the others were able to cope with the stuff the workload that was being demanded of it. Everything works in XML, so you can verify flow, and it makes it easier to do parallelization.
All of these things together become “Genome Model”, which is a thin wrapper around all their tools, which give you massively parallel system with excellent data management and reporting.
Yikes... has an easy PERL API. [Everyone likes perl? Count me out.]
working model for employees: Pairing: analysts are paired with programmers so that better software is written.
Still much more to do.
Sequenceing is demolishing Moores Law
The cult of traces – desire to have raw information at our fingertips. (ven diagrams don't scale well, but things like Circos do!)
Labels: AGBT 2009