Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: - Please come visit my blog there.

Tuesday, February 17, 2009

FindPeaks 3.3

I have to admit, I'm feeling a little shy about writing on my blog, since the readership jumped in the wake of AGBT. It's one thing to write when you've got 30 people reading your blog... and yet another thing when there are 300 people reading it. I suppose if I can't keep up the high standards I've being trying to maintain, people will stop reading it and then I won't have anything to feel shy about... Either way, I'll keep doing the blog because I'm enjoying the opportunity to write in a less than formal format. I do have a few other projects on the go, as well, which include a few more essays on personal health and next-gen sequencing... I think I'll aim for one "well thought through essay" a week, possibly on Fridays. We'll see if I can manage to squeeze that in as a regular feature from now on.

In addition to blogging, the other thing I'm enjoying these days is the programming I'm doing in Java for FindPeaks 3.3 (which is the unstable version of FindPeaks 4.0.) It's taking a lot longer to get going than I thought it would, but the efforts are starting to pay off. At this point, a full chip-seq experiment (4 lanes of Illumina data + 3 lanes of control data) can be fully processed in about 4-5 minutes. That's a huge difference from the 40 minutes that it would have taken with previous versions, which would have been sample only.

Of course, the ChIP-seq field hasn't stood still, so a lot of this is "catch-up" to the other applications in the field, but I think I've finally gotten it right with the stats. With some luck, this will be much more than just a catch-up release, though. It will probably be a few more days before I produce a 4.0 alpha, but it shouldn't be long, now. Just a couple more bugs to squash. (-;

At any rate, in addition to the above subjects, there are certainly some interesting things going on in the lab, so I'll need to put more time into those projects as well. As a colleague of mine said to me recently, you know you're doing good work when you feel like you're always in panic mode. I guess this is infinitely better than being underworked! In case anyone is looking for me, I'm the guy with his nose pressed to the monitor, fingers flying on the keyboard and the hunched shoulders. (That might not narrow it down that much, I suppose, but it's a start...)



Anonymous Anonymous said...

Hi, is 4+3 lanes really used for one experiment only? What number of reads does it correspond to on the current runs?

February 18, 2009 6:08:00 AM PST  
Blogger Anthony Fejes said...

Yes, it was all one chip-seq experiment. Depending on what it is you're trying to place on the genome, you probably won't get saturation in a single lane - especially if you're looking at runs from about a year ago, which these were. For the record, that seems to be about 12 million reads for each of the sample and the control.

February 18, 2009 7:11:00 AM PST  

Post a Comment

<< Home