Thanks for visiting my blog - I have now moved to a new location at Nature Networks. Url: - Please come visit my blog there.

Thursday, May 17, 2007

Grad School Nightmare

Yep... it finally happened. A few months in, and way later than it should have.

I found a nice little email from the data tracking system, keeping me up to date, that none of the data I'd been using for the past 5 months is actually real. All those pretty graphs, charts and even the poster I made were all based on a lot of hot air.

I get the feeling that this isn't as uncommon as it appears, but in any case, better now than later. Unfortunately, it should have been much sooner: We'd started working on the verification in January, and because a bunch of people just figured it had a low priority (and since it was just a grad student's project, I imagine it got bumped further and further down the priority stack), it took nearly 4 months to get done.

Four months of work on my part, down the tube.

At any rate, at least I have something to focus on: I was rewriting the code that would be able to generate these predictions/results, and now I'll have a perfectly good training set for it. aka: no data. I'll know exactly how tight to set the thresholds. (So tight, air couldn't escape under pressure.)

I spent a lot of my afternoon looking at the data, trying to understand it, but there's really not a lot that I can really draw conclusions from. We didn't have experience with using 454-generated sequencing data. Now I know: it sucks. As a training exercise, this was great (albeit way too slow in yielding the conclusion). I just have to be mature enough to write off a few months of work as a "learning experience". So much for the worlds fastest PhD!

Wednesday, May 16, 2007

No more support

For once, I think I'll toss out a quick rant that I haven't really thought through, so don't mind if it's a little rough.

I've spent some time thinking about the various projects I've been working on, and that I'd like to be working on in my grad school future. Surprisingly, I'm really happy with them, and I'm eager to delve into all of them, with one major exception: I think I'm sitting on a potential train wreck.

From a project planning perspective, the one major issue that I foresee is that I'm inheriting a legacy of a few 10's of kloc (thousand lines of code) done in Java. I'm not really proficient in Java, but it's not hard, compared to some of the other languages I've used - that's not the issue. The big problem is that almost all of it takes advantage of something called the Ensembl API, which is a quick way for programmers to access all sorts of fantastic functions and data related to various genomic information. It's a fantastic resource, but Ensembl (who made the API) has decided to stop supporting the java version in favour of the Perl version.

Even now, I'm stuck using the annotations from version 41 of the Ensembl Human Genome, whereas v.43 is the most current. How much difference will this make? Probably not much, at the moment. However, in the long term, I think that could become a major issue.

Now, I have worked in Perl before, years ago, so that's not a problem. But what do I do about the 10kloc? Recreating it will take the better part of a year, at least. For now, the solution is to postpone the decision, but I think that'll only work for another month or two. Eventually something is going to give, and I'm just going to have suck it up and redo all of the code we've got in house. Yuck.


Sunday, May 13, 2007

I think it’s time for another blog entry – not just because I’m bored out of my mind, but also because I think I have a lot to say.

Somehow, Cold Spring Harbor was good for that. It had me thinking about a lot of things, some of which are more useful than others. Among the useless category was the fact that I was constantly mistaken for a post-doc. (I probably look a little old to be a first-year grad student, now, and most of the people there are post docs and P.I.s, so it’s not that surprising.)

Anyhow, among the more useful is a shift in focus: after talking with researchers, I think I can step up my work, and treat my PhD more like a Post-doc. And why not? I don’t think I’m going to be learning a whole lot in terms of techniques – Sequence processing isn’t complicated, although the programming for it can be rather involved. Like any good post-doc, I’m learning the field.

Which means that I need to start doing more literature work. I’ve been slacking off in that respect. (Not that that’s a new thing, I guess.) It also means I can take more responsibility for my research. It’s not like anyone else is going to do it for me.

So what if I’m only getting a PhD. Lets make this a 3 year post-doc, and see if I can live up to the challenge.

Thursday, May 10, 2007

progress.... and grilled grad students.

This amused me - I was wandering through this afternoon's poster session, and one of the poster presenters I had spoken to yesterday asked me "Are you doing your job, grilling poor graduate students?" It was a joke, but I think my questions might have been a little too insightful, yesterday. On the bright side, I think the poor grad student has decided to follow up on some of my questions and put his research somewhere the community can access it. I toned it down, today, but whatever - today's posters weren't as entertaining, in general.

On the other hand, to illustrate my point from yesterday, another poor grad student, named Levine, was presenting some of that Chip-Chip data that would have been pretty cutting edge last week. I suspect everyone who walked by asked the same question: will you be doing Chip-sequencing, next? His research was really impressive, but the stuff presented by someone else from the BCGSC (Martin Hirst), yesterday, made the whole thing seem like a mud hut compared to a skyscraper.

So, if you're not looking far enough ahead, you may as well find somewhere to focus your research that won't be steamrolled by the next invention to come along.

From Cold Spring Harbor

I’ve really been slacking off lately on my blog postings, though for good reason: I’ve been a bit busy. However, I think it’s also been a lack of focus on what I wanted this blog to be about, but that’s coming into focus for me.

This excursion back into grad school has given me an opportunity to rationally think out a lot things I hadn’t before. And, in particular, the last two weeks have been a watershed of revelations about the sciences.

In any case, I’ll return to blogging my grad school adventures in a linear fashion, and probably go over some of my Cold Spring Harbor stories in a bit. Though first, I wanted to share what I learned yesterday, while watching the blistering pace of the talks and posters:

1. Everyone in your field is trying to achieve exactly the same thing you are: Don’t kid yourself that you’re the first to think of something, or the first to walk down a path – odds are there are at least 2 other groups doing exactly the same thing.
2. The only thing that separates you from the other people is how you get it done.
3. Only the people who do it the BEST way will be remembered. Everyone else will just become a footnote.

So many people here are using the same technologies, have access to the same information, and are trying varieties of the same concepts here, that the only things that stand out are the groups that have come up with a way to do those things REALLY well. Their talks are neat, their posters have big audiences, and I’m going to be reading their papers.

For the other people, who even 2 weeks ago I might have been impressed with, their work is somehow not as exciting. ChIP-chip techniques vs. Chip-solexa techniques (sorry for being geeky for a second) are suddenly looking like the last years birthday gifts – enjoyed, but mainly forgotten in the niftyness of this years new presents. And ChIP-chip hasn’t even been around for long enough to have been last year’s hot toy.

Anyhow, just one final word. For those people who brought the posters on chicken colouring, you might want to consider that a poster needs more than an abstract, a conclusion (both in 45pt font), and two pictures of chickens… though really, they are nice pictures of chickens! Really!