Comment Re:Using a Supercomputer Right Now... (Score 1) 23
You're posting anonymously, but if you can figure out how to contact me I can set you up.
You're posting anonymously, but if you can figure out how to contact me I can set you up.
Genome annotation (finding all the interest features in the sequence) is really computationally intensive, due in large part to the number of separate (often sub-optimally written) algorithms that have to be chained together and interpreted. My team at the iPlant Collaborative worked with the authors of a popular open-source annotation tool called "MAKER" to get it running at scale on the 302 TFLOP Lonestar 4 supercomputer, which in turn was used by the pine team to do in a few hours what used to be 6 months of painstaking bioinformatics. In another month or so, this algorithm will be available via REST API allowing, literally, "Annotation As A Service".
It's a nice story, and they provide a MatLab environment to play around with their model, but ultimately I don't believe this work is reproducible given the materials provided. All we're really given is a sandbox to play in where we can adjust model parameters, and so the work should never have been published.
What would convince me? For starters, the ability to take an arbitrary set of values for these SNPs, punch them in, and see the result change. If I put in SNPs from one of the CEU HapMap samples, I would expect to see a vaguely Caucasian face. If the individual is female, I would expect feminized features. Adding to this, I think we need to see more of the source used in the data wrangling. There's quite a bit of "and then this happened" in the methods.
We're hiring. PM me.
Perhaps if they built a large wooden badger...
We all knew this would be the outcome, right?
No shit dude, I came here to post the same thing.
Parkinson's Law: Work expands to fill the time alloted it.