I'm excited that it's going to be a Cray, as they have the best memory-to-processor architecture.
Scientific problems can be subdivided, but there will always be heavy communication between the processors. In particular global communication (a global sum for instance) is a killer. The more processors, the slower this operation will run, so for a big enough machine, this can actually dominate the cost. You get to a point where adding more processors does not make you any faster!
The Cray has a beautiful architecture, where one processor can put data straight _into_register_ on another processor. No cache and network delays. This is freakin' awesome. In Cray's presentation about the X1 they have an ocean simulation code that keeps scaling way beyond IBM, HP, &c machines, precisely because of the efficiency in global operations.
My disappointment is that they are only aiming for 50Tflop. The Earth Simulator hit 37 two years ago! This is no progress.
Victor.