"It’s worth noting that most working professionals do a lot more than submit research reports to their boss, which is all that GDPval-v0 tests for."
"OpenAI says that it believes Claude scored so high because of its tendency to make pleasing graphics, rather than sheer performance."
So by wide range of jobs, they mean jobs consisting of only submitting research reports with pleasing graphics. And based on recent measurements, about 1/4 incorrect or entirely hallucinated.
The key piece from this article is:
“With the incredible demand Azure is seeing and the growth of our platform, we’ve decided to pause our planned migration of LinkedIn to allocate resources to external Azure customers,” Hiremagalur wrote in his memo.
At least a couple of ways to interpret this comment. Azure either doesn't have the capacity required to support LinkedIn or the cost of running it "on-prem" is so much better than in Azure vs paying customers it doesn't make sense...or a combination of the two.
The Falcon heavy was also often promised and then delivered way behind schedule.
I guess the difference is no one paid for a Falcon Heavy lift in 2011.
To thine own self be true. (If not that, at least make some money.)