Comment Re: I don't believe it (Score 0) 60
Doesn't seem to have helped you one lick.
Doesn't seem to have helped you one lick.
that's what business does
This statement is straight up false with even a surface knowledge of the history of console (read: fixed embedded systems) hardware pricing.
It's also just false in a whole host of other ways. Does believing in such laughably simple absolutes help your head not hurt?
Sigh. That's not at all what they said. If there was any nefarious "agenda" at play "they" (whoever the fuck "they" are in the minds of climate change conspiracists) wouldn't even be publishing data like this since smoothed brain folk like you will take the shallowest of positions on the observation to refute the underlying science.
It is literally a summarization model. Its context window is the text to summarize. It was literally trained on being given text as a context, and summaries as outputs.
AI Overview literally tells you what sources it's drawing on. I don't see why people are capable of looking at what the sources are when they're Google search answers but not when they're AI Overview source links, but apparently, that is hard for some people.
This. Most people inevitably respond in these threads talking about "the model's training". AI Overview isn't like something like ChatGPT. It's a minuscule summarization model. It's not tasked to "know" anything - it's only tasked to sum up what the top search results say. In the case of the "glue on pizza" thing, one of the top search results was an old Reddit thread where a troll advised that. AI overview literally tells you what links it's drawing on.
Don't get me wrong, there's still many reasons why AI overview is a terrible idea.
1) It does nothing to assess for trolling. AI models absolutely can do that, they just have not.
2) It does nothing to assess for misinfo. AI models absolutely can do that, they just have not.
3) It does nothing to assess for scams. AI models absolutely can do that, they just have not.
And the reason the have not is that they need to run AI Overview hundreds of thousands of times per second, so they want the most absolutely barebones lightweight model imaginable. You could run their model on a cell phone it's so small.
Bad information on the internet is the main source of errors, like 95% of them. But there are two other types of mistakes as well:
4) The model isn't reading web pages in the same way that humans see them, and this can lead to misinterpreted information. For example, perhaps when rendered, there's a headline "Rape charges filed against local man", and below it a photo of a press conference with a caption "District Attorney John Smith", and then below that an article about the charges without mentioning the man's name. The model might get fed: "Rape charges filed against local man District Attorney John Smith", and report John Smith as a sex offender.
5) The model might well just screw up in its summarization. It is, after all, as miniscule as possible.
I personally find deploying a model with these weaknesses to be a fundamentally stupid idea. You *have* to assess sources, you *can't* have a nontrivial error rate in summarizations, etc. Otherwise you're just creating annoyance and net harm. But it's also important for people to understand what the errors actually are. None of these errors have anything to do with "what's in the model's training data". The model's training data is just random pieces of text followed by summaries of said text.
Totally orthogonal concern. Try again.
Neutrinos are called ghost particles because of how little they interact with other matter. They're constantly streaming through your body without interacting with you. A solar neutrino passing through the entire Earth has less than 1 in a million odds of interacting with the Earth.
Not like this with this - the energy here equates to a couple hundredths of a joule. Now, the "Oh My God! Particle" had a much higher energy, about three orders of magnitude higher. That's knock-photos-over sort of energy (and a lot more than that). The problem is that you can't deposit it all at once. A ton of energy does get transferred during the first collision, but it's ejecting whatever it hit out of whatever it was in as a shower of relativistic particles that - like the original particle - tend to travel a long distance between interactions. Whatever particle was hit is not pulling the whole target with it, it's just buggering off as a ghostly energy spray. There will be some limited chains of secondary interactions transferring more kinetic energy, but not "knock pictures over" levels of energy transferred.
Also, here on the surface you're very unlikely to get the original collision; collisions with the atmosphere can spread the resultant spray of particles out across multiple square kilometers before any of them reaches the surface.
Fucking snowflakes. I mean you, not them.
The average ICE car burns its entire mass worth of fuel every year. Up in smoke into our breathing air, gone, no recycling.
The average car on the road lasts about two decades, and is then recycled, with the vast majority of its metals recovered.
The manufacturing phase is not the phase you have to worry about when it comes to transportation.
Any sources for this
Anonymous (2021). "How My Uncle’s Friend’s Mechanic Proved EVs Are Worse." International Journal of Hunches, 5(3), 1-11.
Backyard, B. (2018). "EVs Are Worse Because I Said So: A Robust Analysis." Garage Journal of Automotive Opinions, 3(2), 1-2.
Dunning, K. & Kruger, E. (2019). "Why Everything I Don’t Like Is Actually Bad for the Environment." Confirmation Bias Review, 99(1), 0-0.
Johnson, L. & McFakename, R. (2022). "Carbon Footprint Myths and Why They Sound Convincing After Three Beers." Annals of Bro Science, 7(2), 1337-42.
Lee, H. (2025). "Numbers I Felt Were True". Global Journal of Speculative Engineering, 22(1), 34-38.
Outdated, T. (2015, never revised). "EVs Are Bad Because of That One Study From 2010 I Misinterpreted." Obsolete Science Digest, 30(4), 1-5.
Tinfoil, H. (2020). "Electric Cars Are a Government Plot (And Other Things I Yell at Clouds)." Conspiracy Theories Auto, 5(5), 1-99.
Trustmebro, A. (2019). "The 8-Year Rule: Why It’s Definitely Not Made Up." Vibes-Based Research, 2(3), 69-420.
Wrong, W. (2018). "The Art of Being Loudly Incorrect About Technology." Dunning-Kruger Journal, 1(1), 1-?.
He clearly got tired of trying to build the Antichrist
... to be dissected as research subjects?
Did a dog conduct this study?
I know this is an alien concept to most people here, but it would be nice if people would actually, you know, read the papers first? I know nobody does this, but, could people at least try?
First off, this isn't peer reviewed. So it's not "actual, careful research", it's "not yet analyzed to determine whether it's decent research".
Secondly, despite what they call it, they're not dealing with LLMs at all. They're dealing with Transformers, but in no way does it have anything to do with "language", unless you think language is repeated mathematical transforms on random letters.
It also has nothing to do with "large". Their model that most of the paper is based on is minuscule, with 4 layers, 32 hidden dimensions, and 4 attention heads. A typical large frontier LLM has maybe 128 layers, >10k hidden dimensions, and upwards of 100 or so attention heads.
So right off the bat, this has nothing to do with "large language models". It is a test on a toy version of the underlying tech.
Let us continue: "During the inference time, we set the temperature to 1e-5." This is a bizarrely low temperature for a LLM. Might as well set it to zero. I wonder if they have a justification for this? I don't see it in the paper. Temperatures this low tend to show no creativity and get stuck in loops, at least with "normal" LLMs.
They train it with 456976 samples, which is.... not a lot. Memorization is learned quickly in LLMs, while generalization is learned very slowly (see e.g. papers on "grokking").
Now here's what they're actually doing. They have two types of symbol transformations: rotation (for example, ROT("APPLE", 1) = "BQQMF") and cyclic shifts (for example, CYC("APPLE", 1) = "EAPPL".
For the in-domain tests, they'll say train on ROT, and test with ROT. It scores 100% on these. It scores near-zero on the others:
Composition (CMP): They train on a mix of two-step tasks: ROT followed by ROT; ROT followed by CYC; and CYC followed by ROT. They then test with CYC followed by CYC. They believe that the model should have figured out what CYC is doing on its own and be able to apply CYC twice on its own.
Partial Out-of-Distribution (POOD): They train on simply ROT followed by ROT. They then task it to perform ROT followed by CYC. To repeat: it was never traiend to do CYC.
Out-of-Distribution (OOD): They train simply on ROT followed by ROT They then task it to do CYC followed by CYC. Once again, it was never trained to do CYC.
The latter two seem like grossly unfair tests. Basically, they want this tiny toy model with a "brain" smaller than a dust mite's to zero-shot an example it's had no training on just by seeing one example in its prompt. That's just not going to happen, and it's stupid to think it's going to happen.
Re, their CMP example: the easiest way for the (minuscule) model to learn it isn't to try to deduce what ROT and CYC mean individually; it's to learn what ROT-ROT does, what ROT-CYC does, and what CYC-ROT does. It doesn't have the "brainpower", not was it trained to, to "mull over" these problems (nor does it have any preexisting knowledge about what a "rotation" or a "cycle" is); it's just learning: problem 1 takes 2 parameters and I need to do an an offset based on the sum of these two parameters. Problem 2... etc.
The paper draws way too strong of conclusions from its premise. They do zero attempt to actually insert any probes in their model to see what their model is actually doing (ala Anthropic). And it's a Karen's Rule violation (making strong assertions about model performance vs. humans without actually running any human controls).
The ability to zero-shot is not some innate behavior; it is a learned behavior. Actual LLMs can readily zero-shot these problems. And by contrast, a human baby who has never been exposed to concepts like cyclic or rotational transformation of symbols could not. One of the classic hard problems is how to communicate with an alien intellect - if we got a message from aliens, how could we understand it? If we wanted to send one to them, how could we get them to understand it? Zero-shotting communicative intent requires a common frame of reference to build off of.
"Catch a wave and you're sitting on top of the world." - The Beach Boys