This. Most people inevitably respond in these threads talking about "the model's training". AI Overview isn't like something like ChatGPT. It's a minuscule summarization model. It's not tasked to "know" anything - it's only tasked to sum up what the top search results say. In the case of the "glue on pizza" thing, one of the top search results was an old Reddit thread where a troll advised that. AI overview literally tells you what links it's drawing on.
Don't get me wrong, there's still many reasons why AI overview is a terrible idea.
1) It does nothing to assess for trolling. AI models absolutely can do that, they just have not.
2) It does nothing to assess for misinfo. AI models absolutely can do that, they just have not.
3) It does nothing to assess for scams. AI models absolutely can do that, they just have not.
And the reason the have not is that they need to run AI Overview hundreds of thousands of times per second, so they want the most absolutely barebones lightweight model imaginable. You could run their model on a cell phone it's so small.
Bad information on the internet is the main source of errors, like 95% of them. But there are two other types of mistakes as well:
4) The model isn't reading web pages in the same way that humans see them, and this can lead to misinterpreted information. For example, perhaps when rendered, there's a headline "Rape charges filed against local man", and below it a photo of a press conference with a caption "District Attorney John Smith", and then below that an article about the charges without mentioning the man's name. The model might get fed: "Rape charges filed against local man District Attorney John Smith", and report John Smith as a sex offender.
5) The model might well just screw up in its summarization. It is, after all, as miniscule as possible.
I personally find deploying a model with these weaknesses to be a fundamentally stupid idea. You *have* to assess sources, you *can't* have a nontrivial error rate in summarizations, etc. Otherwise you're just creating annoyance and net harm. But it's also important for people to understand what the errors actually are. None of these errors have anything to do with "what's in the model's training data". The model's training data is just random pieces of text followed by summaries of said text.