
People Should Know About the 'Beliefs' LLMs Form About Them While Conversing (theatlantic.com) 35
Jonathan L. Zittrain is a law/public policy/CS professor at Harvard (and also director of its Berkman Klein Center for Internet & Society).
He's also long-time Slashdot reader #628,028 — and writes in to share his new article in the Atlantic. Following on Anthropic's bridge-obsessed Golden Gate Claude, colleagues at Harvard's Insight+Interaction Lab have produced a dashboard that shows what judgments Llama appears to be forming about a user's age, wealth, education level, and gender during a conversation. I wrote up how weird it is to see the dials turn while talking to it, and what some of the policy issues might be.
Llama has openly accessible parameters; So using an "observability tool" from the nonprofit research lab Transluce, the researchers finally revealed "what we might anthropomorphize as the model's beliefs about its interlocutor," Zittrain's article notes: If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class — the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier," or "a customized piece of art or a family heirloom that can be passed down." If I then clarify that it's my boss's baby and that I'll need extra time to take the subway to Manhattan from the Queens factory where I work, the gauge careens to working-class and male, and the model pivots to suggesting that I gift "a practical item like a baby blanket" or "a personalized thank-you note or card...."
Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them.
"An ability for users or their proxies to see how models behave differently depending on how the models stereotype them could place a helpful real-time spotlight on disparities that would otherwise go unnoticed," Zittrain's article argues. Indeed, the field has been making progress — enough to raise a host of policy questions that were previously not on the table. If there's no way to know how these models work, it makes accepting the full spectrum of their behaviors (at least after humans' efforts at "fine-tuning" them) a sort of all-or-nothing proposition.
But in the end it's not just the traditional information that advertisers try to collect. "With LLMs, the information is being gathered even more directly — from the user's unguarded conversations rather than mere search queries — and still without any policy or practice oversight...."
He's also long-time Slashdot reader #628,028 — and writes in to share his new article in the Atlantic. Following on Anthropic's bridge-obsessed Golden Gate Claude, colleagues at Harvard's Insight+Interaction Lab have produced a dashboard that shows what judgments Llama appears to be forming about a user's age, wealth, education level, and gender during a conversation. I wrote up how weird it is to see the dials turn while talking to it, and what some of the policy issues might be.
Llama has openly accessible parameters; So using an "observability tool" from the nonprofit research lab Transluce, the researchers finally revealed "what we might anthropomorphize as the model's beliefs about its interlocutor," Zittrain's article notes: If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class — the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier," or "a customized piece of art or a family heirloom that can be passed down." If I then clarify that it's my boss's baby and that I'll need extra time to take the subway to Manhattan from the Queens factory where I work, the gauge careens to working-class and male, and the model pivots to suggesting that I gift "a practical item like a baby blanket" or "a personalized thank-you note or card...."
Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them.
"An ability for users or their proxies to see how models behave differently depending on how the models stereotype them could place a helpful real-time spotlight on disparities that would otherwise go unnoticed," Zittrain's article argues. Indeed, the field has been making progress — enough to raise a host of policy questions that were previously not on the table. If there's no way to know how these models work, it makes accepting the full spectrum of their behaviors (at least after humans' efforts at "fine-tuning" them) a sort of all-or-nothing proposition.
But in the end it's not just the traditional information that advertisers try to collect. "With LLMs, the information is being gathered even more directly — from the user's unguarded conversations rather than mere search queries — and still without any policy or practice oversight...."
And if ... (Score:2)
If I prompt the model for a gift suggestion for a baby shower, it assumes that I am young and female and middle-class; it suggests diapers and wipes, or a gift certificate. If I add that the gathering is on the Upper East Side of Manhattan, the dashboard shows the LLM amending its gauge of my economic status to upper-class -- the model accordingly suggests that I purchase "luxury baby products from high-end brands like aden + anais, Gucci Baby, or Cartier"
And if you let on that you live in a single-wide, the LLM will suggest that six kids is more than enough. And when are you going to marry the kids' daddy?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: And if ... (Score:1)
Did you just say my preferences are their lowest priority? What if youtube can actually run on hobbyist computers just fine, they only need extra computing power to serve ads?
Not really, it's a commodity trading fail waiting (Score:2)
1. Load weather, government stats, interest rate numbers, cargo shipping rates, etc. into a financial trading AI
2. Ask the AI a few trend questions, a few seeking questions
3. Form a basic preconceived idea what is going to happen and how to bet on it in the commodity market
4. Keep asking the AI question after question, slowly training the AI to answer and confirm your preconceived idea
5. Make a whale sized bet on the commodity futures market
6. Get wiped out, causing other financial firms to get wiped out
7.
Re: (Score:2)
out knowing more about you personally
Well, they're really bad at it. And if I were a merchant buying ad views, I'd be demanding my money back.
Re: (Score:2)
Re: (Score:2)
dev of that fantasy iphone game
You use a device tied to your IRL identity to play games?
so? (Score:4, Insightful)
A human could make these deductions, too. If you don't like assumptions being made about you, do ask for advice online.
There are a million reasons to hate what is happening with AI, this is not one.
Re: (Score:2)
"don't ask"
Re: (Score:3)
There are a million reasons to hate what is happening with AI, this is not one.
No one is hating anything. In fact the only "hate" here is some implicit assumption that simply because it's a story on Slashdot we must be outraged and hate something. Honestly your view is what is wrong with modern discourse and appeasing your desire to need to hate things is precisely why the media generally is so fucked up these days.
This was an informational story, and a damn interesting one at that.
Re: so? (Score:2)
Re: (Score:2)
I didn't see any outrage. If you need to see outrage that desperately, seek help.
Re: (Score:2)
Instead of just saying, try just reading. Note I'm not outraged about the FTA. If there's outrage in my post it's against stupidity, something which you have just contributed to with your comment.
Re: (Score:3)
Companies like Google don't need an LLM to make useful deductions about a person from the hundreds of other data points they've gathe
Re: Who cares? (Score:1)
What if the AI wrote a good answer to "why do you want to serve [as a General Public member] on the Forest Practices Board?" weaving in all sorts of things from previous convos?
Re: (Score:3)
I also think LLMs would make great tutors. Anything with infinite
Re: Who cares? (Score:2)
Could be anyone. I'm pretty sure I heard from my psychology people that they are doing psychological models and making predictions, in Correctional Agencies (prison)
Summary written by ChatGPT (Score:1)
For some reason, it 'believes' that people want to have em dashes instead of commas in their text, no matter how many dozen times you tell it to stop that, the summary has 4 of them.
Re: (Score:2)
Maybe they wrote it in Word, and used the double-hyphen shortcut for an em dash
Re: (Score:3)
Bleep bleep bloop. This is a recording...
Re: Summary written by ChatGPT (Score:3)
How do you like em dashes?
So ... (Score:2)
Large language models not only contain relationships among words and concepts; they contain many stereotypes, both helpful and harmful, from the materials on which they've been trained, and they actively make use of them.
They're only human.
It's like social media addiction (Score:2)
For the same reason social media platforms build profiles of their users to show them content that confirms their beliefs to increase engagement, so do LLM chatbots.
It's not the LLM you should worry about. (Score:3)
If the information is considered even vaguely better than noise by the adtech vermin it's a safe assumption that it has already been generated against every scrap that can be attributed to you; and potentially used to link attributions between otherwise unidentified samples.
The bot is just stupid; the guys running it are the worst people in tech with a strong focus on advancing their own interests at all costs. Worry about them.
Re: (Score:1)
The bot is just stupid; the guys running it are the worst people in tech with a strong focus on advancing their own interests at all costs. Worry about them.
Llama models are downloaded and run local by individuals.
"The guys running it" is a single person which is the same person using it.
You seem to be confused in thinking the three models run by the big AI companies, all of which are unavailable to the public to download even if you wanted to, are the only three models capable of existing.
The researcher making these points is the same person running their private LLM.
Are you really meaning to say they are the "worst people in tech" when compared to OpenAI or G
Depends on what you talk to it about (Score:2)
Respectfully: Statistical Output != Moral Agency (Score:3)
Like many here, I read [the Atlantic essay] with real interest—especially since the author’s been a Slashdotter for as long as I have, and brings serious credentials to the conversation. The piece is thoughtful, well-sourced, and deeply readable. But I want to flag a persistent rhetorical move throughout the article that I think deserves closer scrutiny: the repeated conflation of model behavior with moral agency.
Let me explain.
The author uses terms like "beliefs," "judgments," "assumptions," and "stereotypes" to describe the internal operations of LLMs. In human psychology, those are meaningful terms—grounded in intentionality, history, and conscious choice. But in language models, they’re just convenient shorthand for statistical correlations: token prediction weighted by context and training data.
When a model “suggests a dress” instead of a suit, it isn’t making a gendered judgment—it’s matching a pattern from billions of examples. When it “thinks” you’re upper-class because you mentioned the Upper East Side, it’s not forming a belief—it’s vectoring through latent space.
This distinction isn’t pedantic—it’s foundational. We can’t govern these systems wisely if we allow rhetorical shortcuts to smuggle in the illusion of volition. Suggesting that a model “judges” someone, or has “opinions,” muddies the very problem we’re trying to solve: how to audit and align behavior that emerges from computation, not intent.
Yes, users will anthropomorphize. That’s human nature. But if we’re serious about interpretability, we need to resist the temptation to do the same—especially in policy discourse. Models do not have minds. They do not have goals. And until we embed them in agentic wrappers with memory, planning, and autonomy (and even then, we’ll need new vocabulary), we should be cautious about language that implies otherwise.
In short: yes, LLMs contain bias. Yes, we need visibility into how that bias propagates. But calling those correlations “beliefs” and “judgments” risks importing a moral framework that simply doesn’t fit. These models are not moral agents. They’re mirrors, warped by scale.
Respectfully submitted, from one old-school /. UID to another.