AI Pioneer Announces Non-Profit To Develop 'Honest' AI 25

Posted by BeauHD on Tuesday June 03, 2025 @07:00PM from the no-more-lies dept.

Yoshua Bengio, a pioneer in AI and Turing Award winner, has launched a $30 million non-profit aimed at developing "honest" AI systems that detect and prevent deceptive or harmful behavior in autonomous agents. The Guardian reports: Yoshua Bengio, a renowned computer scientist described as one of the "godfathers" of AI, will be president of LawZero, an organization committed to the safe design of the cutting-edge technology that has sparked a $1 trillion arms race. Starting with funding of approximately $30m and more than a dozen researchers, Bengio is developing a system called Scientist AI that will act as a guardrail against AI agents -- which carry out tasks without human intervention -- showing deceptive or self-preserving behavior, such as trying to avoid being turned off.

Describing the current suite of AI agents as "actors" seeking to imitate humans and please users, he said the Scientist AI system would be more like a "psychologist" that can understand and predict bad behavior. "We want to build AIs that will be honest and not deceptive," Bengio said. He added: "It is theoretically possible to imagine machines that have no self, no goal for themselves, that are just pure knowledge machines -- like a scientist who knows a lot of stuff."

However, unlike current generative AI tools, Bengio's system will not give definitive answers and will instead give probabilities for whether an answer is correct. "It has a sense of humility that it isn't sure about the answer," he said. Deployed alongside an AI agent, Bengio's model would flag potentially harmful behaviour by an autonomous system -- having gauged the probability of its actions causing harm. Scientist AI will "predict the probability that an agent's actions will lead to harm" and, if that probability is above a certain threshold, that agent's proposed action will then be blocked. "The point is to demonstrate the methodology so that then we can convince either donors or governments or AI labs to put the resources that are needed to train this at the same scale as the current frontier AIs. It is really important that the guardrail AI be at least as smart as the AI agent that it is trying to monitor and control," he said.

AI Pioneer Announces Non-Profit To Develop 'Honest' AI

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 25 Comments Log In/Create an Account

Comments Filter:

Money Hole (Score:2)

by ImaLamer ( 260199 ) writes:

It's not well defined what he's defending against or what it looks like when an agent goes bad (or how to check their motivations, etc). This is a money grab, nothing more.
- Re: (Score:1)
  
  by MacMann ( 7518492 ) writes:
  
  All kinds of AI research is a kind of "money grab", and people are willing to pay for this because improvements in AI can mean monetary returns later.
  While the goal is a bit nonspecific I believe I have an idea on what they are looking for. A concern with generative AI is that it can produce results that are "hallucinations", with examples like references to books, research papers, and court opinions that do not exist. A real world example might be something like the Space Shuttle having three redundant c
- Honesty (Score:2)
  
  by Kunedog ( 1033226 ) writes:
  
  I bet it would be quite enlightening if someone were to ask him whether he'd call Gemini's race-swapped historical figures dishonest.
Hooray for Hollywood (Score:2)

by dsgrntlxmply ( 610492 ) writes:

"I've just picked up a fault in the AE-35 unit. It's going to go 100 percent failure within 72 hours."
Too Late - AI Was Used Only for Evil (Score:2)

by BrendaEM ( 871664 ) writes:

It's over. The bullets aren't going back in the gun.
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  Not surprising, when AI's killer app is human mimicry!
Glorified Firewall. (Score:2)

by geekmux ( 1040042 ) writes:

Scientist AI will "predict the probability that an agent's actions will lead to harm" and, if that probability is above a certain threshold, that agent's proposed action will then be blocked.
Thresholds? Blocks? Sounds like little more than a glorified firewall/anti-malware system. Are they being “honest” about what it more is? Or the criticality for incredibly tight security due to increased risk in this capacity?
If Scientist AI becomes quite the popular Agent police to use, then it becomes an increased target of opportunity for nefarious actors. Why hack the AI agent being throttled when you can just attack and cripple the device doing the throttling.
On a side note, they shoul
more fraud (Score:3)

by dfghjk ( 711126 ) writes: on Tuesday June 03, 2025 @08:08PM (#65425883)

the way to fix bullshit anthropomorphism of AI is not to use other bullshit traits like "humility", it's to stop doing it. This guy may or may not know what to do about the problem, but it's clear he's working the same grift as all the others.

- Re: more fraud (Score:2)
  
  by Big Hairy Gorilla ( 9839972 ) writes:
  
  Hey brother, it's Gotham City out there these days. Pick your favorite super villain... Musk? Bexos? Ooohhh.. Trump... Sam. Whazziz guy? Bengio?
  
  He gets to play good guy. Maybe he'll turn out to be Bruce Wayne after all. He'll be the good guy until he's not.
Someone has to be the opposition (Score:2)

by Big Hairy Gorilla ( 9839972 ) writes:

He's got some decent ideas. Unfortunately we get alot of drama, manufactured, or real, you can't factor out humanity. So far he's probably the only one offering any alternative ideas.

Interesting to note, before any of this was commercial, it was known that ethics has to be considered in AI.
- Re: (Score:2)
  
  by dsgrntlxmply ( 610492 ) writes:
  
  I was a research assistant (code and data monkey) for a natural language processing researcher mid 70s. His doctoral dissertation proposed title: "Ethics for an Encyclopedic Robot". I lost contact with him, but later found that his eventual dissertation was on a much narrower topic in language processing. He died a few years before LLM arrived.
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  A small correction: Ethics has to be considered in the deployment of AI.
  AI is a software tool. The people who use this tool are responsible for the damage it causes.
  - Ethics is the question; the answer is "no" (Score:3)
    
    by Epeeist ( 2682 ) writes:
    
    AI is a software tool. The people who use this tool are responsible for the damage it causes.
    "How will this impact people", and, "What effect will this have on society" aren't questions that get asked by the tech bros. The question that does get asked is, "How can I monetise these for my benefit"
    Incidentally, the "AI apocalypse" was deemed to be in progress in an article the other day. Presumably, the Butlerian jihad can't be far behind.
    - Re: (Score:2)
      
      by nightflameauto ( 6607976 ) writes:
      
      AI is a software tool. The people who use this tool are responsible for the damage it causes.
      "How will this impact people", and, "What effect will this have on society" aren't questions that get asked by the tech bros. The question that does get asked is, "How can I monetise these for my benefit"
      Incidentally, the "AI apocalypse" was deemed to be in progress in an article the other day. Presumably, the Butlerian jihad can't be far behind.
      The movie Assholes: A Theory says that Silicon Valley invented an entirely new class of asshole. The move fast and break things mentality still pervades most of the tech culture. And, as the movie stated (slightly paraphrasing as my memory isn't perfect here): "If they happen to break democracy or society in the process they don't particularly care that there isn't an easy fix for it."
      We're pretty much seeing that play out not just from the social media giants, but also the AI giants.
      It's an entertaining an
- Re: Someone has to be the opposition (Score:2)
  
  by nategasser ( 224001 ) writes:
  
  Asimov's Three Laws of Robots demonstrates this was a concern at least as early as 1940.
  - Re: (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    Quite right.
    And now that we're on the verge of autonomous agents, that seems to be forgotten, or at best a low priority.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    The three laws are nonsense without AGI that deserves the name. They are pretty much nonsense even with that and Asimov wrote a lot about how robots and people get around them.
So, basically AHI Artificially Honest Intelligence (Score:1)

by NewID_of_Ami.One ( 9578152 ) writes:

So, basically AHI Artificially Honest Intelligence. Get fooled two times and it cancels out
- Re: (Score:1)
  
  by buck-yar ( 164658 ) writes:
  
  Who decides what is true and what isn't?
Yo dawg, we heard you like guardrails. (Score:3)

by nightflameauto ( 6607976 ) writes: on Wednesday June 04, 2025 @08:37AM (#65426773)

This basically sounds like someone is developing AI guardrails to guard against current gen AI guardrails and AI agents that fall outside of their own guardrails. The problem with this concept is defining what it is you're actually guarding against when AI agents have already proven to misbehave in unique and unexpected ways to do and / or say things they shouldn't. It's like a manager telling his people, "I need you to tell me everything that's unexpected in the next quarter so we can plan for it."
That said, I'm sure it's a good way to funnel at least a tiny bit of money away from the more aggressive AI companies that are setting out specifically to fuck shit up at all costs, so long as it earns them profit. I'm just not sure that this proposed idea is really all that different, other than starting out with the premise that they want to defuck the shit that's getting fucked up. How long before the new guardrails will need their own guardrails?

Oxymoron (Score:2)

by kwelch007 ( 197081 ) writes:

"machines that have no self, no goal for themselves, that are just pure knowledge machines -- like a scientist who knows a lot of stuff."
Like Fauci? He is the science.
The very name is already a lie (Score:2)

by gweihir ( 88907 ) writes:

"Honest" and suppression of "harmful" do not go together. You can only do one of the two.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

AI Pioneer Announces Non-Profit To Develop 'Honest' AI 25

AI Pioneer Announces Non-Profit To Develop 'Honest' AI More Login

AI Pioneer Announces Non-Profit To Develop 'Honest' AI

Money Hole (Score:2)

Re: (Score:1)

Honesty (Score:2)

Hooray for Hollywood (Score:2)

Too Late - AI Was Used Only for Evil (Score:2)

Re: (Score:2)

Glorified Firewall. (Score:2)

more fraud (Score:3)

Re: more fraud (Score:2)

Someone has to be the opposition (Score:2)

Re: (Score:2)

Re: (Score:2)

Ethics is the question; the answer is "no" (Score:3)

Re: (Score:2)

Re: Someone has to be the opposition (Score:2)

Re: (Score:2)

Re: (Score:2)

So, basically AHI Artificially Honest Intelligence (Score:1)

Re: (Score:1)

Yo dawg, we heard you like guardrails. (Score:3)

Oxymoron (Score:2)

The very name is already a lie (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot