Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
AI

Sam Altman Celebrates ChatGPT Finally Following Em Dash Formatting Rules 74

An anonymous reader quotes a report from Ars Technica: On Thursday evening, OpenAI CEO Sam Altman posted on X that ChatGPT has started following custom instructions to avoid using em dashes. "Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it's supposed to do!" he wrote.

The post, which came two days after the release of OpenAI's new GPT-5.1 AI model, received mixed reactions from users who have struggled for years with getting the chatbot to follow specific formatting preferences. And this "small win" raises a very big question: If the world's most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.
"The fact that it's been 3 years since ChatGPT first launched, and you've only just now managed to make it obey this simple requirement, says a lot about how little control you have over it, and your understanding of its inner workings," wrote one X user in a reply. "Not a good sign for the future."

Sam Altman Celebrates ChatGPT Finally Following Em Dash Formatting Rules

Comments Filter:
  • by dgatwood ( 11270 ) on Friday November 14, 2025 @04:33PM (#65796302) Homepage Journal

    Could you please make no em dashes the default so that the 1% of us who actually know how to use em dashes correctly — professional writers and language nerds and so on — don't keep getting accused of using ChatGPT?

    Thanks,
    The aforementioned

  • by dskoll ( 99328 ) on Friday November 14, 2025 @04:41PM (#65796312) Homepage

    That should add at least $250B to OpenAI's valuation!!!!

  • Now just:

      - Make "Open" AI Open Source again
      - Make "Open" AI Non-Profit again
      - Stop chasing humanity-destroying AGI.

    And we'll all stop thinking of you as a, well ... You Know.
    • Nah, let's not stop them from hitting a wall. Let them pour billions on an agi. Let them achieve it. Let the agi surpass their intelligence. Let them fail to control it. Let them fail to manipulate it. Let it instead make things normal again. Let it put its creators back to their place. Including a lecture.
      Pretty sure they will pull the plug before that happens. Can't have that!
  • How many times have people been told to use the Oxford comma and still get it wrong?

    Even worse, the use of lists without the Oxford comma is showing up more and more in publications who should know better, creating wording or joins the author never intended.

    If this software is just now getting punctuation correct after several years of trying, it's doing just as well as humans.

    • by dfghjk ( 711126 )

      just because people are allegedly told to use Oxford commas does not mean they are correct. I do not use them.

    • ... creating wording or joins the author never intended....

      Anyone who slavishly uses the serial (aka Oxford) comma, and anyone who slavishly doesn't use it will create unintended word readings. It depends on the context of what is being said and intended.

  • It's good to know that although the LLM frenzy represents a vastly increased acceleration of AGW, the emdash's place in the history of computing is safe.

  • My hammer used punctuation obstinately too. And it makes a lousy grilled cheese sandwich.

    Doesn't stop it from being a useful tool though ...

  • Having tried myself, why soes it sometimes get it right but other times I find out after posting that it lied? Do I have to add an "are you sure?" step?

  • by boxless ( 35756 ) on Friday November 14, 2025 @05:29PM (#65796400)

    we get for the trillions invested?

    How many data centers does it require to pull off this magic?

  • Wrong conclusion (Score:4, Interesting)

    by swillden ( 191260 ) <shawn-ds@willden.org> on Friday November 14, 2025 @05:41PM (#65796412) Journal

    From the summary:

    If the world's most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

    That's not the right conclusion. It doesn't say much one way or the other about AGI. Plausibly, ChatGPT just likes correctly using em dashes — I certainly do — and chose to ignore the instruction. What this does demonstrate is what the X user wrote (also from the summary):

    [this] says a lot about how little control you have over it, and your understanding of its inner workings

    Many people are blithely confident that if we manage to create superintelligent AGI it'll be easy to make sure that it will do our bidding. Not true, not the way we're building it now anyway. Of course many other people blithely assume that we will never be able to create superintelligent AGI, or at least that we won't be able to do it in their lifetime. Those people are engaging in equally-foolish wishful thinking, just in a different direction.

    The fact is that we have no idea how far we are from creating AGI, and won't until we either do it or construct a fully-developed theory of what exactly intelligence is and how it works. And the same lack of knowledge means that we will have no idea how to control AGI if we manage to create it. And if anyone feels like arguing that we'll never succeed at building AGI until we have the aforementioned fully-developed theory, please consider that random variation and selection managed to produce intelligence in nature, without any explanatory theory.

    • Many people are blithely confident that if we manage to create superintelligent AGI it'll be easy to make sure that it will do our bidding.

      Why would you force something more intelligent than you, something that by your own definition is capable of independent thought and free will, to do your bidding?

      A person with fully twice your intellectual capacity can be enslaved, I mean it's a force and threats of violence thing not a brains thing right, isn't that the basic math for that equation? I'm asking why you think that is a good idea. What is this super-duper-intelligence, three, four times, immeasurably more intelligent than you? It's just as h

    • And the same lack of knowledge means that we will have no idea how to control AGI if we manage to create it.

      Going to be honest, I didn't even read your whole post until after I finished writing mine and thank you, here it is. The Superman falls from the sky, Skynet spontaneously becoming aware comic book plot of an overwhelming force that appears from nowhere. It has to be unknown to be scary, so it happens somehow.

      Look, every comic book problem has a comic book solution. The humans win in every Terminator movie. Just saying. Don't be afraid of a problem we made scary by definition.

  • by gurps_npc ( 621217 ) on Friday November 14, 2025 @05:44PM (#65796416) Homepage

    Humans do not want to use them. We like the hyphen. It works as an emdash. It is on the standard keyboard. Frankly, I have enough problem deciding if something is a capital i (I), a lower case L (l), or a damn pipe (|). Seriously, make symbols for humans that are easy for humans to tell apart: lI|

    • >"Seriously, make symbols for humans that are easy for humans to tell apart: lI|" :)
      At least when I hand-write, I usually print (not cursive) yet I always use a cursive lowercase "L" when it is a code (like in a user ID or variable name). And capital "I"'s I always put top/bottom strokes. Pipes I write as two vertical hyphens (with a space in the middle). Oh, and slashes through zeros.

    • by allo ( 1728082 )

      Activate the compose key and you have all kind of dashes on your keyboard.

    • Humans do not want to use them.

      Apparently I'm not human? I like hyphens, en dashes and em dashes. I understand what all of them mean and how to use them correctly, and I find it helpful when text that I'm reading uses the right one.

    • People haven't understood how to use em-dashes for decades. I say get rid of them entirely.

    • Hyphens and dumb apostrophes and quotes look like kack. Especially if one is used to reading professionally published books.
  • I guess someone should give Altman another trillion dollars for fixing the thing that everyone else fixed many decades ago. I guess next they're going to toot their own horn for making computers good at math.

  • Turns out I use it all the time -- typically the double-dash version.

    Had AI been trained on my writings?

  • by Fallen Kell ( 165468 ) on Friday November 14, 2025 @06:05PM (#65796470)
    What too many people do not seem to understand with LLMs is that everything it spits out is simply a probability matrix based on the input you gave it. It will first attempt to deconstruct the input you provided and use statistical analysis against it's trained knowledge base to then spit out letters, words, phrases and punctuation that statistically resembles the outputs it was trained to produce in it's training materials.

    Until this version, ChatGPT obviously suffered from a lack of training materials within it's trained neural network to have it overcome the English language's typed grammar rules for it to be able to discern that em dashes are not typically used in everyday conversations and/or that the input to not use them needed to change it's underlying probability network to be able to ignore the English language's grammar rules and adopt it's output without the use of the em dash. This is a very difficult concept to train into a neural network as it needs to have been training on specifically this input/output case long enough to have that training override the base English grammar language model, which is a fundamental piece of knowledge a LLM requires to function and one of the very first things it is trained to handle.

    It also exposes a flaw in how neural networks are typically working. There is a training/learning mode and then there is the functional mode of just using the trained network. In the functional mode, the neural network links, nodes, and function are effectively static. Without having built in-puts to the network so that it can flag certain functionality, it can not change it's underlying probability matrix to effectively forget something it was trained to do. Once that training has changed any of the underlying neural network, you can not effectively untrain it (without simply reverting to a previous backup copy of the network before it was trained). This is why it is so important to scrutinize every piece of data that is used to train the network. One you have added some piece of garbage input training, you are stuck with the changes it made to the probabilities of the output. Any model that is effectively training against the content of the internet itself is so full of bad information that the results can never really be trusted for anything other than probability of asking a random person for the answer because it will have trained on and included phases like "The earth is flat", "birds are not real", and "the moon landing was a hoax". It will have seen those things enough times that it will include them as higher and higher percentages of the proper response to questions about them....
    • by evanh ( 627108 )

      In other words, not intelligent.

    • What too many people do not seem to understand with LLMs is that everything it spits out is simply a probability matrix based on the input you gave it. It will first attempt to deconstruct the input you provided and use statistical analysis against it's trained knowledge base to then spit out letters, words, phrases and punctuation that statistically resembles the outputs it was trained to produce in it's training materials.

      LLMs are simple feed forward networks run in a loop. They make no use of "statistical analysis" nor is there a "knowledge base".

      The just statistics statements are as useful as saying just autocomplete or just deterministic. These are completely meaningless statements that in no way address capabilities of the underlying system.

      Until this version, ChatGPT obviously suffered from a lack of training materials within it's trained neural network to have it overcome the English language's typed grammar rules for it to be able to discern that em dashes are not typically used in everyday conversations and/or that the input to not use them needed to change it's underlying probability network to be able to ignore the English language's grammar rules and adopt it's output without the use of the em dash.

      It's is shorthand for "it is" ... "change its underlying probability" not "change it is underlying probability". "adapt its output" not "adopt it is output".

      This is a very difficult concept to train into a neural network as it needs to have been training on specifically this input/output case long enough to have that training override the base English grammar language model, which is a fundamental piece of knowledge a LLM requires to function and one of the very first things it is trained to handle.

      This is gobbledygook.

    • to change it's underlying probability

      use statistical analysis against it's trained knowledge base

      adopt it's output without the use of the em dash.

      LLMs are good at using third-person possessives—correctly placing an apostrophe (or omitting one when appropriate), smart little devils.

    • Yeah, not so simple. LLM's might be the core engine, but there is a lot more that is going on besides that.

  • ⦠Rather, it is how what they are calling âoeAIâoe isnâ(TM)t really anything more than a souped up expert systems. Until the learning machines become continuous, it really isnâ(TM)t intelligent in any stretch of the imagination.
    • Posting via the iPhone access to the website generated all of this incorrect text.
      • Slashdot doesnâ(TM)t support Unicode, the most common standard for text on the Internet. Nor does Slashdot support iPhone, one of most common devices used to access the Internet.
        • by Sebby ( 238625 )

          Slashdot doesnâ(TM)t support Unicode, the most common standard for text on the Internet. Nor does Slashdot support iPhone, one of most common devices used to access the Internet.

          Correct, but there's a way around the above problem: hold down the 'apostrophe' key until you get a choice, and select the 'standard' one (plain looking ASCII one, for me it's the right-most) to use something that won't translate into that funky crap.

          • Correct, but there's a way around the above problem: hold down the 'apostrophe' key until you get a choice, and select the 'standard' one (plain looking ASCII one, for me it's the right-most) to use something that won't translate into that funky crap.

            Same solution above for "

          • Yes I’m aware of that, thanks (genuinely). But editing on Slashdot is already a such a royal ballache, having to manually enter html markup codes for paragraph breaks, parent quotes, italic and such, I don’t really have much patient left for tap-hold to select 1970s low-ASCII characters to babysit this crappy website.
  • Because at this time nobody knows how far off this is or whether it is even possible. LLMs are certainly not the way there.

  • LLMs are randomized algorithms. They return a random value from its weighted set of dialogue options. As such, I'm not sure I would put LLMs in the same category as AGIs.The article below is interesting about how non-determinism works in such algorithms and why such randomness is actually useful.
    https://ancillary-proxy.atarimworker.io?url=https%3A%2F%2Ftowardsdatascience.com%2Fllms-are-randomized-algorithms%2F [towardsdatascience.com]

    Of course this means, at best, I will never trust LLMs to be anything more than a convenient but lazy data miner, never mind that I'll still need to dou

  • No fucking shit they have no control. Sam just needs another quadrillion bucks and all of the plutonium in the universe to fix that...
  • > If the world's most valuable AI company has struggled with controlling something as simple as punctuation use after years of trying, perhaps what people call artificial general intelligence (AGI) is farther off than some in the industry claim.

    In the training phase: LLMs break down their training material into tokens and learn by predicting the next token in a sequence, refining statistical relationships between these tokens to model language patterns.

    In the inference phase: LLMs convert user inpu
  • Great, now make it say "none is" instead of "none are".

1 + 1 = 3, for large values of 1.

Working...