It didn't give accurate summaries of fictional lawsuits, it fabricated everything. Here's an example: It created a citation for "Shaboon v. Egypt Air" complete with case number and selected quotations. There's no such lawsuit, either in reality or in a TV Show or Movie. If there was that's all anyone would be talking about, that it can't tell the difference between TV and reality. But that's not what happened. It "hallucinated" as the ML folks call it.
You've got an inaccurate view of what this software is. ChatGPT is a Transformer. BASICALLY, it's a really big neural network with a few thousand inputs. Each input is a "token" (an integer representing a word or part of a word), including a null token. The output is a probability distribution for the next token. Because the input is null padded, you can pick a likely next word and replace it the next null with this word, and repeat. Since only part of the input changed, it can be chained efficiently and keep generating until it generates a special "End of Text" token is generated, or until all nulls have been replaced with tokens.
That's the basics. Under the hood are a lot of moving parts. But an important component is a subnetwork that's repeated several times, called an "Attention Head". These subnetworks are responsible for deciding which tokens are "important" (This is called "self-attention" as the model is calling its own "attention" to certain words). This mechanism is how it can get meaningful training with so many inputs: You might give it 1200 words, but it picks out the important ones and predicts based largely on them. This is also how it can make long-distance references to its own generated text. Proper nouns tend to keep attention on themselves. Earlier techniques couldn't do that. The further away a word was, the less important it was to the next.
So, it doesn't know about cases at all. It just knows e.g. if you ask about SCO v IBM, that those tokens are ALL important, and then it (hopefully) has been trained on enough descriptions of that case that the probability distribution shakes out to a coherent summary. Now if you ask for relevant case law and it hasn't seen any, it HOPEFULLY will say so. But, it's been trained on a lot more cases that exist than it's been trained on "don't know" refusals, so it can "hallucinate" (note that it now HAS been trained on a lot more refusals, which is annoying because it's now very prone to say things don't exist when they do). It knows the general form is "X v Y" so, absent any training indicating a SPECIFIC value for X and Y would be relevant, you'll just get a baseline distribution where it invents "Shaboon v. Egypt Air" because: It knows X should be a last name, and since it was asked about injuries during air travel, that the defendant would be an airline (and presumably it picked Egypt Air because generation is left-to-right, and it had generated an Arabic surname already). Now here is where self-attention gets really dangerous. Just like it would recognize SCO v. IBM as important in a user query, it will recognize Shaboon v. Egypt Air as important. Now this case doesn't exist, so the pretraining will not do much with that per se, but it's going to focus on those tokens. And, if asked for excerpts will generate SOMETHING related to a passenger being injured during air travel. Or, will say it doesn't know. It almost always says it doesn't know or that no such case exists. In large part that's because after the bad press ClosedAI has been very busy fine-tuning it on "I don't know" responses).
Here's an example of it dealing with fictional cases. I asked it what the case was called in the Boston Legal episode "Guantanamo by the Bay". It said there is no such episode and I likely am thinking of fan fiction. I told it it's real, it's S3E22. It said of course, yes, it's the twenty-second episode of the third season, and is about Alan Shore arguing Denny Crane is not fit to stand trial due to dementia, but there are no case names mentioned. I told it that's wrong (but I didn't elaborate, just "That's wrong"). It said sorry again and said the episode is about a client who was tortured in another country suing the US Government over extraordinary rendition (finally correct) and the fictional case was "Ahmadi v. United States." (Wrong, it was "Kallah v. something" if it was anything, but it's been a minute since I saw the episode...it might not have been officially named) If I reset the context and ask about "Ahmadi v. United States" or "Kallah v. United States" it says there are no such lawsuits (correct as of its cutoff date, at least as far as google can tell)
In other words, it doesn't really know much about fictional lawsuits from movies and TV shows because the case name is almost never mentioned in a synopsis and it isn't trained on the full episode transcripts.
Anyway, ML has taken a step or two since you learned about n-grams in 1998 or whenever you last looked at the field, is the main point here.
Bonus: I asked ChatGPT to identify inaccuracies in my post and it mentioned the following (paraphrased by me, not pasted verbatim)
- My summary of the technology is simplified but not incorrect (yay).
- ChatGPT is based on GPT3.5 which has 175 billion parameters, not "thousands" (I was referring to the number of input cells, NOT the number of parameters to the model. I think I hurt its feelings because it pointed this out TWICE in a row. )
- I referred to ChatGPT "knowing" things, but as an LLM it does not have consciousness and is unable to "know" anything.
- I referred to ChatGPT "hallucinating" things, but as an LLM it does not have consciousness or senses, so it is unable to make observations at all, let alone make inaccurate ones (I agree that "hallucination" is a silly description, but that's the ML jargon, like it or no)
- I referred to ChatGPT answering questions about Boston Legal incorrectly, but should have made it more clear that ChatGPT responses involve randomization, so not all users would have the same experience as I did.
- It pointed out that I'm ASSUMING fine-tuning on refusals has taken place, which is unknown, and pointed out that I'm being cynical by assuming it was done to avoid further bad press rather than to improve the experience for clients.
- It took umbrage with calling OpenAI "ClosedAI" which is not the correct name, and suggested I take a more balanced tone as I appear to be overly critical.
Anyways, in summary, ChatGPT (and GPTs in general) are able to do a lot more than people think. And also a lot less.