There are a few pieces to AI, there is the code that ingests the data and tokenizes it, there is the code that interacts with user, and then there is the actual data fed into the tokenizer. The first and second are more like what we traditionally call software, and are available in open source versions. The third is the problem piece. If you managed to textualize everything you know and feed it into a LLM the LLM would only know what you know, and that would not really be very useful unless you just wanted it to remember your second cousins birthday and remind you about it. The minute you start feeding it text or images you didn't create you venture into the ethical and legal morass that is currently churning all over the world around the big LLMs.
That huge pool of tokens is what makes an LLM useful, it really is the LLM. The code that can be ethically shared just creates or interacts with that LLM. Yes, you can own books, and by historical precedent have every right to physically manipulate that book in any way you like. You do not have the right to copy that book, and this is at the heart of the controversy. Many authors, artists, and creators are making the claim that the act of ingesting a book into an LLM creates a copy of that book, while the people making LLMs (and the corporations who see the potential for $BILLIONS$) say that they are just deriving metadata, or that the ingestion does not constitute a copy because the text is not stored, it is tokenized, and LLMs will not regurgitate verbatim the data on which they are trained.
Of course creative prompts seem to show that they will indeed regurgitate verbatim.
The current state of this controversy makes it very difficult to guarantee that the training set of a useful LLM was actually all public domain or legally ingestable and therefore releasing an LLM under an open source license might get you sued.
Of course this legal back and forth is how we discover the need for new law, and eventually will lead to various governments or legislative bodies making laws that define the borders of what can and can't be fed to an LLM without licensing. These laws will vary by location and the perceived values of the bodies making the law, which will probably make "LLM friendly" locations where the AI companies will go to lower ingestion cost, which will then lead to another wave of lawsuits, this time by authors et al., attempting to prevent access to the LLMs from regions with stricter laws, much like we have seen in the audio/video realm.
Basically AI, at least in the AGI sense, is really not something that an individual of normal means can do, much in the same way that an individual of normal means cannot make a mass production factory, the resources required are just too big.
AI, in the classic sense not the prompt driven generative sense, is something an individual can play with. It is fundamentally pattern recognition, and is applied invisibly in many parts of life already.
For me a really fun example is the arc-fault circuit breaker required in much of the US in new electrical installations. It actually "listens" to noise on the electrical line and compares it to a signature library of electrical connections to determine if it is an accidental arc or just the normal operation of a device which arcs, like a brushed motor or a relay.
The first generation of these devices produced so many false positives they rapidly gained a reputation of uselessness, however as signature libraries improved and pattern matching algorithms evolved they got better and better. This is AI. It has nothing to do with general intelligence or conversation, it is a very specific realm of pattern matching, and it does it better and faster than anything a person could do.
Because it is an industrial control device it is not recognized as AI, it is just another little black box that controls something. It doesn't even look as impressive as a PID process controller, which, though it can appear to be smarter, is really not AI, it is just a basic calculator with real world inputs.