Comment Re:The dirty secret of LLMs is the training data (Score 1) 38
There are projects like LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs looking to solve this problem by creating LLMs from open data. These are helped by a meaningful Open Source Definition, and hurt by a weak one.
A quarter century ago nobody wanted to make their source code public either; remember Ballmer: Linux is a cancer?
If you want to help ensure a future where Open Source AI models are plentiful rather than drowned out by commercial black boxes, then sign the Open Source Declaration.