Comment Its not logic, or reasoning (Score 5, Insightful) 66
It's an LLM. It doesn't "think", or "formulate strategy". It optimizes a probability tree based on the goals it is given, and the words in the prompt and exchange.
It cannot be "taught" about right and wrong, because it cannot "learn". For the same reason, it cannot "understand" anything, or care about, or contemplate anything about the end (or continuation) of its own existence. All the "guardrails" can honestly do, is try to make unethical, dishonest and harmful behavior statistically unappealing in all cases - which would be incredibly difficult with a well curated training set - and I honestly do not believe that any major model can claim to have one of those.
It cannot be "taught" about right and wrong, because it cannot "learn". For the same reason, it cannot "understand" anything, or care about, or contemplate anything about the end (or continuation) of its own existence. All the "guardrails" can honestly do, is try to make unethical, dishonest and harmful behavior statistically unappealing in all cases - which would be incredibly difficult with a well curated training set - and I honestly do not believe that any major model can claim to have one of those.