I'm afraid you do not understand what a large language model is.
Given that very obvious fact, were I you, I'd discard every opinion you have on the matter until you can rectify that.
Something like a book is not "stored" in an LLM.
It is torn into a billion sentence fragments, and its weights adjusted toward being able to accurately predict how to complete them based on a ~1000-dimensional embedding of the tokens of that sentence.
The goal is, in fact, to "memorize" as little as possible in the training. If you're memorizing, then you're not generalizing- you're wasting those 1000-dimensional vectors. After all, like you said, such data is trivial to store and recall accurately if that's your goal.
The actual goal of the training, is to, in learning to predict how those sentences finish, learn the semantic associations between words. I.e., to learn to "understand" them, and how they mean when they're used in the way they're being used.
Could you recall 42% of the first book of Harry Potter?