The flaw isn't in the protocols, it's in your misuse of them.
The fundamental problem is that you need to be able to prove to someone who's never met you that you are who you say you are. There's just no way to do that, in software or in real life, without reference to some mutually trusted third party.
Say you're picking up theater tickets at will-call: they ask you to show an ID card which 1) you couldn't reasonably have made yourself (hence the third party), 2) has the same name on it that was used to buy the tickets, and 3) is verifiably tied to your physical identity via a photo. You have to provide all three of these elements in order to prove you own the tickets.
The solution you suggested originally -- having the user install your CA cert over a non-authenticated connection -- is like calling the box office before you leave the house to read them your driver's license over the phone. Two out of three elements are completely missing! What if someone else gets there before you do and says "Yeah, I'm spottedkangaroo, I called earlier"?
Steve Jobs said two years ago that X is brain-damaged and it will be gone in two years. He was half right. -- Dennis Ritchie