I do not understand why entanglement is such a mystery. Entanglement is not a magic connection between two particles. Neither is it (pace bhobba)
merely a correlation. It is plain old superposition. It doesn't arise unless there is superposition - which is, of course, always the case in one basis or another. But given superposition, entanglement is a direct result. With superposition there is no spooky action at a distance. No non-locality.
So what is the problem?
The problem is that people don't like superposition and want to get rid of it.
It is just about psychologically acceptable to allow that a small system is in a weird quantum state until we look at it, but, as soon as we have observed it, surely the wierdness disappears? They say. So we have collapse of the wavefunction - or some similar ad hoc hypothesis - introduced to make our interpretation feel more natural. But when does the wavefunction collapse? When is an observation complete? In the case of an EPR experiment, Alice and Bob make measurements on their own photons. But Charles, who compares their results, has not yet made a measurement. Thus we must regard Alice and Bob, be they real people or mere detectors and recording devices, as being in superpositions *as far as Charles is concerned*. (Technically in his measurement basis.) And that doesn't feel right.
You mean Alice sees both outcomes but only one Alice state is actualized and then only when Charles observes her? I don't think so! They say.
And yet, if Alice and Bob do enter an unambiguous state (of having seen a particular outcome - which is 100% common-sense!) then spooky action at a distance is inevitable unless you really bend over backwards to concoct weird superdeterministic theories. Suppose Alice's measurement is a tiny bit ahead of Bob's - nowhere near enough to alter their spacelike separation but enough to be able to say that, at a certain time, Alice and Bob have set their detectors; Alice has made her measurement but Bob has not. During this time, Bob's detector must alter its detection sensitivity to reflect the angle between the two detectors. It's as simple as that. Bob's detector presumably knows its own angle so
the information about Alice's setting must get to it superluminally. It is actually possible to get quite good correlations if Bob's detector is allowed to know everything about how Alice's photon interacts with her detector (edit - without the information about her setting). This covers all sorts of hidden variables, pre-arrangements and so on and certainly covers the red/green sock case. However Bell's Theorem (the CHSH part) proves that, regardless of quantum mechanics, there are limits to what it can do unless Bob has the additional information of Alice's setting. QM predicts, and experiment confirms, that the correlations are the strongest that would be possible if Bob did have that information. Remember, Bob's sensitivity must adjust itself according to this information even though it cannot have reached him yet. That's spooky.
So the choice is yours:
1 Shelter behind "it's only a correlation" and "you can't send signals with it". Edit - Shut up and calculate!
2 Accept spooky action at a distance (edit - as well as superposition).
3 Accept that Alice and Bob, like Schrodinger's Cat, remain in superposition at least until Charles observes them (edit - and no spooky stuff needed).
4 Come up with an alternative theory involving time-travelling fairies or brains in a vat (edit - not spooky at all, oh no!

).