There's a very simple answer to the question in the OP: Quantum theory is how entanglement is defined and explained. There's nothing else to add, at least not according to contemporary knowledge of physics. Physics, of course, is not complete, as long as there's not one consistent theory describing (note that physics never explains nature but describes her as precisely as possible) all observable reproducible objective phenomena in nature.
So far, we have two fundamental theories, describing all known phenomena: The one is the mathematical description of space and time, the General Theory of Relativity, which has its centennary this year, and the other other quantum theory, which describes the so far understood part of the matter in terms of elementary particles. Fortunately this latter realm is describable using the Special Theory of Relativity as a space-time model, i.e., neglecting the very weak gravitational interaction. To formulate a quantum (field) theory with a general relativistic space-time model is quite difficult and not fully understood in all its details. What's a totally open question is whether there exists a consistent mathematical description of quantum gravitation. The reason is that it is very hard to find observable quantum effects of the gravitational interaction.
However, entanglement is a ubiquitous phenomenon and described to a very high accuracy in accordance with all high-precision experiments in terms of standard relativistic quantum field theory, and it is very important to note that there is non action at a distance whatsoever involved with it. This is so even by construction! In the so far successful realizations of relativistic quantum theory, one assumes from the very beginning that all interactions are local, and as a consequence there cannot be an action at a distance. An action in some region cannot causally affect another phenomenon at a distant region instantaneously, but it needs at least the time light would need to travel from the first place to the second. That's the "cosmic speed limit" inherent in the theory of relativity and thus by construction also for local relativistic quantum field theories. This type of models is very successful in describing the matter surrounding us in terms of quarks, leptons, the Higgs boson and socalled gauge fields, the Standard Model of Elementary Particle Physics.
Although there are no actions at a distance, you can create composite systems whose "parts" are correlated over very long distances. This is one aspect of entanglement. It is quite difficult to keep this entanglement for some time to enable such long-distance correlations, because it can be prepared only for microscopic objects, and such objects are very easily affected by interactions with anything in the "environment". That's why most experiments demonstrating this very fascinating long-distance correlations are done with photons, which can be created in entangled pairs (biphotons) and these can be kept entangled quite successfully nowadays.
A biphoton is created by shooting a laser into a birefringent crystal by a process called "parametric downconversion". Very simplified one can say, a photon out of the laser field is split into two lower-energetic photons in such a way that their polarizations are entangled. These photons then run "back to back" (again spoken in a very simplified way), and if you wait long enough, they are located at far distances, but they still need a finite time since they travel with a finite speed through a vacuum, the speed of light. Now, the entanglement of their polarization is still preserved.
The mind-boggling consequence of such a biphoton state is now the following: According to quantum theory, when measuring the polarization of each of these photons at far-distant places (usually named A and B for the names Alice and Bob of observers doing the experiment), you'll find absolutely random polarizations. You cannot predict which polarization (horizontal or vertical using a correspondingly directed polarization filter) you'll observe, but quantum theory tells you that with 50% probability A and B find a h-polarized photon and with 50% a v-polarized photon. However, the photon polarizations are entangled, as long as during their way from the source to Alice's and Bob's detectors they are no disturbed in any way, and this preparation of entangled polarization implies that the outcomes of Alice's and Bob's measurements are 100% correlated! If Alice measures a h-polarized photon, Bob with certainty finds a v-polarized photon. It doesn't matter who measures her or his photon first, which shows that it is not the Alice's measurement that can affect Bob's result and vice versa (at least if you believe in causality according to the relativistic space-time description), but that it is the preparation in the very beginning that causes this 100% correlation of totally random single-photon polarizations.
Now, there was an objection against this "minimal statistical interpretation" of quantum theory, mostly brought forward, because many scientists have had (and some have still today) objections against a probabilistic world view. What if nature is deterministic, and the single-photon polarizations only appear random, because we don't know some "hidden parameters"? At the first glance, this could well be true, and for some time the physicists thought that you cannot easily distinguish such a deterministic hidden local variable model from the intrinsically probabilistic quantum behavior. Then John Bell showed in the 1960ies that this is not true! At least if you assume that the deterministic theory is also local (i.e., that there are no actions at a distance), you can derive an inequality about the probabilities to measure certain polarizations that are violated when the quantum theoretical prediction is used (in this case Alice and Bob must put their polarization filters in certain non-collinear and non-perpendicular angles). This violation of Bell's inequality has been demonstrated with an overwhelming accuracy. In fact, it's among the most accurate tests of a theory done in physics so far!
Although you still cannot exclude the possibility that one day somebody finds a non-local deterministic theory describing all observed phenomena as well as quantum theory, but so far to my knowledge nobody has found one, and at the same time quantum theory works very well, so that there's no reason to find such an alternative theory.