Scaling comes about in the parton model of a hadron, where the hadron is simply seen as an almost unbound "bag" of "partons" (quarks and gluons). The bjorken -x is then nothing else (at least in the relativistic approximation of all-massless particles) of the longitudinal momentum fraction of the "hit" parton on the "whole".
As such, the interaction cross section of a particle with a hadron should factorize in a "form factor" (the probability to have a parton with fraction x of the momentum) and an "elementary cross section" which is nothing else but the interaction cross section of the incoming particle and the "free parton".
For the small x values, what happens is in fact that there are higher-order QCD diagrams in which the "original parton" presents itself as another one. That's a bit as in the case of an electron, there's a cloud of virtual e+/e- pairs around it (vacuum polarization), and at small enough scale, the interaction can be not with the original electron, but, say, with a positron of this "cloud". In the same way, an initial up quark can couple through a higher-order QCD diagram with, say, an anti-up quark with the incoming particle.
There are "evolution equations", the Altarelli-Parisi equations, which use higher-order QCD diagrams to change the quark density at a certain energy into a quark density at another energy. There have been corrections to this, which have to do with 'diffractive effects', which can be seen as scattering on "bound states" within the hadron.
I have to say that I don't really know what gives scaling violations at high x. All this is from memory from 10 years ago...