A. Neumaier said:
I will outline the steps taken by Bryce DeWitt in his book (as it is an old book, and I don't think many members will have it off hand!): "Dynamical theory of groups and fields" (starting on page 16). So note, this is not my argument, but I believe his argument should be presented.
He argues that the mathematical form for analyzing a single observable "D" for a coupling between a system and apparatus is the total action functional: ##S+S_A+gxD## Where S is the action for the system, ##S_A## is the action for the apparatus, and gxD where: g is the (adjustable) coupling constant, x is some convenient apparatus variable, and D is the observable.
The observable in this case is the spin which we will refer to as the "system", and the "apparatus" will be referred to as: the atom (ignoring spin here), the magnetic field, and a coordinate framework.
The atom is massive compared to the spin, so the dynamical motion of the system S can be considered constant. The apparatus functional will take the form: ##S_A = \int \frac{1}{2} m(\dot{x_2}^2 +\dot{x_3}^2)dt##
Here ##(x_2, x_3)## are the apparatus coordinates in the plane, and we save the ##x_3## axis for the direction of the magnetic field. m is the mass of the atom. He makes the assumption that the atom will move in this plane, so he will ignore ##x_1##.
He then argues that the coupling term that correlates spin and atomic motion has the form (##\hbar = 1)##:
##\int \mu D H dt ##
If left undistrubed, the atom (essentially the apparatus) will follow the trajectory ##x_2 = vt, x_3=0## which is a stationary trajectory for the action ##S_A## He then argues, once again, if the atom is massive, it won't change much from this trajectory.
He then makes some assumptions about the strength of the magnetic field, and approximates it to:
##H = \theta(x_2) \theta(L-x_2) x_3(\frac{\partial H}{\partial x_3})|_{x_3=0}## Where L is the length of pole pieces of the magnet, and the minimum time experiment is to be: ##0 < t < \frac{L}{v}##
and ##\theta## is some step function defined by: ##\theta(e) = \frac{1}{2}(1+\frac{e}{|e|})## which are defined by: 1 for e>0, ##\frac{1}{2}## for e = 0, and 0 for e<0.
With these approximations in mind, he argues that the coupling term will reduce to the form gxD where:
##x = \int \theta(x_2) \theta(L-x_2) x_3 dt##, ##g = \mu (\frac{\partial H}{\partial x_3})|_{x_3=0}##
Later on in the book, he talks about elementary vs complete measurements, and adds more rigor to this arguments in the SG experiment, but I would rather just refer to the book at that point as it is several pages to build it up properly.