The derivation of the Feynman rules for the counterterms goes through exactly like the derivation of the Feynman rules for regular interactions: the idea is that the counterterms are just new interactions. Srednicki has a pretty decent derivation of the Feynman rules for a ##\phi^3## interaction in a scalar field theory; you could try your hand at running the same derivation for the counterterms.
The "derivation" of the counterterm rules just involves writing the bare constants as:
[tex]Z = 1 + \delta_Z[/tex]
[tex]Zm_0^2 = m^2 + \delta_m[/tex]
[tex]Z^2 \lambda_0 = \lambda + \delta_\lambda[/tex]
This is just a rewriting of the action, but you have isolated the physical constants before
calculations. If you just place the above relations into the action, the action now has a different free/interacting split. See chapter 10 of Peskin and Schroeder for more details.
IMHO it's easiest to derive and see in the BPH formalism.
What you do is write the Lagrangian in two parts L1 and L2 and L = L1 + L2. L1 is simply the Lagrangian written with the variables, namely the EM and electron fields, the fine structure constant, and the electron mass, as the renormalised values for QED. For the phi 4 theory they are the values of that theory which I cant recall off hand but will give a link to. These are the values you actually measure. But that is not the actual Lagrangian which is in terms of the bare parameters which are divergent - some say they are not really measurable - not so sure about that - but rather they are cutoff dependent and you need to specify a cutoff to determine its value from what you do measure - the renormalised values. So L2 = L - L1 where L is the bare Lagrangian. The parameters of L2 are not specified but rather calculated so that what you are calculate from L is finite. You calculate exactly the same as usual but using L1 - it blows up with infinity - but you adjust the constants in L2 to cancel them so that what you get is finite.
But basically the idea is if you express physical theories in terms of what you measure (and those things are not divergent like the bare parameters) then what you calculate from them is also finite - the infinities of the theory are cancelled during calculation - so knowing that you simply adjust the undetermined terms (the counter-terms) to do just that. That's why you write the Lagrangian in terms of what you measure and adjust the counter-terms so what you get is finite.