I'm sorry that I can't follow the very interesting discussion my article against teaching "old quantum theory", in particular the pseudo-explanation of the photoelectric effect as an evidence for photons. I'm quite busy at the moment.
Just a remark: Of course, it's subjective, which "wrong" models one should teach and which you shouldn't. That's the (sometimes hard) decision to make for any who teaches science at any levels of sophistication. I personally think, one should not teach "old quantum theory", not because it's "wrong" but it leads to wrong qualitative ideas about the beavior of matter at the micrscopic level. E.g., the Bohr-Sommerfeld model contradicts well-known facts about the hydrogen atom, even known by chemists at the days when Bohr created it (e.g., it's pretty clear that the hydrogen atom as a whole is not analogous to a little disk but rather a little sphere, if you want to have a classical geometrical picture at all). The reason for, why I wouldn't teach old quantum theory (and also not first-quantized relativistic quantum mechanics) is that it leads to the dilemma that first the students have to learn these historical wrong theories and then, when it comes to "modern quantum theory", have to explicitly taught to unlearn it again. So it's a waste of time, which you need to grasp the mind-boggling discoveries of modern quantum theory. It's not so much the math of QT but the intuition you have to get by solving a lot of real-world problems. Planck once has famously said that the new "truths" in science are not estabilished by converting the critiques against the old ones but because they die out. In this sense it's good to help to kill "old models" by not teaching them anymore.
Another thing are "wrong" models which still are of importance and which are valid within a certain range of applicability. One could say all physics is about is to find the fundamental rules of nature at some level of understanding and discovery and then find their limits of applicability ;-)). E.g., one has to understand classical (non-relativistic as well as relativistic) physics (point and continuum mechanics, E+M with optics, thermodynamics, gravity), because without it there's no chance to understand quantum theory, which we believe is comprehensive (except for the lack of a full understanding of gravity), but this also only means we don't know its limits of application yet or whether there are any such limits or not (imho it's likely that there are, but that's a personal belief).
As for the question, why there's (sometimes) a "delay" in the propagation of electromagnetic waves through a medium, classical dispersion theory in the various types of media is a fascinating topic and for sure should be taught in the advanced E+M courses. You get, e.g., the phenomenology of wave propagation in dielectric insulating media right by making the very simple assumption that a (weak) electromagnetic fields distort the electrons in the medium a bit from the equilibrium positions, which leads to a back reaction that can be described effectively by a harmonic-oscillator and a friction force. You get a good intuitive picture, which is not entirely wrong even when seen from the quantum-theoretical point of view. The classical theory is best explained in Sommerfeld's textbook on theoretical physics vol. IV. There's also a pretty good chapter in the Feynman Lectures, but I've to look up at the details of the mentioned intuitive explanation in that book. Of course, a full understanding needs the application of quantum theory, and you can get pretty far by working out the very simple first-order perturbation theory for transitions between bound states. You can also get quantitative predictions for the resonance frequencies and the oscillator strengthts in the classical model. A full relativistic QED treatment is possible (and necessary), e.g., for relativistic plasmas (as the quark-gluon plasma created in ultrarelativistic heavy-ion collisions), where you have to evaluate the photon self-energy to find the "index of refraction".
In any case you learn, that you have to refine your idea of "the wave gets delayed". The question is what you mean by this, in other words, what you consider as the signal-propagation speed. That's not easy. There is first of all the phase velocity, which usually gets smaller than the vacuum speed of light by a factor of ##1/n##, ##n## is the index of refraction. Nevertheless ##\mathrm{Re} n## (usually a complex number) does not need to be ##>1##, and the phase velocity can get larger than ##c##. Another measure is the group velocity, which (when applicable at all!) describes the speed of the center of a wave packet through the medium. Usually it's also smaller than ##c## although in regions of the em. wave's frequency close to a resonance frequency of the material, that's not true anymore and it looses its meaning, because the underlying approximation (saddle-point approximation of the Fourier integral from the frequency to the time domain) is not applicable anymore (anomalous dispersion). The only speed which has to obey the speed limit is the "front velocity", which describes the speed of the wave front. In the usual models it turns out to be the vacuum speed of light, as was found famously by Sommerfeld as an answer to a question by W. Wien concerning the compatibility with the known fact that the phase and group velocities in the region of anomalous dispersion can get larger than ##c## with the then very new Special Theory of Relativity (1907). This was further worked out in great detail by Sommerfeld and Brillouin in two famous papers in "Annalen der Physik", which are among my favorite papers on classical theoretical physics.