In relativity, it is common that a given set observations can be given different explanations (often, but not always, related to what coordinates you use for modeling). You can't say one is right or wrong, but there may be different pedagogic value to different descriptions (about which people may still disagree). A trivial example is "why do muons created in the upper atmosphere reach the ground in large numbers?" Is it due to time dilation or distance contraction? Depends on coordinates used (earth based or muon based).
In the case of the phenomenon of 'gravitational redshift': light emitted as one frequency at the bottom of a tall building is received at the top as a lower frequency
There are at least 3 explanatory frameworks that can be made consistent. You get into real trouble only if you try to mix them. I should also say, that I agree with most of the posts above as to which is pedagogically most useful (as I described in my post #5). However, the OP confusion in part came from mixing two alternate explanatory frameworks, and thinking they should be additive, so I hope to clarify and separate these frameworks.
To help motivate each, I will formalize a thought experiment as follows:
- From the ground of a tall building, we emit a pulse of light at a known frequency and also a nearly (rest)mass-less body at vanishingly close to c (imagine the classical analog of a ball of neutrinos; call this an n-blob). The n-blob is emitted with a known (substantial) KE (kinetic energy).
1) Gravitational Time Dilation Explanatory Framework
A clock at the top of the building runs faster than a clock at the bottom, according to a stationary observer at either location. The same is true for any physical process related to time. As a result, the light pulse emitted at one frequency at the bottom (per a bottom observer), is considered to be emitted at a lower frequency by a top observer, and is received at this lower frequency at the top. The same goes for the energy of the pulse. Note that all of the energy of light is kinetic energy, because it has no rest mass. The speed of the light pulse remains c whether measured at the top or the bottom.
Looking at the n-blob, you need to understand the intimate connection between time and energy, alluded to by Pervect earlier. [You can crudely motivate this by imagining a ball bouncing between two close walls. A higher observer must see this 'clock' running slow compared to an adjacent observer, thus it the ball must be moving slower, and have less KE. Then, in order to have consistent local behavior at different altitudes, all energy (and mass) must scale the same way as time rate.]. Thus, per the higher observer, the n-blob is emitted with less energy than as measured by the lower observer, and this is the energy later detected by the higher observer. The speed of the n-blob will be only infinitesimally different at top compared to the bottom, and infinitesimally different from c. For such a blob, even a 100 fold change in energy would be associated with an undetectable difference in speed.
Note that potential energy has not been mentioned in this framework.
Note that for a slow moving body, at an altitude where the time related scaling of total energy brings the total energy down to the rest energy, the body cannot go any higher (without extra impulse of some kind). Since light has no rest energy (it is all kinetic energy), there is no lower bound on energy scaling, thus light always escapes. The n-blob behaves essentially indistinguishably from light.
2) Potential Energy Explanatory Framework
The notion of gravitational potential energy is introduced. It is consistent with (1) in that the differences in measurement by stationary observers are referenced to a standard at infinity. KE (as measured by stationary observers) is considered exchangeable with potential energy. For gravity and relativity, we must add the notion that total energy plays the role that m plays in a pure Newtonian potential (that is, that potential difference acts on total energy(/c^2) rather than just (rest) mass, though the energy gained or lost is KE). Seeing that this is a different packaging of the same effect as the time dilation explanation clarifies that you would never try to combine these effects - they are describing the same thing in different ways.
That the n-blob or a normal body may be considered to exchange KE for potential energy as it climbs a gravity well is certainly non-controversial. The n-blob introduces the fact that you must consider potential difference as acting on total energy /c^2, not rest mass, or you would find (incorrectly) that the n-blob loses essentially no KE climbing a gravitational well. For a normal body, KE/c^2 is so small compared to m, that this difference is not noticed.
Some say it is 'wrong' to apply this explanation to a light pulse. Yet, the math of the n-blob in GR is, in the limit, indistinguishable from a light pusle. It must, therefore, be no more wrong to say the light pulse exchanges KE for PE than it is for the n-blob. In the case of light, there are many pedagogic advantages to focusing on time dilation and the classical wave picture of light, but comparing a light pulse to an n-blob establishes that it can't be 'wrong' to use a potential energy framework. You don't even need to bring photons into the picture.
A normal body may reach a height where its KE is zero, and then it must fall. Light just gets asymptotically closer to an 'energy at infinity', as does an n-blob (for all practical purposes).
3) And now for something completely different: coordinates based on a free fall observer.
Caveat: this really works only over a small region (like our tall building), where tidal gravity is not significant. For the sake of argument, we consider the tall building suspended on struts to more easily imagine a free fall frame whose origin is coincident with the building bottom at emission time, and is momentarily at rest at that time.
A free fall frame agrees with the time dilation framework that neither light, nor an n-blob, nor a normal body, change KE after emission. However, it finds that the unchanged energy is the one measured by the building bottom observer, and offers a completely different explanation for why the building top observer measures lower energy in all cases. It is simply that the building top observer is moving in this frame by the time it detects each emission.
For light, the motion of the building top at time of reception compared to the bottom at time of emission, leads to Doppler red shift.
For material bodies, it is simply that their KE is less because they are measured by an observer moving in the same direction as the bodies. For the n-blob, you must use relativistic formulas, and find that the result is (again) indistinguishable from the light pulse Doppler computation for change of frequency (energy).
For a 'normal' body, it is seen that building's speed may soon catch up to the body's, at which point the body starts getting closer to the floor.
-----
Don't fret about which is correct, rejoice that that the same set of observations can be understood from multiple points of view.