strangerep said:
The way to go is to read the development step-by-step. Each single step is trivial. If you let me know which one is the first step that you don't follow, then I can add explanation or improve the wording, or add an example, or whatever it takes. But for that I need coordinates on which step you find opaque.
Also, I might re-emphasize that the idea being developed is really simple, much simpler than what you get elsewhere. At the heart of it is this: A field history is some function $$\Phi(-) : \Sigma \longrightarrow E$$ and a family of field histories that vary with some parameter ##u \in U## is a function $$\Phi_{(-)}(-) \colon U \times \Sigma \longrightarrow E$$ Given this, it is completely obvious what the
variational derivative in the direction of variations parameterized by ##U## should be: it's nothing but the de Rham differential ##d_U## along ##U##. Finally, to get the full variational derivative, the one that knows about all possible variations, we simply look at this for all possible ##U## and for all families of field histories ##\Phi_{(-)}## parameterized by these, and collect all the ordinary de Rham differential ##d_U## on all these ##U##.
That simple idea is the realization of the variational derivative as a differential form in the sense of "diffeological spaces".
Next we want to do some actual computations with this. For this it is exceedingly helpful to make use of the fact that the differential forms which we want to vary are actually "local" in that at each spacetime point they depend only on the values of the fields and their derivatives
at that point. Suppose first it depends in fact only on the field values, not on that of their derivatives. This means that we simply have a differential form on ##E##, which we then pull back along field histories to ##U \times \Sigma##.
(Ah, are you familiar with the concept of pullback of differential forms? If not, we need to pause here and say a word about that.)
Now the de Rham differential generally commutes with pullback of differential forms. In view of the above, this has the key consequence that in order to compute the variation of a differential form on the space of field histories, we may just compute the ordinary de Rham differential on ##E## and then pull this back. If in addition we integrate the result over ##\Sigma##, then any "horizontal" or "total spacetime" derivatives drop out, and this is how we decompose the de Rham differential on ##E## into a horizontal piece and the remaining ##\delta##.
Finally, to have the local differential form also depend on the values of the derivatives of the fields, we do the same with ##E## replaced by its jet bundle, which is nothing but the collection of all these values.
To amplify, if we consider some finite jet order (which we can do in all examples of interest here) then this gives a way to speak of variational derivatives and how they are the de Rham differential on the space of field histories entirely using elementary finite-dimensional differential geometry. There is just the spacetime ##\Sigma##, the manifold ##E## where fields take their values, and the Cartesian space ##U## with which we parameterize field histories, and smooth functions of the form ##\Phi_{(-)} : U \times \Sigma \to E##. It doesn't get simpler than that , really.