There's a distribution Tf for each function f

In summary: This corresponds to the usualnotation f(x) + g(x) for functions in a vector space.In summary, the definition of the delta distribution states that the distribution is defined by the function that minimizes the area under the curve from -∞ to ∞ as x is varied. The notation for this is \langle\delta,\phi\rangle. The expression \int \delta (x)f(x)dx is used to define the integral of this distribution.
  • #1
Fredrik
Staff Emeritus
Science Advisor
Gold Member
10,877
422
I'm reading the Wikipedia article on distributions. They say that there's a distribution [itex]T_f[/itex] for each function f, defined by

[tex]\langle T_f,\phi\rangle=\int f\phi dx[/tex]

and that there's a distribution [itex]T_\mu[/itex] for each Radon measure [itex]\mu[/itex], defined by

[tex]\langle T_\mu,\phi\rangle\int \phi d\mu[/tex]

They define the delta distribution by

[tex]\langle\delta,\phi\rangle=\phi(0)[/tex]

I feel that there's one thing missing in all of this. I don't see an explanation of the expression

[tex]\int \delta(x)f(x)dx[/tex]

What does this integral mean? Is there a definition of the integral of a distribution that applies here, or should this just be interpreted as a "code" representing the expression [itex]\langle\delta,\phi\rangle[/itex]? (This would mean that the expression has nothing to do with integrals at all, but is written as if it were an integral just to make it look like the expression involving f above).

Also, can someone tell me why you can define a topology on the test function space by defining limits of test functions? (If you answer by referencing a definition or a theorem in a book, make sure it's either "Principles of mathematical analysis" by Walter Rudin or "Foundations of modern analysis" by Avner Friedman, because those are the only ones I've got :smile:)
 
Physics news on Phys.org
  • #2


I'm sure you're familiar with the definitions but I'll repeat them for clarity.

If we're working on the real line then a test function is an infinitely differentiable function with compact support. The space of all such test functions is denoted [tex]D(\mathbb{R})[/tex].

A distribution is then a linear and continuous map [tex]T : D(\mathbb{R}) \rightarrow \mathbb{R}[/tex] such that if [tex]\phi_n \rightarrow \phi[/tex] and [tex]\phi_n^{(k)} \rightarrow \phi^{(k)} [/tex] uniformly then [tex]T(\phi_n) \rightarrow \T(\phi)[/tex]. For whatever reason, [tex]T(\phi)[/tex] is normally denoted by [tex]\langle T, \phi \rangle[/tex].

For ant integrable function [tex]f : \mathbb{R} \rightarrow \mathbb{R}[/tex] we can then define a distribution [tex]T_f[/tex] given by [tex]\langle T_f, \phi \rangle = \int f \phi dx[/tex]. The delta distribution however does not come from such a function, and instead [tex]\langle \delta, \phi \rangle[/tex] is defined as [tex]\phi(0)[/tex]. So the expression [tex]\int \delta (x) \phi (x) dx[/tex] should never come up.
 
  • #3
Welcome to PF!

Hi crazyjimbo! Welcome to PF! :smile:

We need members who can help write pages in the new PF Library.

Would you like to start an entry on "distribution"? :wink:
 
  • #4


crazyjimbo is of course mathematically correct, but let me offer a physicist's perspective, with the caveat that these are somewhat hazy, nonrigorous notions that the mathematicians have souped up into the rigorous theory of distribtuions.

The "delta function" can be thought of as a function that is sharply peaked at x=0, with total area 1 underneath it. For example,
[tex]\pi^{-1/2}\varepsilon^{-1}\exp(-x^2/\varepsilon^2)[/tex]
with [itex]\varepsilon[/itex] "small". If you integrate this function times f(x), you will get something that is very close to f(0) (provided that f(x) is smooth on the scale set by [itex]\varepsilon[/itex]). The integral notation is then extremely useful, and comes up all the time in situations such as complete sets of states in QM, etc.
 
Last edited:
  • #5


Notice that the article said that for each function, f there is a corresponding distribution. It did NOT say that for each distribution there is a corresponding function! In that sense, the set of distributions (generalized functions) includes the set of functions as a proper subset.
 
Last edited by a moderator:
  • #6


Fredrik said:
I'm reading the Wikipedia article on distributions.
[...] They define the delta distribution by

[tex]\langle\delta,\phi\rangle=\phi(0)[/tex]

I feel that there's one thing missing in all of this.
I don't see an explanation of the expression

[tex]\int \delta(x)f(x)dx[/tex]

What does this integral mean?

I'll add some more math-oriented stuff to the other answers...

The expression [itex]\langle\delta,\phi\rangle[/itex] is a "dual pairing", a concept
which becomes ever more deeply important as you get into advanced QFT.

A function f can be thought of as a vector in an infinite-dimensional
space, indexed by (say) a real variable "x" instead of writing (say)
[itex]f_i[/itex] which you might see for (components of) elements of
finite-dimensional vector spaces.

A physically important concept in vector spaces is the notion of "dual
space". E.g., suppose [itex]v[/itex] is an element of a vector space V with
components [itex]v_i[/itex] wrt to some basis. Then the set of all
possible ways of linearly mapping elements of V to scalars is called
the "dual space of V", denoted [itex]V^*[/itex]. For the familiar
boring finite-dimensional vector spaces, the distinction between
[itex]V[/itex] and [itex]V^*[/itex] corresponds to the distinction
between lower and upper indices on vectors, and the two
spaces are in fact isomorphic. When one writes (eg) [itex]w^k v_k[/itex] (scalar product),
this is actually a special case of a "dual pairing" between elements
[itex]w\in V^* [/itex] and [itex]v\in V[/itex]. Think of it as w acting on v to
produce a scalar. (Remember that [itex]w[/itex] is really a mapping.)

Now consider that a space of functions of a real variable is really an
infinite-dimensional vector space. You can add two different functions
f and g together to get another function h = f+g, defined by

[tex]
h(x) ~:=~ (f + g)(x) ~:=~ f(x) + g(x)
[/tex]

which is precisely analogous to the finite-dimensional case where
we might add two vectors u,v component-wise to get another
vector w = u+v, defined component-wise via:

[tex]
w_i ~:=~ (u + v)_i ~:=~ u_i + v_i
[/tex]

Hereafter, I'll write [itex]f(x)[/itex] as [itex]f_x[/itex] to
emphasize this analogy and I'll denote by "F" the inf-dim vector space
of which f is one member.

Now, just as we can define a dual space over a finite-dim vector space,
the same notion makes sense over infinite-dimensional vector spaces,
so [itex]F^*[/itex] is the space of all the linear mappings from
F to scalars.

But what does the dual-pairing look like in the inf-dim case? For
finite-dim vector spaces we just sum the indices, right? So the analogy
for (many) inf-dim spaces is to integrate over the (now real) index.
However, such integration might not be well-defined for every function
space we might imagine, so mathematicians prefer to omit the details of
the dual pairing if not necessary, and just write things like (eg)
[itex]\langle\beta,f\rangle[/itex], where
[itex]\beta\in F^*, f\in F[/itex]. Explicitly, this is
[itex]\beta^x f_x[/itex], which in this case really means:

[tex]
\int \beta(x) f(x) dx
[/tex]

Many elements of [itex]F^*[/itex] typically arise from elements of F.
That's what the [itex]T_f[/itex] distributions are in the earlier
posts, and it's essentially how Avodyne motivated the delta
distribution -- by considering it as a limit of ordinary functions.
However, in the inf-dim case we often find that [itex]F \subset F^*[/itex],
(unlike the finite-dim case where the two spaces
are isomorphic). Thus, there are elements of the dual space which do
not arise from elements of F, in general. (That's what HallsofIvy pointed out.)
Nevertheless, one can still speak of the dual pairing [itex]\langle\beta,f\rangle[/itex]
(where now [itex]\beta[/itex] does not arise from any element of F).

I prefer to consider the delta distribution [itex]\delta(x-y)[/itex]
as [itex]\delta(x,y)[/itex] (keeping the indices distinct), or
as [itex]\delta_x^{~y}[/itex]. Now the role of the delta distribution
as an identity mapping becomes obvious:

[tex]
\delta_x^{~y} f_y ~:=~ \int \delta(x,y) f(y) dy ~:=~ f(x) ~=:~ f_x
[/tex]

So we can think of [itex]\delta\in F\otimes F^*[/itex] or as
[itex]\delta\in Lin(F,F)[/itex] .

There's a distinction of course between [itex]\delta_x^{~y}[/itex]
and the finite-dim [itex]\delta_i^{~j}[/itex] in that
[itex]\delta_{x+a}^{~~~y+a} = \delta_x^{~y}[/itex], for all x,y but
this reflects translation invariance of the integral measure
used in the dual pairing in this case.

The moral of this story is that it's more general and powerful
to think of distributions as linear operators acting on
inf-dim vector spaces (function spaces).


Also, can someone tell me why you can define a topology on the test
function space by defining limits of test functions?
Actually, that question is not well-defined. You need a topology first
before you can define the notion of "limit". Maybe try to re-ask the
question after thinking about the above.
 
  • #7


crazyjimbo said:
A distribution is then a linear and continuous map [tex]T : D(\mathbb{R}) \rightarrow \mathbb{R}[/tex] such that if [tex]\phi_n \rightarrow \phi[/tex] and [tex]\phi_n^{(k)} \rightarrow \phi^{(k)} [/tex] uniformly then [tex]T(\phi_n) \rightarrow \T(\phi)[/tex]. For whatever reason, [tex]T(\phi)[/tex] is normally denoted by [tex]\langle T, \phi \rangle[/tex].

For ant integrable function [tex]f : \mathbb{R} \rightarrow \mathbb{R}[/tex] we can then define a distribution [tex]T_f[/tex] given by [tex]\langle T_f, \phi \rangle = \int f \phi dx[/tex]. The delta distribution however does not come from such a function, and instead [tex]\langle \delta, \phi \rangle[/tex] is defined as [tex]\phi(0)[/tex]. So the expression [tex]\int \delta (x) \phi (x) dx[/tex] should never come up.
Thank you. I suspeced that, but I'm a still a bit surprised because I've been seeing that expression "everywhere" in my physics books for so many years. Every time there was a comment about it, it always said roughly what Avodyne just said, and that the delta appearing under the integral sign isn't really a function but a distribution. (None of my books defines "distribution").

What do you mean by [itex]\phi_n^{(k)}[/itex]? That part wasn't included on the Wikipedia page. Edit: Ah, I get it. It's the partial derivatives. I don't use that notation myself. I'd normally write [itex]\phi_n,_k[/itex].
 
Last edited:
  • #8


Thanks strangerep. This time I knew just about everything you said already, but it's still interesting to see how someone else thinks about these things. The comment about how the dual space can have members that don't correspond to members of the original vector space helped me see one thing more clearly even though I knew this fact already.

strangerep said:
Actually, that question is not well-defined. You need a topology first
before you can define the notion of "limit". Maybe try to re-ask the
question after thinking about the above.
I thought about it before I asked, and I asked because I have only seen this done your way. (Define a topology first, and then use the topology to define limits). But the Wikipedia page (link) says that "It can be given a topology by defining the limit of a sequence of elements of D(U)". The word "it" refers to the set D(U) of test functions, with the obvious vector space structure.

Now that I've thought about this some more, I see two possible answers:

1. In addition to defining a vector space structure on D(U), we also define an inner product by [itex]\langle\phi,\psi\rangle=\int\phi\psi d\mu[/itex] and use the associated metric to define limits in the way that's standard for metric spaces. To do this we need a measure on U, but if U is the real numbers we can use the Lebesgue measure.

2. Maybe it also makes sense to define the limits first and than take the topology to be the coarsest topology in which the limits we have defined as convergent are convergent according to the standard definition.

But I see now that 2 doesn't work in this case. Their definition of a convergent sequence of test functions uses the notion of uniform convergence of a sequence of (partial derivatives of) test functions. That seems circular to me.
 
  • #9


Fredrik said:
I thought about it before I asked, and I asked because I have only seen this done your way. (Define a topology first, and then use the topology to define limits). But the Wikipedia page (link) says that "It can be given a topology by defining the limit of a sequence of elements of D(U)". The word "it" refers to the set D(U) of test functions, with the obvious vector space structure.
Oh, I see now... It uses the notion of "uniform convergence" of the test functions and
their derivatives to define the limits, but the notion of uniform convergence relies on
the standard topology of the range of the functions (e.g., R or C). The desired topology
is then the final (finest) topology such that the functions and derivatives are continuous
under that topology, so it's a kind of weak topology (I think).
 
  • #10


OK, I feel that I understand the definition well enough now. I'm still uncertain about some details regarding the topology, but I can live with that.

It's funny that expressions that "should never come up" (I'm quoting #2) come up all the time in physics books. I have a follow-up question about that. How do mathematicians express and prove this identity?

[tex]\delta(x^2-a^2)=\frac{1}{2|x|}\Big(\delta(x-|a|)+\delta(x+|a|)\Big)[/tex]

There's nothing special about this particular identity. It's just an example. The "physicist's derivation" is straightforward:

[tex]\int_0^\infty\delta(x^2-a^2)f(x)dx=\begin{bmatrix}y=x^2, x=\sqrt y\\dx=\frac{1}{2\sqrt y}dy\end{bmatrix}=\int_0^\infty\delta(y-a^2)f(\sqrt y)\frac{1}{2\sqrt y}dy=\frac{f(\sqrt{a^2})}{2\sqrt{a^2}}=\frac{f(|a|)}{2\sqrt{|a|}}[/tex]

[tex]\int_{-\infty}^0\delta(x^2-a^2)f(x)dx=\begin{bmatrix}y=-x\\dy=-dx\end{bmatrix}=-\int_{\infty}^0\delta(y^2-a^2)f(-y)dy=\int_0^\infty\delta(y^2-a^2)f(-y)dy=\frac{f(-|a|)}{2\sqrt{|a|}}[/tex]

[tex]\int_{-\infty}^\infty\delta(x^2-a^2)f(x)dx=\frac{f(|a|)+f(-|a|)}{2\sqrt{|a|}}=\int_{-\infty}^\infty\frac{1}{2|x|}\Big(\delta(x-|a|)+\delta(x+|a|)\Big)f(x)dx[/tex]
 
Last edited:
  • #11


why is this in the linear algebra section?
 
  • #12


mathwonk said:
why is this in the linear algebra section?
I'm not sure why I put it here. I suppose "calculus & analysis" would have been the right place to put it. I asked (by using the "report" button) the moderators to move it there when no one had answered the first day, but nothing happened.
 
  • #13


Fredrik said:
I feel that there's one thing missing in all of this. I don't see an explanation of the expression

[tex]\int \delta(x)f(x)dx[/tex]

What does this integral mean? Is there a definition of the integral of a distribution that applies here, or should this just be interpreted as a "code" representing the expression [itex]\langle\delta,\phi\rangle[/itex]? (This would mean that the expression has nothing to do with integrals at all, but is written as if it were an integral just to make it look like the expression involving f above).
This is right. (At least in the formulation I'm familiar with)
 
  • #14


Fredrik said:
I have a follow-up question about that. How do mathematicians express and prove this identity?

[tex]\delta(x^2-a^2)=\frac{1}{|x|}\Big(\delta(x-|a|)+\delta(x+|a|)\Big)[/tex]
Once you have definitions, the way to proceed will probably be obvious. :smile: The big problem is that if you haven't defined what an expression like [itex]\delta(x^2-a^2)[/itex] might mean, you certainly can't prove any theorems about it! The 'physicists derivation' you gave is a motivation: we would like to be able to choose a definition so that that calculation works out. And if you don't need a general theory of derivatives and composition, then you could just take identities like that as a definition, and then do a check to make sure they have the properties you want. (And take care not to use any properties you haven't checked)



Now, if you wanted a general theory of composition, it would probably go something like this. (disclaimer: having never seen it before, I'm working this out from scratch, so it may or may not bear resemblance to how people actually do things)

The final calculation is at the end, after a separator.


First, note that duality gives us, for any function [itex]f:V \to W[/itex], a dual function [itex]f^*:W^* \to V^*[/itex]. In inner-product-like notation, this is defined by

[tex]\langle f^*(\omega), v \rangle := \langle \omega, f(v) \rangle[/tex]

This has a similarly simple expression in functional notation, but it will cause some notational confusion with the theory of composition I want to derive, so I won't state it.

Suppose that V and W are spaces of functions on X and Y. If we have a good map [itex]f:X \to Y[/itex], we get another kind of dual mapping [itex]f^* : W \to V[/itex] defined by composition: [itex]f^*(w)(x) = w(f(x))[/itex]. And, of course, we get the dual dual mapping [itex]f^{**} : V^* \to W^*[/itex], which I'm going to rename as [itex]f_*[/itex].

[itex]f^*[/itex] here is sometimes called a 'pullback', and [itex]f_*[/itex] a 'pushforward'.


Suppose we have an inner product on a vector space V. For any v in V, the inner product let's us define the 'transpose' of v to be an element of [itex]V^*[/itex] as follows, written in inner-product-like notation on the left, and actual inner-product notation on the right:
[tex]\langle v^T, w \rangle := \langle v, w \rangle[/tex]. Henceforth, I will not explicitly write the transpose operation.

Note I've done nothing new or specific to this situation -- the above is just basic operations in the arithmetic of functions.


Now, we can already do some calculations! Let's fix a function [itex]f : \mathbb{R} \to \mathbb{R}[/itex], use the standard inner product, and let [itex]\phi[/itex] be a test function. Then, we have [itex]f^*(\phi)(x) = \phi(f(x))[/itex]. What about the pushforward map?

Well, let's suppose that f is invertible and increasing, with inverse g. For a test function [itex]\psi[/itex], we can make the following calculation:

[tex]\langle f_*(\psi), \phi \rangle = \langle \psi, f^*(\phi) \rangle
= \int_{-\infty}^{+\infty} \psi(x) \phi(f(x)) \, dx
= \int_{-\infty}^{+\infty} \psi(g(y)) \phi(y) g'(y) \, dy
= \langle g' g^*(\psi), \phi \rangle[/tex]

thus giving us [itex]f_*(\psi) = g' g^*(\psi)[/itex]. (Again, recall that I'm suppressing the transpose operation)

We can also vary the calculation slightly to arrive at another interesting result:

[tex] \langle \psi \circ f, \phi \rangle = \langle f^* (\psi), \phi \rangle
= \langle f' f^* (\psi), \phi / f' \rangle
= \langle g_* \psi, \phi / f' \rangle
= \langle g_* (\psi) / f', \phi \rangle[/tex]

and so I'm inspiried to make the following definitions

Definition: If f is a good function, and [itex]\omega[/itex] is a distribution, then define [itex]f \omega[/itex] by [itex]\langle f \omega, \phi \rangle = \langle \omega, f \phi \rangle[/itex]

Definition: If f is a good, increasing function with inverse g, then for any distribution [itex]\omega[/itex], define [itex]\omega \circ f = g_*(\omega) / f'[/itex]

The above still works for multivariable test functions and distributions; the appoproriate condition on f is that it's invertible with positive Jacobian. At this point, I'm going to assume we have also defined partial integration of multivariable distributions. (i.e. evaluate a 2-variable distribution at a 1-variable test function to produce a 1-variable distribution. This is extremely similar to tensor contraction) I will also assume we've worked out the properties of composition as defined above.

So now, my magic trick to define arbitrary composition distributions is to convert to the invertible case by adding a variable, by virtue of the fact that the following is invertible:

u = x
v = y + f(x)

with Jacobian 1.

Now consider this if [itex]\omega[/itex] is a distribution, then it should also be a two-variable distribution by adding another dummy variable. Heuristically speaking, applying the above transformation would give

[tex]\iint \omega(v) \phi(u) \psi(v - f(u)) \, du \, dv
=
\iint \omega(y + f(x)) \phi(x) \psi(y) \, dx \, dy[/tex]

Note that this is well defined, becase we simply composed a two-parameter distribution with an invertible function! Now, if partial integration with respect to x gives us an honest-to-goodness test function, then we have

[tex]\iint \omega(y + f(x)) \phi(x) \psi(y) \, dx \, dy
= \int g(y) \psi(y) \, dy[/tex]

And so we can make the following definition

Definition: Let [itex]\omega[/itex] be a distribution, [itex]f[/itex] a good function, [itex]\phi[/itex] a test function. Suppose there is a good function g such that, for every test function [itex]\psi[/itex], we have the identity [itex]\langle \omega(y), \phi(x) \psi(y - f(x)) \rangle = \langle g, \psi \rangle[/itex] (where the first inner product is over both variables). Then we define [itex]\langle \omega \circ f, \phi \rangle = g(0)[/itex].

--------------------------------------------------------------------------------------

Now, let's compute
[tex]\iint \delta(v) \phi(u) \psi(v - u^2 + a^2) \, du \, dv
= \int \phi(u) \psi(a^2 - u^2) \, du
= \int_{-\infty}^{a^2} \frac{\phi(\sqrt{a^2 - x}) + \phi(-\sqrt{a^2 - x})}{2 \sqrt{a^2 - x}} \psi(x) \, dx[/tex]

And so we have (assuming the integrand is 'good'):
[tex]\langle \delta(x^2 - a^2), \phi(x) \rangle
= \frac{\phi(|a|) + \phi(-|a|)}{2|a|}[/tex]
when a > 0, and finally
[tex]\delta(x^2 - a^2) = \frac{1}{2|a|}\left( \delta(x - a) + \delta(x + a) \right)[/tex]


Note that [itex]\delta(x^2 - a^2)[/itex] is undefined for a = 0. More interestingly, you can let a be a variable rather than a constant (or maybe a 'variable constant'), and now this expression is distributional in a.
 
Last edited:
  • #15


Thanks Hurkyl. I'm about to go to bed, but I'll read your post tomorrow.
 
  • #16


That was very instructive. I found the notation a bit confusing at times, but I was able to understand it. I was also able to use your ideas to find a fairly simple way to define [itex]\delta(f(x))[/itex]. We defined [itex]\delta[/itex] by [itex]\langle\delta,\phi\rangle=\phi(0)[/itex], inspired by the expression [itex]\int \delta(x)\phi(x)dx[/itex]. We would like to find a way to define a distribution [itex]\delta_f[/itex] so that [itex]\langle\delta_f,\phi\rangle[/itex] corresponds to the expression [itex]\int \delta(f(x))\phi(x)dx[/itex]. We start by making the change of variables [itex]y=f(x)[/itex] in that "integral":

[tex]\int \delta(f(x))\phi(x)dx=\int\delta(y)\phi(f^{-1}(y))f^{-1}'(y)dy=\phi(f^{-1}(0))f^{-1}'(0)[/tex]

So (at least when f is continuous and increasing) we can define [itex]\delta_f[/itex] by

[tex]\langle\delta_f,\phi\rangle=\phi(f^{-1}(0))f^{-1}'(0)=\langle\delta,(\phi\circ f^{-1})f^{-1}'\rangle[/tex]

Unfortunately, things get weird when f has lots of zeroes. Here's one generalization of the definition above: If there's no real number r such that every neighborhood of r contains infinitely many members of [itex]f^{-1}(0)[/itex], we can define [itex]\delta_f[/itex] by

[tex]\langle\delta_f,\phi\rangle=\sum_k sign(f'(x_k))\phi(f_k^{-1}(0))f_k^{-1}'(0)[/tex]

where the [itex]x_k[/itex] are the members of [itex]f^{-1}(0)[/itex] and each [itex]f_k[/itex] is the restriction of [itex]f[/itex] to an infinitesimal interval containing [itex]x_k[/itex].
 
Last edited:
  • #17


If all you care about is delta, you get more wiggle room -- e.g. I think you made essential use of the fact that it has exactly one singularity (which is thus isolated).

I was trying to work out a similar development, but using operations on measures to capture the tricks one wants to do with the integral, and try in that way to avoid any irregularities involved because we're really working with distributions rather than functions. I think I now finally see how it's going to work out!


First, the setup.

Let f : X --> Y be a measurable function. If we have a measure [itex]\mu[/itex] on X, then we have a pushforward measure [itex]f_* \mu[/itex] on Y defined by

[tex](f_* \mu)(E) = \mu(f^{-1} (E))[/tex]

for any measurable subset E of X. (here, f^-1 is the inverse image operation on sets -- we don't need an actual inverse for f) From Royden's real analysis, in the section on mappings of measurable spaces, we have the theorem

[tex]\int_Y g \, df_* \mu = \int_X g \circ f \, d\mu[/tex]

so this is how change-of-variable is expressed measure theoretically.


Now, suppose that [itex]\mu, \nu[/itex] are measures on a space X with the property that [itex]\mu(E) = 0[/itex] implies [itex]\nu(E) = 0[/itex], then there exists a measurable function called the Radon-Nikodym derivative with the property that

[tex]\nu(E) = \int_E \left[ \frac{d\nu}{d\mu} \right] d\mu[/tex]

this derivative is essentially unique; any two Radon-Nikodym derivatives will disagree on a set of [itex]\mu[/itex]-measure zero. This derivative satisfies many of the 'normal' derivative properties, and also for any nonnegative measurable function f,

[tex]\int f \, d\nu = \int f \left[ \frac{d\nu}{d\mu} \right] d\mu[/tex]




We can combine the two to fully express change-of-variable for ordinary functions in the 'usual' way. If [itex]dx[/itex] denotes the usual Lesbegue measure, and [itex]df^{-1}(x)[/itex] its pullback, as defined above, then

[tex]\int g(f(x)) \, dx = \int g(x) \, df^{-1}(x) = \int g(x) \left[ \frac{df^{-1}(x)}{dx} \right] \, dx[/tex]

By using the pushforward measure, we have avoided the need to decompose the integral into several parts so that we can use invertible change-of-variable. Not only that, but it covers a lot more badly behaved f's!

As an example, if [itex]f(x) = x^2 - a^2[/itex], then

[tex]\left[ \frac{df^{-1}(x)}{dx} \right] = \begin{cases}
\frac{1}{\sqrt{x + a^2}} & x > -a^2 \\
0 & x \leq -a^2[/tex]

Note that this accounts for most points having two preimages -- the actual value of the derivative is exactly what you would get by decomposing into two integrals, applying invertible change-of-variable, and recombining. In fact, that's how I computed it!



This gets more complicated if the left hand side isn't purely a composition. Let [itex]d\mu = \phi dx[/itex]. Then, we would have

[tex]\int g(f(x)) \phi(x) \, dx = \int g \circ f \, d\mu
= \int g df_*\mu = \int g(x) \left[ \frac{df_*\mu}{dx} \right](x) \, dx[/tex]

This is awkward because we have absorbed the terms [itex]\phi \circ f^{-1}[/itex] into the Radon-Nikodym derivative. But I suppose they would have been awkward anyways, so it hasn't really become worse.

So how would we compute this derivative? The same way we always would! For example, if [itex]h(x) = x^2 - a^2[/itex], I compute

[tex]\int f(x^2 - a^2) g(x) \, dx = \int_{-a^2}^{+\infty}
f(x) \frac{g(\sqrt{x + a^2}) + g(-\sqrt{x + a^2})}{2 \sqrt{x + a^2}} \, dx[/tex]

and so we have

[tex]\left[ \frac{dh_*\mu}{dx} \right] = \begin{cases}
\frac{g(\sqrt{x + a^2}) + g(-\sqrt{x + a^2})}{2 \sqrt{x + a^2}} & x > -a^2 \\
0 & x \leq -a^2[/tex]



All of the above sets us up very nicely for distributions. It's clear that if f is a map and [itex]\varphi[/itex] is a test function, we want to define

[tex]d\mu = \varphi \, dx[/tex]

[tex]\langle \omega \circ f , \varphi \rangle =
\int \omega(x) \left[ \frac{df_*\mu}{dx} \right](x) \, dx[/tex]

But how do we actually compute it? Simple! We do the calculation exactly as we would if [itex]\omega[/itex] were an honest-to-goodness function, with the only caveat that we're not allowed to actually evaluate an integral until we've put it back into the correct form of an integral over all of space. The reason? The method for computing the derivative itself would be done by putting an arbitrary nonnegative function where [itex]\omega[/itex] is, and the computing the integral would be done by replacing that arbitrary function with [itex]\omega[/itex].


So the only remaining problem is what happens if [itex][ df_*\mu / dx][/itex] isn't a test function. (which it isn't in the running example, since it's discontinuous at -a^2) I suppose you just need to define something with limits to allow you to compute the inner product of a distribution with something that isn't a test function.
 
Last edited:
  • #18


mathwonk said:
why is this in the linear algebra section?
I notice it's now been moved to Calculus & Analysis, but this makes me wonder
where discussions on Functional Analysis should go? It's algebra on
infinite-dimensional linear spaces, hence has a foot in both camps.
A question for the moderators, perhaps?
 
  • #19


Hurkyl, that looks pretty cool. I'd like to ask a question before I make an effort to try to really understand this.

Hurkyl said:
[tex]\int_Y g \, df^* \mu = \int_X g \circ f \, d\mu[/tex]
Did you accidentaly swap the measures here? The integral on the left is over Y, but the pullback measure is a measure on X. If the theorem is

[tex]\int_Y g \, d\mu = \int_X g \circ f \, df^*\mu[/tex]

instead, it's kind of intuitive: If you pullback both the measure and the function, you get the same result.

Hurkyl said:
If [itex]dx[/itex] denotes the usual Lesbegue measure, and [itex]df^{-1}(x)[/itex] its pullback, as defined above, then

[tex]\int g(f(x)) \, dx = \int g(x) \, df^{-1}(x) = \int g(x) \left[ \frac{df^{-1}(x)}{dx} \right] \, dx[/tex]
Would you like to change this too? (It seems to have the same problem).

strangerep said:
I notice it's now been moved to Calculus & Analysis, but this makes me wonder
where discussions on Functional Analysis should go? It's algebra on
infinite-dimensional linear spaces, hence has a foot in both camps.
A question for the moderators, perhaps?
I've been wondering the same thing. I had to think about it when I was going to start this thread, and I felt that none of the forums was right.
 
  • #20


Grar, I thought I looked over that carefully. To a mapping [itex]\varphi[/itex] (not a test function), Royden describes the operations on measures in terms of what he labels [itex]\Phi[/itex], defined by [itex]\Phi(E) = \varphi^{-1}(E)[/itex], and defines the measures [itex]d\Phi^*\mu[/itex]. I had wanted to omit [itex]\Phi[/itex], but I simply got the initial definition of that measure wrong. I've corrected it in the above.
 
Last edited:

1. What is a distribution Tf?

A distribution Tf is a mathematical concept that represents the probability of a random variable taking on a certain value. It is derived from a function f, where each possible outcome of the function is assigned a probability.

2. How is a distribution Tf related to a function f?

A distribution Tf is derived from a function f, where each possible output of the function is mapped to a probability. This allows us to understand the likelihood of different outcomes occurring.

3. Can you give an example of a distribution Tf?

Sure, let's say we have a function f that represents the height of individuals in a population. The distribution Tf for this function would give us the probabilities for different height ranges, such as 5'0" to 5'5", 5'6" to 6'0", etc.

4. How is a distribution Tf calculated?

The calculation of a distribution Tf depends on the specific function f and the type of distribution being used. In general, it involves finding the probabilities for each possible outcome of the function and then assigning those probabilities to the corresponding values. There are various mathematical formulas and techniques for calculating different types of distributions.

5. Why is understanding distributions Tf important in science?

Distributions Tf are important in science because they allow us to make predictions and draw conclusions based on data. By understanding the probabilities associated with different outcomes of a function, we can better understand the behavior of a system or phenomenon. This is useful in fields such as statistics, economics, and psychology, among others.

Similar threads

  • Calculus
Replies
25
Views
1K
Replies
1
Views
907
  • Advanced Physics Homework Help
Replies
1
Views
818
Replies
1
Views
795
  • Atomic and Condensed Matter
Replies
5
Views
2K
  • Quantum Physics
Replies
8
Views
2K
  • Quantum Physics
Replies
12
Views
2K
Replies
33
Views
3K
  • Quantum Physics
Replies
7
Views
856
Back
Top