# Is integration theory supposed to be this hard?

1. Jan 24, 2012

### Fredrik

Staff Emeritus
$\newcommand{\dmu}{\ \mathrm{d}\mu}$
I'm using Friedman to try to learn some integration theory. I got the impression that all books start by defining integrals of simple functions in the same way, and then there are several different ways to generalize the definition to more interesting functions. This book's approach is based on the idea that if there's a sequence $\langle f_n\rangle$ of simple functions such that $f_n\to f$ in some sense, then it might make sense to define
$$\int f\dmu=\lim_n\int f_n\dmu.$$ This idea is so simple that I got quite fond of it at first, but things soon get complicated. I'm wondering if other approaches are easier, or if I'm just wrong about how hard this one is.

The idea described above only makes sense if the precise meaning of "$f_n\to f$" is such that the limit on the right always exists, and is independent of the sequence used. The book chooses that meaning by defining f to be integrable if there's a sequence $\langle f_n\rangle_{n=1}^\infty$ that's Cauchy in the mean, and has the property that $f_n\to f$ almost everywhere. It's easy to show that "Cauchy in the mean" implies that the limit exists. The hard part is to prove that it's independent of the sequence. These are some of the statements we'll have to prove to accomplish that, if we follow the book's path: (I'm not going to be very careful with the details in these statements).

1. If $\langle f_n\rangle$ is Cauchy in the mean, $\lim_n\int_E f_n\dmu$ exists for all E.
2. The function $\lambda$ defined by $\lambda(E)=\lim_n\int_E f_n\dmu$ for all E is countably additive.
3. Now suppose that $\langle f_n\rangle$ and $\langle g_n\rangle$ are two Cauchy sequences in the mean that both converge almost everywhere to f. If E is σ-finite, then $\lim_n\int_E f_n\dmu=\lim_n\int_E g_n\dmu$ for all E. (This one uses the result that these sequences also converge to f in measure. I'm putting that on a separate list).
4. The previous result implies that $\lim_n\int f_n\dmu=\lim_n\int g_n\dmu$.
This is not too bad, in my opinion. What I find frustrating is that the comment in parentheses spawns a whole new list.
1. "Cauchy in the mean" implies "Cauchy in measure".
2. A sequence of measurable functions that's Cauchy in measure has a subsequence that converges almost uniformly.
3. If $f_n\to f$ almost uniformly, then $f_n\to f$ in measure.
4. Corollary of the previous two: A sequence of measurable functions that's Cauchy in measure converges in measure. (When we get to this point, we know that if f is integrable and $\langle f_n\rangle$ is a Cauchy sequence in the mean that converges to f almost everywhere, there's a function g such that $f_n\to g$ in measure, but it doesn't seem obvious that f=g a.e. The next theorem should take care of that).
5. If $f_n\to f$ almost uniformly, then $f_n\to f$ almost everywhere.
So we apparently need lots of theorems just to make sense of the definition of the integral. The above is far from all of it. We also need to show that integrals of simple functions satisfy the basic stuff we expect them to, such as the triangle inequality for integrals and the linearity of the integration operation.

So is this as easy as it gets, or can the theory be made simpler?

2. Jan 24, 2012

### micromass

There are many different approaches to integration. The approach you mention can certainly be made simpler. That is: we can easily find a very short definition of the integral, without extra work.

The trick is to define the integral first in the case for positive functions. In that case, you can define the integral simply as a supremum, and not as a limit.

Of course, this does not really absolve us of technical difficulties. We still need to prove theorems like the monotone convergence theorem. That theorem is immediate with the definition in the OP.

I find the book "Real Analysis" by Folland to have a lucid and easy exposition of the integral. I learned the theory myself from "Probability and measure" by Billingsley. But it might not suit your needs.

Of course, if you're really lazy, you can just work with the Daniell integral...

Last edited: Jan 24, 2012
3. Jan 24, 2012

### morphism

On the macro scale integration theory is straightforward. You've basically captured the main idea: approximate your function by stuff that's easy to integrate, and then the integral of your function should be approximated by the integrals of the approximating functions. This idea is even present in Riemann integration (Riemann sums, step functions, etc.)

On the micro level, however, the theory is mired with technicality (just like any other mathematical theory, really). That said, the approach you've outlined seems overly complicated. Instead of worrying about stuff like "Cauchy in the mean" and whatnot, you could start your theory by explaining what it means for a simple function (=finite linear combination of characteristic functions) to be integrable. This is very easy, and all the theorems have trivial proofs. Then you prove the helpful result that every nonnegative measurable function is the pointwise limit of a monotone sequence of nonnegative simple functions. This result is fundamental and is also very easy to prove. Once in hand, you can then essentially define the integral of a nonnegative measurable function f to be be the limit of the integrals of such an approximating sequence of simple functions. Essentially. To avoid problems with having to explain why this is well-defined, etc., you can instead define the integral of f to be the sup of the integrals of g, where g runs over the set of nonnegative simple functions that are <= f pointwise. This will make a lot of the basic theorems easy to prove, because they would follow from their simple-function analogues.

This is a really standard approach to the theory. E.g., it's the approach in big Rudin, Royden, and in Bartle's integration theory book. The latter is fairly slim and very easy to read.

4. Jan 24, 2012

### Fredrik

Staff Emeritus
@micromass:

OK, thanks. I think the other integration book I own (Capinski & Kopp) does the positive functions first approach. I think I'll have to study the integration chapter in that book to see if I like that approach better.

I'm interested in the things you say can be made easier in the approach I described. In particular, I'm wondering if there's a short proof for the stuff I mentioned in parentheses in item 3 in the first list, a proof that doesn't require me to go through the entire second list. I don't mind using the concepts of "in measure", "in the mean" and so on, but I don't want to prove a whole bunch of theorems about them if it can be avoided.

Daniell integral...never heard of it. I see there's a Wikipedia article with that title. I will check it out when I've finished the pizza I just bought.

@morphism: I haven't read your post yet. I will after the pizza.

5. Jan 24, 2012

### Tarantinism

+1

Hi. I was going to answer exactly this (swede? Lunds Universitet? very good experience at erasmus there with Alexandru Aleman, who used this book). You are using Rudin's basic Real-analysis book, but there is another very good book, which I suppose you know, Real and Complex Analysis (there is a third: Functional Analysis, but we are not interested at this moment). Rudin uses this approach. The same as other good books like Capinsky-Kopp's

http://cache0.bookdepository.co.uk/assets/images/book/medium/9781/8523/9781852337810.jpg

I do not know both approaches depthly, but, as micromass says: maybe you avoid some difficulties later with the key Monotone and Dominated convergence theorems (and Fatou's lemma). Maybe. I should know both methods better to know which gets longer.

6. Jan 24, 2012

### micromass

I've edited my post above to say that I like "Real Analysis" by Folland. I feel you would like this approach. I don't know Capinski & Kopp, I should check that out.

Convergence in measure is really handy though, so nothing is wrong with knowing about that. Cauchy in the mean is something I personally never heard about...

7. Jan 24, 2012

### Tarantinism

But if you have begun recently with integration theory, I guess it is better to leave that for the moment when you want to integrate in general topological vector spaces

8. Jan 24, 2012

### micromass

Hmmm, Capinski-Kopp seems like a nice book. A bit sad that they only develop integration theory on $\mathbb{R}$ and not on arbitrary measure spaces...

9. Jan 24, 2012

### Fredrik

Staff Emeritus
Those things are part of Friedman's approach too. I didn't mention what it means for a simple function to be integrable because it seems all books handle simple functions the same way, and differ only in how they generalize integration to measurable functions. The other theorem you mentioned (2.2.5, p. 34) is indeed a nice theorem with a nice and simple proof.

I've had a quick look at it, and my first impression is that it looks very good. It has a few other sections that I'm interested in as well.

Yes and no. Stockholm.

10. Jan 25, 2012

### lavinia

I thought that the integral of a positive measurable function,f, was defined as the supremum of the integrals of simple functions (functions that are a finite linear combination of characterisitc functions of sets of finite measure) that are less than or equal to f. This does not require a notion of limit.

11. Jan 25, 2012

### Fredrik

Staff Emeritus
$\newcommand{\dmu}{\ \mathrm{d}\mu}$
That seems to be what everyone is saying in the posts above. Based on the responses, I have to conclude that the approach described in #1 is a bit unusual. The idea still has certain appeal, since the basic idea is so simple. We want to define
$$\int f\dmu=\lim_n\int f_n\dmu,$$ where the $f_n$ are integrable simple functions such that $f_n\to f$ a.e., but this only makes sense if the limit exists and is independent of the sequence. The limit on the right is just a limit of a sequence of real numbers, so it exists if and only if the sequence is Cauchy. So we also require that the sequence $\langle f_n\rangle$ has the property that for all ε>0, there's an N such that n,m≥N implies
$$\int|f_n-f_m|\dmu<\varepsilon,$$ because this implies that for all n,m≥N
$$\bigg|\int f_n\dmu-\int f_m\dmu\bigg |\leq\int|f_n-f_m|\dmu<\varepsilon.$$ The only problem is that it's hard to prove that the limit is independent of the sequence used. (See the two lists in post #1).

Last edited: Jan 25, 2012
12. Jan 25, 2012

### Fredrik

Staff Emeritus
$\newcommand{\dmu}{\ \mathrm{d}\mu}$
I think I figured out what's different about Friedman's approach, and why it's harder. The simple approach, as explained in e.g. Folland, defines the integral of a non-negative measurable function f as the supremum of all $\int g\dmu$ where g is a non-negative integrable simple function such that g≤f. This implies that there's an increasing sequence $\langle f_n\rangle$ of non-negative integrable simple functions such that
$$\int f\dmu=\lim_n\int f_n\dmu.$$ Here we can see the difference between the simple approach and Friedman's. It's that the simple approach removes the question of whether the limit depends on the sequence, by only considering sequences <fn> that obviously won't give us different values of $\lim_n\int f_n\dmu$. Friedman's approach is essentially the same as the simple approach. It just does one more thing in addition to the simple stuff: It proves that if we had used some other sequence (e.g. a sequence of simple functions that are ≥f instead of ≤f), the result is the same.

I think it makes sense to do this, because if we don't, it makes perfect sense to ask: Would the result have been different if we had chosen some other sequence, and if yes, why didn't we?

Last edited: Jan 25, 2012
13. Jan 25, 2012

### micromass

Yes. However, note that the goal is eventually to prove the monotone convergence theorem. Once this is proven, we see that Friedman's definition is essentially the same as the simple approach.

14. Jan 25, 2012

### lavinia

Lebesque Dominated Convergence seems to give you your theory.

15. Jan 25, 2012

### Tarantinism

16. Jan 26, 2012

### mathwonk

I have never understood integration theory. I will say some ignorant but well meaning things here in case having others correct them sheds light on the topic.

Thinking about it now in reference to your question, I believe one reason is that the objects being integrated, the "interesting" ones, are not functions at all, they are more like probability distributions.

I.e. they have no fixed values at individual points, there is only the probability their value is a certain thing. E.g. one equates two functions in integration theory if they differ on a set of measure zero. That means you can change any value at all and not change the "function".

For instance the function which is identically zero is the same as the function which is zero on the irrationals and equals p on the reduced fraction p/q. This means there is only the probability one that the function has value zero at any given point, but this is not certainty. This is confusing to me.

Of course in this case there is one continuous function in the equivalence class and that is the most natural representative, but the interesting functions are those that are not equivalent to any continuous function, and then there seems to be no natural representative to choose.

The most intuitive way of understanding the integral to me is analogous to Riemann integration, but with axes reversed. I.e. take a positive function on the interval [a,b], and subdivide the y axis into integer intervals. Then define the lower sum to be equal to the sum over all n, of n times the measure of the set where f has values in [n,n+1].

The upper sum is the sum over all n of n times the measure of the set where f has value in the interval [n-1,n].

If these are both finite, then refine the subdivision of the y axis into tenths and continue.

As suggested above, then one can define the integral as the LUB of the lower sums, provided it equals the GLB of the upper sums.

This puts the heat of course on how to define the measure of a set, but it does define the integral of an actual function.

So in some sense, in integration theory, one does define the integral of an actual function.

But when you try to prove completeness results you still have to go back to equivalence classes of functions. I.e. if you want to define a function as a limit of other functions you have a problem defining the values of the limit function, or else you have to give up uniqueness of limits.

I.e. if you have a function f and a sequence fn, then fn-->f means the integrals of f-fn go to zero. But if true, then this is still true for any function equivalent to f.

And if you only have the sequence fn and want to define the limit function you cannot do so uniquely.

I.e. the natural way to define Cauchy - ness of a sequence of functions fn, is to say the integrals of the differences fn-fm -->0, but this does not imply the values of the functions converge everywhere. The hard part referred to above is that they do at least converge pointwise a.e. So a Cauchy sequence does define an equivalence class of functions for a limit.

Actually this lets you break the problem into two parts:
1) define the integral norm on the space of all continuous functions, making it a metric space, and define the completion of this metric space as the set of all equivalence classes of Cauchy sequences.
The integral of a class of such Cauchy sequences is well defined, as the limit of the integrals.
This is the easy part, i.e. the formal part.

2) Try to find a function that represents each class of Cauchy sequences. This is the hard part. You have to prove that a Cauchy sequence of functions does converge pointwise a.e.

So you get a limit function which is only defined on the complement of a set of measure zero. of course I guess you could say the value is zero on the set of measure zero where the functions do not converge pointwise.

Well to me it is a confusing subject, but several people here seem to have studied it. A famous analyst once told me however that the main theorems in the subject are Fubini and dominated convergence, from the point of view of using the theory that is.

In my own experience, I don't know about dominated convergence but i can certainly vouch for Fubini being useful. Lang's Analysis II (maybe now Real analysis), has a good strong statement of Fubini. (And the functions in Lang have values in any Banach space.)

Last edited: Jan 26, 2012
17. Jan 27, 2012

### Fredrik

Staff Emeritus
One of the books mentioned above, Capinski & Kopp, starts out saying that this is how Lebesgue did it in 1902, and then they say:
A century of experience with the Lebesgue integral has led to many equivalent definitions, some of them technically (if not always conceptually) simpler. We shall follow a version which, while very similar to Lebesgue's original construction, allows us to make full use of the measure theory developed already.​
Then they state the supremum definition, that everyone in this thread seems to be familiar with. So they're suggesting that the upper/lower sum thing is kind of hard.

The original approach does however have the advantage that it provides motivation for the definition of the term "measurable function": If you chop up the y axis into intervals $\{I_k\}$, and want to assign a value that's approximately equal to
$$\sum_k y_k\,\mu(f^{-1}(I_k)),$$ where $y_k\in I_k$, it's natural (and maybe unavoidable) to require that all the $f^{-1}(I_k)$ are measurable sets.

18. Jan 27, 2012

### Fredrik

Staff Emeritus
Perhaps someone can answer a very simple question: Are all measurable functions integrable? (I'm sure I'll get to the answer soon enough anyway, but if it only takes a minute to reply...)

19. Jan 27, 2012

### lavinia

A measurable function may have an infinite integral. In this case it is not called integrable.

20. Jan 27, 2012

### lavinia

- In classical physics one wants to measure physical quantities in regions of space. The approach is to divide the region into small pieces -- e.g. rectangles - and multiply the volume of the region by the density of the physical quantity in that region. One then adds these up over lots of small regions to estimate the total.

For instance, the flux of a field across a surface is measured as the sum of the normal component of the field density times the areas of small rectangular pieces of the surface.

This procedure is Riemann integration. To me it naturally arises from classical physics. It is a mathematical model of empirical estimation of physical quantities.

On the other hand, if one wants to measure the probability of something happening, one starts with the outcome - the thing whose probability one wants to measure - then asks what is the measure of the set of situations where this outcome will occur. one does not divide space up into equal regions. rather, one finds those regions of space where the outcome will occur. This is Lebesque integration.

- Generally, classical physical quantities are assumed to be continuous, in fact differentiable, away from isolated singularities. In fact, they are often harmonic. For these types of mathematical functions the requirement that upper and lower sums converge to the same number is not a restriction.

In probability theory though, continuity - no less differentiability - is of no interest. the only thing that matters is measuring the volume of a region where an outcome will occur.

- I find it interesting that when lower sums and upper sums converge to the same number that the physical quantity must be nearly continuous. For certainly, the estimation of something like flux across a surface should not depend upon how one approximates. Upper sums, lower sums, in between sums should all give good estimates if the region is small enough. If this were not true, it is hard to imagine doing classical physics.

Last edited: Jan 27, 2012