Other Should I Become a Mathematician?

mathwonk · Jun 8, 2006

Boy that's a great post, George. Thank you. And your undergraduate basic training is very top notch mathematically.

Beeza, I usually teach calculus myself out of Stewart or Edwards and Penney, or such a standard good book. I like Stewart. It depends on how the course is taught. Most of my students are not math majors, and I often give them what may be weaker than a strong AP high school course.

I try to give whatever is appropriate to the audience and the course level and most of the time I am teaching non honors basic plug and chug calc.

But I always try to put some theory in there. Like when I do exponential functions, I really try to explain why they work the way they do. I will encourage kids to learn to derive the homomorphism law exp(a+b) =exp(a)exp(b), from the differential equation exp' = exp, and exp(0) = 1.

or use the differential equation ln' = 1/x and ln(1) = 0, to prove ln(ab) = ln(a)+ln(b). And I often ask them to elarn to prove that a function with derivative zero is constant, from the mean value theorem.

most people teaching from stewart skip these proofs, or even skip the statement of the MVT and the IVT.

If I happen once in a blue moon to get an honors class, not a spivak class mind you, just a basic honors class, I do try to put in a little more. It is usually very hard for my scholars who are coming in often with a no proofs at all AP course, often into my second semester honors class.

E.g. last time i taught regular honors integral calc, I not only proved all monotone functions are riemann integrable, but also covered power series as special cases of series of functions convergent in the uniform norm.

I discussed the equivalent sum norm, euclidean norm, and max norms on R^n, and then treated the analogous norms on function spaces, L^1, L^2, and sup norm. We proved differentiability and integrability conditions for convergent series of functions and applied to show you can integrate and differentiate power series term by term.

I can't remember if I proved the inverse function theorem but I did prove that all continuous functions on a closed bounded interval assume a maximum value.

This is the sort of approach in Lang's book Analysis 1, listed above.

The idea is to put some experience with proofs into even the basic course, and show a little something I might enjoy myself in the honors course.

You can make a pretty strong theoretical course out of stewart or edwards and penney if you prove all the theorems in there. most of the proofs are in the appendix anyway. it is just sort of the attitude towards proof in those books that is missing.

and which problems do you work? edwards and penney responded to market pressure by adding thousands of easy problems at the beginning of their problem sets, but the harder more interesting ones are often still there in the back of the problem set.

some things though are gone. e.g. calculus was invented by Newton primarily to do physics, and the old editions of E-P had kepler's laws derived, but now some of those are left out.

to get what you missed just start reading courant or spivak, or courant and john, or apostol. you'll see the difference.

mathwonk · Jun 8, 2006

honors freshman calc notes

Here is one chapter from my first semester freshman honors calc course notes. i have only taught this stuff that once. I wrote these notes up just for that class, and i do it differently every time. In particular I had never thought of the local boundedness approach before.

Here is one chapter from my first semester freshman honors calc course notes. i have only taught this stuff that once. I wrote these notes up just for that class, and i do it differently every time. In particular I had never thought of the local boundedness approach before.

2300H. Chapter Seven. Bounded and Unbounded Functions
We want to discuss the basic notions of boundedness and unboundedness for functions, which just means whether their values can become arbitraily large in absolute value on their domains or whether there is some absolute value that is never exceeded. Later on in the case of functions that are also differentiable, we will use derivatives to actually find points where some functions take on maximum and minimum values. But not all functions actually assume such maxima or minima, not even all bounded functions. Thus it is useful to know a condition that guarantees a function has a maximum value in a given domain. The most fundamental condition says that if a function is continuous on a closed and bounded interval, then the function is bounded there (above and below) and it takes on a maximum value and also a minimum value.

Notation: "iff" = "if and only if".

Definition of boundedness:
Let f be a real valued function defined on a set S. Then we say f is "bounded on S", (or simply "bounded" if S is understood), if all of the values of f on S are bounded by some fixed number, in absolute value. More precisely, f is "bounded on S" if and only if there is some real number M >= 0 such that for every x in S, |f(x)| <= M.
In symbols, f is bounded on S if and only if for some M>=0, and all x in S, |f(x)| <= M.
Note this guarantees that f is bounded both above and below, since then, for every x in S, we have - M <= f(x) <= M.

Remark: If S is a finite set, every function defined on S is bounded on S. Just take the bound M to be the largest of the finite colection of numbers {|f(x)|, for x in S}. Hence boundedness is only in question when the domain S of f is an infinite set, such as an interval of positive length.

Definition of unboundedness: A function f defined on S is unbounded on S, if it is not bounded on S, i.e. if no matter how large we take M to be, there is always some point x in S where |f(x)| > M. In symbols, f is unbounded on S if and only if for all M>=0,there is some x in S with |f(x)| > M.

Remark: Note how the quantifiers (for some, for all) in the unboundedness statement are the opposite of those in the boundedness statement. Moreover it is sufficient to say that |f(x)| can be made larger than any natural number n, i.e. f is unbounded on S if and only if
(for every integer n >= 0)there is an x in S): (|f(x)| > n).

Examples The function f(x) = x is continuous and unbounded on the whole real line, but is bounded on every finite interval. The function cos(x) is bounded and continuous on the whole real line and hence also bounded on every finite interval. On the other hand the function g(x) = 1/x is bounded on the infinite interval [1,infinity), and unbounded on the finite interval (0,1). It is no accident that g(x) is discontinuous at 0, and has no continuous extension to the closed interval [0,1]. Note also that g(x) is actually unbounded on every interval of form (0,1/n) for every n. The more interesting function h(x) = cos(x)/x is also unbounded on every such interval. (You should graph these functions and be familiar with them.) This phenomenon of being unbounded on every interval about a point, no matter how short the interval, is called “local” unboundedness.

More precisely,

Definition of local unboundedness: If f is defined on some deleted neighborhood D of a point a (i.e. some set of form (a-?, a) union (a, a+?), or on some open interval with a as endpoint (i.e. some set of form (a-?, a) or (a, a+?)), but not necessarily at a, then f is “locally unbounded at a” (or “locally unbounded near a”) if and only if f is unbounded on every deleted neighborhood of a. In symbols, f is “locally unbounded at a” if and only if no matter how large we take M to be, and no matter how small we take ? to be, there is some point x in (a-?,a+?) where f has a value larger than M in absolute value. In symbols, f is “locally unbounded at a” if and only if
(for all M >= 0)(for all ? > 0)(there exists x)such that (0 < |x-a| < ? and |f(x)| > M).

Definition of local boundedness: If f is defined on some deleted neighborhood D of a point a, or on some open interval with a as endpoint, then f is “locally bounded at a” (or “locally bounded near a”) if and only if f is bounded on some deleted neighborhood of a. Equivalently, there is some bound M and some deleted ? neighborhood of a where f is bounded by M, in absolute value. In symbols, f is “locally bounded at a” if and only if
(for some M >= 0)(for some ? > 0)(for all x)( 0 < |x-a| < ? implies |f(x)| <= M).
In case f is only defined say to the right of a, this would read
(for some M >= 0)(for some ? > 0)(for all x)( 0 < x-a < ? implies |f(x)| <= M).

Continuous functions are locally bounded. This is a basic property of continuous functions. More precisely,
Theorem: If f is continuous at a, then f is locally bounded at a.
proof: Take any positive number e> 0, (such as e = 1). then by definition of continuity there is some ? > 0 such that for all x with
|x-a| < ?, we have |f(x) - f(a)| < e. We claim then M = (|f(a)| + e) is a bound for f on the interval (a-?,a+?). I.e. for all x with |x-a| < ?, we have |f(x) - f(a)| < e, but in general |A| - |B| <= |A-B|, so for |x-a| < ?, we have |f(x)| - |f(a)| <= |f(x) - f(a)| < e, thus |f(x)| - |f(a)| < e. Hence by adding |f(a)| to both sides of this inequality, we get |f(x)| < |f(a)| + e = M, for all x with |x-a| < ?. This proves f is locally bounded near a. QED.

Note this means that a function like f(x) = 1/x can be locally bounded at every point of (0,1), since it is continuous at every point of (0,1), and yet be unbounded globally on (0,1). This is only possible because the interval (0,1) is open. I.e. we have the following basic theorem:

Theorem: If f is locally bounded at every point of the closed bounded interval [a,b], then f is globally bounded on [a,b].
proof: We know f is locally bounded near a, so at least there is some x with a < x < b such that f is bounded on the interval [a,x]. We just want to show we can move x to the right all the way to b and still have f bounded. So consider the set S = { those points x in [a,b] such that f is bounded on [a,x] }. Then a belongs to S, but all elements of S are bounded above by b, so S has a least upper bound, say L. Then we can conclude that f is bounded on [a,x] for every x with a <= x < L, but f is not bounded on [a,x] for any x with L < x <= b.

We claim L = b. We will prove it by contradiction. Since a is in S and L is an upper bound for S, we see a <= L. Since b is an upper bound for S and L is the least upper bound, we see also that L <= b, so a <= L <= b. We claim L cannot be less than b. For if L < b, then since f is continuous at L, by assumption, f is locally bounded at L so there is some ? > 0 such that f is bounded by some M1 on (L-?,L+?) and by taking ? smaller if necessary, we will have L < L+? < b. Hence f is bounded by M1 on the interval [L- ?/2, L+ ?/2] and, since L - ?/2 is less than L, also f is bounded on the interval [a, L- ?/2] say by M2. Then if M = max{M1, M2}, f is bounded by M on the whole interval [a, L+ ?/2].

But this contradicts the fact that f is not bounded on [a,x] for any x with L < x <= b. I.e. L + ?/2 would be such an x. This contradiction shows that in fact L < b is impossible, so L = b.
Now we know that f is bounded on all intervals of form [a,x] with
x < b, and we claim that in fact f is bounded on [a,b]. Also f is locally bounded at b, say by N1, on some ? - neighborhood of b, i.e. on (b-?,b) for some ? > 0. On the other hand, since b - ?/2 < b = L, f is also bounded on [a, b - ?/2] say by N2.

Then f is bounded on [a,b) by max{N1, N2}. We have to include the endpoint b also but that is easy since it is only one point. I.e. if we take N = max{N1, N2, |f(b)|}, then f is bounded by N on [a,b]. QED.

Corollary: In particular, since every continuous function on [a,b] is locally bounded at every point, every continuous function f on [a,b] is globally bounded on [a,b].

From the corollary, we know the set of values of a continuous function f on [a,b] have an upper and a lower bound, and hence by the least upper bound axiom for real numbers, there is a least upper bound and a greatest lower bound for those values. We claim that for a continuous f on [a,b] that the lub of its values is actually a value, the “maximum value”, and the glb of the values is also a value, the “minimum value”. Thus a continuous function on a closed bounded interval assumes a maximum value and a minimum value on [a,b].

More precisely,
Theorem: If f is continuous on [a,b] and if M is the lub of its values on [a,b] while m is the glb of those values, then there is some point x0 in [a,b] where f(x0) = m, and some point x1 in [a,b] where f(x1) = M.
proof: (for M). Suppose to the contrary that f never takes the value M. Since M is the least number >= all the values of f on [a,b], then f takes values arbitrarily near M from below, i.e. for every n > 0, there is a value of f between M - 1/n and M. Thus for every n > 0, there is some x in [a,b] such that M - 1/n < f(x) < M. Thus the reciprocal function 1/(M-f(x)) is greater than 1/(1/n) = n at x. Thus if f never equals M, then the reciprocal function 1/(M-f) is both continuous and unbounded on [a,b]. This is a contradiction to the previous corollary. QED.

Corollary: If f is continuous on [a,b] then there are points x0 and x1 in [a,b] such that for every x in [a,b], we have f(x0) <= f(x) <= f(x1). We say f takes its maximum value at x1 and its minimum value at x0.

mathwonk · Jun 8, 2006

honors integral calc

Here are some of my second semester honors calc notes for freshmen. sorry about the screwed up fonts. i hope it is readable somehow.

2310H Sequences and series
We want to use limit processes to extend our reach from the familiar to the unfamiliar, by approximating some exotic functions and numbers in terms of more familiar ones. E.g. we will approximate irrational numbers like e and <pi>, by simple rational numbers. And we will approximate exotic functions like ex, ln(1+x), sin(x), and arctan(x), by polynomials. Remember, to say some irrational number is a limit of rational numbers just means we can approximate the irrational number as closely as we like by rational numbers. Also to say a function is a limit of polynomials means we can approximate the given function as closely as we like by polynomials. For this we must decide what it means for two functions to be near each other. Does it mean all their values are uniformly near? I.e. that there is not much distance between their graphs?” Or does it mean there is not much area between their graphs? Or something else? We will discuss the various choices below.

Definition: If S is any set, a sequence with values in S is simply a function a:N-->S where N is the "natural numbers", i.e. the positive integers. We denote the value a(n) by an and display the whole function as its sequence of values: a1,a2,...,an,...

Remark: It is not essential that the indices begin with 1; they could begin with 0, or -4, or 1000. The important thing is that it begin somewhere and go on up to infinity. I.e. it is only infinite in one direction, upwards.
We often use letters to denote the values of a sequence that remind us of the nature of the elements of the set S.

Example 1: If S is the real numbers we might call the elements xn, and write the sequence as {xn} or x1,x2,x3,...,xn,...
Example 2: If the set S is the Euclidean plane R2, we might write {pn} or p1,p2,p3,...,pn,... for a sequence of points in the plane.
Example 3: If the set S is the set of continuous functions on the interval [a,b], we might write f1,f2,...,fn,... or {fn} for the sequence of functions.

In all three of our examples, we can add, subtract, and multiply our elements by real numbers. We want to define next a notion of "size" or "length" or "absolute value" for our elements.
ex.1: For real numbers define the absolute value of a number x to be |x| = its absolute value.
ex.2. For a point p =(x,y) in the plane, define its absolute value to be the Euclidean distance from the origin, |p| = sqrt(x^2+y^2).
ex.3. For a function f on [a,b] there are several natural choices, which yield different results. The one suited to our present purposes is called the "sup norm", which is the maximum of all the absolute values |f(x)|, i.e. ||f|| = the global maximum of the function |f|. Thus ||f|| is the maximum of the absolute values |f(x)| of f evaluated at every point x in [a,b].

Thus ||f|| is the height of the highest point of the graph of the function y = |f(x)|, over the interval [a,b]. We know from a big theorem in my 2300H notes, that there exists such a maximum.
These notions of length lead a notion of distance between two objects, and hence of a notion of "epsilon neighborhood" centered at one object:

ex.1: Given two real numbers x,y, their distance apart is |x-y|. For e > 0, the e - neighborhood of x, is the open interval (x-e, x+e) of all real numbers closer to x than e.

ex.2: Given two points in the plane p1 = (x1,y1), and p2 = (x2,y2), their distance apart is |p1-p2| = sqrt([x1-x2]^2 + [y1-y2]^2). Given e > 0, the e - neighborhood of p, is the open disc of radius e, centered at p, of all real points in the plane closer to p than e, in the usual "Euclidean norm".

Remark: We get the same notion of convergence in the plane, but not exactly the same notion of distance, by saying that the distance between two points p1 = (x1,y1), and p2 = (x2,y2), is the maximum of |x1-x2|, or |y1-y2|. I.e. by seeing how far apart their x and their y coordinates are, and taking the larger difference as the distance between the points.

Then given e > 0, the e - neighborhood of p, would be the open square of radius e, centered at p, all real points in the plane closer to p than e, in the "maximum norm".

There is a third natural notion of length and distance for points in the plane, called the “sum norm”, where the length of p = (x,y) = |x|+|y| is the sum of the absolute values of the coordinates. Then the distance from p1 = (x1,y1), to p2 = (x2,y2) is |x1-x2|+|y1-y2|, and the e - nbhd of p, is a “diamond” of radius e, centered at p:

ex.3: The three definitions of “length” we discussed in the plane all have generalizations to “size” of functions. The Euclidean norm generalizes to the "L2-norm" where a function has size = , the square root of the integral of its square. If we think of a function as a “vector” with an infinite number of components, this definition yields a related definition of “dot product” = which allows one to talk about the “angle” between two functions and perpendicularity of functions. This particularly use in approximating functions by sines and cosines, called the theory of “Fourier series”.
The sum norm generalizes to the integral of the absolute value. This was Matt's suggestion, and it is very useful in extending the notion of integrability of functions to more general functions than the ones Riemann’s definitions works for. Convergence using this notion of length, the “L1-norm”, leads to the theory of “Lebesgue integration”.

For our purpose of approximating functions by polynomials, it is useful to choose the generalization of the "max norm" we defined above. Thus the distance between two functions f,g in the max norm, is defined as ||f-g|| = maximum of all differences |f(x)-g(x)|, for all x in [a,b].
For given e>0, the resulting e - nbhd of f, is represented by a strip extending a distance e both above and below the graph of f. I.e. a function g is within a distance e of f if and only if its graph lies entirely in that strip.

Remarks: All our notions of length satisfy these basic properties:

(i) "triangle inequality" |a+b| <= |a| + |b|, |a-b| >= |a| - |b|.
(ii) “homogeneity”: |ca| = |c||a|, where c is a real number.
e.g. ||cf|| = |c| ||f||, for a function f and a constant c.
(iii) “non degeneracy”: |a| = 0 if and only if a = 0.

Although all three norms in the plane give the same notion of convergence, this is not true for their generalizations to functions. Here the sup norm is more restrictive than the L1 or L2 norms.
Exercise: If two continuous functions on [a,b] functions are everywhere within e of each other then their integrals are also within e(b-a) of each other hence also close. [Hint: Recall the monotonicity property of integrals, that f(x) <= g(x) for all x in [a,b], implies integral f <= integral g .]
In particular a function which is everywhere close to zero, has integral which is also close to zero. I.e. if a function is small in the sup norm, it is also small in the L1 norm. On the other hand a function can have integral very close to zero and yet can have some very large values. Hence a function can be small in the L1 norm and yet be very large in the sup norm. Here is one such: [imagine picture]

This function has sup norm equal to n, and yet has integral 1/(2n). So the sup norm approaches infinity while the L1 norm approaches zero. Thus convergence is different in these two norms.

Thus it is harder for functions to approximate other functions in the sup norm, which means that the limit function will retain more properties of the approximating functions. This suits us since we are interested in approximating very good functions like sin and exp, which have the same good properties of continuity and differentiability as the approximating functions we will use, the polynomials. (If on the other hand we wanted to define the notion of integral for functions with lots of discontinuities, we would use a norm like the integral norm which allows very continuous functions to approximate very discontinuous ones.)

Definition: A sequence {sn} in S, (where S is one of our three sets equipped with the appropriate distance), converges to an element s? of S, or simply {sn} --> s?, if and only if, for every e>0, there exists a positive integer N, such that whenever n >= N, then |sn-s?| < e. Note: to say |sn-s?| < e, is the same as saying s? - e < sn < s? + e. (Although we write a single absolute value here, in the case of functions this is the sup norm || ||.)

Note: To say {sn} --> s?, is equivalent to saying that {sn-s?}-->0.Remark: This means that no matter how small an e- neighborhood we describe around our limit point s?, after a certain element sN, all the rest of the sequence lies in that neighborhood. In particular if a sequence converges to s?, and we form a new sequence by throwing away the first billion elements of our old sequence, the new sequence also converges to s?. Thus whether or not a sequence converges, and what the limit is, is unaffected by any given finite number of elements of the sequence.
In particular, if a sequence converges to s?, then the new sequence formed by adding in a billion or so 1’s at the beginning of the sequence, still converges to the same limit. Thus there is no reason to expect to able to guess the limit of a sequence just by looking at the first hundred trillion elements or so.

Remark: Because all our notions of length satisfy the triangle inequality, it follows that the sum of two convergent sequences converges to the sum of the limits, and homogeneity implies that multiplying the elements of a sequence by a constant multiplies the limit by that constant. Non degeneracy implies that the limit of a sequence is unique, i.e. the same sequence cannot converge to two different limits. Of course these are intuitive properties we might expect. And they are indeed true. (You should prove them.)

To prove anything about existence of limits we need an axiom guaranteeing the existence of lots of real numbers. A surprisingly simple one suffices.
Completeness axiom: A non empty set of real numbers which has an upper bound has a least upper bound. I.e. if some number is >= than all numbers in the non empty set S, then there is some smallest number which is still >= all numbers in S.

Corollary: Every set of real numbers with a lower bound has a greatest lower bound.
proof: (Use minus the least upper bound of the negatives of these numbers.)

Remark: Since we said the real numbers are represented by infinite decimals, we can prove the completeness axiom as a theorem. E.g. given any collection of positive infinite decimals, none of which end in all 9’s, and which are bounded above say by N, choose as integer part the largest integer occurring as integer part of one of them. Then among all those reals having exactly that largest integer part, choose the largest tenths digit that occurs among these. Then among all reals in the set having exactly that integer part and that tenths digit, choose the largest hundredths digit that occurs. Continue... and you will construct a decimal that is the smallest number not smaller than any of your decimals. (Note this construction can give a decimal that does end in all 9’s, in which case you can choose a different representative which does not do so.)

Application: The sequence of positive integers {n} is not bounded above.
proof: If it were, there would be a smallest upper bound K. Then K-1 is smaller so K-1 is not an upper bound for all positive integers, so there is some positive integer N with N > K-1. But then K+1 > N, contradicting N being an upper bound for all positive integers. QED.

Corollary: The sequence {1/n} of reciprocals of all positive integers n, converges to 0.
proof: Given e > 0, choose N > 1/e. This is possible since the positive integers have no upper bound. Then for all n >= N, also n > 1/e. I.e. then 1/n < e. So for all n >= N, we have |1/n - 0| < e. QED.

Remark: With our convention that reals are decimals, the past two properties are also somewhat obvious by reasoning with decimals.

Corollary: If 0< r < 1, the sequence {r^n} converges to 0.
proof: Let s = 1/r > 1, and then given e > 0, choose N > 1/e(s-1), hence N(s-1) > 1/e. Then n >= N implies s^n >= s^N = (1+(s-1))^N >= 1+N(s-1) [binomial theorem] >= 1+(1/e) = (e+1)/e. Then 1/s^n = r^n <= e/(e+1) < e. QED.
another proof: given a>0 we want to find N so that n>=N implies that r^n < a. taking logs this is equivalent to n ln(r) < ln(a), i.e. since 0<a<1 implies that ln(a) < 0, this is equivalent to n > ln(a)/ln(r). Since the integers are unbounded above, just choose N > ln(a)/ln(r). Then n >= N implies also n > ln(a)/ln(r). Hence n ln(r) < ln(a), so after exponentiating, we get r^n < a, as desired. QED. Infinite series
Next we discuss “infinite sums” i.e. “convergent series”. Let {an} be any infinite sequence, and form another sequence of “partial sums” of the original sequence: s1 = a1, s2 = a1+a2, s3 = a1+a2+a3,..,
sn= a1+a2+...+an,...

Definition: We say "summation ai converges to a?", or “ = a?”, or “a? is the sum of the series summation ai”, if and only if the sequence {sn} of partial sums converges to a?, if and only if, for every e > 0, there is a positive integer N, such that, for all n>=N, we have |(sum from i = 1 to i = n of ai) - a?| < e.

Example: geometric series: If a is any real number and r is a real number with |r| < 1, then the series summation of ar^i from i=0 to i=n, converges to a/(1-r).
proof: By multiplying out the denominators, one checks that
= a/(1-r) - ar^(n+1)/(1-r), so |a/(1-r) - ?? | = |ar^(n+1)/(1-r)|. Since we know that |r|^n-->0, it follows that |ar^(n+1)/(1-r)| =
|ar/(1-r)| |r|^n -->0.
QED.

mathwonk · Jun 8, 2006

more second semester honors calc notes

a bit more secomd semester homnors calc notes. these were completed by series for e^x, cos(x), sin(x), and proof of differentiability of convergent power series term by term.

Series of functions:
Example: power series:
Consider the functions x^n for n >= 0, and the formal geometric series expansion 1/(1+x) = 1 - x + x^2 - x^3 + x^4 - + ... We know the rhs equals the lhs for any choice of x with |x|<1 by the previous example. We claim this series of functions converges to the function 1/(1+x) on the lhs in the sup norm, on any interval [-r,r] where 0<r<1, (but not on all of
(-1,1)). I.e. since the partial sum 1 - x + x^2 - x^3 + ...+(-1)^n x^n =
[1/(1+x) - (-1)^(n+1)x^(n+1)/(1+x)], we have again that
| 1/(1+x) - (1 - x + x^2 - x^3 + ...+(-1)^n x^n)| = |x^(n+1)/(1+x)| for any real number x. Now since on the interval [-r,r] we have |x| <= r, and 1+x >= 1-r, it follows that for all x in that interval, ||x^(n+1)/(1+x)|| <= r^(n+1)/(1-r). Hence to show that || 1/(1+x) - (1 - x + x^2 - x^3 + ...+(-1)^n x^n)|| = ||x^(n+1)/(1+x)|| approaches zero, it suffices to show that r^(n+1) -->0, which we have done above. Thus 1/(1+x) = 1 - x + x^2 - x^3 + x^4 - + ..., for all x with |x|<1, and convergence holds in the sup norm on any closed bounded interval strictly contained in (-1,1). QED.

Exercise: (i) If a sequence of functions {fn} converges to f in the sup norm on [a,b], then the integrals also converge, i.e. the sequence of real numbers { } (integral of fn from a to b) converges to the real number (integral of f from a to b).
(ii) In fact the indefinite integrals Gn = , (integral of fn from a to x) which are functions on [a,b], also converge to the function G = (integral of f from a to x), in the sup norm.

Approximation of transcendental functions by polynomials
Example: ln(1+x):
By the previous example, 1/(1+x) = 1 - x + x^2 - x^3 + x^4 - + ..., for all x with |x|<1, and convergence holds in the sup norm on any closed bounded interval strictly contained in (-1,1). Consequently, by an exercise above, on any interval [-r,r] with 0<r<1, the series of indefinite integrals (starting at 0) of the series 1 - x + x^2 - x^3 + x^4 - + ..., converges to the indefinite integral of 1/(1+x).
I.e. the series x - x^2/2 + x^3/3 - x^4/4 + x^5/5 - x^6/6 ±... converges in the sup norm on [-r,r], to (integral of 1/(1+t) from t=0 to t = x)= ln(1+x). Thus ln(1+x) =
x - x^2/2 + x^3/3 - x^4/4 ±..., for each x with |x| < 1, and convergence holds in the sup norm on any [-r,r] with 0<r<1. Now because the series has alternating signs, it can be shown that it also converges for x = 1, to ln(2), and yields the amazing formula ln(2) = 1 - 1/2 + 1/3 - 1/4 + 1/5 - + ...

Example: arctan(x): The geometric series 1 - x^2 + x^4 - x^6 + - ..., converges to 1/(1+x^2), for each x with |x| < 1, and in the sup norm on any interval [-r,r] with 0<r<1. Hence the series of indefinite integrals, starting from 0, converges to the indefinite integral of the limit.
I.e. x - x^3/3 + x^5/5 - x^7/7 ±... = =(integral of 1/(1+t^2) from t=0 to t=x) = arctan(x), again in the sup norm on any closed interval strictly contained in the interval (-1,1). Again convergence holds also for x = 1, yielding the even more amazing
formula: <pi>/4 = 1 - 1/3 + 1/5 - 1/7 + 1/9 - + ...
Next we want to find series expressions for e^x, sin(x), and cos(x). Since the derivatives of these functions are no simpler than the functions themselves, we cannot proceed in the same way as before. We need some criteria guaranteeing convergence of sequences and series when we do not know what the limits are precisely. They all involve exploiting the notion of boundedness.

Convergence of monotone sequences
Lemma: A convergent sequence must be bounded. I.e. if {sn} converges, then there is some positive number K such that for all n, |sn| <= K.
proof: By definition of convergence, if {sn} converges to s, then given say e = 1, there is an N such that all elements after sN are within a distance 1 of s, so that for all n >=N, we have |sn| <= |s| + 1. Hence if we let K be the maximum of the numbers |s1|, |s2|,...,|sN-1|, |s|+1, then for all n, we have |sn| <= K. QED.

Remark: The converse does not hold, since the sequence
1,-1,1,-1,1,-1,... is bounded but not convergent.

There is however a class of bounded sequences of real numbers which does always converge, namely bounded monotone sequences.
Lemma: A bounded monotone sequence of real numbers converges.
Proof: If the sequence {sn} is bounded and monotone, say monotone increasing, let K be the least upper bound of the sequence. I.e. let K be the smallest number such that for all n, we have sn <= K. We claim the sequence converges to K. Let e>0 be given. Since K is the smallest number which is >= all elements of the sequence, the number K-e must be less than some element of the sequence. Suppose sN > K-e. Then for all n>=N, we have sN <= sn, by monotonicity. Since K is an upper bound of the entire sequence we also have K-e < sN <= sn <= K < K+e, for all n >= N. I.e.
|sn-K| < e, for all n >= N. QED.

Note: This gives a way to tell a sequence is convergent without explicitly finding the limit. Just find any upper bound for a weakly increasing sequence and you know the sequence converges even if you cannot determine what is the least upper bound, i.e. the limit. Similarly, if there is a lower bound for a weakly decreasing sequence, then that's equence also converges to its greatest lower bound.

Here is the analog for series, of convergence of monotone sequences.
Theorem: If {an} is any sequence of non negative numbers, the series (summation of ai from i =1 to i = infinity) converges if and only if the partial sums are bounded, i.e. if and only if there is some number K such that for all n, (summation of ai from i =1 to i = n)<= K.
proof: trivial exercise.

This leads to the following so called “comparison tests”.
Theorem: If (summation of ai from i =1 to i = infinity) and (summation of bi from i =1 to i = infinity) are two series of non negative real numbers, and if ai <= bi for all i, then the convergence of (summation of bi) implies the convergence of (summation of ai), and hence the non convergence of (summation of ai) implies the non convergence of (summation of bi).
proof: This follows from an earlier result because when the partial sums of one positive series are bounded, so are those of a smaller positive series. QED.

The idea that monotone sequences converge generalizes as follows.
Cauchy’s criterion and its applications
Definition: A sequence {sn} is called Cauchy, if and only if for every
e > 0, there is some N, such that, for all n,m >= N, we have |sn-sm| < e.
Exercise: Any convergent sequence is Cauchy.

Remark: In our three examples, the converse holds: in the real numbers, the plane, and the space of continuous functions on [a,b] with sup norm, every Cauchy sequence converges to an element of the same space.

Digression: Intuitively, to say a sequence is Cauchy, means the elements of the sequence are bunching up, but they might not converge unless there actually is a point of our space at the place where they are bunching. E.g., if our space were the real numbers, except zero had been removed, then the sequence {1/n} would still be Cauchy, but would not converge simply because we had removed the limit point. Since lots of sequences of rational numbers have irrational limits, Cauchy sequences of rationals do not always converge in the space of rationals. E.g. the sequence 3, 3.1, 3.14, 3.141, 3.1415, 1.14159,... of rationals, which converges to <pi>, (if the decimals are chosen appropriately), would be Cauchy in the rationals, but would not converge in the space of rationals. I.e. some spaces have “holes” in them, and a sequence could head towards a hole in the space and be Cauchy, but not have a limit in the space, just because the limit is missing from the space.
[There is a way to fill the holes in any space, i.e. a space with a distance can be enlarged so all Cauchy sequences do converge, by adding in a limit for every Cauchy sequence. This is one way to construct the reals from the rationals. Starting from any space with a length, consider the space of all Cauchy sequences in that space, and identify two Cauchy sequences {xn} and {yn} if the sequence of numbers {|xn-yn|} converges to zero. For instance the real number <pi> is identified with the Cauchy sequence 3, 3.1, 3.14, 3.141, 3.1415,... of rationals. Decimals give a very efficient way of picking usually one Cauchy sequence of rationals for each real number. Still the Cauchy sequences of decimals 1, 1.0, 1.00, 1.000, ... and .9, .99, .999, .9999, ... both represent the same real number.]

None of our 3 example spaces have holes, by the next theorem.

Big Theorem: In all three of our examples, every Cauchy sequence {si} converges to some limit in the given space.
proof:
Example (i) We do the case of real numbers first: define for each n, an = the greatest lower bound of the elements si in the sequence such that i >= n. Define bn = least upper bound of those elements si with i >= n. Then {an} is a weakly increasing sequence and {bn} is a weakly decreasing sequence, so both sequences {an} and {bn} converge by the previous corollary. Now the Cauchyness of the sequence {si} implies that |an-bn| converges to zero. Thus in fact both sequences {an} and {bn} converge to the same limit K. Then since for each n, all sk with k >= n, lie between an and bn, K is also the limit of the sequence {si}.

Here is another cute proof; we claim first (i) that every sequence has a a monotone subsequence, and then (ii) that every Cauchy which has a convergent subsequence, also converges itself.
proof of (i) Call a point sN of a sequence a “peak point” if all later members of the sequence are no smaller. I.e. sN is a peak point iff for all n >= N, we have sn >= sN. Now there are two cases: either there are an infinite number of peak points or only a finite number of them, maybe zero. If there are an infinite number of peak points, then the subsequence of peak points is weakly monotone increasing and we are done. If there are only finitely many peak points, then after the last peak point say sN, no element is a peak point. So every element sn with n >= N, has the property that there is a later element which is smaller. This allows us to choose a weakly decreasing subsequence. I.e. start from sN+1. then there is some sn with n >= N+1 and such that sn < sN+1. Let that sn be thes econd element of the subsequence. Then there is some later element sm such that sm < sn. Let that sm be the third element of the subsequence. Continue in this way.
proof of (ii) If a Cauchy sequence {sn} has a convergent subsequence {tm}, [i.e. each tm is one of the sn, and the t’s occur in the same order in which they occur in the original sequence], then the originals equence {sn} converges to the same limit as the subsequence.

To be precise:
Definition: Recall a sequence of reals is a function s:N-->R, where N is the set of positive integers and R is the set of real numbers. A subsequence is a function t:N-->N-->R which is a composition of a strictly increasing function N-->N, with the function s:N-->R. We some times write the element tm as s(n(m)), where we think of n as a function of m. Here by hypothesis, n(m) >= m, and also n(m+1) > n(m).

Ok, assume tm = s(n(m)) -->L. If {sn} is Cauchy we claim {sn}-->L also. So we just try to plod through the motions. I.e. let e>) be given. We must find N such that n>=N implies that |sn - L| < e. Ah my brain is waking up. OK, we know we can make all the later t’s close to L, by hypothesis that the sequence of t’s converegs to L. We also know that by the Cauchy hypothesis, we make all the later s’s close to each other. Since some of thos s’s are t’s, that should make all the alter s’s close to L too. OK, choose K so large that n >=K implies |tn-L| < e/2. And then choose M so large that n,m>=M implies that |sn-sm|<e/2. Then let N be the alrger of the two integers K,M. Then the element tN = s(n(N)) where n(N) >= N. So this implies that |tN-L| < e/2. Now let n >= N and look at |sn - L|. Since n(N)>=N, we know that |sn -s(n(N))| < e/2. Now we have
|sn - L| = |sn - s(n(N))+s(n(N)) -L| <= |sn - s(n(N))| + |s(n(N)) -L|
< e/2 + e/2 = e. That doos it I hope. QED.

Example (ii) For a Cauchy sequence of points {pn} = {(xn,yn)}, in the plane, both sequences {xn} and {yn} are Cauchy sequences of real numbers, since |pn| >= |xn|, |yn|. Hence {xn} converges to some x, and {yn} converges to some y, and then {pn} converges to (x,y).

Example (iii) If {fn} is a Cauchy sequence of functions on [a,b], then for each x in [a,b], the definition of the sup norm, forces the sequence of real numbers {fn(x)} to be Cauchy, hence convergent to some number we call f(x). This defines a function f, which we claim is continuous, and is the limit of the sequence {fn}.
To see convergence, let e>0 be given. We must find N such that for all n>=N, we have ||f-fn| < e. But we know the sequence {fn} is Cauchy in the sup norm, so for some N, we have ||fn-fm|| < e/3 for all n,m >= N. Since for all x, f(x) is the limit of the fn(x), it follows that for all x and all n >=N, we have |f(x)-fn(x))| <= 2e/3. I.e. given x, there is some m > N such that |fm(x)-f(x)| < e/3. Since for all n>=N, we have |fn(x)-fm(x)| < e/3, it follows that for all n >= N, |f(x)-fn(x)| <= |f(x)-fm(x)|+|fm(x)-fn(x)| < 2e/3. Thus for all x, and all n >= N, we have |f(x)-fn(x)| < e. I.e. {fn} converges to f in the sup norm.
Finally we claim the limit function f is continuous on [a,b], hence lies in the space we are working in. To prove this, let z be any point of [a,b]. To show f is continuous there, let e>0 be given and try to find d>0 such that for all x closer to z than d, we have |f(x)-f(z)| < e. This is a classic e/3 proof. I.e. choose N such that for all n,m >= N, we have ||fn-fm|| < e/3. Then we saw above that also for all n>=N, we have ||f(x)-fn(x)|| < e/3. Now fN is continuous by hypothesis, so there is a d>0 such that for all z closer to x than d, we have |fN(z)-fN(x)|<e/3. Then just note that
|f(z)-f(x)| = |f(z)-fN(z)+fN(z)-fN(x)+fN(x)-f(x)|
<= |f(z)-fN(z)| + |fN(z)-fN(x)| + |fN(x)-f(x)| < e/3 + e/3 + e/3 = e.
I.e. |f(z)-fN(z)| < e/3 because fN is closer than e/3 to f at every point of [a,b]. And |fN(x)-f(x)| < e/3 for the same reason. Then |fN(z)-fN(x)| < e/3 because fN is continuous at x, and d was chosen to make this true for fN since |z-x| < d. QED.

mathwonk · Jun 8, 2006

Q: How familiar is this stuff to students? how many courses treat completeness of real numbers, cauchy convergence, boundedness of continuous functions, differentiability of series, with proofs, in high school AP? in college calc? in college honors calc?

courtrigrad · Jun 8, 2006

we don't cover any of this. I only saw most of the material in Apostol and Courant.

mathwonk · Jun 8, 2006

well that's my point. that's the difference between a high school AP course and my honors college course. so if you are a good student, your AP course prepares you to begin my honors course in first semester. But not everyone teaches this way. In fact many people probably think i am a nutcase for teaching this stuff to freshmen, but i think you can do anything if you do it well and carefully.

this is some of the stuff Tate covered in our freshman honors calc course at harvard, but he also covered linear algebra, inner products and hilbert spaces, including the complex case, and differential equations.

mathwonk · Jun 8, 2006

beeza, i recommend henry helson's honors calculus book, pretty cheap ($24 new, including shipping), from his website, and very well done for a short expert treament of high level calculus, by a retired berkeley professor.

mathwonk · Jun 9, 2006

lets do some exercises. here is a little tiny proof: show that if f is a function which is bounded away from zero, i.e. f never takes values in some interval (-a,a) , then 1/f is bounded above and below.

TD · Jun 9, 2006

mathwonk said:

Q: How familiar is this stuff to students? how many courses treat completeness of real numbers, cauchy convergence, boundedness of continuous functions, differentiability of series, with proofs, in high school AP? in college calc? in college honors calc?

Although I'm not studying mathematics, we have covered all of this in my first year of engineering, in Analysis I and Analysis II. We didn't really construct the real numbers out of the rationals but did see the fundamental properties such as completeness and proved some fundamental theorems based on this (such as the lub-property, Bolzano-Weierstrass, Heine-Borel). From what you mentioned earlier, we also saw differentiating/integrating power series term wise, proving Riemann-integrability for continuous function using uniform convergence, we saw the inverse function theorem (without proof) and proved the implicit function theorem. And then a lot more which hasn't been mentioned before of course.

The way I understand it from this topic, this seems to be quite unordinary at other places, for a first year university, non-math direction. I must add though that the engineering studies at university level here in Belgium are a lot more theoretical (and mathematically founded) compared to most other countries - which I'm glad of seeing that nearly went for mathematics

MathematicalPhysicist · Jun 9, 2006

matt grime said:

" And with the Part III, you can also specilise in Applied or Pure, right?"

No, you get to do whatever the hell you like.

In Part III the lectures are intense, far reaching, there are many different courses, far more than the average university is capable of handling, and widely recognised at international level to be outstanding. None of that applies to other taught masters courses in the UK, which then to be very narrowly focused on one particular area. You want to do graduate level courses in QFT, Lie Algebras, Differential Geometry, Non-linear dynamics and Galois Cohomology of number fields? Could be arranged, depending on the year (that was a selection of courses available when I did it). Where else would you be able to do that?

Feel like finding out about modular representation theory, combinatorics, functional analysis, fluid mechanics, and numerical analysis? Again, quite likely you can do that.

Of course, why you would want to do that is a something else entirely, but in terms of scope of work and expectations placed upon you it is the best preparation out there, far more so than most (ifnot any, but I can't bring myself to make such sweeping statements) MSc's by research, and certainly more so than any MMath course.

If you even want to do a PhD in maths at Cambridge, they will demand part III, and many other places use it as a training ground and ask their students to go there.

The reason it is the best is because in some sense it is 'the only': there is no other university with the resources to be able to offer a program like it. Even Oxford can't compete, and most UK maths departments are just too small to offer anything comparable.

so part III is equivalent to Msc in maths?
and you can also combine studies from pure maths with mathematical physics? sounds interesting cause as far as i know you cannot study in Msc both of them.

J77 · Jun 9, 2006

mathwonk said:

Matt's remarks on differences in expectations in US, UK remind me of a talk I heard at a conference. The speaker said something like, "this proof uses only mathematics that any sophomore undergraduate would know", then paused and added, "or here in the US, maybe any graduate student". This is true and getting worse.

Just a comment on this...

I think this is one of those universal things, where the quality of the students always look, or more so, seem better at other institutions, or in other countries.

However, through experience, I don't think this the case.

At any of the top universities in, eg., the UK - you're going to get good and bad students, ones who like to study and slackers.

As an example, in the German PhD, you're expected to know, and be asked in the viva, anything on your subject. In other words, you have to revise over everything you have been taught (or should know) from day one of UG study onwards. This is much more extreme than any UK viva, in terms of material you should know. However, this doesn't mean that the candidates are any better or any worse than UK, US... PhDs.

And, for me, the example above wouldn't suit at all. I'm not one for learning this theorem and this proof by heart. I prefer to go to a book. If I come across a new problem, of course, I have an idea of how to solve/proof it but from there it's down to searching through the literature - to see what's been done before. Then thinking about what can be done now...

People who come up with lines like, "At university X you'll be taught [add appropiate theorem] in fresher's week..." usually save them for coffee room chat, ime

matt grime · Jun 9, 2006

loop quantum gravity said:

so part III is equivalent to Msc in maths?

It has no equivalent. At the end you get a Certificate of Advanced Studies in Mathematics. But then Cambridge seems to like making itself an anomoly. For funding reasons it was (still is?) technically classed as an undergraduate course, when I did it; my Local Education Authority, who in those days paid your fees and gave you a subsistence allowance, counted it as the 4th year of my degree, though anyone from the UK doing it will have gotten their degree already.

Just look at the courses they offer in anyone year to get an idea of what goes on. Grojnowski gave a lecture course on 'The Geometry of the Punctured Disc' in 1999, which was essentially him lecturing on some research he'd just done (probably on Hilbert Schemes). Gowers (Field's Medallist) decided to give a course on K-Theory cos he wanted to learn about it and found the textbooks on the subject inadequate. The courses offered by DAMTP tend to be more predictable.

There is a definite feeling that anyone with a 'standard' PhD in maths from the UK (i.e. someone who did a 3/4 year undergrad at, oh pick a place like Nottingham, which is a good university for maths, then jumped into the PhD program immediately) is underprepared for life after the PhD in the real world of research. This is exactly because there is no scope currently for doing courses like part III (and this includes people with MScs already) whilst a PhD student (a lot of PhD students at Cambridge attend part III courses, and the truly exceptional like Ben Green lecture the courses) in this country. Whilst it is generally accepted that a US undergrad course is not as demanding as a UK one generically, the PhD courses in the US (that take a lot longer) are far better at preparing you for the real world of academia. The gap between an UK PhD and say a German one is actually a huge yawning chasm, if you ask me.

courtrigrad · Jun 9, 2006

isn't HallsofIvy a mathematician?

Beeza · Jun 9, 2006

mathwonk said:

beeza, i recommend henry helson's honors calculus book, pretty cheap ($24 new, including shipping), from his website, and very well done for a short expert treament of high level calculus, by a retired berkeley professor.

Mathwonk, Thank You! I skimmed over your notes quickly just now (before I go over them thoroughly later tonight) and for the most part, I have never seen any of that stuff before. I've never been exposed to real proofs-- as our professor never really did many "proofs" in our lectures. She always said to refer to our book for the proof, and then began presenting more practice problems. I don't even know where to begin constructing a formal proof. Heck, we weren't even exposed to the epsilon delta precise definition of a limit or hyperbolic trig functions.

With one quick glance, I could get the gyst or some of the material, but I definitely need a long sit-down to digest it. I'll be picking up that book you recommended and hopefully with some studying, it will make my current calculus II class more interesting.

I'm honestly quite disappointed with the classes at my school so far.

mathwonk · Jun 9, 2006

ok if you did my last exercise, here is another harder one. recall f is continuous at a if for every e>0 there is some d>0 such that whenever |x-a|<d and f is defined at x, then |f(x)-f(a)| < e.

prove that if f is continuous at a and f(a) > 0, then f(x)>0 for all x on some interval centered at a, (assuming f is defined on some interval containing a).

then prove (harder) that if f is continuous on a closed bounded interval [a,b] and f(a) < 0 while f(b) > 0, then f(x) = 0 for some x in (a,b).

hint: let x be the least upper bound of the set S of all t in [a,b] such that f(t) < 0. I.e. x is the smallest number not smaller than any element of S.

Prove that f(x) >0 leads to a contradiction and also f(x) < 0 leads to a contradiction. hence we must have f(x) = 0.

this stuff is basic to first semester calc but considered too hard for the AP course. But you can do it if you try. Help is also available. (here)unfortunately your class is undersestimating your intelligence, but if you ask the prof for more, you may get it. That happened to me once. A student came and said the class was boring so I cranked it up, to both her and my enjoyment.

mathwonk · Jun 9, 2006

ex. 3. if f is continuous on [a,b] and differentiable on (a,b) and f(a) = f(b), then there is some point x in (a,b) such that f'(x) = 0.

ex.4. if f is differentiable on (a,b) and f' is never zero on (a,b) then f is strictly monotone on (a,b).

this is what I call the basic principle of graphing. i.e. a function is strictly monotone on any interval in which there is no critical point.

this pretty much covers the entire theory of one variable differentiable calculus if you think about it.

mathwonk · Jun 9, 2006

if you are a calculus student from a standard AP class, or basic college calc class, and if you can do these exercises, you can be a mathematician.

GreenApple · Jun 9, 2006

I want to be a mathematician.

My interest is on logic and foundations.
Should I follow textbooks or study the original works by those great people-Russell,Turing,Godel...?
Any advises will be appreciated.

mathwonk · Jun 10, 2006

do both. perhaps textbooks by great people, like paul cohen's text on independence of the continuum hypothesis. i myself am not enamored of russell's contributions but many logicians disagree.

could we have some input from logicians, or at least from people who love logic?

r4nd0m · Jun 10, 2006

I'm a physics major (freshman) (in Austria) and so far there was nothing new for me in your lecture notes. I'm taking a standard calculus course for physicist (we don't have anything like honor classes) and we proof each theorem we encounter. We also covered the things in your excercises.

We also have a good theoretical approach to linear algebra.

But I can't say that about ODE's. This semester we had a course called "Introduction to differential equations" but it was only some recepies for solving these equations - almost no theory. I'm thinking about taking the course that mathematics majors have - to understand the theory too.
Do you think, that it is a good approach (for maybe a prospective theoretical physicist), or should I rather focus on the physics courses more? How important is the theory for physicists?

ircdan · Jun 10, 2006

mathwonk said:

prove that if f is continuous at a and f(a) > 0, then f(x)>0 for all x on some interval centered at a, (assuming f is defined on some interval containing a).

Since nobody has posted solutions to these, here is my attempt. Don't read this if you're working on it!

Proof.
Suppose f is continuous at a and f(a) > 0. Now since f is continuous at a, for all e > 0 there is some d > 0 such that if |x - a| < d, then |f(x) - f(a)| < e; ie, if a - d < x < a + d, then f(a) - e < f(x) < f(a) + e. In particular, we have that if x is in (a - d, a + d), then f(x) > f(a) - e. Since this holds for all e > 0 and since f(a) > 0, we can choose e = f(a) > 0. Thus if x is in (a - d, a + d), then f(x) > f(a) - f(a) = 0.

I think that works, maybe I'll try the others. This is a great thread btw.Edit again. I just realized that this is true also: If f is continuous at a and f(a) < 0, then f(x) < 0 for all x on some interval centered at a, (assuming f is defined on some interval containing a). The proof is the same, except since f(a) < 0, -f(a) > 0 so choosing e = -f(a) > 0 gives the desired result.

ircdan · Jun 10, 2006

mathwonk said:

ex. 3. if f is continuous on [a,b] and differentiable on (a,b) and f(a) = f(b), then there is some point x in (a,b) such that f'(x) = 0.

Here's my attempt at this one. I used three other results to prove it.
1. If f:[a,b]->R is continuous then f:[a,b]->R attains a max and a min.
2. If f :(a,b)->R is differentiable at x in (a,b) and f attains a max or a min at x, then f'(x) = 0.
3. The derivative of a constant function is 0.

Proof.
Suppose f is continuous on [a,b] and differentiable on (a,b). Now since f is continuous on [a,b], f attains a max and min on [a,b].
If the max occurs at some x in (a,b), then f'(x) = 0.
If the min occurs at some x in (a,b), then f'(x) = 0.
If both the max and the min occur at the endpoints, since f(a) = f(b) the maximum and minimum values of f are equal, so f must be a constant function. Hence f'(x) = 0 for any x in (a,b).
In any case there is some x in (a,b) at which f'(x) = 0.

ircdan · Jun 10, 2006

mathwonk said:

ex.4. if f is differentiable on (a,b) and f' is never zero on (a,b) then f is strictly monotone on (a,b).

I did this one using the mean value theorem which says:
If f:[a,b]->R is continuous and f:(a,b)->R is differentiable, then there is a point x in (a,b) at which f'(x) = (f(b) - f(a))/(b - a).
Proof.
Suppose f is differentiable on (a,b) and f' is never zero on (a,b).
Then either f'(x) > 0 for every x in (a,b) or f'(x) < 0 for every x in (a,b).

Assume f'(x) > 0 for every x in (a,b) and let u and v be points in (a,b) with u < v. Now we can apply the mean value theorem to f:[u,v]->R to choose some x in (u,v) at which f'(x) = (f(v) - f(u))/(v - u). Since f'(x) > 0 and v - u > 0 it follows that f(v) - f(u) > 0; ie, f(u) < f(v). Hence f is strictly increasing.

Assume f'(x) < 0 for every x in (a,b) and let u and v be points in (a,b) with u > v. Now applying the mean value theorem to f:[v,u]->R we can choose some x in (v,u) at which f'(x) = (f(u) - f(v))/(u - v). Since f'(x) < 0 and u - v > 0 it follows that f(u) - f(v) < 0; that is, f(u) < f(v). Hence f is strictly decreasing.

In any case f is strictly monotonic.

A similar result is that if f:R->R is differentiable and f'(x) != 0 for each x in R, then f:R->R is strictly monotonic. Here's a hint for anyone who wants to do it. Use the fact that if f is differentiable on some open interval I, then the image of the derivative f':I->R is an interval. (I found arguing by contradiction easiest on this one)

ircdan · Jun 10, 2006

mathwonk said:

then prove (harder) that if f is continuous on a closed bounjded interval [a,b] and f(a) < 0 while f(b) > 0, then f(x) = 0 for some x in (a,b).

Here is a hint for another way to do this problem(a different way than what mathwonk suggested).
First show that for each natural number n, if a_n and b_n are numbers with a_n < b_n and I_(n+1) = [a_(n+1), b_(n+1)] is contained in I_n = [a_n, b_n] for each n and lim n->inf (b_n - a_n )= 0, then there is exactly one point x which belongs to I_n for all n and both of the sequences {a_n} and {b_n} converge this point x. Now recursively define a sequence of nested, closed subintervals of [a,b] whose endpoints converge to a point in [a,b] at which f(x) = 0. This problem is hard I think.

mathwonk · Jun 10, 2006

r4nd0m,

i am happy you had a more thorough calc course. that may be the difference between instruction in austria and here. i will try to post higher level exercises. those were for people who had had only cookbook calc courses, as they are standard results proved in proof coures.

even in your case it may be that certain subtleties such as my concept of local boundedness is different from the proofs in your course, although of course the statements of the big results are the same.

here is a little slightly less standard exercise for you along those lines, to prove that the derivative of a differentiable function always has the intermediate value property, whether or not it is continuous. I.e. assume f is differentiable on [a,b] and that g is its derivative. of course f is continuous, but g may not be. even if g is not continuous however, i calim that if g(a) = f'(a) >0 and g(b) = f'(b) <0, then there is some x with a<x<b and g(x) = f'(x) = 0. try that.

for a good theoretical intro to diff eq i highly recommend v.i. arnol'd on ordinary diff eq, about $35. let me post some of my recent updates to my linear algebra notes on the topic, taken from his book.

mathwonk · Jun 10, 2006

linear systems

exercise: prove the only solutions of f' = af, with a constant, are f = ce^(at).

Linear differential equations: Let V = vector space of continuously differentiable functions on the real line, W = continuous functions. The derivative map D:V-->W, is linear and surjective by the fundamental theorem of calculus. The kernel of D is all constant functions by the mean value theorem. For any scalar c, f(x) = e^(cx) is an eigenvector for D with eigenvalue c.

Ex: If Lf = f^(n)+a(n-1)f^(n-1)...+a(1)f’+a(0)f,
L:C?-->C? is a linear differential operator with constant coefficients a(i), then DL = LD, so D:ker(L)-->ker(L).

If X^n+a(n-1)X^(n-1)+...+a(1)X+a(0) = ?(X-c(i)), all c(i) distinct, then {f(i)(x) = e^(c(i)x), for i = 1,...,n}, is a basis for ker(L) of eigenvectors for D.

[We know dimker(D-c) = 1, when n = 1. So by induction dimker((D-c(1))(D-c(2))...(D-c(n)) =
dim(D-c(1))^(-1)(ker(D-c2)...(D-cn)) <= n. Then prove {e^(c(i)x): i = 1,..,n}, is linearly indept.]

If the polynomial above factors as P = ?(X-c(i))^(r(i)), with some r(i) > 1, there is no basis for kerL of eigenvectors of D, but there is a Jordan basis {... ; e^(c(i)x), xe^(c(i)x), (1/2)x^2e^(c(i)x),...,(1/(r(i)-1)!)x^(r(i)-1)e^(c(i)x); ...}, so dimker(L) = dim<prod>ker(D-c(i))^(r(i)) = <sum> r(i) = degP, and P = the minimal polynomial for D on kerL. There is exactly one Jordan block for each c(i), in the matrix of D on kerL.

Linear differential systems let C = space of smooth functions on R.
An nxn matrix A of scalars [aij] defines a linear map A:(C)^n-->(C?)^n, acting on columns of n functions, as does D:(C)^n-->(C)^n, acting on each function separately. The equation (D-A)y = 0, is a homogeneous linear differential system, where y = y(t) = (y1(t),...,yn(t)) is a column vector of unknown functions, to be solved for. If n=1, we know y = e^(at) is a basis of ker(D-A).

The general case has a formally similar solution, namely, a basis is given by the columns of the matrix of functions e^(tA) = <sum> (t^n/n!)A^n, defined by the familiar series for e(ta), but for matrices, which converges absolutely by the same argument as when n=1.

If A has a Jordan form, the entries in the matrix e^(tA) are polynomial combinations of ordinary exponential functions as follows. Let A = S+N be the Jordan decomposition above. Then e^(tA) = e^(tS).e^(tN), matrix product. But if S is diagonal with entries c(i), then e^(tS) is diagonal with entries e^(tc(i)), and since N is nilpotent, the series for e^(tN) is finite, and the entries of e^(tN) are polynomials in t.

Ex. Use this method to solve (D-A)y = 0, where A is the 2by2 matrix with rows (a,0), and (1, a), and y = (y1,y2). Show this is equivalent to solving (D-a)(y1) = 0, and (D-a)(y2) = y1, i.e. finding ker[(D-a)^2:C-->C].

mathwonk · Jun 10, 2006

a remark r4nd0m, make sure you yourself can prove these results, not just that they were proved in class by the teacher. that is the difference between becoming a mathematician, or scientist, and a listener.

ircdan · Jun 10, 2006

mathwonk said:

it may be that certain subtleties such as my concept of local boundedness is different

Just want to add that I had never seen this before.

ircdan · Jun 10, 2006

mathwonk said:

exercise: prove the only solutions of f' = af, with a constant, are f = ce^(at).

It's been like 2 years since I've had ODE's but I think this works.

Proof.
It's clear e^(at) is a solution. Now suppose y(t) is any other solution. Then y'(t) = a*y(t). Let w(t) = e^(-at)*y(t), then w'(t) = -ae^(-at)y(t) + e^(-at)y'(t) = -ae^(-at)y(t) + e^(-at)a*y(t) = 0, so w'(t) = 0 for all t and thus w(t) = c = e^(-at)y(t). Hence y(t) = ce^(at). So any solution is a linear combination of e^(at); that is, any solution has the form ce^(at).

Also it doesn't seem to matter whether a is real or complex, and I guess we can say that {e^(at)} is a basis for the solution space of this equation.

Edit: Fixed, I think there was a mistake in the first proof I wrote. Looks ok now I think.

I like these problems because they seem to be right at my level, they are not extremely easy for me nor are they extremely difficult either. I'll try that intermediate value one later when I have more time, looks interesting.

Other Should I Become a Mathematician?

Similar threads

Admissions PSA -- Contact PI's before applying to graduate school in the US

Programs Johns Hopkins Applied Physics Masters online

Admissions Taking longer to complete the degree

Other Which branch of engineering is more physics heavy?

Self Learning Math/Physics

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers