Other Should I Become a Mathematician?

kant · Jun 10, 2006

mathwonk said:

I am interested in starting this discussion in imitation of Zappers fine forum on becoming a physicist, although i have no such clean cut advice to offer on becoming a mathematician. All I can say is I am one.

My path here was that I love the topic, and never found another as compelling or fascinating. There are basically 3 branches of math, or maybe 4, algebra, topology, and analysis, or also maybe geometry and complex analysis.

There are several excellent books available in these areas: Courant, Apostol, Spivak, Kitchen, Rudin, and Dieudonne' for calculus/analysis; Shifrin, Hoffman/Kunze, Artin, Dummit/Foote, Jacobson, Zariski/Samuel for algebra/commutative algebra/linear algebra; and perhaps Kelley, Munkres, Wallace, Vick, Milnor, Bott/Tu, Guillemin/Pollack, Spanier on topology; Lang, Ahlfors, Hille, Cartan, Conway for complex analysis; and Joe Harris, Shafarevich, and Hirzebruch, for [algebraic] geometry and complex manifolds.

Also anything by V.I. Arnol'd.

But just reading these books will not make you a mathematician, [and I have not read them all].

The key thing to me is to want to understand and to do mathematics. When you have this goal, you should try to begin to solve as many problems as possible in all your books and courses, but also to find and make up new problems yourself. Then try to understand how proofs are made, what ideas are used over and over, and try to see how these ideas can be used further to solve new problems that you find yourself.

Math is about problems, problem finding and problem solving. Theory making is motivated by the desire to solve problems, and the two go hand in hand.

The best training is to read the greatest mathematicians you can read. Gauss is not hard to read, so far as I have gotten, and Euclid too is enlightening. Serre is very clear, Milnor too, and Bott is enjoyable. learn to struggle along in French and German, maybe Russian, if those are foreign to you, as not all papers are translated, but if English is your language you are lucky since many things are in English (Gauss), but oddly not Galois and only recently Riemann.

If these and other top mathematicians are unreadable now, then go about reading standard books until you have learned enough to go back and try again to see what the originators were saying. At that point their insights will clarify what you have learned and simplify it to an amazing degree.

Your reactions? more later. By the way, to my knowledge, the only mathematicians posting regularly on this site are Matt Grime and me. Please correct me on this point, since nothing this general is ever true.

Remark: Arnol'd, who is a MUCH better mathematcian than me, says math is "a branch of physics, that branch where experiments are cheap." At this late date in my career I am trying to learn from him, and have begun pursuing this hint. I have greatly enjoyed teaching differential equations this year in particular, and have found that the silly structure theorems I learned in linear algebra, have as their real use an application to solving linear systems of ode's.

I intend to revise my linear algebra notes now to point this out.

how much do you earn in a year as a math professor?

George Jones · Jun 10, 2006

kant said:

how much do you earn in a year as a math professor?

I'm not sure about other countries, but in Canada, most collective agreements are available online, so you can look this up. For example, http://www.uwfacass.uwaterloo.ca/floorsandthresholds20062008.pdf" is the pay structure for the University of Waterloo.

kant · Jun 10, 2006

George Jones said:

I'm not sure about other countries, but in Canada, most collective agreements are available online, so you can look this up. For example, http://www.uwfacass.uwaterloo.ca/floorsandthresholds20062008.pdf" is the pay structure for the University of Waterloo.

Hmm... ok the money is reasonable.. but what about the chicks? girls don t like nerdy guys... or do they? hmm...

kant · Jun 11, 2006

well, i guess what i am saying is this: Are girls usually impress about your profession? This a serious question. Well, i get pretty good grade, but i am always very conscious that others might view me as weak.

kant · Jun 11, 2006

Hmm.. ok ok. i got it.

mathwonk · Jun 11, 2006

Well this is a family forum, but i will admit that to impress girls in my experience it is not sufficient to be able to solve their quadratic equations.It helps to know some jokes too. and compliment their shoes.Secret: Basically, to get dates it is sufficient to react the those girls who are trying to tell you you should ask them out.

[i deleted my earlier attempts at humor on this topic because my wife said they were "a little nerdy".]

r4nd0m · Jun 11, 2006

mathwonk said:

even in your case it may be that certain subtleties such as my concept of local boundedness is different from the proofs in your course, although of course the statements of the big results are the same.

Yes you're right, we didn't mention the local boundedness. What is this concept actually good for?

mathwonk said:

here is a little slightly less standard exercise for you along those lines, to prove that the derivative of a differentiable function always has the intermediate value property, whether or not it is continuous. I.e. assume f is differentiable on [a,b] and that g is its derivative. of course f is continuous, but g may not be. even if g is not continuous however, i calim that if g(a) = f'(a) >0 and g(b) = f'(b) <0, then there is some x with a<x<b and g(x) = f'(x) = 0. try that.

Well, I would proceed like this:

f is continuous on [a,b] then (from Weierstrass's second theorem ( I don't know how you call it in the US

)) f has its maximum and minimum on [a,b].
But g(a)>0, hence f is rising in a (i.e. there exists a d such that for every x from (a,d) f(x)>f(a) ). Hence f(a) is not the maximum of f on[a,b]. The same holds for f(b).
Let f(m) be the maximum. Then m must be from the interval (a,b).
Hence f(m) is also a local maximum => g(m) = 0
Q.E.D.

mathwonk · Jun 11, 2006

it is good for proving the boundedness result for possibly discontinuous functions. This shows that the boundedness of a function on a closed bounded interval does not actually need continuity, but is true with the weaker condition of local boundedness. it could help you prove a discontinuous function is also bounded if you could show it is everywhere locally bounded.

i just like it because it occurred to me while thinking through the proof from scratch. it convinces me that i thought up the proof myself and hence am beginning to understand it, instead if just remembering a proof i read.

i like your proof that f'(x) = 0 has a solution. it is very clear and complete, without being wordy at all. [i believe the needed weierstrass 2nd thm is proved in my notes above as well and follows quickly from the boundedness of reciprocals].

now can you refine it to give the full IVT for derivatives? I.e. assume f'(a) = c and f'(b) = d, and c<e<d. prove f'(x) = e has a solution too.

mathwonk · Jun 11, 2006

ircdan, #82 and #83, look clean as a whistle. Also #84, but try that one again just using rolle's thm: if a differentiable f takes the same value twice on an interval, then f' has a zero in between.

i.e. if a differentiable f is not monotone on [a,b] can you prove it takes the same value twice?

as for the intermediate value thm, try it without sequences, just using the property you already proved, that a functioin which is positive or negative at a apoint, is so on an interval.

then let x be the smallest number in [a,b] which is not smaller than a point where f<0. if f(a) < 0 and f(b) > 0, prove f cannot be negative at x.

Lisa! · Jun 11, 2006

Q. who wants to be a mathematician?
hmmm...I guess you have to be intelligent enough and very interested in maths, you have to study hard and you should study at a cool university.AND you can't say all people with a PHD in math is a mathematician.

A. I would, if I could. I can't so I don't want to be 1.

mathwonk · Jun 11, 2006

r4nd0m. here is possible a test of the usefulness of the condition of local boundedness. is it true or not that if f has a derivative everywhere on [a,b] that then g = f' is bounded on [a,b]? [if it is true, then local boundedness might help prove it.]

unfortunately it appears to be false. i.e. f(x) = x^2sin(1/x^2), for x not 0, and f(0) = 0 seems to be differentiable everywhere with derivative locally unbounded at x=0.so I have not yet thought of an interesting case where local boundedness holds and continuity fails. but the concept still focuses attention on why the theorem is true. i.e. if a function f is unbounded on [a,b] then there is a point x in [a,b] with f unbounded on every interval containing x. that is the real content of the theorem. in particular continuous functions do not fail this condition at any point. so it let's you stop thinking about the whole interval and think about the nbhd of one point.

e.g. in finding the counterexample above it helped me to know that if a counterexample existed, it would have to also be a local counterexample. i.e. to know that if a derivative existed which was unbounded on [a,b], there must also be a point x in [a,b] at which the derivative is locally unbounded, which is a priori a stronger condition.

mathwonk · Jun 11, 2006

doing mathematical research can be as simple as this: finding a theorem whose proof actually proves more than the theorem asserts, and then generalizing it to a new interesting case.

for example the famous kodaira vanishing theorem says that on a complex manifold, if L is a line bundle with L-K positive in a certain sense, i.e. positive curvature, or positive definite chern class, then the cohomology of L is zero, above degree 0. the proof by kodaira, modified by bochner, is long and hard, but analyzing it closely shows that works in each degree separately, by showing the curvature in that degree is positive, i.e. certain sums of eigenvalues are positive.

now when kodaira's hypothesis holds, then all degree one eigenvalues are positive and then those in higher degrees, which are sums of the ones in degree one, must also be positive. but in fact if only one eigenvalue is negative, and all others are not only positive but large compared to that one, then any sum of two or more eigenvalues wil be positive, i.e. cohomology will be zero in dimension 2 and more.

since on a complex torus, which is flat, eigenvalues can be scaled in size without affecting the fact that they represent the given line bundle, this gives a proof of Mumford's famous "intermediate cohomology vanishing theorem" on complex tori.

this theorem has in fact been published with this proof by a number of well known mathematicians.

a more significant and wide reaching generalization has been obtained by Kawamata and Viehweg, using branched covers of complex manifolds, to generalize to a sort of fractionally positive condition, which has many more applications than the original theorem. all the proofs reduce to the kodaira theorem, for which kolla'r has also given a nice understandable "topological" proof.

With my colleague, we also have given a generalization of riemann's famous "singularity theorem" on jacobian theta divisors, whose beautiful proof by kempf turned out to use only some conditions which were usually true also for theta divisors of prym varieties, so we published this.

this progress, and later work on less general cases, gave impetus to the subject which has culminated recently in a complete solution by a young mathematician, Sebastian Casalaina - Martin, of the prym singularities theorem over 100 years after prym theta divisors were introduced.

this in turn has led to progress in understanding abelian varieties of low dimension. e.g. it is shown now by Casalina Martin, that if a 5 diml abelian variety has a theta divisor with a triple point or worse, then in fact that abelian variety is either a hyperelliptic jacobian, or an intermediate jacobian of a cubic threefold.

thus understanding proofs helps one learn more than just knowing the traditional statements, and it is fun. this is why i try to teach students to think through proofs and make their own. it is hard getting many people to get past just memorizing the statements and problem - solving techniques, and even proofs, without analyzing them.

In the cases above many people had known and used those theorems for decades without noticing they could be strengthened.

mathwonk · Jun 11, 2006

lisa, i am not sure about some of your restrictions on candidacy for being a mathematician, but i think you do have to want to.

some of the best mathematicians at my school went to colleges like University of Massachusetts, Grinnell, University of North Carolina (are they cool? i don't know), and the smartest guy in my grad class at Utah went to Univ of Iowa. i guess by my definition hurkyl is a mathematician even if he hasn't joined the union, since he likes it and does it.

but i enjoy singing in the shower too even if i am not a singer. why miss out on the fun?

mathwonk · Jun 11, 2006

ircdan, your proof in #90 is right on. can you generalize it to prove that the only solutiuons of (D-a)(D-b)y = 0 are ce^(at) + de^(bt) whene a and b are different?

then try that all solutions of (D-a)(D-a)y = 0 are ce^(at) + t de^at).

my notation means (D-a)(D-b)y = [D^2 -(a+b)D+ab]y = y'' -(a+b)y' + ab y.TD I am glad to hear what thorough instruction is provided in Belgium. You say you skipped proving the inverse function theorem. can you prove it for one variable functions?

TD · Jun 11, 2006

mathwonk said:

TD I am glad to hear what thorough instruction is provided in Belgium. You say you skipped proving the inverse function theorem. can you prove it for one variable functions?

We skipped that indeed, according to my notes it would have required "more advance techniques" than we had developped at that point in the course. We then used it so prove the implicit function theorem for f : R²->R, which was a rather technical proof (more than we were used to at least).
I'm supposing the proof of the inverse function theorem would be at least equally technical/complicated, so I doubt that I would be able to prove it just like that :blushing:

mathwonk · Jun 11, 2006

i think you can prove it with only the intermediate value theorem. i.e. let f be a continuous function on [a,b] with derivative non zero on (a,b), and prove first that the image of f is an interval [c,d], and that f is strictly monotone from [a,b] to [c,d].\\\

this will be easy for you. then use IVT to prove that f^(-1) is continuous from [c,d] to [a,b].

you can do this easily too, with some thought, and it may convince you that the things concealed from you are as easy as those shown you. the great thing is to begin to see that the subject is not a mystery, contained within book covers, but is open to all who use their "3rd eye", i.e. their reasoning powers.

mathwonk · Jun 11, 2006

hint : draw a picture of the graph.

mathwonk · Jun 11, 2006

to prove the only solutions of (D-a)(D-b)y = 0 are of form ce^ax + de^bx, show that if L is linear and Lf = Lg = h, then L(f-g) = 0. then note that (D-a)(e^(bx)) = (b-a)e^(bx), hence (D-a)(e^(bx)/[b-a]) = e^(bx).

Thus (D-a)(D-b)(e^(bx)/[b-a]) = 0, so all solutions of (D-a)(D-b)y = 0 are of form y = ce^ax + de^bx.

is this right? i have lots of problems with arithmetic.

r4nd0m · Jun 12, 2006

mathwonk said:

now can you refine it to give the full IVT for derivatives? I.e. assume f'(a) = c and f'(b) = d, and c<e<d. prove f'(x) = e has a solution too.

Well, I would do it like this:
Let function g be defined as g(x) = f(x) -ex. f is differentiable, obviously ex is also differentiable, hence g is differentiable (subtraction of two differentiable functions is differentiable).
g'(x) = f'(x) - e
g'(a) = f'(a) - e = c - e <0
g'(b) = f'(b) - e = d - e >0

So there must be a value c, for which g'(c) = f'(c) - e = 0 => f'(c) = e
Q.E.D.

What I have realized when learning calculus and also doing this excercise is that many of the proofs use only few very simmilar ideas.
At first (when I started with calculus) I dindn't really understand why we proved some theorem in a certain way - I understood it only formally as some series of expressions. But as we proceeded I saw that there is something behind those formulas, some basic ideas, which repeat pretty often and I started to understand it more intuitively (if something like that can be said about math

).

TD said:

We then used it so prove the implicit function theorem for f : R²->R, which was a rather technical proof (more than we were used to at least).

In my textbook the proof of this theorem takes two pages. Anyway I think, that it is a very elegant proof, when you see what's happening there - I drew a picture and I was really suprised how easy it was.

jbusc · Jun 12, 2006

mathwonk said:

Jbusc, topology is such a basic fioundational subjuect that it does not depend on much else, whereas differential geometry is at the other end of the spectrum.. still there are inrtroductions to differential geometry that only use calculus of several variables (and topology and linear algebra). Try Shifrin's notes on his webpage.http://www.math.uga.edu/~shifrin/

I had forgotten that I had posted here, so sorry for bringing it up again.

Thanks for that resource, his notes are exactly what I was looking for. I am reading Hartle's General Relativity and while it is an excellent book the math is quite watered down and I am trying to look for some readings on differential geometry.

As well there are some graduate students in electrical engineering here whose research problems lead them to asking questions about things such as maps between spaces with different metrics and topologies and they could use some resources as well since those are not addressed in their education.

I have one other question, from looking around it seems that you (and many others) are quite fond of Apostol as a Calculus textbook. Now without being too egotistical I would rank my knowledge of multivariable calc at "very proficient", but I would like to improve to "extremely proficient". I am running low on my textbook budget however - should I really start on volume I or is beginning on volume II adequate?

mathwonk · Jun 12, 2006

nice proof r4nd0m! claen as a whistle. I must say when I read your arguments they look more succint and clear than my own.

i am really happy to hear what you say about proofs beginning to look more similar, based on a few ideas. like in this case subtraction! this trivial sounding device is very basic, as it reduces the consideration of arbitrary numbers to consideration of the case of zero! and zero is often an easier case to reason about.

jbusc,

you can begin wherever you want, but you may find that volume 2 of apostol uses more familiarity with rigorous arguments and proofs from volume 1 than you are used to. but since you are reading on your own, you can always slow down and go back fopr a refresher.

i think i find some much cheaper copies of apostol than the usual, and listed some of them above. try abebooks.com or one of the other used book sites.

mathwonk · Jun 12, 2006

another good choice for multivariable calc for people already knowing some, and some linear algebra, and some limits theory, is spivaks "calculus on manifolds". this excellent short book is a great bridge from undergraduate to graduate level preparation in calculus.

mathwonk · Jun 12, 2006

remark on implicit, inverse functions: i recall the 2 variable inverse function theorem is used to prove the implicit function theorem from R^2-->R.

As opposed to the one variable inverse function theorem, the 2 variable version is topologically interesting and requires (or usually uses) some new ideas.

for instance, one must prove that a smooth function f:R^2-->R^2 taking (0,0) to (0,0), and with derivative matrix at (0,0) equal to the 2by2 identity matrix, maps a small nbhd of (0,0) onto a small nbhd of (0,0).

try picturing this, and see if you can think of an intuitive argument that it should be true.

it is also useful to restudy the argument deriving the implicit function theorem from this result as r4nd0m has done.

mathwonk · Jun 12, 2006

as to solving linear ode's with constant coeffs, the hard part is the case already solved above for n = 1, i.e. that (D-a)y = 0 iff y = ce^(at).

the higher cases follow from that one by induction.

i.e. prove that (D-a)(D-b)y = 0 if and only if (D-b)y = z and (D-a)z = 0. thus z must equal ce^(at), and y must solve (D-b)y = z. so it suffices to
i) find one solution y1 of (D-b)y = e^(at), and then
ii) show that if y2 is another solution, then (y1-y2) solves (D-b)y = 0. and this tells you what y1-y2 is. since you already know y1, this tells you all possible solutions y2.

try this. it shows how to use "linear algebra" to ratchet up the calculus solution for the case n=1, in order to solve all higher cases.

mathwonk · Jun 12, 2006

for people trying to graduate to a higher level view of calculusa, here is a little of an old lecture on background for beginning students in differential topology:

Math 4220/6220, lecture 0,
Review and summary of background information

Introduction: The most fundamental *concepts used in this course are those of continuity and differentiability (hence linearity), and integration.

Continuity
Continuity is at bottom the idea of approximation, since a continuous function is one for which f(x) approximates f(a) well whenever x approximates a well enough. The precise version of this is couched in terms of “neighborhoods” of a point. In that language we say f is continuous at a, if whenever a neighborhood V of f(a) is specified, there exists a corresponding neighborhood U of a, such that every point x lying in U has f(x) lying in V.
Then the intuitive statement “if x is close enough to a, then f(x) is as close as desired to f(a)”, becomes the statement: “for every neighborhood V of f(a), there exists a neighborhood U of a, such that if x is in U, then f(x) is in V”.
Neighborhoods in turn are often defined in terms of distances, for example an “r neighborhood” of a, consists of all points x having distance less than r from a. In the language of distances, continuity of f at a becomes: “if a distance r > 0 is given, there is a corresponding distance s > 0, such that if dist(x,a) < s, (and f is defined at x) then dist(f(x),f(a)) < r”.
More generally we say f(x) has limit L as x approaches a, if for every nbhd V of L, there is a nbhd U of a such that for every point of U except possibly a, we have f(x) in V. Notice that the value f(a) plays no role in the definition of the limit of f at a. Then f is continuous at a iff f(x) has limit equal to f(a) as x approaches a.

Differentiability
Differentiability is the approximation of non linear functions by linear ones. Thus making use of differentiability requires one to know how to calculate the linear function which approximates a given differentiable one, to know the properties of the approximating linear function, and how to translate these into analogous properties of the original non linear function. Hence a prerequisite for understanding differentiability is understanding linear functions and the linear spaces on which they are defined.

Linearity
Linear spaces capture the idea of flatness, and allow the concept of dimension. A line with a specified point of origin is a good model of a one dimensional linear space. A Euclidean plane with an origin is a good model of a two dimensional linear space. Every point in a linear space is thought of as equivalent to the arrow drawn to it from the specified origin. This makes it possible to add points in a linear space by adding their position vectors via the parallelogram law, and to "scale" points by real numbers or "scalars", by stretching the arrows by this scale factor, (reversing the direction if the scalar is negative).
We often call the points of a linear space "vectors" and the space itself a "vector space". A linear function, or linear map, is a function from one linear space to another which commutes with these operations, i.e. f is linear if f(v+w) = f(v)+f(w) and f(cv) = cf(v), for all scalars c, and all vectors v,w.
The standard model of a finite dimensional linear space is R^n. A fundamental example of an infinite dimensional linear space is the space of all infinitely differentiable functions on R.

Linear Dimension
This is an algebraic version of the geometric idea of dimension. A line is one dimensional. This means given any point except the origin, the resulting non zero vector can be scaled to give any other vector on the line. Thus a linear space is one dimensional if it contains a non zero vector v such that given any other vector x, there is a real number c such that x = cv. We say then v spans the line.
A plane has the two dimensional property that if we pick two distinct points both different from the origin, and not collinear with the origin, then every point of the plane is the vector sum of multiples of the two corresponding vectors. Thus a linear space S is two dimensional if it contains two non zero vectors v,w, such that w is not a multiple of v, but every vector in S has form av+bw for some real numbers a,b. We say the set {v,w} spans the plane S.
In general a set of vectors {vi} spans a space S if every vector in S has form <summation> aivi where the sum is finite. The space is finite dimensional if the set {vi} can be taken to be finite. A space has dimension r if it can be spanned by a set of r vectors but not by any set of fewer than r vectors. If S is inside T, and both are finite dimensional linear spaces of the same dimension, then S = T.

Linear maps
Unlike continuous maps, linear maps cannot raise dimension, and bijective linear maps preserve dimension. More precisely, if f:S-->T is a surjective linear map, then dim(T) <= dim(S), whereas if f:S-->T is an injective linear map, then dim(T) >= dim(S). Still more precisely, if ker(f) = f-1(0), and im(f) = {f(v): v is in S}, then ker(f) and im(f) are both linear spaces [contained in S,T respectively], and dim(ker(f)) + dim(im(f)) = dimS. This is the most fundamental and important property of dimension. This is often stated as follows. The rank of a linear map f:S-->T is the dimension of im(f) and the nullity is the dimension of ker(f). Then for f:S-->T, we have rank(f) + nullity(f) = dim(S).
It follows that f is injective if and only if ker(f) = {0}, and surjective if dimT = dim(im(f)) is finite. A linear map f:S-->T with a linear inverse is called an isomorphism. A linear map is an isomorphism if and only if it is bijective. If dimS = dimT is finite, a linear map f:S-->T is bijective if and only if f is injective, if and only if f is surjective. A simple and important example of a linear map is projection R^nxR^m-->R^n, taking (v,w) to v. This map is trivially surjective with kernel {0}xR^m.
The theory of dimension gives a strong criterion for proving the existence of solutions of linear equations f(x) = w in finite dimensional spaces. Assume dimS = dimT finite, f:S-->T linear, and f(x) = 0 only if x = 0. Then for every w in T, the equation f(x) = w has a unique solution.
More generally, if S,T are finite dimensional, f:S-->T linear, and dim(ker(f)) = dim(S) - dim(T) = r, then every equation f(x) = w has an r dimensional set of solutions. We describe the set of solutions more precisely below.
Differentiation D(f) = f' is a linear map from the space of infinitely differentiable functions on R to itself. The mean value theorem implies the kernel of D is the one dimensional space of constant functions, and the fundamental theorem of calculus implies D is surjective.
More generally, for every constant c the differential operator
(D-c) is surjective with kernel the one dimensional space of multiples of ect, hence a composition of n such operators has n dimensional kernel. One can deduce that a linear combination <summation>cjDj 0<=j<=n, cn not 0, with constant coefficients cj, of compositions of D with maximum order n, has n dimensional kernel.

Geometry of linear maps.
If f:S-->T is a linear surjection of finite dimensional spaces, then ker(f) = f-1(0) is a linear space of dimension r = dim(T)-dim(S), and for every w in T, the set f-1(w) is similar to a linear space of dimension r, except it has no specified origin. I.e. if v is any solution of f(v) = w, then the translation taking x--> x+v, is a bijection from f-1(0) to f-1(w). Hence the choice of v as "origin" in f-1(w) allows us to define a unique structure of linear space making f-1(w) isomorphic to f-1(0). Thus f-1(w) is a translate of an r dimensional linear space.
In this way, f "fibers" or "partitions" the space S into the disjoint union of the "affine" linear sets" f-1(w). There is one fiber f-1(w) for each w in T, each such fiber being a translate of the linear space ker(f) = f-1(0). If
f:S-->T is surjective and linear, and dimT = dimS - 1, then the fibers of f are all one dimensional, so f fibers S into a family of parallel lines, one line over each point of T. If f:S-->T is surjective (and linear), but dimT = dimS - r with r > 0, then f fibers S into a family of parallel affine linear sets f-1(w) each of dimension r.

The matrix of a linear map R^n-->R^m
If S, T are linear spaces of dimension n and m and {v1,...,vn}, {w1,...,wm} are sets of vectors spanning S,T respectively, then for every v in S, and every w in T, the scalar coefficients ai, bj in the expressions v = <summation>aivi, and w = <summation>bjwj, are unique. Then given these minimal spanning sets, a linear map f:S-->T determines and is determined by the "m by n matrix" [cij] of scalars where: f(vj) =
<summation>i cijwi, for all j = 1,...,n. If S = T = Rn, we may take vi = wi = (0,...,0,1,0,...,0) = ei = the "ith unit vector", where the 1 occurs in the ith place.
If S is a linear space of dimension n and {v1,...,vn} is a minimal spanning set, we call {v1,...,vn} a basis for S. Then there is a unique isomorphism S-->R^n that takes vi to ei, where the set of unit vectors {e1,...,en} is called the "standard" basis of Rn. Conversely under any isomorphism S-->R^n, the vectors in S corresponding to the set {e1,...,en} in R^n, form a basis for S. Thus a basis for an n dimensional linear space S is equivalent to an isomorphism of S with R^n. Since every linear space has a basis, after choosing one, a finite dimensional vector space can be regarded as essentially equal to some R^n.
In the context of the previous sentence, every linear map can be regarded as a map f:R^n-->R^m. The matrix of such a map, with respect to the standard bases, is the m by n matrix whose jth column is the coordinate vector f(ej) in R^m.
If f:S-->T is any linear surjection of finite dimensional spaces, a careful choice of bases for S,T can greatly simplify the matrix of the corresponding map R^n-->R^m. In fact there are bases for S,T such that under the corresponding isomorphisms, f is equivalent to a projection
R^(n-m)xR^m-->R^m. I.e., up to linear isomorphism, every linear surjection is equivalent to the simplest example, a projection.
This illustrates the geometry of a linear surjection as in the previous subsection. I.e. a projection f:R^nxR^m-->R^m fibers the domain space R^nxR^m into the family of disjoint parallel affine spaces f-1(v) = R^nx{v}, with the affine space R^nx{v} lying over the vector v. Since every linear surjection is equivalent to a projection, every linear surjection fibers its domain into a family of disjoint affine spaces linearly isomorphic to this family. We will see that the implicit function theorem gives an analogous statement for differentiable functions.

The determinant of a linear map R^n-->R^n.
For each linear map f:R^n-->R^n there is an important associated number det(f) = det(cij) = the sum of the products <summation>p <product>i sgn(p)cip(i), where p ranges over all permutations of the integers (1,2,3...,n). det(f) is the oriented volume of the parallelepiped (i.e. block) spanned by the image of the ordered set of unit vectors f(e1),...,f(en). Then f is invertible iff det(f) is not 0. The intuition is that this block has non zero n dimensional volume iff the vectors f(e1),...,f(en) span R^n, iff f is surjective, iff f is invertible.

mathwonk · Jun 12, 2006

summary of derivatives in several variables

Here is the other half of lecture zero for a course that intends to use calculus of several variables. i.e. this is what you need to know:

Derivatives: Approximating non linear functions by linear ones.
Ordinary Euclidean space R^n is a linear space in which an absolute value is defined, say by the Euclidean "norm", |v| = (x1^2+...+xn^2)^(1/2), where v = (x1,...,xn), hence also a distance is defined by dist(v,w) = |v-w|. The set of points x such that |x-a| < r, is called the open ball of radius r centered at a. An "open set" is any union of open balls, and an open neighborhood of the point a is an open set containing a. If f:R^n-->R^m is any map, then f(x) has limit L as x approaches a, iff the real valued function |f(x)-L| has limit 0 as x approaches a.
In a linear space with such an absolute value or norm we can define differentiability as follows. A function h is "tangent to zero" at a, if h(a) = 0 and the quotient |h(x)|/|x-a| has limit zero as x approaches a. I.e. if "rise" over "run" approaches zero in all directions. In particular then h(x) approaches zero as x approaches a. Two functions f,g are tangent at a, if the difference f-g is tangent to zero at a.
A function f defined on a nbhd of a, is differentiable at a if there is a linear function L such that L(v) is tangent to f(v+a)-f(a) at 0. Then L = f'(a) is unique and is called the derivative of f at a. I.e. f has derivative L = f'(a) at a, iff the quotient |(f(x)-f(a)-L(x-a))|/|x-a| has*limit zero as x approaches a. If f is itself linear, then f'(a)(v) = f(v), for all a. I.e. then a-->f'(a) is a constant (linear map valued) function, with value f everywhere.

Chain Rule
The most important property of derivatives is the chain rule for the derivative of a composite function. If f is differentiable at a and g is differentiable at f(a), then gof is differentiable at a and (gof)'(a) = g'(f(a))of'(a). I.e. the derivative of the composition, is the composition (as linear functions) of the derivatives. Since the derivative of the identity map is the identity map, this says roughly "the derivative is a functor", i.e. it preserves compositions and identity maps.
As a corollary, if a differentiable function has a differentiable inverse, the derivative of the inverse function is the inverse linear function of the derivative. I.e. If f-1 exists and is differentiable, then (f-1)'(f(a)) = (f'(a))-1. In particular, since a linear function can be invertible only if the domain and range have the same dimension, the same holds for a differentiable function. E.g. a differentiable function f:R^2-->R cannot have a differentiable inverse. (Continuous invertible functions also preserve dimension, but this is harder to prove in general. It is easy in low dimensions however. Can you prove there is no continuous invertible function f:R^2-->R?)

Calculating derivatives
The usual definition of the derivative of a one variable function from R to R, agrees with that above, in the sense that if f'(a) is the usual derivative, i.e. the number limh-->0 (f(a+h)-f(a))/h), then
f(a+h)-f(a) is tangent at zero to the linear function f'(a)h of the variable h. I.e. the usual derivative is the number occurring in the 1 by 1 matrix of the derivative thought of as a linear function. There is an analogous way to compute the matrix of the derivative in general.
A function f:R^n-->R^m is made up of m component functions g1,...,gm, and if in the ith component function gi, we hold all but the jth variable constant, and define the real valued function h(t) of one variable by h(t) = gi(a1,...,aj+t,...,an), we call h'(0) = dgi/dxj(a), the jth partial derivative of gi at a. If f is differentiable at a, then all partials of f exist at a, and the matrix of the derivative L = f'(a) of f at a is the "Jacobian" matrix of partials [dgi/dxj(a)].
It is useful to have a criterion for existence of a derivative that does not appeal to the definition. It is this: if all the partials of f exist not only at a but in a nbhd of a, and these partials are all continuous at a, then f is differentiable at a, and the derivative is given by the matrix of partials. We can then check the invertibility of f'(a), by computing the determinant of this Jacobian matrix.

Inverse function and implicit function theorems
The "inverse function theorem", is a criterion for f to have a local differentiable inverse as follows: If f is differentiable on a neighborhood of a, and the derivative f'(x) is a continuous function of x in that nbhd, (i.e. the entries in the matrix of f'(x) are continuous functions of x), and if f'(a) is invertible, then f is differentiably invertible when restricted to some nbhd U of a. I.e. f maps some open nbhd U of a bijectively onto an open nbhd V = f(U) of f(a), with f-1 defined and differentiable on V, and f-1(V) = U.
More generally, the implicit function theorem characterizes differentiable functions locally equivalent to projection maps, as follows. If f is differentiable on a neighborhood of a in R^n with values in R^m, if the derivative f'(x) is a continuous function of x, and if f'(a) is surjective, then on some nbhd U of a, f is differentiably isomorphic to a projection.
I.e. if f:R^n-->R^m is continuously differentiable near a with surjective derivative at a, then there are open sets U in R^n, W in
R^(n-m), V in R^m, with U a nbhd of a, V a nbhd of f(a), and a differentiable isomorphism h:U-->WxV, such that the composition
foh-1:WxV-->V, is the projection map (x,y)-->y. Then the parallel flat sets Wx{y} which fiber the rectangle WxV, are carried by h-1 into "parallel" curved sets which fiber the nbhd U of a. The fiber passing through a, suitably restricted, is the graph of a differentiable function, hence the name of the theorem.
I.e. one can take a smaller nbhd of a within U, of form XxY, with XinW, and the map XxY-->WxV to be of form (x,y)-->(x,f(x,y)). Then the flat set Xx{f(a)} pulls back by h-1 to some subset Z of XxY in which every point is determined by its "X-coordinate". I.e. given x in X, there is a unique point of form (x, f(a)), hence a unique point h-1(x,f(a)) in the set Z = h-1(Xx{f(a)}). Since on Z, the Y coordinate of every point is determined by the X coordinate, and every x coordinate in X occurs, Z is the graph of a function X-->Y. This function is differentiable since it is a composite of differentiable functions: i.e. (projection) o (h-1) o (id,f(a)). We are more interested in the simpler geometric interpretation, that the map fibers the domain into smooth parallel surfaces, than in the "implicit function" interpretation that each of these surfaces is a graph of a function.

Compactness
In proving various results, we will often need the important ideas of connectedness and compactness from point set topology. In Euclidean space recall that an open set is a union of open balls. Compactness is a replacement for finiteness as follows: a set Z is called compact if whenever Z is "covered by" a collection of open sets (i.e. Z is contained in the union of those open sets), then a finite number of those same open sets already cover Z. A set is called "closed" if it is the complement of an open set.
A subset of R^n is compact if and only if it is closed and contained in some finite open ball, i.e. if and only if it is closed and "bounded". It follows that the product of two compact sets of Euclidean space is compact.
If f is a continuous function, and Z a compact subset of its domain, then f(Z) is also compact. Hence a real valued continuous function defined on a compact set Z assumes a global maximum there, namely the least upper bound of its values on Z. Likewise it assumes a global minimum on Z.
If Z is a compact subset of R^n then any open cover {Ui} of Z has a "Lebesgue number". I.e. given any collection of open sets {Ui} covering Z, there is a positive number r > 0, such that every open ball of radius r centered at any point of Z is wholly contained in some open set Ui of the given cover. This number is the minimum of the continuous function assigning to each point p of Z the least upper bound of its distances from the outside of all the sets Ui, i.e. the least upper bound of all r > 0 such that the open ball of radius r about p is contained in some set Ui. This function is positive valued since the sets Ui cover Z, hence it has a positive minimum.
A sequence contained in a compact set Z has a subsequence converging to a point of Z. In R^n this property implies in turn that Z is closed and bounded hence compact.

Connectedness
This is one of the most intuitive concepts in topology. Ask anyone mathematician or not, which set is connected, the interval [0,1], or the two point set {0,1}, and they will always get it correct. Fortunately it is also one of the most important and powerful concepts. A set Z is connected if whenever Z is contained in the union of two open sets A,B, then either some point of Z is in both A and B, or Z is entirely contained in one of the sets A or B. I.e. you cannot separate a connected Z into two non empty disjoint open parts (A intersect Z) and (B intersect Z). Either (A intersect Z) and (B intersect Z) have a common point, or one of them is empty.
The empty set is connected. Any one point set is connected. The only connected subsets of R are the intervals, either finite or infinite, open or closed, half open or half closed. The image of a connected set under any continuous map is again connected. Thus an integer valued continuous function on an interval is constant. If f is a continuous real valued function defined on an interval of R, the set of values of f is also an interval. In calculus this is called the intermediate value theorem. (Tip: For proving things about connectedness, the most efficient form of the definition is that a set S is connected if and only if every continuous map from S to the 2 point set {0,1} is constant.)
If f:S^1-->R^2 is a continuous injection from the circle to the plane, then R^2 - f(S1) is a disjoint union of exactly two non empty connected open sets, the inside and the outside of the closed loop f(S1). This, the "Jordan curve theorem", is famously hard to prove, but we will prove it easily when f is continuously differentiable.

mathwonk · Jun 12, 2006

i have just summarized all of the basics of topology, linear algebra, and calculus of several variables. did i touch any bases? help anyone?

do you recognize the content of your first 2 or 3 years of math in these 10 pages? :bugeye:

ircdan · Jun 13, 2006

mathwonk said:

do you recognize the content of your first 2 or 3 years of math in these 10 pages?

In general, most of the stuff, but some of the stuff I hadn't seen before.

I was surprised because I had seen most of the stuff in your post on differential topology(I haven't studied topology yet). I also was familiar with connectedness. I had not seen the Inverse function and implicit function theorems but I'll be seeing them again next semester. Also the local boundedness stuff was new.

Do you have any notes on algebra you can just copy/paste? I'm taking my first course on algebra next semester from an extremely difficult(yet amazing) professor, so I plan to start reading my book in advance. Any extra notes or pointers would be appreciated!

courtrigrad · Jun 14, 2006

I was just wondering, does anybody have an online notebook. In other words, I am thinking about creating a Latex journal that shows my work for all the problems that I do (right now working out of Apostol). Would do you guys think about this?

ircdan · Jun 14, 2006

courtrigrad said:

I was just wondering, does anybody have an online notebook. In other words, I am thinking about creating a Latex journal that shows my work for all the problems that I do (right now working out of Apostol). Would do you guys think about this?

Yes I have one. Currently I have four categories, Advanced Calculus, Linear Algebra, Complex Analysis, and Number Theory. It's actually really fun doing this, in a sense it's an "end result" to all your work. I mean sure there is a certain self satisfaction you get from proving something, but there is nothing concrete. It's also an amazing way to get organized. Currently I have 5-6 big binders full of math problems, all disorganized, so what I do is I read a section, do as many of the problems as I can, and then compare them to my previous solutions if any. Alot of times I find out my new solution ends up being much cleaner than my old one. Also I don't use latex, I just use a scanner, it's much quicker and I can focus on solving problems rather than on making them look pretty. I think it's a great idea, go for it.

Other Should I Become a Mathematician?

Similar threads

Admissions PSA -- Contact PI's before applying to graduate school in the US

Programs Johns Hopkins Applied Physics Masters online

Admissions Taking longer to complete the degree

Other Which branch of engineering is more physics heavy?

Self Learning Math/Physics

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers