Other Should I Become a Mathematician?

  • Thread starter Thread starter mathwonk
  • Start date Start date
  • Tags Tags
    Mathematician
AI Thread Summary
Becoming a mathematician requires a deep passion for the subject and a commitment to problem-solving. Key areas of focus include algebra, topology, analysis, and geometry, with recommended readings from notable mathematicians to enhance understanding. Engaging with challenging problems and understanding proofs are essential for developing mathematical skills. A degree in pure mathematics is advised over a math/economics major for those pursuing applied mathematics, as the rigor of pure math prepares one for real-world applications. The journey involves continuous learning and adapting, with an emphasis on practical problem-solving skills.
  • #301
you are doing fine. insecurity is an occupational hazard, as there are so many brilliant people in it. love of the subject is the key. i think from what you say you are cut out to do this.


why give up what we love just because someone else is better at it? be happy for them, and hang in there.:smile:
 
Last edited:
Physics news on Phys.org
  • #302
I love math. =]

and don be discouraged by what others do. I'm dense compared to my classmates too, but I'm the brightest dense person (or maybe the densest bright person). be proud of your accomplishments even if you're not the best. Having a love of math is good enough if you really enjoy what you do regaurdless of how well you do it
 
  • #303
day 4.1 algebra, groups and group actions

8000 Group Actions, Simplicity of Icos, Sylow, Jordan Holder.
We continue to study finite groups. To study non abelian ones, we try as with abelian groups to decompose them into products composed of smaller subgroups. This is not always possible, and even to attempt it we need to prove the existence of smaller subgroups. A finite abelian group G has a subgroup of order n for every n that divides #G. This is not true for non abelian groups, but it is true for prime power factors ps dividing #G. To find these subgroups we could look for non trivial homomorphisms, but the kernel of a homomorphism is a normal subgroup, and subgroups of non abelian groups may not be normal. Worse, some non abelian groups have no proper normal subgroups, i.e. they are "simple". A homomorphism from a simple group G is thus either injective or constant.
We cannot have a product decomposition of such a group, since a product KxH admits two projections, to K and to H, whose kernels are the normal subgroups {e}xH and Kx{e}, which intersect in the identity. It cannot be a "semi direct product" KxcH*since that requires K to be normal and K to intersect H only in the identity, nor even be an extension of a group H by a group K, i.e. there is no exact sequence {e}-->K-->G-->H-->{e}, since that requires K to be normal.
Thus we need another tool to study general groups, "group actions", a refinement of the technique of homomorphisms.
Definition: A group G acts on a set S if there is a map GxS-->S taking (g,x) to gx, such that g(hx) = (gh)x for all g,h in G, and all x in S, and ex = x for all x in S.
This is equivalent to a homomorphism G-->Bij(S) of G into the group of bijections of S, taking g to the bijection (x-->gx) and allows us to study G by looking at how it moves points of S around.

The key concepts are orbit, stabilizer, and
the counting principle #(G) = #(orbit)#(stabilizer).

More precisely:
Defn: If G acts on S, the orbit O(y) of a point y in S is the image of the map
G x {y}-->S, i.e. O(y) = {gy: all g in G}.

Defn: If y is in S, Stab(y) = {g in G: gy = y}.

Cosets and conjugacy classes come in as follows:
Lemma: If y is in S, and gy = z, then Stab(z) = gStab(y)g-1, is a conjugate of Stab(y), and the set {elements in H taking y to z} = the coset (g.Stab(y)), and its conjugate hg.Stab(y)h-1 = {elements in H taking h(y) to h(z)}
proof: exercise:

Counting principle: For any y in S, #(G) = #O(y).#Stab(y).
proof: Since every element of G takes y to some element of the orbit of y, G is the disjoint union, over all z in O(y), of the sets {all h in H: hy = z}. Since each of these is a coset of Stab(y), and since multiplication by g is a bijection Stab(y)-->gStab(y), each of these cosets has the same cardinality as Stab(y). QED.

Lemma: Every subgroup H of G is a stabilizer for some action.
proof: Let G act on left cosets of H by left translation. I.e. x takes yH to (xy)H. Then H is the stabilizer of the coset eH = H. QED.

Thus stabilizers for actions can be used to study all subgroups.
Corollary(LaGrange): For every subgroup H of G, #(H) divides #(G).
proof: The counting principle says #(G) = #(H).#(cosets of H in G). QED.

Note: Being in the same orbit is an equivalence relation on S, so an action partitions S into disjoint orbits, each orbit having cardinality dividing #(G).

Def: A fixed point is a point y of S such that Stab(y) = G, i.e. O(y) = {y}.

Corollary: If S is finite, and #(G) = pr, where p is prime, then #(S) is congruent modulo p, to the number of fixed points.
proof: S is the disjoint union of orbits, and each orbit has cardinality divisible by p, except the singleton orbits. QED.
Example of a simple group: G = Icos = rotation group of a regular icosahedron. G acts on the points of the icoshedron, in particular on the vertices, which form one orbit of 12 points. Since each vertex is fixed by exactly 5 rotations, #(G) = (5)(12) = 60. This agrees with the orbit of 20 faces, each fixed by 3 rotations, and the orbit of the 30 edges, each fixed by two rotations.
The 20 elements of order 3 fixing the 10 pairs of opposite faces, the 24 elements of order 5 fixing the 6 pairs of opposite vertices, and the 15 elements of order 2 fixing the 15 pairs of opposite edges, give all 59 non trivial elements of G.
Since the stabilizers of all vertices are conjugate, a normal subgroup containing one element of order 5 contains all, and similarly for the other orders. Hence a normal subgroup K of G has order = 1 + some or none of the integers 15, 20, 24. But the only divisors of 60 those sums form are 1 and 60. Hence G has no proper normal subgroups, so is simple.

Next we use actions to produce stabilizer subgroups of prime power orders.
Theorem(Sylow): Let #(G) = mpr where p does not divide m.
1) There exist subgroups of G of order pr.
2) All subgroups of order pr are conjugate to one another,
3) The number of subgroups of order pr divides m, and is congruent to 1 modulo p.
proof: Suppose G acts on a set S such that p does not divide #(S). S is a disjoint union of orbits, so there is an orbit O(x) whose order is not divisible by p. By the counting principle pr divides #(Stab(x)). So if we can find such an action where #(Stab(x)) <= pr, we would be done.
Since G is an arbitrary group, the only thing G acts on is G itself, by translation, and conjugation. But G has order divisible by p. We might consider subgroups of G, but we do not know how many there are! So we consider subsets of G, with G acting by translation. If a subgroup H stabilizes a non empty set T, then for any y in T, translation is an injection H-->T taking g in H to gy in T. So H is no larger than T. Thus if we let G act on subsets of size pr, then the stabilizers will have cardinality <= pr as desired.
So we hope the number of such subsets is not divisible by p. Of course the set S of subsets of g of size pr, has order = = . In this fraction every factor in the top of form (mpr-k), is divisible by ps , s <= r, if and only if k is, if and only if the factor (pr-k) in the bottom is. Thus every factor of p occurring in the top is canceled by a factor from the bottom. Hence this binomial coefficient is not divisible by p, and thus the stabilizer of any subset in an orbit not divisible by p, gives a subgroup of G of order pr. QED

Lemma: If H,K are subgroups of G and H lies in N(K), then the set of products HK is a subgroup of G, and HK/K ? H/(HmeetK).
proof: exercise.
To count the number of subgroups P1,...,Pn, of order pr, (called Sylow p - subgroups, or pr - subgroups) let P1 act by conjugation on all of them. We claim P1 fixes only P1. To prove it, if P1 fixes Pj, then P1 lies in the "normalizer" N(Pj) = {g in G such that g-1Pjg = Pj}. Then P1Pj is a subgroup of G, and (P1Pj)/Pj ? P1/(P1meetPj). Since the latter quotient group has order dividing #(P1) = pr, it follows that #(P1Pj) is a power of p. Since P1Pj contains P1, whose order is already the largest possible power of p for a subgroup of G, hence P1 = Pj. Thus the action of P1 on the set S of Sylow p subgroups, has exactly one fixed point. By the counting principle above for p-groups, #(S) is congruent to 1, mod p.
Now let G act on S by conjugation. The G- orbit of Pj contains the P1 orbit of Pj. Thus the G orbits are unions of P1 orbits, and all the P1 orbits except {P1}, have order divisible by p. So the G orbit containing P1 has order congruent to 1 mod p, while the others are divisible by p. But the normalizer of any Pj in G contains Pj. The order of the G orbit of Pj equals the index of that normalizer, hence divides m, so cannot be divisible by p. Thus there is only one G orbit, i.e. all Pj are conjugate. Since the order of each orbit divides m, and there is only one orbit, #(S) divides m. QED.
Cor: A group G of order 24 cannot be simple.
proof: 24 = 23.3, so the number k of subgroups of order 8, divides 3, hence k = 1, or 3. If k = 1, the unique subgroup of order 3 is normal, if k = 3, we get a transitive action of G by conjugation on the 3 subgroups, hence a homomorphism G--S(3), which must have a non trivial kernel, since #S(3) = 6 < 24 = #(G).

Cor: Every simple group G of order 60 is isomorphic to A(5).
proof: First we want to find a non constant homomorphism G-->S(5).
Since 60 = 22.3.5, by Sylow there are 1, 3, 5, or 15, sylow 4-subgroups, and 1, 4, 10, or 20, sylow 3-subgroups, and 1, 6, or 12, sylow 5-subgroups. G is simple, hence has trivial center, so cannot have a conjugacy class of one non identity element, and a transitive action on a set of n < 5 elements gives non constant homomorphism to a group S(n) of order less than 60. So there are either 5 or 15 subgroups of order 4; 10 of order 3; and 6 of order 5. This gives 20 elements of order 3, and 24 elements of order 5. So we focus on the groups of order 4.
If there are 5 of them, since G acts transitively on them by conjugation, we have our non constant map G-->S(5). If there are 15, they cannot all intersect trivially, since there are only 15 elements left in the union of all the 4-subgroups. Hence some pair of distinct 4 groups contain a common element x, necessarily of order 2.
Then the normalizer N(x) of x is a subgroup which contains two distinct sylow subgroups of order 4. Thus #(N(x)) = 4n for some n > 1, and #(N(x)) divides 60. Hence #(N(x)) = 12, 20 or 60. Hence the index of N(x), i.e. the order of the class of elements conujugate to x, has order <= 5. Since G acts transitively on this class, it has order 5, and again we have our non constant map <pi>:G-->S(5).
The map <pi> is injective since the kernel is a normal subgroup smaller than G. Moreover if S(5)-->{±1} is the "sign map" (discussed below), the composition G-->S(5)-->{±1}, must have non trivial kernel in G. Since the only non trivial normal subgroup of G is G itself, the image of the map
G-->S(5) lies in A(5) = kernel(sign map). Hence G ? A(5). QED.

Challenge: Consider groups of order 168. Try to find a simple one, and prove it is unique. Then prove there are no other simple groups of order < 168, or even < 360, (except abelian ones of prime order).

Exercise: Extend Sylow's theorem, by showing the following:
i) If p is prime and ps divides #G, then G has a subgroup of order ps.
[hint: It suffices to look inside a sylow p-subgroup. Prove the center of a p-group is always non trivial by looking at conjugacy classes. I.e. elements of the center define conjugacy classes with only one element. All non trivial conjugacy classes are divisible by p. So how many singleton classes must exist? Then you can mod out by the center and use induction.]
ii) If G has a subgroup H of order ps where p is prime, prove H is contained in a sylow p-subgroup. [hint: the proof we gave above for the number of sylow groups showed that when a p-group acts on all sylow p-subgroups by conjugation, it must be contained in any fixed subgroup.]

Decomposing groups as "products" of subgroups.
Direct products:
Now that we have a good supply of subgroups in any group G, we ask when G decomposes as a product of some of these subgroups. We define a direct product of groups exactly as before:
Def. H x K = {all pairs (h,k) with h in H, k in K} and (x,y)(h,k) = (xh,yk).

Non abelian products only have half the mapping properties of abelian ones:
Lemma: The projections HxK-->H and HxK-->K are homomorphisms, and if f:G-->H and g:G-->K are any two homomorphisms, there is a unique homomorphism G-->HxK, whose compositions G-->HxK-->H, and G-->HxK-->K equal f and g. proof: exercise.
 
  • #304
day 4.2 algebra, classifying small groups using semi direct products

This does not help us to decompose G, because if H,K are subgroups of G, we only have inclusion maps H-->G and K-->G. In the non abelian case, these do not define a map H x K-->G. This is why it is harder to decompose G as a product. The image of such a map would be the set of products of elements of H and K, but these products usually do not even define a subgroup of G unless at least one of H or K is normal.

Exercise: If H,K are subgroups of G and H lies in the normalizer of K, then HK is a subgroup of G, and HK/K ? H/(HmeetK).

To define a map out of a product we need some commutativity. We identify H, K with the subgroups H x {e}, and {e} x K in H x K. Then H and K intersect only in {e} = {(eH,eK)}, and every element of H commutes with every element of K, i.e. (h,e)(e,k) = (h,k) = (e,k)(h,e). Thus both H and K are normal subgroups of H x K. Conversely, if normal subgroups H, K of a group G intersect only in {e}, they commute with each other since for x in H, y in K, we have x(yx-1y-1) = (xyx-1)y-1, belongs both to H and K. Hence xy(x-1y-1) = e, so xy = yx.

This much commutativity is enough to define a map out of a product.
Proposition: If f:H-->G and g:K-->G are group maps then f(H) and g(K) are subgroups of G. If the elements of these image subgroups commute with each other, i.e. if f(x)g(y) = g(y)f(x) for every x in H, y in K, then the map
(f x g):H x K-->G with (f x g)(s,t) = f(s)g(t) is a homomorphism whose restrictions to H,K are f, g respectively.
proof: With this definition, (fxg)(u,v).(fxg)(s,t) = f(u)g(v)f(s)g(t) = f(u)f(s)g(v)g(t) = f(us)g(vt) = (fxg)(us,vt) = (fxg)((u,v).(s,t)). QED.

Cor: If H,K are normal subgroups of G and HmeetK = {e}, there is an injective homomorphism HxK-->G sending (h,k) to hk, whose image is HK.
proof: We have just proved the image groups H,K commute, so this is a homomorphism. If hk = e, then h-1 = k, so belongs to both H and K, hence k = e = h, proving injectivity. The image is obviously HK. QED.

Cor: If H,K are normal in G, HK = G and HmeetK = {e}, then G ? HxK.

Examples: A group of order 15 has sylow subgroups H,K of orders 3,5, which are unique, since 1 is the only factor of 5 congruent to 1 mod 3, and also the only factor of 3 congruent to 1 mod 5. Thus both H, K are normal, intersect only in {e}, so G ? Z/3 x Z/5 ? Z/(15). QED.

This example generalizes as follows. If #G = pq, with p,q primes, the sylow subgroups H, K have orders p, q. If p > q, the number of sylow p-subgroups divides q and has form 1, p+1, .., hence equals 1. So the sylow subgroup of the larger prime is always normal. The number of q - sylow subgroups has form nq+1 and divides p, so since p is prime it equals p, so nq = p-1, and q divides p-1. Thus we have:

Proposition: If #G = pq where p>q are primes, and q is not a factor of
p -1, then G is cyclic, G ? Z/(pq).
proof: As above, both sylow subgroups are normal, so G ? Z/p x Z/q ? Z/(pq). QED.

E.g., all groups of orders, 15, 35, 65, 77, 91, 85,139, 95, 133,... are cyclic.

What about groups of order p2?
Proposition: All groups of order p2 are abelian. Hence there are exactly 2 of them, Z/p2 and Z/p x Z/p.
proof:

Lemma: A p - group always has a non trivial center.
proof: This uses the orbit formula in the following form: If N(x) = {y: yx=xy} = the normalizer of x, then N(x) is a subgroup, and its index is the order of the conjugacy class of x. Hence #G = sum over one element x from each conjugacy class, of the indices of the N(x). In particular, since an element is in the center Z(G) if and only if its normalizer is G with index 1, we have:

The class equation: #G = #Z(G) + summation IndexN(x), for one x in each non trivial conjugacy class.

proof of lemma: For a p - group G, these non trivial indices are all powers of p, as is #G, hence so is #Z(G). I.e. #Z(G) is divisible by p, so the center contains more than just {e}. QED lemma.

proof of proposition:
If x is any element of any group, the normalizer of x always contains both x and the center Z(G). If x is in the center then N(x) = G. If not, then N(x) is strictly larger than Z(G). Since in a p group, #Z(G) is at least p, then for every x ,#N(x) is at least p2. But that means for every x, N(x) = G. Hence every x is in Z(G). QED.Prop.

We now know all groups of order 4, 9, 25, 49, 121,..., and may ask about groups of order pq where p > q and q is a factor of p-1, like #G = 6, or 21, or 2p, for p odd. As above, these are the cases where only one of the two sylow subgroups need be normal. So what happens in that case? I.e. how does the "product" group HK look then? We need another tool.

Semi - direct products
If H,K are subgroups of G and only K is normal, the products kh still form a subgroup KH, but the multiplication is more complicated. If we understand H and K, we need to know how to multiply products of form (xs)(yt) where x,y are in K, s,t are in H. If s,y did commute, then (xs)(yt) would equal xyst, but sy may not commute, but the extent to which they do not commute is given by conjugation. Thus sy may not equal ys, i.e. sys-1 may not equal y, but it does equal cs(y) where cs:K-->K is conjugation by s.
I.e. if we know the automorphism cs:K-->K, then sys-1 = cs(y), so sy = cs(y)s. Thus xsyt = x(sy)t = x(cs(y)s)t = (x.cs(y))(st). Thus if c:H-->Aut(K) is the homomorphism taking each s to cs = conjugation by s, the product (xs)(yt) is given by (x.cs(y))(s.t). This tells us how to define a twisted product, called the semi direct product of K and H, with twisting given by a homomorphism c:H-->Aut(K).

Defn: Let H,K be groups and let c:H-->Aut(K) be a homomorphism. Then define multiplication on the cartesian product KxH by setting (x,s).(y,t) = (x.cs(y), st). Denote the resulting semi direct product by KxcH.

Exercise: With definitions as above, prove:
(i) The semi direct product KxcH is a group.
(ii) The subsets K' = {(k,e) for all k in K}, and H' = {(e,h) for all h in H} are subgroups of G isomorphic to K, H respectively, and K' is normal.
(iii) The action of H on K via c becomes the conjugation action of H' on K', i.e. if k' = (k,e), h' = (e,h), then h'k'h'-1 = (c(h)(k))' = (h(k),e).
(iv) H' is normal in KxcH if and only if c is the trivial homomorphism.
(v) If H, K are subgroups of a group G, K is normal, and we define
c:H-->Aut(K) to be conjugation of K by H, then letting f(k,h) = kh, defines a homomorphism f:KxcH-->G, which is surjective if G = KH, and injective if KmeetH = {e}.

Proposition: If G has order 2p where p is an odd prime, there is exactly on non abelian group of order 2p, the dihedral group Dp.
proof: The subgroup K of order p is normal, so we have an isomorphism G ? (Z/p) xc (Z/2), where c:Z/2-->Aut(Z/p) is a non trivial homomorphism. Since Aut(Z/p) ? (Z/p)* ? (Z/(p-1)), there is only one element of order 2 in Aut(Z/p), hence only one on trivial map c, hence one non abelian group. Since Dp is non abelian of order 2p, this is it. QED.
This classifies all groups of orders 6, 10, 14.

Next we show homomorphisms c:H-->Aut(K) that difffer by an automorphism of H, define isomorphic semi direct products.
Proposition: Let H, K be groups, c:H-->Aut(K) a homomorphism, g:H-->H an automorphism of H, and define c':H-->Aut(K) by c' = cg-1. Then the map f:KxcH-->Kxc'H defined by f(k,h) = (k,g(h)), is an isomorphism.
Proof: f is a bijective function, with inverse f-1(k,h) = (k,g-1(h)), so we check the homomorphism property. If (k,h), (k1,h1) are in KxcH, their product is (k,h)\(k1,h1) = (k.c(h)(k1),hh1), whose image is f(k.c(h)(k1), hh1) = (k.c(h)(k1), g(hh1)).
On the other hand the two images of (k,h) and (k1,h1) are f(k,h) = (k,g(h)) and f(k1,h1) = (k1, g(h1)), hence the product of the images is (k,g(h)).(k1, g(h1)) = (kc'(g(h))(k1), g(h)g(h1)). Since c'g = c, and g is a homomorphism, thus indeed f((k,h).(k1,h1)) = (k.c(h)(k1), g(hh1)) = (kc'(g(h))(k1), g(h)g(h1)) = f(k,h).f(k1,h1).
QED.

Exercise: i) If p-1 = mq, there are exactly q-1 non constant maps
c:(Z/q)-->Z/(p-1), taking [1] to some multiple of [m].
ii) Aut(Z/p) ? Z/(p-1).
iii) If p-1 = mq, all non constant maps c:Z/q-->Aut(Z/p) define isomorphic semi direct products (Z/p) xc (Z/q).
iv) If p-1 = mq, there is exactly one non abelian group of order pq.

Classifying groups whose order has more than 2 factors is more work.
Theorem: There are exactly 2 non abelian groups of order 8, up to isomorphism, Hamilton's unit quaternions, and D4 = Isom(square).
Proof: #G = 8 = 23, and G not cyclic, so all elements have order 1,2, or 4.

Lemma: Two elements x,y of order 2 in a group, commute if and only if their product has order 2.
proof: If xy has order 2, then (xy)(xy) = e, so xy = (xy)-1 = y-1x-1 = yx, since x,y have order 2. The other direction is even easier. QED.

Hence G has elements of all orders 1,2, and 4.
case 1) Assume there is only one element of order 2, hence 6 elements of order 4. Then let x be an element of order 4, and y another element of order 4, with y different from both x and x-1. The subgroup <x> has index 2, hence is normal. Since G = <x>.<y>, and <x> ? <y> ? Z/4, G must be the image of a surjective map from a non trivial semidirect product Z/4 xc Z/4, defined by a non constant homomorphism
c:Z/4-->Aut(Z/4) ? Z/2. There is only one such map, hence only one such non trivial s.d.p. Z/4 xc Z/4.
Now for the map Z/4 xc Z/4-->G. It is multiplication, (or exponentiation in our notation) hence maps {0}xZ/4--><y> isomorphically ([0,n]-->yn), and maps Z/4 x {0}--><x> isomorphically ([n,0]-->xn). Since there is only one element of order 2 in G, the elements x2 = y2 are the same, so the element [2,2] of Z/4 xc Z/4, must be the unique non trivial element of the kernel. Hence G ? [Z/4 xc Z/4]/{(2,2)}, is also uniquely determined. So there is only one non abelian group of order 8 with a unique element of order 2. Note that Hamilton's quaternions do satisfy this description, hence this is the quaternion group.

case 2) Assume there are more than one element of order 2. There are still some elements of order 4, so let x have order 4, hence x2 is the unique element of order 2 in the subgroup <x>. then choose another element of order 2, say y, different from x2. Then <x> is normal and the subgroup <x>.<y> = G, so G ? <x> xc <y> ? (Z/4)xc(Z/2), defined by the unique non trivial map c:Z/2-->Aut(Z/4). So there is only one non abelian group of order 8 with more than one element of order 2, which must be D4 = Isom(square).

Theorem: There are 3 non abelian groups of order 12, up to isomorphism.
proof: #G = 12 = 22.3, so there are sylow subgroups H,K of orders 3,4. If there are 4 subgroups of order 3, hence 8 elements of order 3, there are only 4 elements left to form one group of order 4, so the sylow 4-subgroup is unique and normal. Hence at least one of the sylow subgroups is normal. If both sylow subgroups H,K are normal, G ? HxK, hence G is abelian. So if G is non abelian, only one sylow subgroup is normal.
Since HK = G, we have in all cases an isomorphic map KxcH-->G where c:H-->Aut(K) is a non constant homomorphism. (The constant homomorphism defines the trivial direct product, which is abelian.) If the 4-subgroup is normal, we have c:Z/3-->Aut(K), where K is either Z/4 or Z/2 x Z/2. Since the only homomorphism Z/3-->Aut(Z/4) ? Z/2 is constant, K must be Z/2 x Z/2. Then Aut(Z/2 x Z/2) ? S(3) has 2 elements of order 3 so there are two non constant maps c:(Z/3)-->Aut(K). Since one can show that Aut(Z/3) acts on the set of the resulting semi direct products by isomorphisms, and since Aut(Z/3) ? Z/2, the two non cobnstant maps Z/3-->S(3) yield isomorphic groups KxcH.
Thus there is only one non abelian group G ? (Z/2 x Z/2) xc (Z/3) of order 12, with normal sylow 4 - group. In fact the group Tet = Isom(tetrahedron) has order 12, and 4 distinct subgroups of order 3, so must be this group. The action on the 4 vertices also embeds this group as A(4) in S(4), since that sub group is generated in S(4) by the 8 elements of order 3.
If K = Z/3 is the normal subgroup, and H is the sylow 4-subgroup, we have a map H-->Aut(K) ? Aut(Z/3) = {±Id} ? Z/2. If H ? Z/4 there is only one non trivial map, taking [1] to -Id. So there is only one non abelian group of order 12 with Z/3 as normal subgroup, and having a subgroup isomorphic to Z/4, i.e. one non trivial semi direct product (Z/3) xc (Z/4).
I have not run across this group in geometry.
If K = Z/3 is the normal subgroup, and H ? Z/2 x Z/2 is the sylow 4-subgroup, then c:(Z/2 x Z/2)-->Aut(Z/3) = (Z/3)* ? Z/2, so there are three non constant maps, each taking two of the vectors (1,0), (0,1), (1,1) to 1, and taking the other vector to 0. But again Aut(Z/2 x Z/2) ? S(3) acts transitively on these maps. Hence all three resulting semi direct products are isomorphic, so there is really only one non abelian semi direct product of form (Z/3) xc (Z/2 x Z/2). Since the dihedral group D6 = Isom(hexagon) has order 12, seven elements of order 2, two elements of order 6, and two elements of order 3, it must be this group.
 
  • #305
day 4.3 algebra, normal and composition series for groups

Theorem:
G is solvable iff all subgroups and quotient groups of G are solvable iff there is one normal subgroup K such that both K and G/K are solvable.
proof:
I. Assume K is normal in G, and that both K and G/K are solvable. Thus we have normal series K = K0 > K1 > ...> Kn = {e}, and G/K = H0 > H1> ...> Hm = {[e]}, and all quotients Ki/Ki+1 and Hj/Hj+1 are abelian. Then define a normal series for G by augmenting that for K, by the pull back of that for G/K. I.e. let Gj = f-1(Hj) where f:G-->G/K is the natural projection. Since the inverse image of a normal group is also normal, all Gj are normal. Hence G = G0 > G1 > ...>Gm = K = K0>...>Kn = {e} is a normal series for G. The Ki/Ki+1 are still abelian, Gm/K0 = {e} is abelian, and for j < m, we have Gj/Gj+1 ? (Gj/K)/(Gj+1/K) ? Hj/Hj+1 is abelian. That proves G solvable.
II. Next assume G solvable with abelian normal series G = G0 > G1 > ...>Gm = {e}, and let H be any subgroup. Define Hi = HmeetGi. Then Hi+1 is not necessarily normal in G, but it is normal in Hi. I.e. conjugating an element y of Hi+1 = HmeetGi+1 by an element x of Hi = HmeetGi is conjugating by an element of Gi, and Gi+1 is normal in Gi. Hence xyx-1 lies in Gi+1. But x also lies in H, as does y, and H is normal in H, so xyx-1 also lies in H. I.e. for all x in Hi and y in Hi+1, xyx-1 lies in HmeetGi+1 = Hi+1.
Now Hi/Hi+1 = (HmeetGi)/(HmeetGi+1), so if we map HmeetGi into Gi/Gi+1, the kernel is precisely (HmeetGi+1). Hence Hi/Hi+1 is isomorphic to a subgroup of Gi/Gi+1, hence is also abelian. Thus H is solvable.
III. Assume G is solvable with abelian normal series G = G0 > G1 > ...>Gm = {e}, K is normal in G and consider G/K. Define Hi = (KGi)/K. Since the class of elements of K are trivial, each class in Hi can be represented as [x] for some x in Gi, and similarly each [y] in Hi+1 can be represented as [y] for some y in Gi+1. Thus [x][y][x-1] = [xyx-1] is in Hi+1, since xyx-1 is in Gi+1. Hence Hi+1 is normal in Hi.
Now consider Hi/Hi+1 = (KGi/K)/(KGi+1/K) ? (KGi)/(KGi+1). Then map Gi-->KGi-->(KGi)/(KGi+1). The composition map f is onto since again every class [y] in the quotient can be represented by an element y of Gi. Then since the subgroup Gi+1 of Gi goes to zero under this composition, there is an induced map [f]:(Gi/Gi+1)--> (Hi/Hi+1) which is still surjective. Since Hi/Hi+1 is thus isomorphic to a quotient of the abelian group Gi/Gi+1 modded out by the kernel of [f], the quotient Hi/Hi+1 is also abelian. QED.

Composition series
Notice that if G is a cyclic group Z/(mn) of order mn, then G has a cyclic subgroup generated by [m] of order n, whose quotient is cyclic of order m. Hence a cyclic group of order n = ?piri has a maximal, non redundant, normal series whose quotients are of prime order, and equal to the prime factors of n. Thus every maximal non redundant normal series for G has the same quotients, up to isomorphism, but possibly in a different order. That this also holds for non abelian groups is called the Jiordan Hoklder theorem.

Definition: A composition series for a group G is a normal series G = G0 > G1 > ...>Gm = {e}, in which every quotient group Gi/Gi+1 is simple but not trivial, ( thus a maximal, non redundant, normal series).

Theorem: (Jordan - Holder) If a finite group G has two composition series, then they have the same length, and after some renumbering of the quotients.the two sequences of simple quotients are the same, up to isomorphism.
proof: By induction on the order of G, prime order being trivial. Let
G > G1 > ...>Gm = {e}, and G > H1 > ...>Hn = {e}, be composition series for G.
case I. G1 = H1. Then we are done by induction, since the groups G1 = H1 have smaller order, so their compositiion series are the same length, and have isomorphic quotients, in some order.
case II. G1 and H1 are different. Then both G1, H1 are maximal proper normal subgroups of G, so their product G1H1 is normal and larger than either, hence G1H1 = G. We want to construct another composition series for G to reduce to case I. Look at G1meetH1. This is not equal to either G1 or H1 and is normal in both, so call it K2, and construct a composition series for K2 > K3 >...> Ks.
Then we have two new composition series for G: G > G1 > K2 > ...> Ks, and G > H1 > K2 > ...> Ks. To check this, we only have to show that both G1/K2 and H1/K2 are simple and non trivial. But G1/K2 = G1/(G1meetH1) ? G1H1/H1 = G/H1, is simple. Same for H1/K2.
Now case I tells us that m = s, and the composition series
(A) G > G1 > ...>Gm and (B) G > G1 > K2 > ...> Ks, have isomorphic quotients. Also n = s, and the series (C) G > H1 > K2 > ...> Ks, and (D) G > H1 > ...>Hn have isomorphic quotients. Since G1/K2 ? G/H1 and H1/K2 ? G/G1, we see series (B) and (C) also have isomorphic quotients. Hence the same holds for series (A) and (D), as desired. QED.

Corollary: A group G is solvable if and only if in every composition series for G, all the simple quotients are cyclic of prime order. [Necessarily the orders of the quotients is the sequence of primes in the prime factorization of #G].

Corollary: Prime factorization of integers is unique, up to order of factors.
proof: A prime factorization of n gives a composition series for Z/n. QED.

Free groups and free products of groups.
We noted that given two maps g:G-->K, h:H-->K of groups, setting (gh)(x,y) = g(x)h(y) may not define a map GxH-->K, since elements of G commmute with elements of H, but their images in K may not commute. Since we have no restriction on the elements of K, in order to construct a group from which the maps g,h, do always define a map to K, we must allow no commutativity in our "product" at all. Let G = H = Z, the simplest case. Call a the generator of the first copy of Z, and b the generator of the second copy. Since the only relations we allow are those forced by the group laws, we must allow ababab and a2bab-3ab, and so on, all to be different elements. So we define a "word" constructed from the letters a,b, to be any finite sequence of powers of a and b, e.g. ar1bs1ar2bs2... The exponents can be any integers. The sequence where all powers are zero, is the identity element, and words are multiplied by juxtaposition. When we juxtapose two words, the powers of a and b may not alternate, so we combine adjacent powers of the same letter. The trivial word has only zero exponents. A non trivial word is reduced if it has no adjacent powers of the same letter and no zero exponents. We also consider the trivial word to be reduced and write it as e.
Clearly inverses exist, and the trivial word is the identity since e = x0 = y0. Associativity is not so easy, but there is a nice proof in Mike Artin's book, which I copy.
Artin calls a word a finite sequence of the elements a,b,a-1,b-1, and a reduction of a word is obtained by a cancellation of some adjacent pair, of form xx-1, or by a sequence of such cancellations. A reduced word is one in which no cancellations are possible. The main point is that starting from any word and performing cancellations, there is only one possible reduced result. This is true if the word is already reduced, for example if it has length zero or one, so use induction on the length of the word. If a word is not reduced it must contain a pair of form xx-1. If we cancel this pair first, the induction hypothesis says there is only one possible reduced result for this word. If we perform some other sequence of cancellations, and eventually cancel this \pair, we might as well have canceled it first, and the same result holds. If on the iother hand we can one of these elements but not both, we must do it as follows: by cancelling the first two of
x-1xx-1, or the last two of xx-1x. Either way, the result is the same as if we had canceled the original pair, so the argument in that case holds. QED.

Definition: The set of reduced words in the letters {a,b}, with the operation of juxtaposition, is the free group on those two letters. The empty word is the identity, written e = a0 = b0. We shorten the notation by using higher exponents for adjacent occurrences of the same letter.

Exercise: Associativity follows easily from the (obvious) associativity on the set of unreduced words, by the uniqueness result above for reduction.

Definition: The free*product of two copies of Z is defined as the free group Fr(a,b) on two letters. It is easy to see that any two group maps f:Z-->K and g:Z-->K define a unique group map (fxg):Fr(a,b)-->K.

There is one plausible result that is not too hard.
Theorem: If G = Fr(a,b) is the free group on two letters, and G' is the commutator subgroup, the quotient G/G' ? Z x Z, the free abelian group on two letters. [hint: prove they have the same mapping property.]

Remark: The same construction, with the same proof, defines a free group on any set of letters, and proves the existence of a free product of any number of copies of the group Z. It follows that every group G is the image of a homomorphism from a free group Fr(S)-->G, but it is hard to make much use of this. I.e. unlike the abelian case, the free groups, even on two letters, are very complicated and hard to understand.

Theorem: 1) Every free group on any finite or countable number of letters, is a subgroup of Fr(a,b).
2) Conversely, every subgroup of Fr(a,b) is isomorphic to a free group on a finite or countable set of letters. [look at <pi>1(figure eight).]
“proof”: If X(n) = the “figure eight” with n loops, then X(n), n >= 2, is a covering space of X(2), and <pi>1(X(n)) ? Fr(a1,...,an), same for X(infinity). This proves 1), by the homotopy lifting property. For 2) given any subgroup of <pi>1(X(2)), it defines a covering space Y whose <pi>1 is that subgroup. But the figure eight is a graph, and every covering space of a graph is again a graph (1 dimensional complex), hence homotopic to a wedge of circles, so <pi>1 is again free. qed.
 
  • #306
mathwonk said:
jasonrox: i tried to make it clear that a math dept may be interested in very talented person, degree or not, but a grad school will not want to accept that person, and with good reason. you have seized on one phrase in my long statement and taken it out of context. read it all. i am not advising or encouraging anyone to seek entrance to gradschool without degree.

no it is unlikely you can get in and unwise to try.

ill give you one successful example, barry mazur has no undergrad degree. he's the guy andrew wiles sent his manuscript on fermats last theorem to to check it. and presumably he was unsure about it, when it was indeed wrong.

but most of the rest of us are not at all like barry. and besides barry had all the academic requirements and time spent in school, he just refused to attend required ROTC. the school (MIT) afterwards seems to have eliminated the requirement, probably as a result of the subsequent embarrasment at ahving denied barry mazur a degree.

I read the whole thing. I just wanted to check if that's what you said.

I'm aware that people like Barry are not common... at all.
 
  • #307
day 3 algebra, canonical forms of matrices

day 3 was normal forms of matrices and the matrices will undoubtedly not looad, but maybe some of the algebra still has interest.

8000 Fall 2006 Day 3.
Canonical forms of matrices, the power of Cramer's rule

Our decomposition theorem gives us a standard model in each isomorphism class of finitely generated torsion k[X] modules. This will be used next to provide a standard matrix representative for each conjugacy class, or similarity class as it is usually called, in the ring Matn(k), of n by n matrices over any field k.
Recall that a linear map T:V--->V on a k vector space V, provides a unique k algebra map k[t]--->Endk(V), sending t to T, and hence f(t) to f(T), and hence a unique k[t] module structure on V. We will denote Endk(V) simply by End(V) in this chapter for brevity, since we will not be concerned with the larger ring of group endomorphisms.
Conversely, a k[t] module structure on V singles out a unique linear map T, the image of t under the map k[t]--->Endk(V). Thus k[t] module structures on V are in natural bijection with the elements of End(V). We want to ask what equivalence relation is imposed in this way on End(V) by considering isomorphism classes of modules.

Note that if f:(V,T)--->(V,S) is a k[t] module isomorphism, then f is a k isomorphism that takes multiplication by (i.e. application of) T into multiplication by S. Thus f(Tv) = S(fv) for every v in V. Since f is an isomorphism this implies Tv = (f-1Sf)v, for every v. Hence S and T are conjugate by the isomorphism f.
Conversely, these equations show that if T = (f-1Sf), then T and S define isomorphic k[t] modules via the isomorphism f. Thus isomorphism classes of k[t] module structures on V correspond to conjugacy classes of endomorhisms via the action of Aut(V) on End(V).

Hence when V has finite k dimension, our canonical models of each k[t] - isomorphism class, translate into canonical representatives of each conjugacy class in End(V). Recall each finitely generated torsion k[t] module (V,T) has a model V ? k[t]/f1 x ...x k[t]/fm , where each fi is a monic polynomial in k[t], and fi divides fi+1.
Under the isomorphism (V,T) ? k[t]/f1 x ...x k[t]/fm the linear map T:V--->V, i.e. multiplication by T, becomes multiplication by the variable t on each factor of k[t]/f1 x ...x k[t]/fm. Hence if we choose a natural k basis for this model vector space, the resulting matrix for t will give a natural matrix representing T in some corresponding k basis for V.

A k - basis for k[t]/f1 x ...x k[t]/fm, can be obtained as the union of bases for each factor space k[t]/fi, and the simplest basis for k[t]/fi, is {1, t, t^2,..., t^ri-1}, where fi has degree ri. If f = a0 + a1t + ...+ ar-1t^r-1 + t^r,

the matrix of t in this basis is this: , where the jth column

is the coefficient vector of t times the jth basis vector. E.g. t(1) = 0(1)+1(t) + 0(t^2) + ...+ 0(t^r-1), gives the first column.

This is called a cyclic basis, since the linear map carries each basis vector to the next one, except for the last one, which is carried to a linear combination of the basis by means of scalars which are precisely minus the coefficients of the polynomial f. This is called a companion matrix Cf for f. [Other versions of it in other books may have the coefficients of f along the bottom, and the 1's above the diagonal.] Note that if v1,...,vn is one cyclic basis for (V,T) then for any c != 0, cv1,...,cvn is another, so cyclic bases are never unique.

If f1,...,fm is the sequence of polynomials defining the module (V,T), the full matrix for T using the cyclic bases for each factor looks like this:

, where there are zeroes away from the Cfi.

Summarizing, we have the following.
Theorem: If V is a vector space of finite dimension n over a field k, and T is any linear endomorphism of V, there exist bases for V in which the matrix of T is composed of one or more blocks, each block being a companion matrix for a monic k polynomial fi.
The sum of the degrees of the fi equals n, and we may choose them so each fi divides fi+1. If we do this, then two maps S,T of V are conjugate if and only if they have exactly the same matrix of companion blocks. There is exactly one companion matrix block Cf for each factor k[t]/(f) in the standard decomposition of the k[t] module structure for (V,T). The block Cf has dimension deg(f) by deg(f).

Terminology: We call the unique matrix of this type associated to T, the rational canonical matrix for T.
Two natural questions remain:
1) how do we find the canonical form for a given matrix? and
(more difficult):
2) how do we find a basis that puts a given matrix into canonical form?
A third question is:
3) is there a simpler canonical matrix in cases where the polynomials fi are particularly simple, e.g. when they all factor into linear factors over k?

Before addressing these questions, we derive some useful consequences of the results we already have. For example we can already*compute the important invariant ?fi of the module (V,T), using determinants. Briefly, we claim this product is the "characteristic polynomial" of T, ?fi = det[tI-T] = chT(t). Since fm is the annihilator of the module (V,T), this implies the Cayley Hamilton theorem: chT(T) = 0.

Before proving this, we recall without proof the basic theory of determinants, including LaGrange's formulas for expanding them along any row or column, and the resulting "Cramer's rule".
Review of determinants.
If A = [aij] is an n by n matrix over a commutative ring, denote by Aij the (n-1) by (n-1) matrix obtained from A by deleting the ith row and jth column. Then LaGrange's formulas say, for each fixed value of i, det(A) = <sum>j (-1)^i+j det(Aij), (expansion by the ith row), and for each fixed value of j, det(A) = <sum>i (-1)^i+j det(Aij), (expansion by the jth column.
Thus if we define adj(A) = the adjoint of A, as the matrix whose i,j entry equals (-1)i+j det(Aji), i.e. as the transpose of the matrix of signed determinants of the Aij, it follows that the matrix products adj(A).A = A.adj(A), both equal the diagonal matrix det(A).I, whose entries along the diagonal are all equal to det(A).
Thus if det(A) is a unit in the ring of coefficients, then A is an invertible matrix with inverse equal to (det(A))-1.adj(A). Since for any two n by n matrices A,mB we always have det(AB) = det(A)det(B), the converse is also true. I.e. AB = I implies det(A)det(B) = det(I) = 1, so both det(A) and det(B) are units. Thus the equation adj(A).A = A.adj(A) = det(A).I, yields a formula for the inverse of an invertible A, and hence Cramer's rule for solving invertible systems AX=Y.
Cramer's formula also implies that a matrix and its transpose have the same determinant. I.e. since the transpose of the adjoint is the adjoint of the transpose, taking the transpose of the equation adj(A).A = A.adj(A) = det(A).I, gives (det(At).I) = At.adj(At) = adj(At).At = (det(A).I)t = det(A).I, the last because the diagonal matrix det(A).I is symmetric.

Define: the characteristic polynomial of a linear map T on a finite dimensional space chT(t) = det([tI-A]) where A is any matrix for T.

By the previous remarks, a matrix A and its transpose At have the same characteristic polynomial.

Note: If A,B are two matrices matrix for T, A and B are conjugate, i.e. B = C-1AC for some invertible C. Then since det(B) = det(C-1AC) =
det(C-1)det(A)det(C) = det(A)det(C-1)det(C) = det(A), we see A and B have the same determinant. Similarly, [tI-A] and C-1[tI-A]C =
[C-1tIC-C-1AC] = [tI-B] have the same determinant, since t.I commutes with every matrix C. Hence the characteristic polynomial of T is well defined by any matrix for T. It is easy to see the constant term of chA(t) is ± det(A), and the coefficient of tn-1 is minus the trace of A, (minus the sum of the diagonal entries).

Exercise: If Cf is a companion matrix for the monic polynomial f, then ch(Cf) = f. [hint: use induction and expand across the first row.]
One can see immediately that the trace of Cf is - an-1.

Corollary:(Cayley Hamilton) If T is any linear transformation, then chT(T) = 0. In particular a matrix satisfies its characteristic polynomial.
proof: The annihilator ideal of the cyclic module R/I where I is any ideal of the ring R, equals I. In particular the annihilator ideal of k[t]/(f) is (f). Hence the annihilator of the module k[t]/f1 x ...x k[t]/fm, where fi divides fi+1, is fm. I.e. the smallest degree monic polynomial f such that f(t) = 0 on this module is fm. If this module represents (V,T), then the minimal polynomial of T is fm, and we just showed the characteristic polynomial of T is the product ?fi. So the minimal*polynomial of T divides its characteristic polynomial, which implies the corollary. QED.

Note: Since every factor fi divides fm, this proof shows that every irreducible factor of chT(t) is an irreducible factor of the minimal polynomial mT(t), (and vice versa). Moreover, for a cyclic or companion matrix, the minimal and characteristic polynomials are equal. This is the analog of the fact that for a cyclic group Z/nZ, the order n of the group equals the annihilator of the group.

Example: A nilpotent matrix A is a square matrix such that Am = 0 for some m. If A is nilpotent, follows that An = 0, where n is the dimension of the matrix A. Since all coefficients ai of the characteristic polynomial for a nilpotent matrix are 0 except the leading one, the rational canonical form of a nilpotent matrix consists of blocks of form:

The reader should verify this matrix is nilpotent.


Direct proof of Cayley Hamilton:
Cramer's rule implies the Cayley Hamilton theorem directly, without using the decomposition theorem, or the rational canonical form, as follows. Let [tI-A] be the characteristic matrix for A, with coefficients in k[t], and substitute t = A into this matrix, obtaining an n by n matrix with coefficients in the subring k[A], of Matn(k).
This may be viewed as defining a linear map on the product space (k^n) x...x (k^n), a product of n copies of k^n. Note this is not the same as substituting t = A into tI-A viewed as a polynomial with matrix coefficients, as that would give A.I-A = 0. Our result instead is the following n by n matrix M:

M = . Now take the transpose of this,


Mt = , and apply it to the column of vectors


in (k^n)^n.

By definition of the entries in A, this yields Mt = . Now multiply

Mt from the left by adj(Mt) = (adj(M))t. By Cramer's rule adj(Mt) Mt =

ch(At)(A).I = chA(A).I = annihilates the vector . I.e. the matrix

product = . Hence chA(A)(ei) = 0 for

each i, so chA(A) = 0. QED.

Note: This proves the minimal polynomial divides the characteristic polynomial, but does not show they have the same irreducible factors.
 
  • #308
day 3.2, canonical forms of matrices

The canonical presentation of (k^n, A) by the characteristic matrix of A.

Next we ask how to find the rational canonical form of a given n by n matrix A over a field k. Since it is determined by the cyclic decomposition of the k[t] module (k^n,A), it suffices to diagonalize any presentation matrix for this module. So we look for a matrix M of polynomials in k[t], whose cokernel is isomorphic to (k^n, A) as k[t]- modules. Perhaps not surprisingly, it is given by the only k[t] matrix we know, the characteristic matrix [tI-A].
It is easy to find an explicit sequence of k[t] generators for (k^n,A), since e1,...,en are k generators, hence also k[t] generators of kn. The map (k[t])^n--->k^n, sending Ei to ei, where E1 = (1,0,...,0) in (k[t])^n, and e1 = (1,0...,0) in k^n, is thus a surjective k[t] module map, where <sum> fi(t)Ei in (k[t])^n goes to <sum> fi(A)ei in k^n.

The next theorem is our main result.
Theorem: Given an n by n matrix A over a field k, defining a k[t] module structure on kn, the k[t] module map (k[t])n--->kn, sending <sum> fi(t)Ei to
<sum> fi(A)ei, is surjective. Its kernel is a free k[t] module of rank n generated by the columns of [tI-A], the characteristic matrix of A. I.e. the following sequence of k[t] modules is exact: 0--->(k[t])^n--->(k[t])^n--->k^n--->0, where the left map is multiplication by [t.I-A].

Remark: This will follow from a version of the wonderful "root factor" theorem.

As corollary of the theorem above we get another proof of
Cayley Hamilton: If the k[t] module (kn, A) is isomorphic to the product (k[t]/f1) x ...x (k[t]/fm), in standard form, i.e. where fi divides fi+1, then the minimal polynomial of A is fm and the characteristic polynomial is the product ?fi.
proof: Since [tI-A] is a presentation matrix for this module, there exist invertible matrices A, B over k[t] such that A[tI-A]B is diagonal, with lower diagonal entries equal to the fi, and higher diagonal entries = 1.
Hence det(A)chA(t)det(B) = ?fi. Since A, B are invertible over k[t], their determinants are units in k[t] hence non zero constants in k. Since chA(t) is monic, the coefficient of the leading term on the left equals det(A)det(B). Since the product ?fi on the right is also monic, det(A)det(B) = 1, hence chA(t) = ?fi. QED.

Note the analogy here with the structure of finite abelian groups. If G is an abelian group isomorphic to (Z/n1) x ...x (Z/nr), where ni divides ni+1, then nr is the annihilator of G, (it generates the principal annihilator ideal), and the cardinality of the group G is ?ni. In both cases it is hard to compute the precise annihilator, but we can compute a multiple of it more easily, i.e. in one case the order of the abelian group, and in the other the characteristic polynomial of the matrix. In both cases the computable element has the same prime factors as the annihilator.

Next we recall the root - factor theorem, and apply it to prove the theorem above, that the characteristic matrix of A gives a presentation for the k[t] module (k^n, A). We also get another proof of Cayley Hamilton.

Polynomials with non commutative coefficients: If R is any ring, not necessarily commutative, define the polynomial ring R[t] as usual, but where powers of t commute with all coefficients in R, although the coefficients may not commute among themselves.
Hence f(t) = <sum> ait^i = <sum> t^iai, but if we set t*= c, where c is in R, it makes a difference whether we set t = c in the first or the second of these expressions. We call fr(c) = <sum> aic^i the right value of f at c, and fl(c) =
<sum> c^iai, the left value of f at c.

Remainder theorem: If f(t) is a polynomial in R[t], then we can write f(t) = (t-c)q(t) + fl(c) = p(t)(t-c) + fr(c), i.e. we can divide f(t) by (t-c) from the left, with remainder the left value of f at c, and similarly from the right. The quotients and remainders are unique if we require the remainder belong to R.
proof: We do it for left evaluations and left division. This is the binomial theorem, i.e. replace t in f(t), by (t-c)+c and expand. We get in each term tiai, terms in which all but the last have a factor of (t-c), i.e.
t^iai = [(t-c)+c]^i ai = [(t-c)q(t) + c^i] ai. Thus f(t) = <sum> t^iai = (t-c)Q(t) + <sum>c^iai, and we see the remainder is indeed the left evaluation of f at c.
This proves existence. For uniqueness, assume f(t) = (t-c)q(t)+r = (t-c)(p(t)+s, where r,s belong to R. Then (t-c)[q(t)-p(t)] = s-r. Thus the left hand side also belongs to R. But multiplication by (t-c) raises the degree by one, so the left hand side has degree >= 1, unless [q(t)-p(t)] = 0. then also r-s = 0. Hence both quotient and remainder are unique. QED.


Corollary: If f(t) is any polynomial in R[t], f is left divisible by (t-c) if and only if fl(c) = 0. Similarly for right divisibility.
proof: The expression we gave shows that f(t) = (t-c)q(t) + fl(c), Hence if fl(c) = 0, then f is left divisible by (t-c). Conversely, if f is left divisible by (t-c), uniqueness shows the remainder, which is zero, must equal fl(c), so fl(c) 0. QED.

Next to apply these results about divisibility of polynomials, to products of matrices, we prove that matrices with polynomial entries are equivalently polynomials with matrix coefficients.

Lemma: If k is a field, the non commutative ring Matn(k[t]) of n by n matrices with entries from k[t], is isomorphic to Matn(k)[t], the ring of polynomials with coefficients in the non commutative ring Matn(k).
proof: Just as with commutative rings, a ring map R[t]-->S is obtained from a ring map R--->S plus a choice of element in S to send t to, only this time, since t commutes with R in R[t], we must choose as image of t, an element that commutes with the image of R in S. So we map Matn(k) into Matn(k[t]) by viewing scalar matrices as polynomial matrices, and then send t to the matrix t.I, which is in the center of Matn(k[t]), i.e. it commutes with everything. It is an exercise to check this ring map is injective and surjective. QED.

It follows that if we have two matrices of polynomials and we multiply them as matrices, we get the same result by viewing them as polynomials with matrix entries, and multiplying them as polynomials.

Corollary: Cayley Hamilton. A square matrix A over a commutative ring R, is a root of its characteristic polynomial chA(t).
proof: By Cramer's rule, we have (tI-A).adj(tI-A) = chA(t).I, as products of matrices. Then it holds also as products of polynomials. Setting t = A gives zero on the left, hence also on the right side. I.e. if chA(t) = <sum> t^ici, where the ci belong to R, then chA(t).I = (<sum> t^ici).I = <sum> t^i(ci.I). Thus setting t = A gives 0 = <sum> A^i(ci.I) = <sum>A^i(ci) = <sum> ciA^i = chA(A). QED.

If in the lemma above, we think of the matrix on the left acting individually on each column vector of the matrix on the right, we can also consider matrices of polynomials acting on column vectors of polynomials, as multiplication from the left of polynomials with matrix coefficients, times polynomials with column vector coefficients. I.e. the lemma also holds with the same proof for polynomials with coefficients in any ring R with identity, acting from the left on polynomials with coefficients in any (unitary) left module over R.

So let k^n[t] denote polynomials with coefficients which are column vectors from k^n. This is not a ring, in particular the coefficents do not have an element 1, so this object does not contain t. But the coefficients do contain the basic vectors ei, and we can multiply these by polynomials over k and add up. In particular this object is a k[t] module, and is isomorphic as such to the free k[t] module (k[t])^n.
I.e. if Ei are the standard free k[t] basis vectors in (k[t])^n, just send Ei to ei, and <sum>fiEi to <sum>fiei where fi are polynomials in k[t]. The expression <sum>fiei can be re - expanded as a polynomial in t with vector coefficients by expanding each term as fei = (a0+a1t+...+t^n)ei = (a0ei + t a1ei +...+ t^nei), and then combining coefficients of like powers of t, from various terms, to get coefficient vectors.

Exercise: Show this gives a k[t] module isomorphism (k[t])^n--->k^n[t].
As we have remarked above, the previous lemma, shows multiplication of matrices corresponds to multiplication of polynomials, i.e. the isomorphisms above, give isomorphisms of multiplication diagrams with matrix multiplication Matn(k[t]) x (k[t])^n--->(k[t])^n, corresponding to polynomial multiplication Matn(k)[t] x k^n[t] ---> k^n[t].

Now we can prove the main presentation theorem.
Theorem: Given any n by n matrix A over a field k, defining a k[t] module structure on k^n, the k[t] module map (k[t])^n--->k^n, sending
<sum> fi(t)Ei to <sum> fi(A)ei, is surjective, and its kernel is a free k[t] module, freely generated by the columns of [tI-A], the characteristic matrix of A. I.e. this sequence is exact: 0--->(k[t])^n--->(k[t])^n--->k^n--->0, as k[t] - modules, where the left map is multiplication by [tI-A].
proof: We know the last map is surjective.
Recall the right map takes <sum>fi(t)Ei to <sum>fi(A)ei, which is exactly the result of viewing <sum>fi(t)Ei as a polynomial <sum>fi(t)ei with coefficient vectors in k^n, and then setting t = A. So if we view these as maps of polynomials k^n[t]--->k^n[t]--->k^n--->0, the right map k^n[t]--->k^n, is left evaluation of a polynomial f(t) with vector coefficients, at t = A. By the root factor theorem above, this equals zero if and only if the polynomial f(t) is left divisible by (t-A), i.e. if and only if f(t) is in the image of the left hand map k^n[t]--->k^n[t].
Since multiplication by a monic polynomial never sends a non zero polynomial to zero, the left map is injective. Hence the sequence
0--->(k[t])^n--->(k[t])^n--->k^n--->0 is exact, and (tI-A) is indeed a presentation matrix for the module (k^n,A). QED.

The following amazing theorem, generalizes the fact an injective endomorphism of a finite dimensional vector space is also surjective.
Theorem: If R is any commutative ring and X a finitely generated R module, any surjective R module map f:X--->X is an isomorphism.
proof: This follows from the proof of Cayley Hamilton. If x1,...,xn are generators and if we write f(xj) = <sum>i aij xi, then as in a previous proof, the matrix A represents f for the generators {xi} even if not independent, and look at the matrix M = . Again the transpose

Mt = , annihilates the column of vectors


Again the determinant of tI-A is a polynomial P(t) over R annihilating the matrix A and hence the map f. As a small refinement: note if the image f(X) of the map f lies in the submodule IX, for some ideal I of R, then we can choose the entries aij to belong to I, and looking at the determinant formula for P shows the coefficient of ti in P(t) belongs to the power I^n-i of the ideal I, where n = degree of P(t).
Now apply the principle just proved, not to f, but to the map
Id:X--->X where X is viewed not as an R module, but as an R[t] module where t acts via t = f. Then the image of Id is all of X, which equals (t)X, the product of X by the ideal (t) in R[t]. Hence we have a polynomial satisifed by Id as follows: Id^n + c1f.Id^n-1 + ...+cn-1f^n-1.Id + cnf^n = 0, where each cif^i belongs to the ideal (f) in R[f]. But we can solve this for Id, getting Id = -[c1f.Id^n-1 + ...+cn-1f^n-1.Id + cnf^n ] =
f [-c1.Id^n-1 - ...-cn-1f^n-2.Id - cnf^n-1]. The polynomial expression on the right is a right inverse for f, and since all its terms are polynomials in f, it commutes with f, hence is also a left inverse. QED.
 
  • #309
in my class, we have so far covered almost all topics on the syllabus in post 177, and are now in the field and galois theory section. we did not prove either the simplicity of A(n) [but we did prove it for A(5), hence the non - solvability of all A(n) n > 4] or the spectral theorem.

i did not write up the jordan form material again yet this fall, nor any galois theory, but of course a detailed treatment is in my webnotes.
 
Last edited:
  • #310
mathwonk, do you know of any places that still used Intro to Calculus and Analysis by Courant/John?
 
  • #311
well i used it in my freshman honors calc course at UGA a couple years back. we were able to get copies for everyone for about 10-15 bucks apiece.

what a bargain. and I learned a lot too using it.
 
  • #312
oh and to people interested in good notes and cheap, i.e. free, the homepage of James Milne, university of michigan, has outstandingly good course notes, in tex, well written, and mathematically very authoritative.

they range from groups to galois theory, to etale cohomology and the Weil conjectures, miles over my head, but very inspiring. it would be cool to have an e - seminar on them, but i don't know how feasible that is.

you need sheaf cohomology as background for instance, but the real obstacle is someone with time and energy to commit to keeping it going, as some guys did with Bachman's book a while back.

that would ideally be me, but i am feeling challenged just keeping up my class preparations in calc and galois theory.

anyway i am quite interested in exploring the deep connections between algebra topology and geometry contained in the links between the fundamental group, its monodromy representations defined by branched coverings of algebraic varieties, and etale cohomology and the absolute Galois group of the rationals.

e.g. Grothendiecks conjecture (nakahara, et al..) that the galois group of a hyperbolic curve determines its arithmetic and geometry.
 
Last edited:
  • #313
Are you planing on reading some more Riemann eventually mathwonk?
 
  • #314
yes i would like to understand his discussion of his singularity theorem, and other results on theta divisors.

also his study of the representation of algebraic curves via monodromy representations, which foreshadows all this geometric galois theory.
 
Last edited:
  • #315
I need some hints on where to apply to grad school. First a few facts:

3.7 GPA mathematics, PSU
3.7 GPA meteorology, Naval Postgraduate School (one-year)

No research experience. Joined the Air Force and have been working as a weather officer for two years. Main interests lie in differential geometry and algebra with some interest in logic/computation. Took GRE as a cocky undergrad without studying... didn't do real hot. Taking it again next April and currently spending most free time practicing. Now I understand "the process". Getting out of military, but not ready to take GRE this month to apply to school for next year... going to have to apply for fall 2008.

With a solid GRE score, what level of school should I be applying to? I don't want to set myself up for disappointment, nor do I want to sell myself short.
 
  • #316
I have a question. Is it acceptable to get a letter of recommendation from faculty that are not (associate, full) professors? Maybe I should not say acceptable but will it hurt me that much when I am applying to grad schools? For example, next semester I need to get my third letter of recommendation, and I have 2 professors I am looking at: this will be the second class I take with whichever one I choose, and I did very well in the first class with them, so if I do well again I would think I could easily ask for a good letter. The problem is that one is a better instructor than the other (the part-time professor is better) and the class the tenured professor is teaching is not very important. Should I go with the part-time instructor in that he will be teaching a more important class (topology) and that he is a better instructor, or should I go with the tenured professors that is not as good at teaching and teaching something not very important (graph theory)? I asked my adviser, but she is a little biased (married to the part-time instructor). Thanks.
 
  • #317
mattmns said:
I have a question. Is it acceptable to get a letter of recommendation from faculty that are not (associate, full) professors? Maybe I should not say acceptable but will it hurt me that much when I am applying to grad schools? For example, next semester I need to get my third letter of recommendation, and I have 2 professors I am looking at: this will be the second class I take with whichever one I choose, and I did very well in the first class with them, so if I do well again I would think I could easily ask for a good letter. The problem is that one is a better instructor than the other (the part-time professor is better) and the class the tenured professor is teaching is not very important. Should I go with the part-time instructor in that he will be teaching a more important class (topology) and that he is a better instructor, or should I go with the tenured professors that is not as good at teaching and teaching something not very important (graph theory)? I asked my adviser, but she is a little biased (married to the part-time instructor). Thanks.

I'd say go with the part-time instructor. They too know what it takes to get through graduate school. Considering it's your third letter, I would think it wouldn't affect you.

It's what I'm planning on doing. My part-time professor is my favourite professor, as well as we had our conversations together. I go to him for advice usually.
 
  • #318
jason, really anything is possible, but most things unusual are exceptional.
 
  • #319
good letters from lesser known people who actually know you well, are often more helpful than so - so letters from famous or high ranking people, who do not.
 
Last edited:
  • #320
day 5 notes, field and galois theory

8000 Day 5 Field extensions and homomorphisms.
Introduction: Galois theory is concerned with the self mappings, i.e. automorphisms of a field E, that are specified on some subfield, e.g. that equal the identity on some subfield k. We want to see how to construct such automorphisms, to count how many there are, and to compute their exact fixed field. If E has finite vector dimension n over k, we will see there are at most n automorphisms of E, that fix k pointwise. The reason is simple. It will turn out that such maps must send roots in E of polynomials with coefficients in k, to other roots in E of these polynomials. It follows that the number of such automorphisms will be determined by the number of distinct roots such polynomials have in E, which is always bounded by the degree of the polynomials. Galois theory is most useful when the number of distinct roots of an irreducible polynomial equals its degree. This is the "separable" case.
Since the vector dimension of a field extension can be computed in terms of the degree of the polynomials satisfied by generators for the extension, it will follow that the number of automorphisms is also bounded by the vector dimension of the extension, and that equality holds in the separable case. The proofs proceed by carrying out the "simple" case first, then using induction to deduce the result for any sequence of simple extensions, i.e. any finite extension.

First we review a few elementary facts about extensions that are technically essential, and probably familiar from math 6000.
Def: If k is a subfield of E, then E is a vector space over k, and we write [E:k] for the k dimension of E, also called the degree of E over k.

Definition: If k is a subfield of a field E, and a belongs to E, we say a is algebraic over E, iff a is a root of some non zero polynomial f in k[X].

Lemma: If k is a subfield of a field E, a belongs to E, and a is algebraic over k, there is a unique monic irreducible polynomial in k[X] satisfied by a, namely the unique monic polynomial in k[X] of lowest degree with a as root. This polynomial is called the minimal polynomial of a over k.
proof: If a is algebraic over k, the evaluation map k[X]-->E, sending g to g(a) has non trivial kernel of form (f), inducing an injection k[X]/(f)-->E. Hence (f) is a maximal and prime ideal generated by a unique monic irreducible polynomial, namely the monic polynomial of lowest degree in the kernel. QED.

Definitions:
The ring generated by a over k, =k[a], = k-algebra generated by a.
If k is a subfield of E, and a is an element of E, then k[a] denotes the intersection of all subrings of E that contain both a and k. Concretely, k[a] = {f(a): f is any polynomial in k[X]}.

The field generated by a over k = k(a) = the intersection of all subfields of E that contain both a and k. Concretely k(a) = {f(a)/g(a): f,g are in k[X], and g(a) != 0}.

Theorem: If k is a subfield of E, and a is an element of E, then TFAE:
1) a is algebraic over k.
2) k[a] = k(a).
3) k(a) has finite dimension over k.
4) k[a] has finite dimension over k.
5) k[a] is contained in a finite dimensional k - subspace of E.
6) k(a) ? k[X]/(f) where f is an irreducible polynomial in k[X].
In particular, if a is algebraic over k, then the k dimension of k(a) equals the degree of the minimal polynomial of a over k.
proof: [sketch]: If a is algebraic over k, then for some n, an is a linear combination of lower powers of a. This implies every power am with m>n, is also a linear combination of powers of a less than n. Hence the dimension of k[a] over k is finite. If we take a monic dependency relation c0+c1a+c2a2+...+an for the smallest possible power an, then a satisifes the monic polynomial f(X) = c0+c1X+c2X2+...+Xn, but no polynomial of lower degree, so f is irreducible, and the map k[X]/(f)-->k[a] is injective and surjective. Since k[X]/(f) is a field, so is k[a] = k(a). If k[a] is contained in some finite dimensional k subspace of E, say of dimension n, then 1,a,a2,...,an are dependent/k, and a dependency relation gives a polynomial satisfied by a, so a is algebraic. The k - dimension of k[X]/(f) is deg(f) = n, and a basis is given by [1], [X],...,[Xn-1]. If a is not algebraic over k, then the infinite sequence of powers {1,a,a2,...,an,...} is linearly independent over k. QED.

Lemma: If k is a subfield of E, and E a subfield of F, then [F:k] = [F:E][E:k].
proof: If x1,...,xn is a k basis for E, and y1,...,ym an E basis for F, then {xiyj}, 1<=i<=n, 1<=j<=m, is a k basis for F. This is trivial to check by changing the order of summation. E.g. if z lies in F, then there exist constants b1,...,bm in E such that z = b1y1+...+bmym. But each bj lies in E, so there exist constants aij such that each bj = a1jx1+...+anjxn. Hence z = b1y1+...+bjyj+...bmym = ...+(a1jx1+...+anjxn)yj+... = ...+aij(xiyj)+...

Thus the products {xiyj} span F over k. In particular if both [F:E] and [E:k] are finite, so is [F:k]. You can check simialrly that the products {xiyj} are independent over k, and you should as this is a favorite little easy prelim question. QED.

Cor: The subset of E consisting of elements which are algebraic over k, is a subfield of E.
proof: If a,b in E are algebraic over k, we must show that a+b, ab, a/b, are also algebraic over k. But k(a) has finite dimension over k, and since k(b) also has finite dimension over k, it has even smaller dimension over k(a). Thus both [k(a,b):k(a)] and [k(a):k] are finite. Hence also [k(a,b):k] is finite, and so all field combinations of a,b, such as a+b, ab, a/b, etc,.. , belong to finite dimensional subfields of E, hence are algebraic over k. QED.

Note: All finite dimensional extensions of k are algebraic, but not all algebraic extensions are finite dimensional. We have shown that simple aklgebraic extensions are finite dimensional. Hence finitely generated (as fields) algebraic extensions, are finite dimensional (as vector spaces).

Cor: A field extension of k is finite dimensional (as a vector space), if and only if it is both finitely generated (as a field) and algebraic/k.

Extending homomorphisms.
Given a (homomorphic) map, f:k-->k', of fields which are subfields of larger fields E, E', we want to say exactly when it is possible to extend f to a map f':E-->E'. Of course field maps are injective whenever they exist.
Field map extensions are always done one generator at a time, and for that "simple" case we use the fundamental isomorphism F(a) ? F[X]/(g) where g is the minimal F polynomial of a, to tell us how to extend the map.

Theorem 1: Assume f:k-->k' is a map of fields which are subfields of E, E'.
1) Then f extends uniquely to a ring map k[X]-->k'[X], by applying f to the coefficients of each polynomial.
Let a be any element of E with minimal polynomial g in k[X], mapping to g' in k'[X].
2) Then f extends to a map k(a)-->E' with f(a) = a', if and only if a' is a root of g'.
3) Hence the number of extensions of f to f':k(a)-->E' is equal to the number of distinct roots in E' of g'. This number is at most equal to the degree of g = dimk(k(a)).
proof: 1) we must check that (g+h)' = g'+h' and (gh)' = g'h'. The coefficient of Xn in g+h is the sum an+bn where*a,b, are the coefficients in g,h. But the coefficient of Xn in (g+h)' = f(an+bn) = (an+bn)' = (an)'+(bn)' = the sum of the coefficents of Xn in g' and h'. Hence (gh)' = g'+h'.
The coefficient of Xn in gh is the sum of the products aibj over all i,j, with i+j = n. But f applied to this sum is the sum of the corresponding products ai'bj', which is the corresponding coefficient in g'h'. Hence (gh)' = g'h'.
2) Applying f to g(a) = 0, gives g'(a') = 0, so the condition is necessary. If indeed g'(a')=0 for some a' in E', then we get a map k'[X]-->E' sending X to a', which induces a map k'[X]/(g')-->E' sending [X] to a', hence by composition a map
k(a)-->k[X]/(g)-->k'[X]/(g')-->E', sending a to a'.
3) Each different choice for f(a) gives a different map k(a)-->E', so the number of maps equals the number of roots a' of g' in E'. The number of rooits of the polynomial g' in the field E' never exceeds the degree of g', which equals the degree of g, which equals the k dimension of k(a). QED.

Cor: In the setting above, the number of extensions of f:k-->k', to f':k(a)-->E' is at most equal to dimk(k(a)), and equals this number if g' has deg(g') distinct roots in E', equivalently if g' factors into distinct linear factors in E'[X].

Note: Even if g' has d = deg(g') distinct roots in E', and there exist d distinct maps k'-->E' extending f, all of these maps may have the same field as image in E'. I.e. if {a1', a2',...,ad'} are the distinct roots of g' in E', and k-->k' is an isomorphism, then all the fields k'(ai') are isomorphic, but it may or may not happen that they are actually equal subfields of E'. We call these fields "conjugate" subfields of E', and when they are all equal we say they are all normal. We will say more about normal extensions later, and show how in this situation they define normal subgroups of the group of automorphisms of E'.

The argument for extending maps to simple extensions k(a), let's us extend maps to any algebraic extensions at all, provided the target field has enough roots.
We have to be a little careful about the hypotheses, since if k has more than one generator, it does not suffice to require just one root in E' for each of their minimal polynomials. For instance, suppose g is an irreducible polynomial in k[X] with distinct roots a,b and E = k(a,b), while E' = k(a) != k(a,b).
Then a,b, have the same minimal polynomial over k, namely g, so in both cases, the minimal polynomial of a and b does have a root in E'. Still there is no extension of the identity map k-->k to a map k(a)-->k(a,b). So we use a little stronger hypothesis on E' in the next result. We do the finite dimensional case first.

Theorem 2: Assume E = k(a1,...an) is finitely generated and algebraic, i.e. a finite (vector) dimensional field over k, f:k-->k' is a field map and E' a field containing k'.
Assume further for every generator ai of E, that gi' factors completely into linear factors in E'[X], where gi' in k'[X] is the image under f of the minimal polynomial gi of ai. Equivalently assume E' contains a splitting field for the polynomial ?gi'.
Then there exist extensions of f to f':E-->E'. The number of such extensions is at most equal to the dimension [E:k], and equals this dimension if the polynomial ?gi' factors into distinct linear factors in E'[X].
proof: (existence of extensions): From the argument in the simple case we get an extension to k(a1)-->E'. Now we want to extend further to k(a1, a2)-->E'. There is only a tiny difference from the simple case. We regard this as a simple extension k1(a2), of k(a1) = k1. But our hypothesis only says the minimal polynomial g2 of a2 over k corresponds to a polynomial g2' in E'[X] that splits completely into linear factors. We need to know about the minimal polynomial h2 of a2 over the field k1.
The point is that h2 is always a factor of g2. I.e. both h2 and g2 have coefficients in k1, and h2 is irreducible there. Since a2 is a root of both polynomials, h2 must divide g2 in k1[X]. Thus the corresponding polynomial h2' in E'[X] is a factor of g2', and since g2' factors in E'[X] completely into linear factors, some of those same factors give a factorization of h2' into linear factors.
Extending successively over each simple extension, we eventually get an extension to E-->E', by induction on the dimension [E:k].
(number of extensions): By the argument in the simple case, i.e. by the corollary above, at each stage the number of extensions from ki-->E' to ki+1-->E' is at most equal to the dimension [ki+1:ki], and equals that dimension if hi' factors in E'[X] into distinct linear factors. But the number of extensions of f from k-->k' to f'E-->E' is the product of the number at each stage, and the dimension of successive field extensions is also multiplicative, i.e. [E:k] = {E:kn-1][...][k1:k]. Thus the number of extensions is at most equal to [E:k], and equals this dimension when every polynomial hi' factors into distinct linear factors in E'[X]. But since the linear factors of hi' are a subset of the linear factors of gi', if ?gi' factors into distinct linear factors in E'[X], then so does every hi'. QED.

[If you are getting lost in the theory, this is a good time to go read some examples. DF looks good in chapter 14.2, and I recommend my webnotes 844.2 where I compute very thoroughly the Galois group of X4-2 over Q.]

It may be getting tiring to repeat this same argument, but we do it anyway, to practice using Zorn's lemma in the infinite dimensional case. The moral is that there are only three steps to the argument: i) the simple case; ii) the observation that the minimal polynomial over a larger extension is a factor of the minimal polynomial for a smaller one, which allows you to repeat the simple case more than once; iii) using Zorn, i.e. "transfinite induction", to find a maximal extension. It is instructive also to note that now that E is not finitely generated over k, there will be an infinite number of hypotheses to check about E', which poses the question of how do we verify that E' satisfies them.

Def: A field F is algebraically closed if every polynomial in F[X] has a root in F, hence every polynomial in F[X] factors completely into linear factors in F[X].

Theorem 3: Assume E is an algebraic extension of k, f:k-->k' is a field map and E' a field containing k'. Assume further for every element a of E, that g' factors completely into linear factors in E'[X], where g' in k'[X] is the image under f of the minimal polynomial g of a. This is true for example if E' is algebraically closed.
Then there exist extensions of f to f':E-->E'.
proof: Consider the set of all partial extensions of f to F-->E' where F is a field intermediate between k and E. These form a partially ordered set where g > h if g extends h. (Each such partial extension is a map from a subset of E to E', hence its graph is a subset of ExE'. The collection of all such partial extensions is a subset of the set of all subsets of ExE', in particular it is a set, to which we can try to apply Zorn's lemma.)
Given any totally ordered collection or "chain" of partial extensions, they define a partial extension on the union of their domains, and this is an upper bound for the chain. Since the hypothesis of Zorn is thus satisfied, there exist maximal partial extensions. We claim any maximal partial extension is actually defined on all of E.
If g:F-->E' is an extension of f that is not defined on all of E, there is some element a of E that is not in F. Then a is algebraic over k hence also over F, and satisfies an irreducible polynomial over F which is a factor of its irreducible polynomial over k. The corresponding polynomial over E' thus factors completely into linear factors as argued above, and we get an extension of f to F(a)-->E'. This shows any extension of f whose domain is not all of E cannot be maximal, hence any maximal extension is defined on all of E. QED.

Notice theorem 2 is a corollary of theorem 3, but I thought it clearer (certainly for me) to explain these cases separately. Also the use of a finite set of generators in theorem 2, allows the splitting hypothesis to be given only for the generators.
 
  • #321
day 5 part 2, separable extensions

The phenomenon of polynomials with multiple roots also bears examination, since it affects the number of extensions of a homomorphism.
Def/Ex: A polynomial g in k[X] is separable if gcd(g, dg/dx) = 1 in k[X], if and only if g, dg/dx have no common root in any extension of k, if and only if f has no multiple linear factor in any extension of k, if and only if there is an extension of k where g has deg(g) distinct roots, and where g factors into distinct linear factors.

Def: An algebraic element a of a field E containing k, is separable over k if its minimal polynomial in k[X] is separable. Notice that an element separable over k is also separable over any larger field, since its minimal poynomial there is a factor of the one over the smaller field.

Theorem 4: If E is any field containing k, the elements of E that are separable over k form a subfield of the field of algebraic elements.
proof: Let a,b, be elements of E that are separable over k. We claim every element of k(a,b) is separable over k. Let c be any element of k(a,b), and let f,g,h, be the minimal polynomials of a,b,c over k, with f,g separable. Let E' be a splitting field for (fgh) over k. Since f,g both factor into distinct linear factors in E', there are exactly [k(a,b):k] extensions of the inclusion map k-->E' to maps of k(a,b)-->E'.
If c is not separable over k, there are fewer than deg(h) extensions of k-->E' to k(c)-->E', hence fewer than [k(a,b,c):k] extensions of k-->E' to k(a,b,c)-->E'. But since k(a,b,c) = k(a,b) this is a contradiction. So c is separable over k. QED.

With this terminology, Theorem 2 implies the following statements.
Cor 5: Assume E = k(a1,...an) is a finite separable extension of k, g is a polynomial in kl[X] satisfied by the generators ai, and E' is a field containing a splitting field for g. Then there exist exactly [E:k] maps E-->E', which are the identity on k.

Cor 6: If k is any field, f a separable polynomial over k, and E a splitting field for f, there are exactly [E:k] automorphisms of E which equal the identity on k.

Remark: All algebraic extensions are separable in characteristic zero. We will see later that all algebraic extensions of finite fields are also separable.

Def: The "fixed field" of a set S of automorphisms of a field E is the subfield of E of elements left identically fixed by every element of S. Automorphisms of E leaving a subfield k fixed are called k automorphisms.

Cor 7: If k is any field, f a separable polynomial over k, and E a splitting field for f, the fixed field of G = Galk(E) equals k.
proof: If G fixed a larger subfield F of E containing k, there would be more than [E:F] automorphisms of E fixing F, contradicting Theorem 2. QED.

Cor 8: If E is a splitting field for a separable polynomial f over k, and F is an intermediate field, the fixed field of the subgroup GalF(E) of Galk(E), is F.
proof: E is also a splitting field for f over F. QED.

If E is a splitting field of a finite separable polynomial over k, this shows the map from subgroups of G = Galk(E) to fields intermediate between k and E, is surjective. I.e. every field between k and E is the fixed fielkd of some subgroup of Galk(E).

Theorem: If f is a separable polynomial of degree n over k, with splitting field E, the group Galk(E) is isomorphic to a subgroup of the symmetric group S(n). In particular it has finite order dividing n!
proof: Every k automorphism of E permutes the roots of f, hence G acts on the set of n roots. But the roots generate E, so this action determines the element of G, hence the map G-->S(n) is injective. QED.

Cor: If E is the splitting field of a separable polynomial over k, there are only a finite number of intermediate fields between k and E.

It remains to show in this setting that no two subgroups of Galk(E) can fix exactly the same subfield of E, so that the numer of subgroups and intermediate fields is the same. I.e. it could conceivably occur that a subgroup H fixes exactly F, but that actually also more elements of G also fix F, so the full subgroup fixing F is larger than H. This never happens, as the following cute little argument apparently stemming from work of Dedekind and Artin shows.

Theorem: If G is a finite group of automorphisms of a field E, and k the subfield fixed by G, then [E:k] = #G.
proof: We know #G <= [E:k], so we want to show any subset x1,...,xn of E with n > #G is dependent over k. This a matter of solving linear equations. If f1,...,fm are the elements of G, the m by n matrix [fj(xi)] over E has a non zero solution vector c = (c1,...,cn) in En, since n > m. Thus for all i, (*) <sum>j cjfi(xj) = 0. Choose c with as few non zero entries as possible. There must be at least two != 0 entries, since we may assume all xj != 0, else dependency is obvious, and the fi are automorphisms of E. By reordering the x's, we may assume c1 != 0, and multiplying through by c1-1 we may assume c1 = 1 belongs to k.
We claim actually all cj belong to k, yielding a k - linear relation among the elements (fi(x1),...fi(xj),...,fi(xn)) for every i. Since one of the f's is the identity, this will give a k linear relation among the (xj) as claimed.
If some cr != c1 is not in k, then since k is the fixed field of G, some fs != id does not fix cr. We can renumber so that fr(cr) != cr. Then apply fr to the system of equations getting <sum>j fr(cjfi(xj)) = <sum>j fr(cj) frfi(xj) = 0 for all i. But as fi runs over the group G, the product frfi does the same. So we can say also that
(**) <sum>j fr(cj) fi(xj) = 0 for all i. Now if we subtract the two systems of equations (*) and (**), we get <sum>j [fr(cj)-cj] fi(xj) = 0 for all i. In this system, since c1 is in k, fr(c1)-c1 = 0, but fr(cr)-cr != 0. Hence we have a non zero solution vector
(,...,fr(cj)-cj,...) in the kernel of the matrix, but with fewer non zero entries than before, a contradiction. QED.

Cor: If E is a finite dimensional extension of a field k, TFAE:
i) #Galk(E) = [E:k].
ii) E is the splitting field of a separable polynomial. iii) k is the fixed field of some finite group of automorphisms of E.
proof: We proved ii) implies i) and iii), and that iii) implies i) above. So it suffices to prove i) implies ii). By the arguments used above, i) implies the minimal polynomial of every generator of E over k splits into distinct lienar factors in E, so E is a splitting field for the product of the minimal polynomials of a finite set of generators for E over k, where each of these polynomials is separable. It remains to show their product may be assumed separable. If any two of these polynomials are equal we may omit one. So we want to rule out that two distinct separable irreducible polynomials over k share a root in E. But any element of E has a unique irreducible minimal polynomial over k, so this is impossible. QED.

Def: Under any of these conditions, we say E is a finite galois extension of k.

Cor: If E is a finite galois extension of k, then different subgroups of G = Galk(E) have different fixed fields. Thus the correspondence between subgroups of G and fields intermediate between k and E is bijective.
proof: If E is galois over k, and H is a subgroup of G, let F be the fixed field of H, and K the subgroup of all elements of G fixing F. By the previous corollary, #H = [E:F]. H is a subgroup of K. But theorem 2 shows that #K <= [E:F] = #H. Thus H = K. QED.
 
  • #322
I'm thinking of taking the real analysis course next term, but the class has a notorious reputation for its difficulty. I think only two students out of about 25 got A's in it last spring, and they're no less than academic superstars (read: putnam fellows).

Therefore, I want to have some idea of what I'm up against. The class uses Rudin's book, which I already bought. However, I have not written a proof since my sophomore year's geometry class. How do you think I should proceed with the book? Reading doesn't seem to cut it; I have a fair understanding of the underlying concepts, but I'd think a little more than would be required to solve the problems at the end of each chapter.
 
  • #323
That sounds a lot like the analysis classes at my school. I have to take it in the spring or in the fall and I'm quite nervous to say the least. That said, several of my friends who are certainly very bright, but not superstars passed it and learned a lot. I think sometimes the reputation is worse than the class itself. (At least that's what I'm hoping.)

Anyhow, maybe practice with proofs would be helpful?

I'm interested to see what someone who actually knows what they're talking about has to say.
 
  • #324
real analysis is thorough going proofs class in a topic whose concepts are hard and precise. Rudins book moreover treats the material very briefly and succinctly, and is far less than ideal as a learning place.

You need a lot of preparation to ace this class; 1) practice in proofs, 2) preliminary study from an easier book.

you do not have much time now, so clear your schedule as much as possible to leave double free time for this one course. then get some other books, such as simmons introduction to topology and analysis, and study them.

practice writing proofs and expect to get less than an A with your weak background. try for a B.
 
Last edited:
  • #325
heres a cheap copy of simmons:

Introduction to Topology and Modern Analysis
Simmons, George F.
Bookseller: Logos Books
(Davis, CA, U.S.A.) Price: US$ 14.00
[Convert Currency]
Quantity: 1 Shipping within U.S.A.:
US$ 3.50
[Rates & Speeds]
Book Description: McGraw-Hill, 1963. Hardcover. Book Condition: Good. Dust Jacket Condition: Good. 1st Edition. A small amount of marginalia; a few pages have indents from a paperclip. A good reading copy. Bookseller Inventory # 010260
 
  • #326
spivak's calc book is also good preparation, or apostol.
 
  • #327
If one were to be in a differential equations class that is taught with no theory, theorems, but just methods to solving differential equations (Yes, this is the course math majors have to take too). The teacher also considers integration using anything beyond "u substitutions" too hard. How much trouble would this person be in when they move on to more advanced math courses-- let's say partial differential equations? The text is Boyce, DiPrima.
 
  • #328
Beeza said:
If one were to be in a differential equations class that is taught with no theory, theorems, but just methods to solving differential equations (Yes, this is the course math majors have to take too). The teacher also considers integration using anything beyond "u substitutions" too hard. How much trouble would this person be in when they move on to more advanced math courses-- let's say partial differential equations? The text is Boyce, DiPrima.

When I studied PDEs, the main things from ODEs we used were the solution techniques. We needed to be able to solve ODEs right away. Solving a PDE often comes down to solving an ODE, or several ODEs. I don't think it's a problem, and if you feel it is, read your text. Anyways make sure you remember how to solve the various types of ODEs. I would suggest taking a course on PDEs right after a course on ODEs if possible. Also, being familiar with vector calculus helps. It seems that when vector calculus was used, it was mostly the concepts being applied. Anyways every course on PDEs is different, and this is just my experience. Goodluck.
 
  • #329
if you read the book, can't you get a lot more out of it than the teacher is offering? boyce diprima has proofs in it right?

or read a better book, like arnol'd.
 
  • #330
That's what I was thinking. Boyce & DiPrima proves stuff. So even if you teacher only mentions the result without proof, you can look the proof up yourself.
 
  • #331
here is a site with free notes on a wide variety of topics:

http://us.geocities.com/alex_stef/mylist.html
 
Last edited by a moderator:
  • #332
Awesome. Thanks.
 
  • #333
Galois' theorem on solvability of polynomials by radicals
We will prove next that in characteristic zero, a polynomial whose
galois group is not a solvable group, is not "solvable by radicals", and
give an example of a polynomial that is not solvable by radicals.

Lemma: The galois group of a polynomial is isomorphic to a subgroup of
permutations of its distinct roots. If the polynomial, is irreducible, the
subgroup of permutations is transitive on the roots.

Cor: If an irreducible polynomial over Q has prime degree p, and exactly 2
non real roots, its Galois group is isomorphic to S(p).

Def: A primitive nth root of 1 (or of "unity"), is an element w of a field
such that w^n = 1, but no smaller power of w equals 1.

Lemma: If char(k) = 0, for every n > 0, there is an extension of k which
contains a primitive nth root of 1.

Theorem 1: If char(k) = 0, the Galois group G of X^n -1 over k is
isomorphic to a subgroup of the multiplicative group (Z/n)*, hence G is
abelian.
Rmk: If k = Q, then G ≈ (Z/n)*, as we will show later.

Theorem 2: If k is a field containing a primitive nth root of 1, and c is
an element of k, the galois group G of X^n-c is isomorphic to a subgroup of
the additive group Z/n, hence G is abelian.
Rmk: It is NOT always true that G equals Z/n, even if k is the splitting
field of X^n-1 over Q.

Theorem 3: If ch(k) = 0, and w is a primitive nth root of 1, and if k0 =
k(w), ki = k(w,a1,...,ai), where for all i = 1,...,m, ai^ri = bi is some
element of ki-1, and ri divides n, then the galois group of km =
k(w,a1,...,am) over k is solvable.

Def: A radical extension E of k, is one obtained by successively adjoining
radicals of elements already obtained. I.e. E = k(a1,...,am) where for
each i, some positive integral power of ai lies in the field
k(a1,...,ai-1).

Def: A polynomial f in k[X] is "solvable by radicals" if its splitting
field lies in some radical extension of k.

Theorem 4: If k has characteristic zero, and f in k[X] is solvable by
radicals, then the galois group of f is a solvable group.
Rmk: The converse is true as well, also in characteristic zero.

Cor 5: The polynomial f = X^5-80X + 2 in Q[X], is not solvable by radicals.
proof: The derivative 5X^4 - 80, has two real roots 2,-2, so the graph has
two critical points, (-2, 130), and (2, -126). Since f is monic of odd
degree, it has thus exactly 3 real roots, and 2 non real roots. The galois
group is therefore isomorphic to S(5), which has a non solvable subgroup
A(5) ≈ Icos. QED.
 
  • #334
does fractal geometry have any practical uses
 
  • #336
I read once that somebody made a fractal image compression algorithm for nature scenes. I don't think it was dramatically better than other algorithms for most subjects, but apparently it worked.
 
  • #337
In your very first post you mentioned

"basically 3 branches of math, or maybe 4, algebra, topology, and analysis, or also maybe geometry and complex analysis"

But what about branches like Statistics, Probability/
Stochastic Processes, Operations Research?
Do they fit into one of the 4 major fields you suggested? If so how would you put them in?
 
  • #338
The comment probably should have been, "three branches of PURE math".

Those topics you mentioned all fall under applied mathematics, with things like statistics and stochastic process borrowing heavily from analysis. The question is more, can I use some or invent some mathematical technique to solve this problem? So it doesn't really matter what branch that technique comes from.

http://www.math.niu.edu/~rusin/known-math/index/mathmap.html"
 
Last edited by a moderator:
  • #339
yes, i admitted later that i am incompetent in those other areas of applied math, and i appreciate any input on those topics anyone is willing to offer. i apologize if i gave the impresion my advice is comprehensive, as i am obviously limited by my own knowledge and experience.

i myself have studied only pure math, with courses in algebraic topology, algebraic geometry, functional analysis, riemann surfaces, homological algebra, complex manifolds, real anaylsis and representations.

then my research was entirely in riemann surfaces, singularities of theta divisors of jacobians and prym varieties, and their moduli.

so i am pretty ignorant of analysis, algebra, topology, finite math, probability, statistics, gosh almost everything.

I do know a little about theta divisors.

But I still feel free to offer advice!

i just meant to start this thread, not to dominate it. my apologies for its shortcomings.
 
Last edited:
  • #340
can one be both a pure and applied mathematician?
 
  • #341
Why don't you include complex analysis in analysis and geometry in topology so that there are (definitively) 3 branches of (pure) mathematics.
 
  • #342
you could very reasonably do that. complex analysis , at least the homological kind I know most about, does have a rather different flavor from real analysis, but deep died analysts use a lot of real analysis to do complex analysis, via harmonic functions.
 
  • #343
mathwonk said:
you could very reasonably do that. complex analysis , at least the homological kind I know most about, does have a rather different flavor from real analysis, but deep died analysts use a lot of real analysis to do complex analysis, via harmonic functions.

That is interesting. So the complex analysis you do is more related to algebra than analysis?

Now I am going to ask, most likely a very stupid questions.

If someone wanted to solve the Riemann hypothesis, which branch of mathematics should they get into?
 
Last edited:
  • #344
In general, the more branches you get in, the better.
 
  • #345
radou said:
In general, the more branches you get in, the better.

Yeah, but you might want to do the relevant ones first.

I'd say Complex Analysis, Number Theory and Abstract Algebra would be the most relevant.

Not entirely sure if there are more important areas.

You might want to read "The Music of the Primes". That might give you an idea of what you're getting into.
 
  • #346
What is everyone's favorite book on the Riemann hypothesis? There's

Prime obsession
The Music of the Primes
The Riemann Hypothesis: The Greatest Unsolved Problem in Mathematics
Riemann's Zeta Function

And probably a miriad of other ones. Which one's the most interesting to read?
 
  • #347
i rather liked Riemann's own paper.
 
Last edited:
  • #348
the riemann hypothesis is clearly an application of compelx analysis to number theory. put very simply, riemanns point of view was that a complex functon is best understood by studying its zeros and poles.

the zeta function is determined by the distribution of primes among the integers, since its definition is f(s) = the sum of (forgive me if tjhis is entirely wrong, but someone will soon fix it) the terms 1/n^s, for all n, which by eulers product formula equals the product ofn the sum of the powers of 1/p^s, which equals by the geometric d]series, the product of the factors 1/[ 1 - p^(-s)].

now this function, determined by the sequence primes , is by riemanns philosophy best understood by its zeros and poles.

hence riemanns point of view requires an understanding of its zeroes, which he believed to lie entirely on the critical line.

this hypothesis the allowed him to estimate the number of primes less than a given value, to an accuracy closer than gauss' integral estimate.

even with its flaws, this brief discussion allows you to see what areas of math you might need to know at a minimum. complex analysis, number theory, integral estimates, and (excuse me that it was not visible) mobius inversion.
 
  • #349
mathwonk, what do you think of the math department of this college:

http://www.rose-hulman.edu/math/home.php
 
Last edited by a moderator:
  • #350
i never heard of it before but it looks like a really good undergraduate college department, with a deep commitment to teaching and nurturing undergraduates. i like what i see on their website.
 
Back
Top