[SOLVED] Sharipov's linear algebra textbook... Hi fellow Year-2005 vector space cadets, Here's an interesting Linear Algebra textbook I stumbled upon this morning - Course of Linear Algebra and Multidimensional Geometry. It uses a fairly rigorous development style which I like (if done well). But, everyone has different tastes so maybe it'll make you puke. Whatever... Anyway, I just lightly scanned through the thing but spent some time studying the first chapter (and a few other msc. tid-bits) to see how well the material was presented. A few things I like: 1. He proves nearly every theorem in the book. For me, this is the best part since it makes for an interesting study of some of his strategies for structuring various proofs. 2. It's only 143 pages so, being concise, it's also interesting to see if he can develop the material smoothly and without resorting to using and abusing previously undefined concepts. The little bit that I studied, he did OK - did find a few no-nos though. I'm trying to imagine how well I could understand this book if it were my first exposure to Linear Algebra and without benefit of any professor's lectures. Hmmmm.... maybe it's not a very good Intro. 3. There's no "exercises left for the reader" to feel guilty about not doing . I know this is usually considered one of the most valuable parts of a textbook when properly designed but I have plenty of other L.A and Set Theory stuff with more exercises than I could do in a lifetime. 4. He doesn't begin the journey by treading the matrix/determinant path but rather uses the more mathematically rigorous set theoretical method. [i.e. non-math fanatics would probably toss it in the Recycle Bin pronto] Ok - have a happy New Year! I'll bet right now there's a few of you who wish you'd done what I did a couple years ago - namely, quit drinking :tongue2: Perion
I thnk i can write a shorter one than that: linear algebra is about linear spaces, i.e. vector spaces, and linear maps between them. the first topic is therfore linear spaces. a (real) linear space V is a set of "vectors" closed under addition and multiplication by real numbers, which is an abelian group under addition (the usual properties of arithmetic hold, like associativity, commutativity and existence of a zero and negatives) and has the expected properties under multiplication (multilication by 1 acts as the identity, multiplication distributes over addition, a(bv) = (ab)v if a,b, are numbers and v is a vector). the basic example is R^n, ordered n tuples of real numbers, with component wise addition and multiplication, i.e. (v1,...,vn) + (w1,...,wn) = (v1+w1, v2+w2,...,vn+wn) and a(v1,...,vn) = (av1,...,avn). a subspace of V is a non empty subset W closed under addition and scalar multiplication. given a subspace W of V we can define a new vector space V/W by identifying two vectors x,y in V provided x-y lies in W. given any two vector spaces V,W we can also define a new space V+W consisting of all ordered pairs (x,y) with x in V and y in W. addition and multiplication are componentwise as in R^n which is merely the sum of n copies of the real numbers. A map f from V to W is called linear if f(x+y) = f(x)+f(y) for all x,y, in V, and also if f(ax) = af(x) for all x in V and all reals a. and isomorphism is a linear map with a linear inverse. exercise: any bijective linear map is an isomorphism. given a space V and a subspace W, the map V-->V/W sending a vector mto its equivalence class is a linear map sending all vectors in W to zero. a "linear combination" of the vectors v1,...,vm,.... is a vector which is a finite sum of multiples of the given ones, i.e. a vector of form a1 v1+...+am vm. a set of vectors "spans" or "generates" a vector space if every vector in the space is a linear combination of the given vectors. equivalently if the given vectors are not contained in any proper subspace. a space is called finite dimensional if it has a finite spanning set or equivalently if there is a linear surjection from some R^n to that space. we restrict attention to finite dimensional spaces from now on. note: every vector in R^n is a linear combination of the standard unit vectors (1,0,...,0),....,(0,....,0,1), but that none of these is a linear combination of the others. hence we call these the standard "basis", e1,...,en. in any vector space we call a subset a "basis" if every vector is a linear combination of them, but no vector in the subset is a linear combination of the other vectors in that subset. an isomorphism between two spaces always takes a basis to a basis. in particular an isomorphism from V to R^n takes the standard basis of R^n to some basis of V. conversely, any basis v1,...,vn of V defines a unique isomorphism from R^n to V sending (a1,...,an) to a1v1+...+anvn. thus an (ordered) basis for V may be thought of merely as an isomorphism of V with R^n. in other words, an ordered basis is a way to introduce linear coordinates into V, since each vector gets represented by a sequence of numbers, or coordinates. example: the set of polynomials of degree at most d, has as basis the set of d+1 monomials 1,X,...,X^d. another basis is the set of d+1 polynomials 1, (1+X), (1+X+X^2),...,(1+X+...+X^d). theorem: every finite dimensional space has a basis, i.e. admits an isomorphism with some R^n. proof: choose any finite spanning set, v1,...,vn. throw out any zero vectors. if v2 is a multiple of v1 throw out v2, if not keep it. if v3 is a linear combination of v1,v2, throw it out, if not keep it. continue going through nthe spanning set, throwing out any vectors which are linear combinations of previous ones. then the ones left are a basis. i.e. they still span but none is a linear combination of a previous one, hence none is a linear combination of any others. QED. cor: we have proved that every finite generating set contains a basis. rephrasing this proof in term of linear maps, it follows that any linear surjection from R^n to V which is not injective restricts to an isomorphism on some linear subspace (which may be regarded as R^m where m < n) to V. cor: given any basis of V, there is a one one correspondence between linear maps from V to W and set functions from the basis to W, i.e. every function on the basis extends uniquely to a linear map. proof : this is true of R^n hence of all finite diml spaces. exercise: if V = R^n and W is the subspace spanned by en, then V/W is isomorphic to R^(n-1). exercise: if f:V-->W is a linear map and ker(f) is the subspace of vectors sent by f to zero, then f is constant on equivalence classes in V/ker(f), hence defines a linear map from V/ker(f)--->W, which is always injective, and is still surjective if f was. we define a space to have dimension n if it is isomorphic to R^n. we claim the dimension of a space is "well defined", i.e. that a space cannot have two different dimensions. it suffices to show: theorem: if R^n and R^m are isomorphic, then n=m. proof: clearly there is no linear surjection from R^1 to a higher dimensional space, since the image vectors of a linear map from R^1 to R^n all have proportional entries. if n<m and there were a linear surjection f: R^n --> R^m, and if e1,...,en, and u1,...,um are the standard bases of these two spaces, then the composition R^n-->R^m/span(um) is a linear surjection which is not injective. hence there is a restriction to some lower dimensional subspace of R^n which is still surjective to R^(m-1). by induction on n this is a contradiction. QED. Cor: two spaces are isomorphic if and only if they have the same dimension. Cor: all bases have the same cardinality. we agree that the space {0} containing only the zero vector has dimension zero, and has the empty set as a basis. theorem: if W is a subspace of V, then dimW+dimV/W = dimV. proof sketch: choose a basis for W, and extend it to a basis of V. then the added vectors are a basis for V/W. QED. Theorem: if f:V--.W is a linear surjection, then dimker(f) + dimW = dimV. proof: f induces an isomorphism from V/ker(f) to W. QED. cor: dim(V+W) = dimV + dimW proof: the projection taking (x,y) to y is a linear surjection from V+W to W with kernel V. QED. definition: an indexed set of vectors {vi} is independent if the only linear combination of them that equals the zero vector has all coefficients equal to zero. equivalently, if some ai is a non zero scalar, then a1v1+...+anvn is not the zero vector. lemma: every independent set is contained in a basis. proof: given independent vectors v1,...,vn, add on any basis to get a generting set v1,...,vn,w1,...,wm. then applying the procedure in the theorem above on reducing a generating set to a basis, does the job. cor: if V has dimension n then an independent set of vectors in V has at most n vectors. if a set of vectors in V has more than n vectors, it is not independent. if dimV > dimW, then a linear map V-->W is not injective, and a linear map W-->V is not surjective. proof: easy exercise. exercise: if the sequence v1,...,vn is independent then the map R^n-->V taking (a1,...,an) to a1v1+...+anvn is injective. (this is a tautology.) that pretty much finishes the theory of finite dimensional vectors spaces and their dimension. the next chapter is to classify linear maps between them. i would do it by introducing modules over rings, row and column operations on matrices, and diagonalizing the matrix giving a finite presentation of the R[X] module structure on V defined by a linear endomorphism. but not right now.
chapter 2: dot products, matrices, eigenvalues, and digonalizable linear maps. define the dot product of two vectors in R^n as follows: (a1,...,an).(b1,...,bn) =a1b1+....anbn. given a linear map f from R^n to R^m, arrange the image vectors f(e1),...,f(en) as columns in a matrix M. then the columns of M have length m and there are n of them, i.e. the rows have length n. now if v = (a1,...,an) is any vector in R^n, note that the jth entry in the vector f(v) are obtained by dotting v with the jth row of M. thus every linear map from R^n to R^m is obtained from a unique m by n matrix. If f is any linear map from V to W, then by choosing bases for V and W we obtain isomorphisms between them and some R^n and R^m, hence we obtain a resulting map from R^n to R^m which has a matrix. this is called the matrix of f associated to the given bases for V and W. in particular a map from V to itself has a matrix associated to any given basis of V. if f:V-->W and g:W-->U are linear maps and we choose bases for all theee spaces, then the matrix of the composition gof has as entry in its ith row and jth column, the dot product of the ith row of the matrix for g with the jth column of f. note that the matrix of a map from V to V has non zero entries only on the diagonal if and only if each basis vector is mapped to a scalar multiple of itself. this is called a diagonal matrix, and the associated basis is called a basis of eigenvectors of f. the identity map of V has as matrix the diagonal matrix with ones on the diagonal, in any basis. thus any basis of V is an eigenbasis for the identity map. the map given by multiplication by the scalar c has as matrix the diagonal matrix with all c's on the diagonal. an n by n matrix A is called symmetric if and only if the entry in the ith column and jth row equals the entry in the jth column and ith row, for every i and j. the matrix A* obtained from A by interchanging its rows and columns is called the tranpose of A. in general, if v, w are any vectors in R^n, then v.(A*w) = (Av).w. In particular if A is symmetric, then A* = A and hence Av.w = v.Aw. Theorem: If A is symmetric, then R^n has a basis of eigenvectors for A. proof: The function f(x) = Ax.x has a maximum on the unit sphere in R^n, at which point the gradient vector of f is zero on the tangent space to the sphere, i.e. is perpendicular to the tangent space at that point. But th tangent space at x is the subspace of vectors pwerpendicular to x, and the gradient of f at x is the vector 2Ax. Hence Ax is also perpendicular to the tangent sace at x, i.e. Ax is parallel to x, i.e. x is an eigenvector for A. that gives us one eigenvector for A. Now restrict A to the tangent space (through the origin) to the sphere at x. i.e. let v be a tangent vector, so that v.x = 0. then Av.x = v.Ax = v.cx for some c. so this is also zero, and hence A preserves this tangent space. now A still has the property Av.x = v.Ax on this subspace so A the restriction of A has an eigenvector and by induction on the dimension of the space, A has a basis of eigenvectors. QED. the entries on the diagonal of a diagonal matrix for A are called the eigenvalues of A. in general an eigenvalue for A is a number c such that Av = cv for some non zero vector v.
i just thought it might be fun to see the whole story explained in a few paragraphs. can't hurt you much.
addendum to chapter 2, diagonalizable maps and minimal polynomials. given a vector space V and a linear map f:V-->V, define a multiplication of the polynomial ring R[X] on V by saying that X times a vector v is f(v). similarly, X^2 times v is f(f(v)) and so on. Then V is called a "module" over the ring R[X]. since the space of n by n matrices is itself a vector space of dimension n^2, so also the space of linear maps from V to V has dimension n^2. Since sending a p[olynomial P to multiplication by P. defines a liner map from R[X] to the space of linear maps on V, and since R[X] is not finite dimensional, having as basis all monomials 1,X,X^2,X^3,.... the map from R[X] to linear maps on V must have a non zero kernel. the monic polynomial of least degree in that kernel is called the minimal polynomial of f. it is the monic polynomial P of smallest degree such that P(f)v = 0 for all v in V. For a diagonal matrix with entries c1,...,cn on the digonal, note that the polynomial (X-c1)(X-c2)....(X-cn) does give zero when applied to f, i.e. when f is substituted for X. In fact the is stil true when we omit repeated occurrences among the scalars ci. so a necessary condition for diagonalizability of a map is that its minimal polynomial must be of form (X-c1)(X-c2)....(X-ct) where t ≤ n and all ci are distinct. i think this is also sufficient. lets see if we can prove that. consider for each root ci of the polynomial P, the kernel Vi of (f-ci.Id). Then we claim that V is isomorphic to the sum of the subspaces Vi. well, there is certainly a map from that sum to V, which we must show is injective and surjective. if not injective then some sum of form v1+...+vt = 0. then vt is in the span of the vectors v1,...,vt-1, which are all sent to zero by the polynomial (X-c1)(X-c2)....(X-ct-1). then so is vt sent to zero by this polynomial, but this is false. i.e. X-c1 sends vt to (ct-c1)vt, which is not zero. then X-c2 sends (ct-c1)vt to (ct-c1)(ct-c2)vt which is not zero,...etc. for surjectivity, let v be any vector in V, and define the polynomials P1,....,Pt, where P1 = (X-c2)(X-c3)....(X-ct), P2 = (X-c1)(X-c3)....(X-ct), P3 = (X-c1)(X-c2)(X-c4)....(X-ct), ...., Pt = (X-c1)(X-c2)....(X-ct-1). then these polynomials are relatively prime hence by the euclidean algorithm we can find polynomials Q1,...,Qt such that P1Q1+...+PtQt = 1. Then applying this to v gives that v = 1v = P1(Q1(v)) +....+Pt(Qt(v)), which lies in the sum of the images of the polynomials Pi. Since the image of each Pi is in the kernel of X-ci, we have proved surjectivity. well i think i have written a very short text covering most of a first course in linear algebra, minus gaussian elimination and determinants. reactions?
I'm confused by V/W. I mean, I think it denotes the "quotent set" or set of all "equivalence classes" but then subspace W would be an "equivalence relation" - which I guess I'm OK with - not sure. Let's see, For each [tex] a \in V[/tex] the "equivalence class" (denoted by [a]) is the set of all elements to which a is related by the relationship W. This seems totally weird to me. Anyway, if that's correct (which it probably isn't), then we should have the "equivalence class" defined by: [a] [tex]=\{x:(a,x)\in W\}[/tex] In the above equation, (a,x) is a particular equivalence relationship between each a in V and some other element(s) x of V, determined by W (which doesn't make sense at all to me but I tried to set this up to follow the set theoretic definitions of equivalence relation, equivalence class, and quotent set). If this is wrong please show me what's going on. Pressing on in spite of my confusion, if the above is accurate, V/W should be defined by: V/W [tex]= \{[a]:a\in V\}[/tex] Again, this seems weird and probably wrong, so, what does set V/W represent that V is being mapped into? Perion
it looks pretty good, but i would write that the equivalence class [a] consists of all x such that a-x is in W. i.e. the vectors in [a] are all those of form a+x where x is in W. then indeed V/W consists of all [a] where a is in V. this just says the map V-->V/W from the space V to the set of all equivalence classes is surjective, a traditional property of quotient maps. geometrically, if V is the plane and W a line through the prigin, then V/W consists of all lines parallel to W. If you pick any line L through the origin and not equal to W, then L meets exactly one line which is parallel to W. the point of intersction allows you to represent that element of V/W as a point of L. so in this way V/W becomes isomorphic to L, in particular V/W has dimension one as expected.
Ooops - I was editing my previous post while you were posting your reply - sorry. Don't think it makes any difference. Perion
note if W is a line through the origin and we choose a point a on any ine parallel to W, then all points of the line containing a have form a+x where x is in W. i.e. the other points on the line paralel to are the other elements of [a]. it may seem weird to think of a subspace W as an equivalence relation, but it isn't really. i.e. W is just the set of things equivalent to zero. we do not need the other equivalence classes to be given specifically because we have the ability to subtract. so we just say two vectors are equivalent if their difference is equivalent to zero, i.e. lies in W.
here is the basic infinite dimensional example: V = all continuously differentiable functions defined on the real line, and W is all continuous functions. then the map D:V-->W given by taking the derivative, is linear and surjective by the fundamental theorem of calculus. the kernel of D is all constant functions by the mean value theorem. V has infinite dimension because the monomials 1,X,X^2,X^3,.... is independent. for all scalars c, e^(cx) is an eigenvector for D with eigenvalue c. although these eigenvectors do not span V, trying to do the best we can using them to represent other functions is called the theory of fourier series. we use infinite linear combinations instead of only finite ones, and define a dot product of f and g as the integral of fg. it is sometimes advisable to restrict to twice differentiable functions or more, and to a finite interval of definition (e.g. to make this integral a finite number).
Surjective meaning, for all x in W, there exists an a in V such that a-x is in W - i.e sorta like an "onto" map, only a relation? But, am I correct in saying that those geometrical properties are just the consequence of your particular choice of definition for equivalence class (i.e. ... "the equivalence class [a] consists of all x such that a-x is in W"). And why did you choose that as your definition? Perion
"Surjective meaning, for all x in W, there exists an a in V such that a-x is in W - i.e sorta like an "onto" map, only a relation?" no. for all x in V/W there is a v in V such that v goes to x in the map v->[v]
I'm confused. Where am I screwed up here? Originally, mathwonk talked about "given a space V and a subspace W, the map V-->V/W..." where V/W is a quotent set or set of equivalence classes [v]. Let R denote our equiv. relation. Isn't V/W (where W is a subset of V) the set of equivalence classes [v] such that v is in V? And isn't [v] the set of all elements x in W (rather than in V/W) for which (v,x) is in R? Isn't that what V/W and [v] usually mean? I know the overall mapping is V->V/W but guess I'm confused between when we're talking about the equivalence relation (and especially notion of V/W, given that W is a subset of V) and when we're dealing with the mapping V->V/W. I'm very tired and this probably makes no sense at all. Tomorrow I'll probably laugh at my questions. Perion
I said somehing crazy: "If you pick any line L through the origin and not equal to W, then L meets exactly one line which is parallel to W. the point of intersection allows you to represent that element of V/W as a point of L. so in this way V/W becomes isomorphic to L, in particular V/W has dimension one as expected." I meant to say that our line L meets each line U parallel to W in exactly one point. that one point of L represents the line U. as to why I chose the equivalence relation that way, it was to make the equivalence relation compatible with the addition of vectors and scalar multiplication. i.e. one would want zero plus zero to equal zero, and also a scalar times zero to be zero. so if anything is set equal to zero, then all multiples of that thing should also be set equal to zero, and the sum of two things set equal to zero should also be set equal to zero. so our definition is the only way to define an equivalence relation on a vector space so that the set of equivalence classes is also a vector space, and the natural map taking a vector a to its equivalence class [a], is a linear map. i.e. the set of thigns equivalent to zero should be a subspace, such as W, and then the set of things equivalent to a, should be of form a plus anything equivalent to zero, i.e. things equivalent to a should look like a+x, where x is in W. then one can always make any linear map factor through a linear surjectioin followed by a linear injection. i.e. if f:V-->T is linear and W is the subspace ker(f) of V, then the maps f factors uniquely through the projection V-->V/W and an induced injection V/W-->T, taking [a] to f(a). I.e. if a+x is any element of [a] then f(a+x) = f(a) +f(x) = f(a)+0 because x by definition belongs to ker(f) = W. Then the map V/W-->T is injective because we equated all things that map to zero, with zero itself. hence the only thing that maps to zero from V/W is [0]. i.e. if f(x) = 0, then [x] = [0] in V/W. this is a standard construction in all forms of mapping situations. if S and T are merely sets, and f:S-->T any function, define an equivalence relation R on S by saying that x and y are equivalent if f(x) = f(y). then S/R is the set o equivalence classes and now there is a unqiue injection from S/R-->T taking [x] to f(x). again the natural projection S-->S/R is surjective. If G and H are groups and f:G-->H is a group homomorphism, then we let K = ker(f) be the set of elements of G mapped by f to the identity of H, called whatever, 0, or 1, or e. Then K is a subgroup of G and we can define a group structure on the set of equivalence classes under the relation, that x and y in G are equivalent mod K if x-y is in K, i.e. if f(x) = f(y), i.e. if f(x-y) or f(xy^(-1)) = e or 1 or 0 or whatever you call it. It turns out that the set G/K of equivalence classes is a group such that G-->G/K is a group homomorphism, where we define addition of [x] and [y] in G/K to be [x+y] or [xy] or whatever the operation in G is called. Actually not every subgroup K of G has the property that one can define a nice group operation on G/K such that sending x to [x] is a group map G-->G/K. In fact this is possible if and only if K is the kernel of some group map, if and only if for every element x of G the set of left products xK equals the set of right products Kx, (assuming the operation on G is multiplication), if and only if K is "normal". In particular this always holds if G is abelian, as in the case of a vector space where the operation is vector addition.
This is wrong: "Let R denote our equiv. relation. Isn't V/W (where W is a subset of V) the set of equivalence classes [v] such that v is in V? And isn't [v] the set of all elements x in W (rather than in V/W) for which (v,x) is in R?" [v] is the set of elements x of V such that a-x is in W. i.e. (a,x) belongs to the equivalence relation if and only if a-x is a vector in W. W consists precisely of the elements x which are equivlent to zero, i.e. such that (0,x) belongs to R. Frankly i do not think of an equivalence relation as a set of ordered pairs. to me that isa set theoretic device with nothing to recommend it intuitively, merely a trick for making all concepts set theoretic in nature.
this treatment might seem amusing, but if you compare it with the text of sharipov, you will see i have covered in about 8 pages much of his 140 page text, including factor spaces, and the spectral theorem for self adjoint operators in the real case.
The first part of the book (through page 10 at least) only deals with sets (briefly) and mappings. The more I've gotten into that book the less I seem motivated to use it. I was looking into equivalence relations and other related stuff due to our discussion but that particular book was little help, containing only the usual standard definition on page 32. I use a wide variety of resources when I'm chasing after some understanding. Seems to work best for my pee brain. When I first saw your reference to mapping V-->V/W I was intrigued and confused by this usage of the set theoretical "quotent set" V/W - especially with the occurance of the subset W where the relation is normally placed. I've only encountered quotent sets, equivalence relations, equiv. classes, etc, in set theory books rather than in terms of vector spaces and subspaces - though vector spaces are obviously defined in set theoretic terminology so I have no problem with that. The notion of mapping a vector space into the set of its own equivalance classes seemed pretty strange - especially where a subset of the vector space is deemed the relation. I've only ever seen the quotent set used in the format A/~ where A refers to the set to which the equivalence relation pertains and ~ refers to the relation - i,e. a subset of A x A composed of all the ordered pairs (a, x) where a and x (each elements of A) obey whatever particular equivalence rule ~ refers to). That's why when I saw V/W I was struck by curiosity. It looked like, even though W was assumed to be a subspace of V it also somehow refered to the set of equivalence classes for V. This didn't seem possible. But, if "W is just the set of things equivalent to zero" as you put it, I guess(?) W could be an equivalence relation. But, is W composed of a set of vector elements v_a, v_b, etc or a set of pairs of equivalent elements (v_a, v_b), (v_c,v_d), etc, like in the set theoretic definition for equivalence relation??? It seems to me it matters unless we are just playing a bit loose with what is meant by "equivalence relation". Happy Wednesday! Perion
The important thing is that the set of equivalences classes carries a vector space structure. That is the additoin [x]+[y]=[x+y], and scalar mult. t[x]=[tx] are well defined - they are independent of the choice of representatives of the equiv. class. This may seem odd to you, but this kind of thing is exactly where equivalence classes came from, or at least it appears to be: "the" example of them is modulo arithmetic if you like. I've seen very few "good" mathematicians who would choose define equivalence classes as ordered pairs, as subsets of AxA. This treatment, whilst abstract and general disguises the naturalness of the definition of an equivalence relation and only seems to confuse the issue. W isn't an equivalence relation. We are not playing at all loose with the definitions - you must realize that these are just notational conventions.
Thanks for the reply Matt. I've been camping in set theory stuff too much lately. I guess I never thought of "equivalence" for vector spaces in set theory terminology like "equivalence relation", "equivalence classes", "quotent sets", and all that. But, I'm just a (persistent) novice and undoubtedly have quite a few misconceptions and distortions in my understanding. I'm working on it. And yeah - I do understand the congruence relation in modulo arithmetic doesn't have anything to do with ordered pairs or cross products... so, I'll get the ordered pair, set stuff, out of my head. Later, Perion