How hard is this Linear Algebra textbook?

TGV320 · Jan 24, 2024

Hello,

I am currently self studying Linear Algebra using MIT lectures and the textbook Introduction to Linear Algebra by professor Gilbert Strang. I'm at the 16th lecture on Projection Matrices and Least squares approximation. The lectures are very informative, but I struggle a lot with the exercises from the problems in each chapter. It's excruciating.

I found out in the library a textbook called Linear algebra with applications by Steven J. Leon and after reading it a little, that book semmed relatively easier than the first one.

Please inform me if the difficulty of Steven J. Leon 's textbook is reasonable for a newbie at linear algebra like me, or is it because I'm really so bad at math that Strang's exercises seem hard.

Thanks

romsofia · Jan 24, 2024

Does it matter what level it's used for if you're learning? Use both if it helps, find 3 others if you can. Learn from all books.

It's also meant as an introduction to linear algebra: "This book is for sophomore-level or junior/senior-level first courses in linear algebra and assumes calculus as a prerequisite"... so use it, and solve problems because the point of learning is to learn, not be egotistical about where you learn from.

WWGD · Jan 24, 2024

I think TGV320 wants some perspective, assessment on his learning stage.

romsofia · Jan 24, 2024

I highly disagree, sounds like they just want to vent about needing to use a 'lesser' book in their eyes to learn. The book tells you what stage it is meant for in the overview verbatim.

To literally type what it says... "This book is suitable for either a sophomore-level course or for a junior/senior-level course. The student should have some familiarity with the basics of differential and integral calculus. This prerequisite can be met by either one semester or two quarters of elementary calculus .If the text is used for a sophomore-level course..." and then goes to outline different lecture plans. Which they should know if they read said book.

So, if it isn't about books, then it shouldn't be in the book section.

mathwonk · Jan 24, 2024

I just perused a freely available section of Strang and confirmed my previous dislike of his books. The sample on amazon just listed a bunch of facts to memorize, no explanation, no motivation, hence nothing of great value in my mind. so given that you are not finding it easy to learn from, I would toss it. the author Steven Leon has an excellent pedigree in both mathematics and teaching, so I would immediately prefer his book if I were you.

I also second the opinion that you should choose books mainly from the perspective of whether they speak to you or not, assuming they do not contain false information.

It is true that some books provide (potentially) higher than average level instruction, and it can be beneficial to struggle with them; but in my view, 1st of all Strang is not one of these superior books, and second it is perfectly reasonable to stop trying to read such superior books, if they are found overly difficult, until one has learned enough from easier books to do so.

I also agree that the goal of reading a book is not to assess your ability but to improve it.

TGV320 · Jan 24, 2024

Thanks for the advice everyone, I shall follow it.

hutchphd · Jan 24, 2024

I do not know Strang's book but I found his MIT open courseware lectures to be a very good refresher (I confess I am not a very good student of "pure" maths if that makes a difference). So mix and match?

TGV320 · Jan 24, 2024

Actually in China, Strang's lectures and his textbook have garnered a very strong following. On the video platforms Bilibili and NetEase OCW, there are at least in total 22 million views.

In great part because the home brewn Chinese textbooks are translated 1950's Soviet applied linear algebra textbooks, extremely computational (with only matrices and determinants, no eigenvalues whatsoever), who themselves are watered down Soviet 1950's "advanced algebra" textbooks.

That textbook is still used for graduate level entrance examination math, so it still rules today.

Hornbein · Jan 24, 2024

When I was a student of a topic I'd go to the library, scan a bunch of books (there can be a hundred on a common topic) and find the one I liked the most. They all say the same thing, it's just a matter of style.

I took a linear algebra course in 1975 that used Strang. I didn't get it and got a C. In 1995 I took linear algebra in graduate school and was the best student. I thought it was easy. My roommate couldn't do it and flunked out. I can't explain this at all.

hutchphd · Jan 24, 2024

The only truly excellent math course I had in high school was linear algebra (that was not the title but it was the subject...maybe Algebra II was the title). I am truly indebted to Mr Garlowe. I was clueless (nearly) at the time but we all knew he was excellent and acted accordingly. Ripples were created.

mathwonk · Jan 25, 2024

It strikes me that Strang's formulaic presentation, "these are the 4 fundamental subspaces!, here are their properties!," may help many students remember the basic facts, but they do little to explain why the facts are true. I will expose my own bias for the conceptual approach, and allow you to dismiss my opinions if you are so inclined, by presenting here my own brief take on the "4 subspaces", (assuming we are working with real numbers):

Matrices give a numerical way to study linear geometry. A vector of dimension n represents an arrow in n-space, and the dot product of two such vectors is zero iff the arrows are perpendicular. A mxn matrix A of numbers represents a linear transformation A:R^n-->R^m, taking column vectors v of length n to column vectors Av of length m, by multiplying v from the left by A. This multiplication takes v to the column of dot products obtained by multiplying v by every row of A. It follows that those vectors v taken to zero, i.e. N(A) = the nullspace of A, are exactly the vectors perpendicular to the space R(A) spanned by the row vectors of A. I.e. the nullspace is the orthogonal complement of the row space, N(A) = R(A)perp.

Thus the whole n-space is split into two perpendicular components, R(A) and N(A), and every n-vector v is a unique sum, v = p+q, of a vector p in N(A) and a vector q in R(A). In particular n = dimR(A) + dimN(A). (This can be proved by the row-reduction process, or Gram-Schmidt “projection”, but I hope it is intuitively plausible that a space splits into any subspace plus the subspace of vectors perpendicular to that first subspace.)

Now from the definition of matrix multiplication, an n -diml column vector v is taken to a linear combination Av of the m -diml column vectors of A, where the jth column vector of A is multiplied by the coefficient occurring in the jth position in v. It follows that the subspace in R^m, of all image vectors Av of the transformation A, is exactly the span of the column vectors, the column space C(A).

Now if we express a vector v as a sum of vectors p and q, v=p+q, where p is in N(A) and q is in R(A), the image Av = A(p+q) = Ap + Aq = Aq. I.e. since Ap = 0, the image vector Av is the same as Aq, where q is in R(A). Thus every image vector, i.e. every vector in the column space, is the image already of some vector in the row space.

Moreover no two different vectors in R(A) have the same image since the only way Aq = Aq’ is if A(q-q’) = 0, i.e. if q and q’ differ by an element of N(A). But since q, and q’ are both in R(A), their difference is too, so q-q’ is both in R(A) and in N(A), which is perpendicular to R(A). Thus q-q’ is perpendicular to itself, hence is zero, so q = q’.

Thus A sets up a one-one linear correspondence between R(A) and C(A). In particular R(A) and C(A) have the same dimension. If we define the “rank” of a matrix as the dimension of its column space, and since the rows of A are precisely the columns of its transpose A*, we see that A and its transpose have the same rank. Or some people say that the column rank and row rank of A are equal. Thus dimR(A) = dimC(A) = dimR(A*) = dimC(A*). And dimN(A) + dimR(A) = n, while dimN(A*)+dimR(A*) = m.

The fact that dimC(A) + dimN(A) = n, its called the rank - nullity theorem, and is the most important fact in all of linear algebra.

(Strang's 4 subspaces are R(A), C(A) and their orthogonal complements, R(A)perp = N(A), and C(A)perp = R(A*)perp = N(A*). It seems to take Strang about 200 pages to prove these dimension theoretic facts, and 400 pages to even mention linear transformations.)

mathwonk · Jan 25, 2024

As I struggle to recall some row reduction results, I am reminded that, to be fair to Strang, many people would actually prefer to be able to just do certain calculations accurately, rather than to "understand" them but not have much skill with them.

So if your goal is to deal confidently with matrices in practice, rather than to slouch back and talk about what they mean, you may well prefer Strang to any explanation I would make or recommend.

But I am just trying to be inclusively fair, I myself still dislike his books, even if I could well learn something from them.

since I also dislike reading comments that say there exists something better but do not name any, here is a (free) book I like:
https://www.math.brown.edu/streil/papers/LADW/LADW.html

mathwonk · Jan 26, 2024

Well after reading Strang's preface to his 5th edition of his book, I feel like sort of a jerk for trashing his books. He is so enthusiastic, and so many people have been helped by his clear explanation of matrices, he has surely done a great service to math education. I also agree with him that schools have taught too much calculus, when most students would be benefited far more by learning linear algebra. So more power to him, and to everyone who enjoys his books. Still, choose books that help you.

mathwonk · Jan 26, 2024

In thinking about how to give the explanation above in the simplest way, I have also learned something. My argument above shows that if A is a real matrix, the row and column spaces R(A) and C(A) are isomorphic, and in fact, multiplication by A, restricts to an isomorphism from R(A) to C(A).

I tried to see this in general from looking at a reduced form matrix, trying to produce vectors in R(A) that map to the basis vectors of C(A), but could not do so just by taking linear combinations. It turns out this is not true in more general fields than the reals. My argument used the fact that a non zero real vector is not perpendicular to itself, but over the complexes, (1,i).(1,i) = 1-1 = 0, so this argument does not work. If we write V* for the dual space of (a finite dimensional space) V, consisting of linear functions from V to the field of scalars, then V and V* have the same dimension, hence are abstractly isomorphic, but there is no natural isomorphism.

It turns out in general that if A:k^n-->k^m is a matrix defining a linear map from k^n to k^m, then C(A) is isomorphic, via A, to the quotient space k^n/N(A), while R(A) is naturally isomorphic to the dual of this quotient space, i.e. to N(A)perp. So this explains why R(A) and C(A) always have the same dimension, even though the restriction of A mapping R(A) to C(A) is not always an isomorphism.

In Strang's computational approach, these dimensions are clearly the same because they are both equal to the number of pivots in an echelon form of A. That does not even attempt to explain why they are equal, but it does prove it.

I guess I conclude, if you want to know what is true, Strang can help. if you want to know why, for some things you may have to look elsewhere.

hutchphd · Jan 27, 2024

Being a self-described nonmathematician I think your analysis is on point and descripive of my happiess with prof Strang's pedagogy. Were I a better mathematician I would have verbalized similar commentary.....thanks @mathwonk.

mathwonk · Jan 27, 2024

As another argument for conceptual explanations, I propose the proof of the fact that the reduced echelon form of a matrix A is unique, no matter how you proceed to compute it. Online searches produce many very lengthy and tedious proofs of this as well as claims that it is inherently hard. This is absurd. There are also a few short proofs, few of which are conceptual. I suspect Strang omits it entirely.

Uniqueness of reduced row echelon form.
conceptual argument:
note that the rows of the reduced echelon matrix of an mxn A are just a nice basis of the row space of A. note also that these rows project isomorphically, onto the subspace of k^n spanned by the pivot variables, exactly onto the standard basis of that subspace. that proves uniqueness: i.e. project the row space onto the subspace of k^n which is spanned by the earliest sequence of variables (in dictionary order), such that the projection is an isomorphism. Those are the pivot variables. Then take the inverse image of the standard basis under this isomorphism; those are the rows of the reduced echelon form. qed.

more detailed concrete argument:
proof: since the rows of the reduced echelon matrix of A are a basis for the row space of A, it follows that every vector in the row space of A is a (unique) linear combination of those reduced row vectors. If you look at the row vectors of a reduced matrix you will see that the coefficients of any linear combination v of them, appear as the entries in the pivot positions of v.
Hence a vector in the row space is entirely determined by its entries in the pivot positions. In particular, the jth row of the reduced matrix of A is the only element of the row space having a 1 in the jth pivot position and 0’s in the other pivot positions.
Thus the row reduced echelon form is determined by knowing which are the pivot columns. But the jth column of A is a pivot column if and only if the matrix consisting of the first j columns of A has greater rank than the matrix consisting of the first j-1 columns. I.e. this is true of the reduced form, and the reduction procedure does not change the row rank of any matrix formed by the first j columns of A. qed.

Both of these are much shorter than most proofs I found online.

Here is still another way to express the conceptual argument, assuming you know the pivot variables:
The row space R(A) in k^n projects isomorphically into the space k^r spanned by the pivot variables of A, hence R(A) is the graph in k^n = k^r x k^(n-r) of a unique linear function f from k^r to k^(n-r). Since the pivot entries of the jth row of the reduced form of A represent ej, the jth standard basis vector of k^r, the non pivot entries represent f(ej), in particular they are uniquely determined by A.

Explicitly, the r x (n-r) matrix of non pivot entries of the reduced form of A, is the (transpose of the) matrix of f, the linear map with graph equal to R(A).Here is another conceptual point of view, which I claim shows how anyone might have thought of this uniqueness argument: a (finite) basis of a vector space V is equivalent to an isomorphism of V with some coordinate space k^r. I.e. v1,…vr, is a basis of V if and only if there is an isomorphism from V to k^r taking vj to ej. Thus, when one hears the word "basis", it is helpful to think: "isomorphism with k^r".

Hence if we are looking for a distinguished basis of R(A), it is equivalent and more natural, to look for a distinguished isomorphism from R(A) to some coordinate space. Since R(A) is a subspace of k^n, it is natural to look at projection of R(A) onto some coordinate subspace of k^n. Indeed, the earliest r-dimensional subspace of k^n, in the dictionary order, for which projection is an isomorphism from R(A), corresponds to the basis defined by the rows of the reduced form of A.

MidgetDwarf · Feb 25, 2024

TGV320 said:

Actually in China, Strang's lectures and his textbook have garnered a very strong following. On the video platforms Bilibili and NetEase OCW, there are at least in total 22 million views.

In great part because the home brewn Chinese textbooks are translated 1950's Soviet applied linear algebra textbooks, extremely computational (with only matrices and determinants, no eigenvalues whatsoever), who themselves are watered down Soviet 1950's "advanced algebra" textbooks.

That textbook is still used for graduate level entrance examination math, so it still rules today.

I believe his textbooks are popular because he teaches at MIT, and his video lectures have a high view count. I agree with Mathwonk, and say that his book on LA is terrible.

Never looked at Leon's book, but I doubt anyone can do worse than Strang.

The easiest book I have seen for LA, is the one by Paul Shields. Elementary but well written.
Other intro books I have seen for LA, are similar to the one by Anton. Ie., David Lay etc, and are nearly indistinguishable from each other. Reminds me of the run of the mill Calculus books of today. I read Shields, Anton, and Lang elementary LA book when first learning the material.

Next up in difficulty, would be books like LA Done Right by Axler or the one by Friedberg/Insel/Spence.

I liked the book by Sterling K Berberian. It reads similar to Axler, and was written before Axler. It is also less known, and cheaper.

Vanadium 50 · Feb 25, 2024

I used Anton, so I don't have a dog in this hunt.

There are two linear algebras, with different goals. One is designed around computation and the other is sort of a pre-abstract algebra. You probably want a clear idea what the goal is before picking the book to help you accomplish it.

How hard is this Linear Algebra textbook?

Similar threads

Hot Threads

Recent Insights