It strikes me that Strang's formulaic presentation, "these are the 4 fundamental subspaces!, here are their properties!," may help many students remember the basic facts, but they do little to explain why the facts are true. I will expose my own bias for the conceptual approach, and allow you to dismiss my opinions if you are so inclined, by presenting here my own brief take on the "4 subspaces", (assuming we are working with real numbers):
Matrices give a numerical way to study linear geometry. A vector of dimension n represents an arrow in n-space, and the dot product of two such vectors is zero iff the arrows are perpendicular. A mxn matrix A of numbers represents a linear transformation A:R^n-->R^m, taking column vectors v of length n to column vectors Av of length m, by multiplying v from the left by A. This multiplication takes v to the column of dot products obtained by multiplying v by every row of A. It follows that those vectors v taken to zero, i.e. N(A) = the nullspace of A, are exactly the vectors perpendicular to the space R(A) spanned by the row vectors of A. I.e. the nullspace is the orthogonal complement of the row space, N(A) = R(A)perp.
Thus the whole n-space is split into two perpendicular components, R(A) and N(A), and every n-vector v is a unique sum, v = p+q, of a vector p in N(A) and a vector q in R(A). In particular n = dimR(A) + dimN(A). (This can be proved by the row-reduction process, or Gram-Schmidt “projection”, but I hope it is intuitively plausible that a space splits into any subspace plus the subspace of vectors perpendicular to that first subspace.)
Now from the definition of matrix multiplication, an n -diml column vector v is taken to a linear combination Av of the m -diml column vectors of A, where the jth column vector of A is multiplied by the coefficient occurring in the jth position in v. It follows that the subspace in R^m, of all image vectors Av of the transformation A, is exactly the span of the column vectors, the column space C(A).
Now if we express a vector v as a sum of vectors p and q, v=p+q, where p is in N(A) and q is in R(A), the image Av = A(p+q) = Ap + Aq = Aq. I.e. since Ap = 0, the image vector Av is the same as Aq, where q is in R(A). Thus every image vector, i.e. every vector in the column space, is the image already of some vector in the row space.
Moreover no two different vectors in R(A) have the same image since the only way Aq = Aq’ is if A(q-q’) = 0, i.e. if q and q’ differ by an element of N(A). But since q, and q’ are both in R(A), their difference is too, so q-q’ is both in R(A) and in N(A), which is perpendicular to R(A). Thus q-q’ is perpendicular to itself, hence is zero, so q = q’.
Thus A sets up a one-one linear correspondence between R(A) and C(A). In particular R(A) and C(A) have the same dimension. If we define the “rank” of a matrix as the dimension of its column space, and since the rows of A are precisely the columns of its transpose A*, we see that A and its transpose have the same rank. Or some people say that the column rank and row rank of A are equal. Thus dimR(A) = dimC(A) = dimR(A*) = dimC(A*). And dimN(A) + dimR(A) = n, while dimN(A*)+dimR(A*) = m.
The fact that dimC(A) + dimN(A) = n, its called the rank - nullity theorem, and is the most important fact in all of linear algebra.
(Strang's 4 subspaces are R(A), C(A) and their orthogonal complements, R(A)perp = N(A), and C(A)perp = R(A*)perp = N(A*). It seems to take Strang about 200 pages to prove these dimension theoretic facts, and 400 pages to even mention linear transformations.)