1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Homework Help: Want to understand proof of Dimension/ Rank-Nullity theorem

  1. Jun 23, 2013 #1
    OK, I am working on proofs of the rank-nullity (otherwise in my class known as the dimension theorem).

    Here's a proof that my professor gave in the class. I want to be sure I understand the reasoning. So I will lay out what he had here with a less-precise layman's wording, as I want to be sure I know what I am doing. It makes the proof easier to memorize for me.


    Let V and W be vector spaces.

    T:V→W is linear
    V is finite-dimensional
    function [itex]f \in [/itex] HomK(V,W)
    Let dim(V) = n for some n[itex]\in \mathbb N[/itex] and dim(ker([itex]f[/itex]) = r

    dim(V) = nullity(T) + rank(T) = dim(ker([itex]f[/itex]) + dim(Im([itex]f[/itex]))

    in some notations (like the one in our text) this wold look like dim(V) = nullity(T) + rank(T) = dim(N(T)) + dim(R(T))

    on to the proof:

    [itex]ker(f) \subseteq V[/itex]. And it is a subspace.

    Why a subspace? Because, since the kernel of any function is the set of vectors that goes to zero, adding to those vectors another vector in V will still be in V, a will multiplying them (since they go to zero).

    since we let dim(V)=n all the bases (basis-es?) of V will have n elements.

    therefore [itex] \exists[/itex] a basis {x1, x2, ... , xn} of [itex]ker(f)[/itex] where r≤n.

    (The reason is that any basis will have an equal or lesser number of dimensions than the space it describes. ker(f) is a subspace).

    by the exchange lemma, which says that given any linearly independent subset
    [itex] \exists[/itex] {y1, y2, ... ys} [itex]\in V [/itex] such that {y1, y2, ... yn}[itex] \cap [/itex]{x1, x2,... ,xr} = [itex] \varnothing [/itex] the next step says that {y1, y2, ... yn}[itex] \cup [/itex]{x1, x2,... ,xr} is a basis of V.

    Now, my question is if that is because the intersection of the two sets is the empty set and they are linearly independent?

    After that, we get to saying that {[itex]f(y_1), f(y_2),... f(y_n)[/itex]} is a basis of Im([itex]f[/itex]).

    But I am not sure why that is.

    He then says we can claim the following:
    [itex] λ_1 f(y_1) + λ_2 f(y_2)+... +λ_s f(y_s)= 0 [/itex]

    for some λ1, λ2, ..., λs [itex] \in [/itex] K

    so taking
    [tex]f \Big( \sum_{i=1}^s x_i y_i \Big) = 0 [/tex]
    we can make that into
    [tex] \Big[ \Big( \sum_{i=1}^s \lambda_i f(y_i) \Big) \Big] = \sum_{i=1}^s \lambda_i y_i \in ker(f)[/tex]

    That step I am a bit fuzzy on the reasoning. IIUC, it's just saying that taking the sum of f using the union of x an y sets equals zero (its just f(x,y) ) and the summation of the product of λ and all the f(y) terms is the same as the sum of all the λy terms and they are all in the kernel of f. But I wanted to be sure.

    He then said that the above implies that there exists some set of scalars, α1, α2, ... αs[itex]\in[/itex] K s.t. [itex]\sum_{i=1}^s \lambda_i y_i = \sum_{j=1}^r y_i x_j[/itex] and that further implies

    [tex]\sum_{j=1}^r \alpha_j x_j - \sum_{i=1}^s \alpha_i x_i = 0 [/tex] which implies αj, λi = 0 for all 1≤j≤r and 1≤i≤s.

    and that further implies that the set {f(y1), f(y2), ... ,f(ys)} is linearly independent.

    Then he says: for all z in the Im(f) there exists x[itex]\in[/itex] V s.t. z=f(x) (this seems obvious at one level but I felt like it was just sleight of hand).


    [tex]z = f \Big(\sum_{j=1}^r \alpha_j x_j - \sum_{i=1}^s \lambda_i y_i \Big) = \sum_{j=1}^r \alpha_j f(x_i) + \sum_{i=1}^s x_i f(y_i)= 0 + \sum_{i=1}^s x_i f(y_i)[/tex]

    and then he says dim(V) = r + s = dim(ker(f)) + dim(Im(f))

    its the last few steps I can't seem to justify in my head. Any help would be appreciated (and seeing if I copied this wrong from the board).
    Last edited: Jun 23, 2013
  2. jcsd
  3. Jun 23, 2013 #2


    User Avatar
    Homework Helper

    There are a number of transcription errors in your post.

    That's a bad idea. Precision is vitally important in mathematics.

    V and W.

    You need a linear map [itex]V \to W[/itex]. You've defined two: T and f. Since you don't mention T again I'll stick with f.

    A precise statement of the theorem is:

    Let [itex]V[/itex] and [itex]W[/itex] be vector spaces over a field [itex]K[/itex], and let [itex]f : V \to W[/itex] be linear. Let [itex]V[/itex] be finite dimensional with [itex]n = \dim V[/itex] and [itex]r = \dim \ker(f)[/itex]. Then [itex]\dim \mathrm{Im}(f) = n - r[/itex].

    The proof that [itex]\ker(f)[/itex] is a subspace should be the subject of a separate lemma; otherwise it makes no sense to talk about [itex]\dim \ker(f)[/itex].

    So far so good, but obviously the index should only run from 1 to r: [itex]\{x_1, \dots, x_r\}[/itex] is the basis of [itex]\ker(f)[/itex].

    It's the result of applying the exchange lemma: Given a basis for a subspace of V and a basis for V, you can obtain a basis of V which contains the basis of the subspace. The notation of the statement is probably going to be confusing here, but the key point is that you can apply the exchange lemma to obtain a basis [itex]\{x_1, \dots, x_r, y_1, \dots, y_s\}[/itex] of [itex]V[/itex] consisting of a basis [itex]\{x_1, \dots, x_r\}[/itex] of [itex]\ker (f)[/itex] and [itex]s = n - r[/itex] vectors [itex]\{y_1, \dots, y_s\}[/itex] which are not in [itex]\ker(f)[/itex].

    This is an assertion. It requires proof.

    Why prove that? You're trying ultimately to prove that [itex]\dim \mathrm{Im}(f) = s[/itex]. The way to do that is to prove that the [itex]s[/itex] vectors [itex]f(y_1), \dots, f(y_s)[/itex] are a basis of [itex]\mathrm{Im}(f)[/itex], since then by definition [itex]\dim \mathrm{Im}(f) = s[/itex].

    How do you do that? You need to show that every element of [itex]\mathrm{Im}(f)[/itex] can be expressed as a linear combination of those vectors, and that those vectors are linearly independent.

    Your professor has decided to switch the order of those, so he first shows that those vectors are linearly independent. He does so by contradiction: he assumes that they are not linearly independent, and shows that this means that the basis of V we've chosen is not linearly independent. That's not possible by definition of a basis, so he has the desired contradiction.

    But the crucial point is that you must assume that at least one [itex]\lambda_j[/itex] is not zero.

    He's using the linearity of f to show that
    0 = \sum_{i = 1}^s \lambda_i f(y_i) = f\left(\sum_{i = 1}^s \lambda_i y_i\right) = 0.
    The immediate consequence of that is that [itex]\sum_{i = 1}^s \lambda_i y_i \in \ker(f)[/itex].

    Yes: [itex]\sum_{i = 1}^s \lambda_i y_i \in \ker(f)[/itex], so it can be written as a linear combination of basis vectors of [itex]\ker(f)[/itex]:
    \sum_{i = 1}^s \lambda_i y_i = \sum_{j=1}^r \alpha_j x_j

    That's a bit garbled. You've shown that
    \sum_{i = 1}^s \lambda_i y_i - \sum_{j=1}^r \alpha_j x_j = 0
    Since at least one of the [itex]\lambda_i[/itex] doesn't vanish, it follows that the vectors [itex]\{x_1, \dots, x_r, y_1, \dots, y_s\}[/itex] are not linearly independent. But that's impossible: those vectors are by construction a basis of V, so they are linearly independent. This is the contradiction we're looking for: [itex]f(y_1), \dots, f(y_s)[/itex] must be linearly independent.

    Now he shows that [itex]\{f(y_1), \dots, f(y_s)\}[/itex] spans [itex]\mathrm{Im}(f)[/itex].

    That's the definition of [itex]\mathrm{Im}(f)[/itex]: it's exactly those things you get from applying f to every member of V. So if [itex]z \in \mathrm{Im}(f)[/itex], it can only be because there exists [itex]x \in V[/itex] such that [itex]z = f(x)[/itex].

    You've got a basis for V, so you can express [itex]x \in V[/itex] as a linear combination of basis vectors:
    x = \sum_{j = 1}^r \alpha_j x_j + \sum_{i = 1}^s \lambda_i y_i.
    Then, because f is linear,
    f(x) = \sum_{j = 1}^r \alpha_j f(x_j) + \sum_{i = 1}^s \lambda_i f(y_i)
    = 0 + \sum_{i = 1}^s \lambda_i f(y_i)
    because by construction [itex]f(x_j) = 0[/itex] for all j. That completes the proof.
  4. Jun 23, 2013 #3
    thanks, i noticed some of the typos too later on when it was pointed out. THe problem is that this guy asks for proofs on the exam and wants us to basically do it from memory. I feel like I am watching some guy talk in latin and we're supposed to memorize the bloody liturgy, you know? At least in vector calculus I felt like there was a procedure to follow, and that I could do the problems knowing that, so I didn't have so much memorization.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted