Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Determinant of Transpose Operator

  1. Sep 2, 2012 #1
    I'm trying to find a way to prove that the determinant of the transpose of an endomorphism is the determinant of the original linear map (i.e. det(A) = det(Aᵀ) in matrix language) using Dieudonne's definition of the determinant expressed in terms of an alternating bilinear form but am having problems with it. Let me set the stage a little:

    Start with a Bilinear form ψ : V × V → ℝ | (u,v) ↦ ψ(u,v) = ψ(∑ᵢ₌₁²λᵢeᵢ,∑ᵢ₌₁²μᵢeᵢ).

    It is alternating if ψ(v,v) = 0 & anti-symmetry, ψ(u,v) = - ψ(v,u), follows by expanding ψ(u + v,u + v) = 0

    Consider the endomorphism f : V → V where V is finite-dimensional. If ψ(f(u),f(v)) = γψ(u,v) for some γ ∈ ℝ (a unique γ which follows by expanding the alternating form & showing γ is determined by the action of ψ on it's basis) & we intuitively interpret the alternating bilinear form ψ(u,v) as the area spanned by u & v the geometric interpretation of γ is inescapably obvious. Here γ is known as the determinant of the map f, i.e. ψ(f(u),f(v)) = det(f)ψ(u,v).

    Some quick properties that follow immediately, without any stress, from the definition:

    1: ψ(u,v) = det(I)ψ(u,v) → det(I) = 1

    2: ψ((fg)(u),(fg)(v)) = det(fg)ψ(u,v) → ψ(f(g(u)),f(g(v))) = det(f)ψ(g(u),g(v)) = det(f)det(g)ψ(u,v) → det(fg) = det(f)det(g)

    3: f Bijective → ff⁻¹ = I → 1 = det(I) = det(ff⁻¹) = det(f)det(f⁻¹) → det(f) ≠ 0

    4: ψ(f(u + w),f(v)) = det(f)ψ(u + w,v) = ψ(f(u + w),f(v)) = det(f)ψ(u,v) + det(f)ψ(w,v)

    5: ψ(f(u + av),f(v)) = = det(f)ψ(u + av,v) = = det(f)ψ(u,v)

    6: ψ(f(b),f(v)) = ψ(f(αu + βv),f(v)) = ψ(αf(u) + βf(v),f(v)) = αψ(f(u),f(v)) ---> α = ψ(f(b),f(v))/ψ(f(u),f(v)), β = ψ(f(u),f(b))/ψ(f(u),f(v))

    7: ψ((f + g)(u),(f + g)(v)) = det((f + g))ψ(u,v) = [det(f) + det(g)]ψ(u,v) + ψ(f(u),g(v)) + ψ(g(u),f(v))

    (I can't think of what to do right now that doesn't result in a dead end but at least we have a nice geometric interpretation of why det(A + B) ≠ det(A) + det(B)).

    All of the above extends easily to multilinear forms, now here are the questions:

    [1]Using this approach to show det(f) = det(fᵀ) I get stuck. There seem to be a few approaches. One, taken by Dieudonne, is to just say that on expanding the determinant we see that the expansion equals the expansion of the transpose matrix. Another, taken by Bourbaki, is to prove using permutations that the determinant of the matrices are equal then using the fact that the dual basis representation of the map is the transpose of the matrix representation in the original basis to prove this for the maps... However these approaches seem disconnected with the simplicity the above picture so I give it a shot myself, basically if det(f) = det(fᵀ) holds when we do it with matrices, it should be that:

    ψ(f(u),f(v)) = ψ(u,fᵀ(f(v))) = ψ(fᵀ(f(u)),v) = det(f)ψ(u,v)


    ψ(fᵀ(u),fᵀ(v)) = ψ(u,f(fᵀ(v))) = ψ(f(fᵀ(u)),v) = det(fᵀ)ψ(u,v)

    give the same result, though I can find no way of showing this. However, if the above equalities are taken seriously then it amounts to showing ψ(u,fᵀ(f(v))) = ψ(u,f(fᵀ(v))),
    but quite honestly I'm not sure any this is even valid in the first place. For instance:

    If we define:
    f : V → U | v ↦ f(v)
    Φ : U → ℝ | w ↦ Φ(w)
    (Φf) : V → ℝ | (ΦT) : v ↦ (Φf)(v) = Φ(f(v))
    fᵀ: L(U,ℝ) → L(V,ℝ) | Φ ↦ fᵀ(Φ) = Φ(f) w/ fᵀ(Φ)(v) = (Φf)(v) = Φ(f(v))

    then given a concrete example:

    f : V → U | (x,y) ↦ f(x,y) = (y,x + y)
    Φ : U → ℝ | (u,v) ↦ Φ(u,v) = u - 2v
    (Φf) : V → ℝ | (Φf) : (x,y) ↦ (Φf)(x,y) = Φ(f(x,y)) = Φ(y,x + y) = y - 2(x + y)
    fᵀ: L(U,ℝ) → L(V,ℝ) | Φ ↦ fᵀ(Φ) = Φ(f) w/ fᵀ(Φ)(x,y) = (Φf)(x,y) = Φ(f(x,y)) = Φ(y,x + y) = y - 2(x + y)

    even considering the map ψ(u,fᵀ(f(v))) = ψ(u,f(fᵀ(v))) makes no sense. How in the world can you write fᵀ(f(v))?
    fᵀf : V → L(V,ℝ) isn't a valid map, fᵀ has to be defined on the image of f whereas it's not defined on the vector space U it's defined on the vector space of linear functionals from U to ℝ. It implies that f(x,y) = (y,x + y) is an element of L(U,ℝ). If you want to compose f with fᵀ then fᵀ has to be defined on L(U,ℝ), you're talking about entirely separate concepts in doing this so I don't see why ψ(u,fᵀ(f(v))) = ψ(u,f(fᵀ(v))) makes any sense, it seems fast and loose to me... Even pretending this composition makes sense we get fᵀf : V → L(V,ℝ) | (x,y) ↦ (fᵀf)(x,y), which - now an element of L(V,ℝ) gives (fᵀf)(x,y) : V → ℝ | (z,w) ↦ [(fᵀf)(x,y)](z,w) which again makes no sense to me.

    I can't even make up an example to test whether ψ(u,fᵀ(f(v))) = ψ(u,f(fᵀ(v))) holds inside an alternating bilinear form so hopefully you see my problem.

    Basically I don't know

    a) why fᵀ(f(v)) is a valid definition
    b) whether ψ(u,fᵀ(f(v))) = ψ(u,f(fᵀ(v))) holds inside of an alternating bilinear form
    c) an alternative way to show ψ(f(u),f(v)) = ψ(fᵀ(u),fᵀ(v)) = det(f)ψ(u,v) = det(fᵀ)ψ(u,v) without giving in, losing the flow, & resorting to permutations or explicit calculations.

    [2] If it works out that we can prove things this way & I were to try to adapt this process to multilinear forms, I don't know how one can talk about the transpose of a linear map since in most sources, e.g. here, it's defined in terms of bilinear forms & we're working inside of a multilinear form - would I have to create a valid definition of the transpose inside of a multilinear form? I can't find anything on this & I don't know if one can just write ψ(f(u),f(v),f(w)) = ψ(u,fᵀ(f(v)),fᵀ(f(w))) etc...?

    [3] Dieudonne goes on to define the adjoint f* of a map f, one way using a bilinear form ψ(f(x),y) = ψ(x,f*(y)), & an alternative way I'll describe below.

    If f is an endomorphism of V then fᵀ is an endomorphism of L(V,ℝ), & so using the linear maps:

    s_ψ : V → L(V,ℝ) |x ↦ s_ψ(x) w/ s_ψ(x) : V → ℝ | y ↦ [s_ψ(x)](y) = ψ(x,y)
    d_ψ : V → L(V,ℝ) |y ↦ d_ψ(x) w/ d_ψ(y): V → ℝ | y ↦ [d_ψ(y)](x) = ψ(x,y)

    it "readily follows" that the map f* defined as

    f* = (s_ψ)⁻¹(fᵀ)(s_ψ) : V → V
    f* = (d_ψ)⁻¹(fᵀ)(d_ψ) : V → V

    known as the Adjoint of f Relative to ψ is an endomorphism of V.

    Now, I have no idea in the world how the two definitions are equivalent (this was included in a book aimed at French high school students!!!), but assuming we've proven that det(f) = det(fᵀ) we can show that det(f) = det(f*) since det(f*) = det[(s_ψ)⁻¹(fᵀ)(s_ψ)] = det(fᵀ) = det(f).

    Now if there is an alternative way to prove det(f) = det(f*) we could invert f* to show det(f) = det(fᵀ), so I offer this as an extra possible approach to this problem, though moreso I post it to see if anybody understands this & can justify why both definitions are equivalent because I certainly can't. However this approach seems to me to also be constrained by the fact that fᵀ doesn't seem to be defined for multilinear forms so I'm not sure if this is flawed or not.

    Thanks for taking the time with this!
  2. jcsd
  3. Sep 2, 2012 #2
    The transpose should be equivalent to the adjoint if the underlying metric involved is Euclidean.

    Pardon me for intruding as a physicist trying to deal with higher linear algebra. This is what I know. Let [itex]\underline A(a)[/itex] denote a linear operator acting on a vector [itex]a[/itex] and [itex]\overline A(a)[/itex] denote its adjoint . For some vector [itex]b[/itex], the two are related by

    [tex]\underline A(a) \cdot b = a \cdot \overline A(b)[/tex]

    This can be extended to two equal graded multivectors without any modification. Hence, it can be extended all the way to the pseudoscalar [itex]i[/itex] and scalar multiples thereof. Since there is only one pesudoscalar in a space, this yields (for scalars [itex]\alpha, \beta[/itex]

    [tex]\underline A(\alpha i) \cdot (\beta i) = \alpha \beta \underline A(i) \cdot i = \alpha \beta i \cdot \overline A(i)[/tex]

    This should persuade us that [itex]\underline A(i) = \overline A(i)[/itex]. We call the action of a linear operator on the pseudoscalar of a space the [itex]\underline A(i) \equiv i(\det \underline A)[/itex]. It is, essentially, the determinant.

    I've spoken only about operators and their adjoints here. The adjoint is simply much more useful than the transpose, and while they're equivalent when the metric is Euclidean, that's obviously not the general case.

    Again, pardon me for intruding. I feel as though we're doing the same thing but talking in two different languages, so I know I may be a bit difficult to parse. One thing I do notice, though: I interpret your [itex]\psi(u,v)[/itex] as a metric. I view this as a linear operator on a flat space as well: [itex]\psi(u, v) \equiv \underline \psi(u) \cdot v[/itex], say. Then you talk about [itex]\underline \psi \underline f(u) \cdot \underline f(v) = \gamma \underline \psi(u) \cdot v[/itex]. I see this a lot in my work, and it describes a conformal transformation. I'm just wondering what that has to do with the determinant, as it seems to say more that [itex]\underline f \overline f(a) = \gamma \underline I(a)[/itex], a multiple of the identity operator, than anything particular about the determinant.

    I must emphasize that perhaps it is simply my primitive understanding of what you're doing.
  4. Sep 2, 2012 #3


    User Avatar
    Science Advisor

    Can you decompose your operator into a rotation/non-rotation decomposition?

    Rotation matrices have the property that the determinant is 1 which means that the inverse has a determinant of 1 and you can show that the inverse is equal to the tranpose.

    Since any multi-linear map is just a composition of linear maps you should be able to do this decomposition.

    If you can get a normalized rotation matrix and another matrix to multiply it by (like a diagonal for instance) then since R^(-1) = R^T and det(R) = det(R^-1) = det(R^T) = 1 and for the diagonal matrix, the determinant is easy to calculate (since it becomes an "inverted diagonal") and you're done.
  5. Sep 2, 2012 #4
    Oh, I see what you were doing with [itex]\psi[/itex] now. You're basically defining the outer or wedge product, a means to extending linear operators to multilinear algebra, right? The wedge product is a product between vecturs [itex]u \wedge v = - v \wedge u[/itex] such that [itex]u \wedge u = v \wedge v = 0[/itex]. When used with linear operators, one often chooses to say:

    [tex]\underline A (a \wedge b) \equiv \underline A(a) \wedge \underline A(b)[/tex]

    One can interpret this as a definition. Otherwise, yes, you're correct, the determinant follows more or less from this notion. In a 2d space, we have

    [tex]\underline A(e_1 \wedge e_2) = (\det \underline A) e_1 \wedge e_2[/tex]

    So if I follow your logic now, you're basically saying that if [itex]\underline f(u \wedge v) = \gamma u \wedge v[/itex]--that is, if [itex]u \wedge v[/itex] is an eigenbivector of the linear operator--then all this stuff follows. That's fine logic for a 2d space (since we've already restricted ourselves to eigenbivectors), so let's follow it.

    Still, putting this in terms of the vectors [itex]u,v[/itex] makes things kind of sticky for extending to an arbitrary dimensioned space. I'll just put things in terms of the unit pseudoscalar of a 2d space [itex]i \equiv e_1 \wedge e_2[/itex]. The following statements are meant to correspond to your numbered statements:

    1) [itex]i = 1i = \underline I(i)[/itex], which means the determinant of the identity is 1.
    2) [itex]\underline f \underline g(i) = \underline f[(\det \underline g)i] = i (\det \underline f)(\det \underline g)[/itex]
    3) [itex]1 = \underline f \underline f^{-1}(i) \implies \det \underline f = 1/(\det \underline f^{-1})[/itex]
    For the following only in a 2d space:
    4) [itex]\underline f([u + w] \wedge v) = \underline f(u \wedge v) + \underline f(w \wedge v) = (\det \underline f)(u \wedge v + w \wedge v)[/itex]
    5) [itex]\underline f([u + av] \wedge v) = \underline f(u \wedge v) + \underline f(av \wedge v) = (\det \underline f)(u \wedge v)[/itex] because [itex]v \wedge v = 0[/itex].
    6) [itex]\underline f(b \wedge v) = \underline([\alpha u + \beta v] \wedge v) = \alpha (\det \underline f) u \wedge v[/itex]
    7) [itex](\underline f + \underline g)(u \wedge v) = (\det \underline f) u \wedge v + (\det \underline g)(u \wedge v) + \underline f(u) \wedge \underline g(v) + \underline g(u) \wedge \underline f(v)[/itex]

    I hope this persuades you that we're talking about the same math (even if they're presented in different languages). I think the symmetry argument I gave earlier should make more sense now with this correspondence between your statements in your language and them in mine.
  6. Sep 2, 2012 #5
    I'm really sorry, I can't give a proper response until probably Tuesday, but yeah I think you're doing the same thing in the language of geometric algebra, which at the very least gives equivalent answers though I'm unsure as to whether there are inbuilt assumptions in going down that route.

    I'll say now that my source for differential forms is Cartan's book which isn't exactly standard (though it is extremely explicit about what's going on in the domain & what's going on between spaces etc...) & insofar as I can gauge concepts like the wedge product come after concepts like multilinearity, & further differential forms & wedge products are only defined as maps from a banach space into the space of alternating multilinear functionals, i.e. there are much more restrictions on these concepts than on something like a determinant. A determinant as I've defined it here seems to depend only on an alternating multilinear functional (form) without the requirement of completeness that differential forms seem to impose. Further this seems like the most basic exposition of the determinant I can find short of going into modules ala Bourbaki's Algebra. (edit: sorry, I don't know why I immediately thought of differential forms as the wedge product can indeed be defined without reference to them, however the concept still depends on alternating multilinear maps having been established so again the point about restrictions holds :p).
    But you seem to be using the language of geometric algebra & although I know extremely little about it I know it is defined over a vector space with an associated quadratic form so there could be all kinds of inbuilt assumptions considering that a quadratic form is itself defined in term of a symmetric bilinear form. Again I don't know if that affects this discussion but I don't know how you're able to get this anti-symmetry if the whole thing is defined in terms of a quadratic form so I don't know, there may be no in-built assumptions, what do you think?

    As for extending everything I wrote to a multilinear form, I don't see how it gets sticky:

    ψ(f(v¹), f(v²), ..., f(vⁿ)) = det(f)ψ(v¹, v², ..., vⁿ)

    & you've got n-dimensional volume & all the properties follow with no change, apart from the issues with the adjoint & transpose I mentioned. Maybe you thought I meant something else like mapping u & v, which are in 2-space, to n dimensional space via f? Maybe there's something I'm missing?

    Also your comment about the requirement of the space being Euclidean for the transpose to be equivalent to the adjoint got me thinking about what's going on here. A metric is a symmetric bilinear form yet we're working with an alternating bilinear form. Now the adjoint is defined for arbitrary bilinear forms, both symmetric & alternating, so I'm starting to think that you might actually need to expand the determinant in terms of the coefficients in order to offer up an interpretation of the transpose matrix or the adjoint matrix, because in doing so the coefficients over the complex field will be conjugated when you take them out of one side of the bilinear form whereas they wont be affected in the case of a real field. But I'm not sure, hopefully there is a nicer way to do this. Still though, as it stands I'm not sure how some of these definitions, such as fᵀ(f(v)), are even valid...

    Also, thanks for taking the time with this!

    Chiro: I don't see how to do what you've said for arbitrary linear maps or even whether it can be done, can it? I think you're focusing too much on the multilinear map which doesn't affect the term det(f), changing the multilinear map will change the terms ψ(f(u),f(v)) & ψ(u,v) in ψ(f(u),f(v)) = det(f)ψ(u,v) without altering det(f) which is the unique "similitude" determined by the action of ψ on it's basis thus it wont affect the determinant in the natural case where we interpret ψ(f(e¹),f(e²)) = det(f)ψ(e¹,e²) = det(f)·1 = det(f). Maybe you meant something else or I've missed what you said or maybe this is flawed?

    Anyway, sorry a proper response to follow!
    Last edited: Sep 3, 2012
  7. Sep 3, 2012 #6
    You're fine; you seem to understand exactly how it works. I just didn't want to get too tied up in stuff that works for 2d but not in N-dimensions. You obviously have a good grasp of this stuff, so I'll try to be more precise.

    I think the point of contention is whether the adjoint is really defined for an arbitrary bilinear form. I'm trying to see if an antisymmetric one would yield any obviously undesirable properties or behavior, but I admit that, at the moment, I can't see a big reason why it couldn't be. This does make me wonder why the adjoint is defined the way it usually is in GA. In GA, though, we have the metric and the outer product--basically, two bilinear forms, one symmetric and the other antisymmetric. Together, through the inner and outer products combined, they form the geometric product. Perhaps that allows the symmetric and antisymmetric parts to be married into a single bilinear form...?
  8. Sep 3, 2012 #7
    Ah, I think I see what's going on here. I think when you use an arbitrary bilinear form, you get that, if you decompose the form into symmetric and antisymmetric parts, you get two relations between the operator and its adjoint that must both be true.

    In GA language, if you define the adjoint by

    [tex]\underline f(u)v = u \overline f(v)[/tex]

    where this juxtaposition indicates the geometric product (i.e. [itex]uv = u \cdot v + u \wedge v[/itex]), then you get

    \underline f(u) \cdot v &= u \cdot \overline f(v) \\
    \underline f[u \wedge \underline f^{-1}(v)] &= \overline f[\overline f^{-1}(u) \wedge v]

    These two equations result, and the first is the familiar expression of the adjoint that I know of. The second is unfamiliar to me, and I'm going to check whether it actually ought to hold or not. If it doesn't make sense, then perhaps I've just put forward a bad definition of the adjoint.
  9. Sep 3, 2012 #8
    I checked this out yesterday before posting to find out whether it was actually defined for arbitrary bilinear forms because I originally came across it defined in terms of just a symmetric bilinear form in Lang's linear algebra book but since Dieudonne was using it with anti-symmetric bilinear forms I wondered & sure enough the wiki doesn't restrict it, neither does http://planetmath.org/encyclopedia/Bilinear.html [Broken] (which I think more clearly illustrates the equivalence between both formulations of the adjoint, also I think the definitions in this link use better definitions of the adjoint in that it's not defined in terms of linear functionals which makes me wonder what's wrong with everything I've defined). So it's just a question of how the adjoint affects those terms on the R.H.S. of your formula, whether the split up into something nice or maybe it doesn't split up in a manner analogous to det(A + B) ≠ det(A) + det(B).
    Last edited by a moderator: May 6, 2017
  10. Sep 8, 2012 #9
    Apologies for the late response. In a different section of the book he offers an alternative proof that det(f) = det(f*).

    Consider the alternating form ψ(x,y) such that ψ(x,y) = <w(x),y> where w* = - w & < , > is an inner product.
    (Note that since < , > is symmetric we see ψ(x,y) = <w(x),y> = <x,w*(y)> = <x,-w(y)> = - <x,w(y)> = - <w(y),x> = - ψ(y,x))
    So considering ψ(f(x),f(y)) = det(f)ψ(x,y) we have:
    ψ(f(x),f(y)) = <w(f(x)),f(y)> = <f*(w(f(x))),y> = det(f)ψ(x,y) = det(f)<w(x),y>
    Thus <f*(w(f(x))),y> = det(f)<w(x),y> = <det(f)w(x),y>.
    Therefore we can write f*wf = det(f)w.
    Taking the determinant of this operator we find:
    det(f*wf) = det(det(f)w) ---> det(f*)det(w)det(f) = det(det(f)·I)det(w) = det(f)²det(I)det(w) ---> det(f*)det(w)det(f) = det(f)²det(w) ---> det(f*) = det(f),
    (Note that det(det(f)·I) became det(f)²det(I) because I is 2x2).

    This is constrained to hold over the real numbers but I think I can see how it can be adapted to hold over C, further it is constrained to work only in 2-d, but at least it's something. Again I just don't see how this approach could possibly apply in dimension greater than 2 without inventing a consistent definition of the adjoint that holds in multilinear forms & I'm hardly ready to do that...

    To quote Chernyshevsky: "What is to be Done?"
  11. Sep 8, 2012 #10


    User Avatar
    Science Advisor

    SponsoredWalk: The idea I was trying to get at was to decompose a matrix into so that you have at least two matrices: one being a rotation matrix and the other being the other that deals with the scaling and other operations that create the other behaviour encoded in the operator.

    We know that R*R^T = 1 and R^T = R^-1 so both have a determinant 1. You then use the identity that det(A)det(B) = det(AB) since A and B are operators of the same rank.

    The idea was to do this decomposition and get a simpler matrix from the non-rotation component and go from there (like for example a diagonal matrix that scales each independent basis vector).

    So you start off by considering how you can find the best orthonormal component, set this part in the rotation component and then look at the other matrix that provides the rest of the transformation information (as you said, a multi-linear map).

    The proofs above are going to be better, but doing decompositions like the above is a good way to do proofs especially if you can harness the right decomposition.

    Having A = RQ and then A^T = Q^TR^T then det(A^T) = det(Q^T)det(R^T) = det(Q^T) so the thing remains to show det(Q^T) = det(Q).
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook