1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Matrix derivative of quadratic form?

  1. Dec 11, 2014 #1


    User Avatar
    Gold Member

    1. The problem statement, all variables and given/known data
    Find the derivative of f(X).
    f(X) = transpose(a) * X * b

    X is nxn
    a and b are n x 1
    ai is the i'th element of a
    Xnm is the element in row n and column m
    let transpose(a) = aT
    let transpose(b) = bT

    2. Relevant equations
    I tried using the product rule, which I assume is wrong.
    I know the answer to be a*bT (but I have not the slightest clue how)

    3. The attempt at a solution
    I tried many things, to the point where punching a whole through my screen doesn't really seem like a bad idea anymore.

    My last attempt was to use the product rule along with some matrix properties, here is what I did:
    d(f)/dX = [d(aT*X)/dX]*b + (aT*X)*[d(b)/dX] = [d(aT*X)/dX]*b = (d/dX)[Σai*X1i Σai*X2i ⋅ ⋅ ⋅ Σai*Xni]*b

    I have no idea what to do next. I have a feeling using the product rule doesn't apply to matrices.

    Thanks for reading...
  2. jcsd
  3. Dec 12, 2014 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    As an example take [itex] n = 2 [/itex]

    [itex] a = \begin{pmatrix} a_1 \\ a_2 \end{pmatrix} [/itex]

    [itex] b = \begin {pmatrix} b_1 \\ b_2 \end{pmatrix} [/itex]

    [itex] X = \begin{pmatrix} x_{1\ 1} & x_{1\ 2} \\ x_{2\ 1} & x_{2\ 2} \end{pmatrix} [/itex]

    Then [itex] f(X) = a^T X b [/itex] is a single number. ( We could say it is a 1x1 matrix.)

    Then the answer would be [itex] \begin{pmatrix} a_1 \\ a_2 \end{pmatrix} \begin{pmatrix} b_1 & b_2 \end{pmatrix} [/itex] but what kind of multiplication does that represent? It can be worked as ordinary matrix multiplication to produce a 2x2 matrix.

    [itex] ab^t = \begin{pmatrix} a_1b_1 & a_1b_2 \\ a_2b_1 & a_2 b_2 \end{pmatrix} [/itex]

    I don't know the details of your class materials, so I must guess about how "the derivative" of f(X) is defined.

    One guess is that the derivative of [itex] f [/itex] with respect to [itex] X [/itex] is:

    [itex] \begin{pmatrix} \frac{\partial f}{\partial x_{1\ 1}} &\frac{\partial f}{\partial x_{1\ 2}} \\ \frac{\partial f}{\partial x_{2\ 1}} & \frac{\partial f}{\partial x_{2\ 2}} \end{pmatrix}[/itex]

    Is that the definition you use?
    Last edited: Dec 12, 2014
  4. Dec 12, 2014 #3


    User Avatar
    Homework Helper

    Looking at the derivative with respect to the first term (1,1), you could use the limit definition to see what happens in the matrix multiplication.
    ## \lim_{h\to 0} \frac{f(X+\begin{pmatrix} h & 0 \\ 0 & 0 \end{pmatrix})-f(X)}{h} = ? ##
  5. Dec 12, 2014 #4


    User Avatar
    Homework Helper

    And to take a stab at why the product rule isn't working the way you had it above...
    You are treating b like a constant, where really you have a composition of functions of X. g(X) = Xb, h(X) = aX, so f(X) = h(g(x)). You should use the chain rule instead of the product rule.
  6. Dec 12, 2014 #5

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    He should not use any of those things; it is just a straightforward matter, like saying ##(d/dx) (cx) = c## for constant ##c##. In fact,
    [tex] f(X) = \sum_{i=1}^n \sum_{j=1}^n a_i x_{ij} b_j = \sum_{i,j=1}^n c_{ij} x_{ij}, \;\; c_{ij} = a_i b_j [/tex]
    Last edited: Dec 12, 2014
  7. Dec 12, 2014 #6


    User Avatar
    Gold Member

    Yes! However I would like to solve it assuming I don't know what the answer is to be.
    I know you are sort of using the definition of a derivative but I don't get why you have a matrix with h in the top left corner.
    I have a couple questions about what you wrote, if I may.

    ##(d/dx) (cx) = x## for constant ##c## should this not be ##(d/dx) (cx) = c## for constant ##c## ?
    For your equation of f(x): [tex] f(X) = \sum_{i=1}^n \sum_{j=1}^n a_i x_{ij} b_j = \sum_{i,j=1}^n c_{ij} x_{ij}, \;\; c_{ij} = a_i b_j [/tex]
    shouldn't the subscripts of x be reversed (ji instead of ij)?
    Also how did the x go away : ( ??

    Thank you so much!
  8. Dec 12, 2014 #7

    Ray Vickson

    User Avatar
    Science Advisor
    Homework Helper

    Yes, it should have been ##(d/dx) (cx) = c##; I have edited out the error.

    I don't understand the second question: reverse i and j where? What I wrote was ##a^T X b## in expanded form. And, I don't see why you ask why/how the ##x## went away; it didn't---it is still there. Perhaps you wonder where the ##x## went at the end of the displayed equation? Well, when I said ##c_{ij} = a_i b_j##, that was just the definition of ##c_{ij}##. In other words, I wrote the sum with a ##c_{ij}## in it, so I have to define ##c_{ij}## somewhere. Perhaps I should have said " ... where ##c_{ij} = a_i b_j##".
  9. Dec 12, 2014 #8


    User Avatar
    Gold Member

    sorry! my last question is wrong. I read your equation as f(X) = aibj, so it is my fault.
    Ok. I think I understand your equation then.

    But what next? Product rule and chain rule? Or do I simply take the derivative of ##c_{ij}x_{ij}## with respect to ##x_{ij}##? If i do the latter procedure, I just get the sum of ##c_{ij}## terms.
    EDIT: Actually I am wrong once again! You don't get the sum of ##c_{ij}##. You get a column vector with each row being a derivative of ##c_{ij}x_{ij}## with respect to an ##x_{ij}##, right?

    Thank you for your patience : )
  10. Dec 13, 2014 #9


    User Avatar
    Gold Member

    I finally was able to do this. I was trying to solve it without considering the elements of the matrix, when i think that is not possible. Here is my solution, for anyone that may be interested in the future. Thanks for the help from everyone.

Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted

Similar Discussions: Matrix derivative of quadratic form?
  1. Quadratic Forms (Replies: 0)

  2. Quadratic forms (Replies: 3)