Solving Quadratic Form Derivative: A Frustrating Challenge

Click For Summary

Discussion Overview

The discussion revolves around the differentiation of a quadratic form with respect to a vector parameter, specifically focusing on the expression for the cost function J and its derivative with respect to θ. Participants are exploring the application of matrix differentiation rules and the implications of transposes in this context.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant expresses confusion about applying derivative rules to the quadratic form J and seeks clarification on the transition from J to its derivative.
  • Another participant suggests using the product rule for differentiation, indicating that since the terms involved commute, the expected result should follow.
  • A later reply clarifies that z and θ are vectors, not scalars, and presents an expanded form of the derivative that includes additional terms not present in the expected result.
  • There is a discussion about the symmetry of the matrix X, with one participant asserting that if Xz = X^T z, then X must be symmetric.
  • Further confusion arises regarding the differentiation with respect to the vector θ and the characterization of the expression as quadratic, prompting questions about the context of the problem.

Areas of Agreement / Disagreement

Participants do not reach a consensus, as there are multiple competing views on the application of differentiation rules and the nature of the matrices involved. The discussion remains unresolved with ongoing questions and challenges to the proposed approaches.

Contextual Notes

Participants highlight potential limitations in understanding the differentiation process, particularly regarding the treatment of vectors and the implications of matrix transposes. There is also uncertainty about the characterization of the expression as quadratic.

Cyrus
Messages
3,246
Reaction score
17
I'm not sure how you apply the rules of a derivative on a quadratic form. I've been trying to find the solution on google but no luck:

Basicallly:

J=\frac{1}{2} (z-X \theta)^T (z-X \theta)

and

\frac{ \partial J}{\partial \theta}= -X^T z +X^TX\theta

I can't for the life of me figure out how they got from the upper equation to the lower equation. The reason is that the transpose is really screwing things up in terms of the deriatives. There is some rule being applied to matrix differentiation of a transpose of a quadratic form that I am ignorant of, which won't let me get to the same expression on the second line...

Every time I try to expand the top line out I end up with 2*cross product term that doesn't drop out, but is clearly not shown in the second line.
 
Physics news on Phys.org
Hi Cyrus! :smile:

It's the usual product rule: (fg)' = f'g + fg',

which here is (fTg)' = f'Tg + fTg'

Since f = g, that comes out as f'Tf + fTf'

and since z (i assume that mean zI) and θ commute with anything, you should get the given result :wink:
 
tiny-tim said:
Hi Cyrus! :smile:

It's the usual product rule: (fg)' = f'g + fg',

which here is (fTg)' = f'Tg + fTg'

Since f = g, that comes out as f'Tf + fTf'

and since z (i assume that mean zI) and θ commute with anything, you should get the given result :wink:

This isn't working. To be clear z and theta are vectors, not scalars.

After expanding I am getting

(-X^Tz+X^TX\theta) -(Xz^T + X\theta^TX^T)


If the two things in brackets could equal twice each term, then the one half would knock out the two and make things right.

Basically, Xz^T = X^T z
 
Got the same thing

Xz^T = X^T z


if this is to be true, then Xz must be symmetric

A symmetric matrix is when it's equal to its transpose

A = A^T
 
Cyrus said:
This isn't working. To be clear z and theta are vectors, not scalars.

After expanding I am getting

(-X^Tz+X^TX\theta) -(Xz^T + X\theta^TX^T)

If the two things in brackets could equal twice each term, then the one half would knock out the two and make things right.

Basically, Xz^T = X^T z

No, the T must always come first: XT z = zT X.

I'm honestly not followng this …

if θ is a vector, how can you differentiate with respect to it?

and you originally called it a quadratic … what's quadratic about it unless each bracket is a vector? :confused:

What is the context of this?
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
2
Views
2K
  • · Replies 9 ·
Replies
9
Views
3K
  • · Replies 20 ·
Replies
20
Views
19K
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K