Gradient of a function containing a matrix?

Click For Summary

Homework Help Overview

The discussion revolves around computing the gradient of a function that involves matrices and vectors, specifically focusing on a function defined in terms of a quadratic term and a logarithmic term involving an exponential function. The participants are exploring the implications of matrix calculus in this context.

Discussion Character

  • Mixed

Approaches and Questions Raised

  • Participants express uncertainty about how to compute partial derivatives when matrices are involved. There are attempts to differentiate the function with respect to individual components of the vector.
  • Some participants question whether the gradient can include a matrix in its final form and discuss the potential cancellation of terms in the differentiation process.
  • There are suggestions to rewrite the function in a more explicit form to clarify the differentiation process.

Discussion Status

The discussion is ongoing, with participants providing formatting assistance and exploring different aspects of the problem. There is no explicit consensus on the correct approach, but various lines of reasoning and attempts at differentiation are being shared.

Contextual Notes

Participants are encouraged to follow specific guidelines regarding problem presentation, and there is mention of the need for clarity in mathematical notation. The original problem context includes a matrix and vector setup that may influence the differentiation process.

countzander
Messages
17
Reaction score
0

Homework Statement


http://i.imgur.com/TlDOllQ.png

Homework Equations


As stated.

The Attempt at a Solution


[/B]
I'm not sure how to slay this beast. I know the gradient is just a partial derivative and that the solution likely involves multiple partial derivatives, one for each element in the vector x. But how would the partial derivatives be computed, given the matrices?
 
Physics news on Phys.org
countzander said:

Homework Statement


http://i.imgur.com/TlDOllQ.png

Homework Equations


As stated.

The Attempt at a Solution


[/B]
I'm not sure how to slay this beast. I know the gradient is just a partial derivative and that the solution likely involves multiple partial derivatives, one for each element in the vector x. But how would the partial derivatives be computed, given the matrices?

Just type out the problem here; do not use thumbnails. Read the pinned post 'Guidelines for students and helpers', by Vela, to see why. I cannot read your thumbnail on some media, so I will make no attempt to help.
 
I'll help you out with the formatting, at least. This might help you if you want to post future questions.

Let ##A \in\mathbb R^{m\times n}## and ##B \in\mathbb R^m##. Compute the gradient of$$f:\mathbb R^n\rightarrow\mathbb R,f(x)=\frac{1}{2}x^Tx+log\left(e^TE(Ax+b)\right),$$ where ## e=\left(1,1,1,\ldots,1\right)\in\mathbb R^m ## and ##E:\mathbb R^m\rightarrow\mathbb R^m## is a component wise exponential function, i.e., ##(E(x))_i=exp(x_i)## for ##i=1,2,\ldots m.##
Use ##diag(v)## for a ##m\times m## diagonal matrix with diagonal elements given by ##v\in\mathbb R^m##
 
Thanks for the formatting help.

I attempted a solution by differentiating with respect to ##x_n##.

$$\frac{\partial f}{\partial x_n} = 2x_n + \frac{A^T e^T exp(Ax+b)}{e^T exp(Ax+b)}$$

But this isn't correct, I don't think. Shouldn't ##A^T## cancel out somewhere? Can the gradient contain a matrix in the final solution?
 
countzander said:
Thanks for the formatting help.

I attempted a solution by differentiating with respect to ##x_n##.

$$\frac{\partial f}{\partial x_n} = 2x_n + \frac{A^T e^T exp(Ax+b)}{e^T exp(Ax+b)}$$

But this isn't correct, I don't think. Shouldn't ##A^T## cancel out somewhere? Can the gradient contain a matrix in the final solution?

countzander said:
Thanks for the formatting help.

I attempted a solution by differentiating with respect to ##x_n##.

$$\frac{\partial f}{\partial x_n} = 2x_n + \frac{A^T e^T exp(Ax+b)}{e^T exp(Ax+b)}$$

But this isn't correct, I don't think. Shouldn't ##A^T## cancel out somewhere? Can the gradient contain a matrix in the final solution?

Just write out the original function ##f = f(x_1,x_2, \ldots, x_n)## in detail so you can see what is happening; then, after finding the derivative you can re-write the answer using matrices again, if you want. So
f = \frac{1}{2} \sum_{i=1}^n x_i^2 + \log \left(\sum_{i=1}^n \exp (b_i + \sum_{j=1}^n a_{ij} x_j ) \right)
You could also use the fact that ##\log(\sum_i (\exp(g_i)) = \prod_i g_i##, but I don't know if that makes things better or worse---you would need to try it for yourself. Also, you don't want just ##\partial f / \partial x_n##, you want all the ##\partial f / \partial x_k, k = 1,
\dots,n##.

BTW: in TeX/LaTeX you should use "\exp" instead of "exp" and "\log" instead of "log"; this applies also to the other standard functions (the trig functions, the hyperbolic functions plus things like "max", "min", "mod", etc.) The results really do look better: you get ##\exp## instead of ##exp##, ##\log## instead of ##log##, etc.
 

Similar threads

  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 16 ·
Replies
16
Views
3K
Replies
6
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
Replies
10
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
Replies
12
Views
9K
  • · Replies 4 ·
Replies
4
Views
2K