Gradient of a function containing a matrix?

countzander · Sep 28, 2014

Homework Statement

http://i.imgur.com/TlDOllQ.png

Homework Equations

As stated.

The Attempt at a Solution

[/B]
I'm not sure how to slay this beast. I know the gradient is just a partial derivative and that the solution likely involves multiple partial derivatives, one for each element in the vector x. But how would the partial derivatives be computed, given the matrices?

Ray Vickson · Sep 29, 2014

countzander said:

Homework Statement

http://i.imgur.com/TlDOllQ.png

Homework Equations

As stated.

The Attempt at a Solution

[/B]
I'm not sure how to slay this beast. I know the gradient is just a partial derivative and that the solution likely involves multiple partial derivatives, one for each element in the vector x. But how would the partial derivatives be computed, given the matrices?

Just type out the problem here; do not use thumbnails. Read the pinned post 'Guidelines for students and helpers', by Vela, to see why. I cannot read your thumbnail on some media, so I will make no attempt to help.

TheFerruccio · Sep 29, 2014

I'll help you out with the formatting, at least. This might help you if you want to post future questions.

Let ##A \in\mathbb R^{m\times n}## and ##B \in\mathbb R^m##. Compute the gradient of$$f:\mathbb R^n\rightarrow\mathbb R,f(x)=\frac{1}{2}x^Tx+log\left(e^TE(Ax+b)\right),$$ where ## e=\left(1,1,1,\ldots,1\right)\in\mathbb R^m ## and ##E:\mathbb R^m\rightarrow\mathbb R^m## is a component wise exponential function, i.e., ##(E(x))_i=exp(x_i)## for ##i=1,2,\ldots m.##
Use ##diag(v)## for a ##m\times m## diagonal matrix with diagonal elements given by ##v\in\mathbb R^m##

countzander · Sep 30, 2014

Thanks for the formatting help.

I attempted a solution by differentiating with respect to ##x_n##.

$$\frac{\partial f}{\partial x_n} = 2x_n + \frac{A^T e^T exp(Ax+b)}{e^T exp(Ax+b)}$$

But this isn't correct, I don't think. Shouldn't ##A^T## cancel out somewhere? Can the gradient contain a matrix in the final solution?

Ray Vickson · Oct 1, 2014

countzander said:

Thanks for the formatting help.

I attempted a solution by differentiating with respect to ##x_n##.

$$\frac{\partial f}{\partial x_n} = 2x_n + \frac{A^T e^T exp(Ax+b)}{e^T exp(Ax+b)}$$

But this isn't correct, I don't think. Shouldn't ##A^T## cancel out somewhere? Can the gradient contain a matrix in the final solution?

Just write out the original function ##f = f(x_1,x_2, \ldots, x_n)## in detail so you can see what is happening; then, after finding the derivative you can re-write the answer using matrices again, if you want. So
f = \frac{1}{2} \sum_{i=1}^n x_i^2 + \log \left(\sum_{i=1}^n \exp (b_i + \sum_{j=1}^n a_{ij} x_j ) \right)
You could also use the fact that ##\log(\sum_i (\exp(g_i)) = \prod_i g_i##, but I don't know if that makes things better or worse---you would need to try it for yourself. Also, you don't want just ##\partial f / \partial x_n##, you want all the ##\partial f / \partial x_k, k = 1,
\dots,n##.

BTW: in TeX/LaTeX you should use "\exp" instead of "exp" and "\log" instead of "log"; this applies also to the other standard functions (the trig functions, the hyperbolic functions plus things like "max", "min", "mod", etc.) The results really do look better: you get ##\exp## instead of ##exp##, ##\log## instead of ##log##, etc.

Gradient of a function containing a matrix?

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Homework Statement

Homework Equations

The Attempt at a Solution

Homework Statement

Homework Equations

The Attempt at a Solution

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Volume with spherical coordinates

Does this series converge uniformly?

Polar integral

Use greedy vertex coloring algorithm to prove the upper bound of χ

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers