Calculating Hessian of f(x)^TQy: What Can We Conclude?

  • A
  • Thread starter perplexabot
  • Start date
  • Tags
    Hessian
In summary: I cannot provide a link to the paper as I do not have it at hand. However, I can provide a summary of the content of the paper. The paper discusses a method for calculating the hessian of a multivariate normal distribution. The method is based on the Hessian of a linear function.
  • #1
perplexabot
Gold Member
329
5
Hey all. Let me just get right to it! Assume you have a function [itex]f:\mathbb{R}^n\rightarrow\mathbb{R}^m[/itex] and we know nothing else except the following equation:
[itex]\triangledown_x\triangledown_x^Tf(x)^TQy=0[/itex]
where [itex]\triangledown_x[/itex] is the gradient with respect to vector [itex]x[/itex] (outer product of two gradient operators is the hessian operator). Also let the dimensions of [itex]Q[/itex] and [itex]y[/itex] conform.

Using the information provided above what can you conclude about [itex]f(x)[/itex] (if anything)? Can you infer that [itex]f(x)[/itex] is linear?

Thank you : )
 
Physics news on Phys.org
  • #2
The role of ##Q## and ##y## in this equation is unclear. The most natural interpretation, which I will adopt, pending clarification, is that the equation is implicitly prefixed by ##\forall y## and ##\forall Q##. If so then the equation is equivalent to the simpler equation (writing ##u## for ##Qy##):

$$\forall u:\ H \langle f(x),u\rangle=0$$
where ##H## denotes the Hessian operator.

This in turn can be written:
$$\forall u\forall i:\ \sum_j\sum_k u_k\frac{\partial}{\partial x_i\partial x_j}f_k(x)=0$$

By letting ##u## be each of the basis vectors in turn, we can get:
$$\forall i\forall k:\ \sum_j\frac{\partial}{\partial x_i\partial x_j}f_k(x)=0$$

Note that there are ##m## separate Hessian matrices involved here, indexed by ##k## in this formula. The formula tells us that, in each such matrix, all row sums are zero. I think that will make each Hessian singular, but they need not be zero. For instance we could have a Hessian ##\pmatrix{1&-1\\-1&1}##.

So there can still be curvature (ie ##f## is not necessarily linear) but there would be some sort of constraining relationship within that curvature.
 
  • Like
Likes perplexabot
  • #3
andrewkirk said:
The role of ##Q## and ##y## in this equation is unclear. The most natural interpretation, which I will adopt, pending clarification, is that the equation is implicitly prefixed by ##\forall y## and ##\forall Q##. If so then the equation is equivalent to the simpler equation (writing ##u## for ##Qy##):

$$\forall u:\ H \langle f(x),u\rangle=0$$
where ##H## denotes the Hessian operator.

This in turn can be written:
$$\forall u\forall i:\ \sum_j\sum_k u_k\frac{\partial}{\partial x_i\partial x_j}f_k(x)=0$$

By letting ##u## be each of the basis vectors in turn, we can get:
$$\forall i\forall k:\ \sum_j\frac{\partial}{\partial x_i\partial x_j}f_k(x)=0$$

Note that there are ##m## separate Hessian matrices involved here, indexed by ##k## in this formula. The formula tells us that, in each such matrix, all row sums are zero. I think that will make each Hessian singular, but they need not be zero. For instance we could have a Hessian ##\pmatrix{1&-1\\-1&1}##.

So there can still be curvature (ie ##f## is not necessarily linear) but there would be some sort of constraining relationship within that curvature.
Hmmm. I find your post interesting. I do not understand how you achieved your equations. Maybe my question was badly worded, or maybe I have truncated too much information from the question. Your final answer, f not necessarily being linear, is what I also think. Would it be wise to link or post the paper of which my question stems from? It has something to do with taking the hessian of the log of a multivariate normal distribution.

Thank you for your help : )
 

1. What is the purpose of calculating the Hessian of f(x)^TQy?

The Hessian of f(x)^TQy is used to determine the curvature or shape of a multivariate function at a given point. It provides information on the second-order derivatives of the function, which can help in optimizing the function or determining the nature of its critical points.

2. How do you calculate the Hessian of f(x)^TQy?

The Hessian of f(x)^TQy is calculated by taking the second partial derivatives of the function with respect to each variable and arranging them in a matrix form. This matrix is also known as the Hessian matrix.

3. What does the Hessian matrix tell us about the function f(x)^TQy?

The Hessian matrix provides information on the local behavior of the function f(x)^TQy. It can tell us whether the function is convex or concave, and provide information on the location and nature of its critical points such as maxima, minima, or saddle points.

4. How can the Hessian of f(x)^TQy be used to optimize the function?

The Hessian matrix can be used to determine the optimal step size and direction for gradient descent algorithms, which are commonly used in optimization problems. It can also be used to check for the convergence of the optimization algorithm.

5. Can the Hessian of f(x)^TQy be used for functions with more than two variables?

Yes, the Hessian matrix can be calculated for functions with any number of variables. For functions with more than two variables, the Hessian matrix will be a square matrix with dimensions equal to the number of variables.

Similar threads

Replies
3
Views
1K
  • Topology and Analysis
Replies
24
Views
2K
Replies
3
Views
1K
Replies
2
Views
2K
Replies
14
Views
2K
Replies
1
Views
4K
  • Calculus and Beyond Homework Help
Replies
4
Views
837
  • Calculus
Replies
4
Views
904
Replies
4
Views
880
Replies
9
Views
2K
Back
Top