What is the Process and Meaning of Taking the Derivative of a Vector?

  • Context: Graduate 
  • Thread starter Thread starter hkBattousai
  • Start date Start date
  • Tags Tags
    Derivative Vectors
Click For Summary
SUMMARY

The discussion focuses on the process of taking the derivative of vector functions, specifically addressing the derivatives of expressions such as \(x^TA^TAx - 2x^TA^Tb\) and \(x^Tx\). The derivatives are computed as \(2A^TAx - 2A^Tb\) and \(2x^T\) respectively. The concept of vector derivatives is clarified through the gradient of scalar-valued functions, with emphasis on the role of symmetric matrices, particularly \(B = A^TA\), where the derivative is expressed as \(\frac{df}{dx} = 2Bx\).

PREREQUISITES
  • Understanding of matrix calculus, particularly derivatives of linear functions.
  • Familiarity with vector and matrix notation, including transpose operations.
  • Knowledge of gradients and their significance in optimization.
  • Basic concepts of symmetric matrices and their properties.
NEXT STEPS
  • Study the properties of symmetric matrices and their derivatives in matrix calculus.
  • Learn about the application of gradients in optimization problems.
  • Explore advanced topics in matrix calculus, such as the chain rule for vector functions.
  • Review examples of vector derivatives in machine learning contexts, particularly in loss function optimization.
USEFUL FOR

Mathematicians, data scientists, and engineers who are involved in optimization, machine learning, or any field requiring a solid understanding of vector calculus and matrix operations.

hkBattousai
Messages
64
Reaction score
0
Here is a snapshot from one of my textbooks:
[PLAIN]http://img64.imageshack.us/img64/8114/vector0.png

How do we take the derivative below?
\frac{d}{dx}\Huge(\normalsize x^TA^TAx\,-\,2x^TA^Tb \Huge)\normalsize\,=\,2A^TAx\,-\,2A^Tb

There is also another vector derivative in the book as follows:
\frac{d}{dx}\Huge(\normalsize x^Tx \Huge)\normalsize \, = \, 2x^T

How do we take these type of derivatives?
What is the meaning of taking derivative of a vector, or transpose of vector?

_____________
EDIT: I found http://en.wikipedia.org/wiki/Matrix_calculus#Derivative_of_linear_functions", but it doesn't either explain the main idea behind vector derivation.

\frac{\partial \; \textbf{a}^T\textbf{x}}{\partial \; \textbf{x}} = \frac{\partial \; \textbf{x}^T\textbf{a}}{\partial \; \textbf{x}} = \textbf{a}

\frac{\partial \; \textbf{A}\textbf{x}}{\partial \; \textbf{x}} = \frac{\partial \; \textbf{x}^T\textbf{A}}{\partial \; \textbf{x}^T} = \textbf{A}
 
Last edited by a moderator:
Physics news on Phys.org
For a symmetric matrix B (in your case, B = A^T A), the following is a scalar-valued function from R^n to R:

f(x) = x^T B x

The derivative you are looking for is defined as the vector of partial derivatives (aka gradient):

\frac{df}{dx} = \left(\frac{\partial f}{\partial x_1}, ..., \frac{\partial f}{\partial x_n}\right)^T

If you express f in terms of the components of B,

f(x) = b_{11} x_1^2 + 2 b_{12} x_1 x_2 + ...

you will find that the partial derivatives just "come out right", i.e.

\frac{df}{dx} = 2 B x
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 19 ·
Replies
19
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 13 ·
Replies
13
Views
2K