Is the Chain Rule Applicable to the Euclidean Norm in Calculating Derivatives?

  • Context: Graduate 
  • Thread starter Thread starter SchroedingersLion
  • Start date Start date
  • Tags Tags
    Euclidean Norm
Click For Summary
SUMMARY

The discussion centers on the application of the Chain Rule to the Euclidean norm in the context of calculating derivatives involving 3D vectors and a 3x3 matrix. The participants confirm the correctness of the derivative expressions for the loss function involving the matrix multiplication of vectors and a bias term. Specifically, the derivative with respect to the bias vector \(\mathbf{b}\) is expressed as \(\sum_k 2(\mathbf{Wx}_k+\mathbf{b}-\mathbf{y}_k)\), and the derivative with respect to the matrix element \(w_{i,j}\) is detailed using a specific column vector structure. The operator \(\frac{\partial}{\partial \mathbf{b}}\) is clarified as equivalent to the gradient operator \(\nabla_{\mathbf{b}}\).

PREREQUISITES
  • Understanding of vector calculus and derivatives
  • Familiarity with matrix operations and notation
  • Knowledge of the Euclidean norm and its properties
  • Basic concepts of optimization in machine learning contexts
NEXT STEPS
  • Study the Chain Rule in the context of multivariable calculus
  • Learn about gradient descent optimization techniques
  • Explore the properties of the Euclidean norm in higher dimensions
  • Investigate the role of bias terms in machine learning models
USEFUL FOR

Mathematicians, data scientists, machine learning practitioners, and anyone involved in optimization problems requiring derivative calculations in multi-dimensional spaces.

SchroedingersLion
Messages
211
Reaction score
56
TL;DR
Need some verification or corrections.
Greetings,

suppose we have 3d vectors ##\mathbf{x}_k, \mathbf{y}_k, \mathbf{b}## for ##k=1,...,N## and a 3x3 matrix ##\mathbf{W}## with real elements ##w_{i,j}##.

Are the following two results correct?
$$
\frac{\partial}{\partial \mathbf{b}} \sum_k ||\mathbf{Wx}_k+\mathbf{b}-\mathbf{y}_k||² = \sum_k 2(\mathbf{Wx}_k+b-\mathbf{y}_k)
$$
$$
\frac{\partial}{\partial w_{i,j}} \sum_k ||\mathbf{Wx}_k+\mathbf{b}-\mathbf{y}_k||² = \sum_k 2 (\mathbf{Wx}_k+\mathbf{b}-\mathbf{y}_k)\cdot
\begin{pmatrix}
0 \\
... \\
0 \\
x_{k,j}\\
0\\
...\\
0
\end{pmatrix}
$$

where the nonzero entry in the column vector is in row ##i## and where ##x_{k,j}## is the ##j-th## component of vector ##\mathbf{x}_k##.
Calculating the scalar product gives
$$
\sum_k 2(\sum_{n=1}^{3} w_{i,n}x_{k,n} +b_{i} - y_{k,i})x_{k,j}
$$
 
Physics news on Phys.org
SchroedingersLion said:
$$
\frac{\partial}{\partial \mathbf{b}} \sum_k ||\mathbf{Wx}_k+\mathbf{b}-\mathbf{y}_k||² = \sum_k 2(\mathbf{Wx}_k+b-\mathbf{y}_k)
$$
I have never seen the operator $$ \frac{\partial}{\partial \mathbf{b}} $$ before. Can you define it?
At first glance, the expression is not meaningful - but I am willing to suspend disbelief ...
 
Svein said:
I have never seen the operator $$ \frac{\partial}{\partial \mathbf{b}} $$ before. Can you define it?
At first glance, the expression is not meaningful - but I am willing to suspend disbelief ...
Oh, that is just a way of writing ##\nabla_{\mathbf{b}}##, sorry.
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 4 ·
Replies
4
Views
4K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 1 ·
Replies
1
Views
6K
  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 12 ·
Replies
12
Views
3K