How to compute the derivative with respect to a vector in index notation?

In summary, the derivative of the function with respect to the N*1 column vector a is -2E[s] + 2a, where E is a vector that depends on the vector s. This can be written in index notation as \partial f/\partial a_i = -2E[s]_i + 2a_i and in vector notation as \partial f/\partial \mathbf{a} = - 2\mathbf{E}[\mathbf{s}] + 2\mathbf{a}.
  • #1
EngWiPy
1,368
61
Dear all;

I need to find this derivative:

[tex]\frac{\partial}{\partial \mathbf{a}}\left(E-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2\right)[/tex]

where boldface letters indicates column vectors, the superscript T indicates transpose, and ||.|| is the norm of a vector.

Thanks in advance
 
Mathematics news on Phys.org
  • #2
The notation is unusual. I have never seen a definition of a derivative with respect to a vector. The closest thing I can think of is the del operator.
 
  • #3
I think what is meant is
[tex]
\frac{\partial}{\partial\vec a}f(\vec a) = \left(\frac{\partial}{\partial a_1}f(\vec a),\frac{\partial}{\partial a_2}f(\vec a),\frac{\partial}{\partial a_3}f(\vec a)\right)
[/tex]
which is of course very similar to the del operator.

So I suggest you write your function
[tex]
f(\vec a) =E-2\,\vec a^{\text{T}}\,E[\vec s]+\|\vec a\|^2
[/tex]
explicitly in terms of the components [itex]a_1, a_2, a_3[/itex] (I assume its a three-dimensional vector, it wouldn't make a difference if it were not) and then compute the partial derivatives.
 
  • #4
The vectors are N*1 column vectors. At the end, I need to find the derivative with respect to [tex]\mathbf{a}[/tex]. How can I do that?

Thanks in advance
 
  • #5
In general, the derivative of a function f from [itex]R^n[/itex] to [itex]R^m[/itex] is the linear function from [itex]R^n[/itex] that "best" approximates the function f around the given point. "best" is made precise in the definition of the derivative.

Given a coordinate system (basis) in each space, we can then write the derivative as an n by m matrix. In this problem you are differentiating a scalar, in R, with respect to a vector in [itex]R^3[/itex] so this would be a "1 by 3" matrix which we would interpret as a 3-vector. Essentially, you treat the components of the vector as the three variables and this derivative is the same as taking the gradient, [itex]\nabla f[/itex], of a numerical function of three variables.
 
  • #6
S_David said:
Dear all;

I need to find this derivative:

[tex]\frac{\partial}{\partial \mathbf{a}}\left(E-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2\right)[/tex]

where boldface letters indicates column vectors, the superscript T indicates transpose, and ||.|| is the norm of a vector.

The vectors are N*1 column vectors.

Then in that case I'd assume that the you are to find a column vector of partial derivates, as in :

[tex]\frac{\partial f}{\partial \mathbf{a}} = \left[\frac{\partial f}{\partial a_1}, \frac{\partial f}{\partial a_2}, \frac{\partial f}{\partial a_3}, ... \right]^{\text{T}} = 2 [ \mathbf{a} - E \mathbf{s}] [/tex]

Assuming that "E" is a scalar then "f" is just a scalar (quadratic) function of [itex]a_1,a_2, ... a_n[/itex] right. So you should be able to verify the above should be easily enough.
 
Last edited:
  • #7
S_David said:
Dear all;

I need to find this derivative:
You have been asked to clarify what your notation means. Please do so, rather than force people to guess what notational conventions you are using.

Also, I suggest you specify what E and [s] mean. (And if they have any functional dependence on a)


p.s. ||a||2 = aT a
 
  • #8
Hurkyl said:
You have been asked to clarify what your notation means. Please do so, rather than force people to guess what notational conventions you are using.

Also, I suggest you specify what E and [s] mean. (And if they have any functional dependence on a)


p.s. ||a||2 = aT a

I thought it is obvious, sorry. The terms are:
1- [tex]\mathbf{a}[/tex] is an [tex]N\times 1[/tex] column vector.
2-[tex]E[/tex] is a constant that does not depend on [tex]\mathbf{a}[/tex].
3- [tex]E[\mathbf{s}][/tex] is another [tex]N\times 1[/tex] vector, that does not depend on [tex]\mathbf{a}[/tex].
4- [tex]\|\mathbf{a}\|^2[/tex] is as you clarified.

Thank you all, now I get it.

Regards
 
  • #9
S_David said:
I thought it is obvious, sorry.
I, in particular, was confused by the square brackets around s. (i.e. why not write Es?)
 
  • #10
Hurkyl said:
I, in particular, was confused by the square brackets around s. (i.e. why not write Es?)

Yes, you are right, it is confusing, because the terms [tex]E[/tex] and [tex]E[\mathbf{s}][/tex] are different. Let me re-write the original equation:

[tex]\mathcal{E}-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2[/tex]

and the term [tex]E[\mathbf{s}][/tex] is written in this way, because in the context in which I am working in it is the statistical average of a number of vectors [tex]\left\{\mathbf{s}_i\right\}_{i=1}^{M}[/tex]

Regards
 
  • #11
Perhaps it would help simplify things if I pointed out that the norm of anything is a number ie a scalar.
 
  • #12
Studiot said:
Perhaps it would help simplify things if I pointed out that the norm of anything is a number ie a scalar.
Yes, so the function to be differentiated is a number, not a vector. But it is being differentiated with respect to a vector. That is the whole point.
 
  • #13
S_David said:
The vectors are N*1 column vectors. At the end, I need to find the derivative with respect to [tex]\mathbf{a}[/tex]. How can I do that?

Thanks in advance

Write it in index notation (repeated indices are assumed to be summed over):

[tex]f(\{a\}) = \mathcal E - 2 a_i E[\mathbf{s}]_i + a_i a_i[/tex]

Then, what you want is

[tex]\frac{\partial f}{\partial a_\ell} = -2 \frac{\partial a_i}{\partial a_\ell} E[\mathbf{s}]_i + \frac{\partial a_i}{\partial a_\ell} a_i + a_i \frac{\partial a_i}{\partial a_\ell} = (-2 E[\mathbf{s}]_i + 2a_i )\delta_{i\ell} = -2E[\mathbf{s}]_\ell +2a_\ell[/tex].
where I used the fact that [itex]\partial a_i/\partial a_i = \delta_{i\ell}[/itex], the Kronecker delta.
So, the ith component of the vector [itex]\partial f/\partial \mathbf{a}[/itex] is

[tex]\frac{\partial f}{\partial a_i} = -2E[\mathbf{s}]_i +2a_i[/tex]
which means

[tex]\frac{\partial f}{\partial \mathbf{a}} = - 2\mathbf{E}[\mathbf{s}] + 2\mathbf{a}[/tex].

(Since E is a vector that depends on the vector s, I have made E bold in vector notation).
 

1. What is the definition of vector differentiation?

Vector differentiation is a mathematical operation that involves finding the rate of change of a vector's components with respect to a given variable. It is similar to traditional differentiation, but instead of dealing with single variables, it involves differentiating functions of multiple variables.

2. How do you differentiate a vector function?

To differentiate a vector function, you will first need to find the derivative of each component of the vector. This can be done by applying the traditional rules of differentiation, such as the product rule, quotient rule, and chain rule. Once you have found the derivatives of the components, you can combine them to form the derivative of the vector function.

3. What is the difference between vector differentiation and traditional differentiation?

The main difference between vector differentiation and traditional differentiation is the number of variables involved. In traditional differentiation, there is only one variable, whereas vector differentiation involves multiple variables. Additionally, vector differentiation deals with vectors, which are quantities that have both magnitude and direction, while traditional differentiation deals with scalar quantities.

4. Why is vector differentiation important?

Vector differentiation is essential in fields such as physics, engineering, and economics, where quantities often have both magnitude and direction. It allows us to find the rate of change of vector quantities, which is crucial in understanding how these quantities behave in different scenarios. It also enables us to solve complex problems involving vectors and their derivatives.

5. What are some common applications of vector differentiation?

Vector differentiation has many practical applications, including optimization problems, motion analysis, and finding maximum or minimum values of vector quantities. It is also used in fields such as fluid dynamics, electromagnetism, and computer graphics, where vector quantities are prevalent. Additionally, it is essential in understanding the behavior of complex systems that involve multiple variables and vectors.

Similar threads

Replies
2
Views
1K
  • Special and General Relativity
Replies
28
Views
661
Replies
3
Views
1K
  • Advanced Physics Homework Help
Replies
1
Views
915
  • Special and General Relativity
Replies
10
Views
1K
  • General Math
Replies
11
Views
1K
Replies
5
Views
1K
  • Linear and Abstract Algebra
Replies
1
Views
790
Replies
3
Views
1K
  • Advanced Physics Homework Help
Replies
5
Views
2K
Back
Top