How to compute the derivative with respect to a vector in index notation?

EngWiPy · Jun 29, 2010

Dear all;

I need to find this derivative:

[tex]\frac{\partial}{\partial \mathbf{a}}\left(E-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2\right)[/tex]

where boldface letters indicates column vectors, the superscript T indicates transpose, and ||.|| is the norm of a vector.

Thanks in advance

mathman · Jun 29, 2010

The notation is unusual. I have never seen a definition of a derivative with respect to a vector. The closest thing I can think of is the del operator.

Pere Callahan · Jun 29, 2010

I think what is meant is
[tex] \frac{\partial}{\partial\vec a}f(\vec a) = \left(\frac{\partial}{\partial a_1}f(\vec a),\frac{\partial}{\partial a_2}f(\vec a),\frac{\partial}{\partial a_3}f(\vec a)\right)[/tex]
which is of course very similar to the del operator.

So I suggest you write your function
[tex] f(\vec a) =E-2\,\vec a^{\text{T}}\,E[\vec s]+\|\vec a\|^2[/tex]
explicitly in terms of the components [itex]a_1, a_2, a_3[/itex] (I assume its a three-dimensional vector, it wouldn't make a difference if it were not) and then compute the partial derivatives.

EngWiPy · Jun 30, 2010

The vectors are N*1 column vectors. At the end, I need to find the derivative with respect to [tex]\mathbf{a}[/tex]. How can I do that?

Thanks in advance

HallsofIvy · Jun 30, 2010

In general, the derivative of a function f from [itex]R^n[/itex] to [itex]R^m[/itex] is the linear function from [itex]R^n[/itex] that "best" approximates the function f around the given point. "best" is made precise in the definition of the derivative.

Given a coordinate system (basis) in each space, we can then write the derivative as an n by m matrix. In this problem you are differentiating a scalar, in R, with respect to a vector in [itex]R^3[/itex] so this would be a "1 by 3" matrix which we would interpret as a 3-vector. Essentially, you treat the components of the vector as the three variables and this derivative is the same as taking the gradient, [itex]\nabla f[/itex], of a numerical function of three variables.

uart · Jun 30, 2010

S_David said:

Dear all;

I need to find this derivative:

[tex]\frac{\partial}{\partial \mathbf{a}}\left(E-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2\right)[/tex]

where boldface letters indicates column vectors, the superscript T indicates transpose, and ||.|| is the norm of a vector.

The vectors are N*1 column vectors.

Then in that case I'd assume that the you are to find a column vector of partial derivates, as in :

[tex]\frac{\partial f}{\partial \mathbf{a}} = \left[\frac{\partial f}{\partial a_1}, \frac{\partial f}{\partial a_2}, \frac{\partial f}{\partial a_3}, ... \right]^{\text{T}} = 2 [ \mathbf{a} - E \mathbf{s}][/tex]

Assuming that "E" is a scalar then "f" is just a scalar (quadratic) function of [itex]a_1,a_2, ... a_n[/itex] right. So you should be able to verify the above should be easily enough.

Hurkyl · Jun 30, 2010

S_David said:

Dear all;

I need to find this derivative:

You have been asked to clarify what your notation means. Please do so, rather than force people to guess what notational conventions you are using.

Also, I suggest you specify what E and [s] mean. (And if they have any functional dependence on a)

p.s. ||a||² = a^T a

EngWiPy · Jun 30, 2010

Hurkyl said:

You have been asked to clarify what your notation means. Please do so, rather than force people to guess what notational conventions you are using.

Also, I suggest you specify what E and [s] mean. (And if they have any functional dependence on a)

p.s. ||a||² = a^T a

I thought it is obvious, sorry. The terms are:
1- [tex]\mathbf{a}[/tex] is an [tex]N\times 1[/tex] column vector.
2-[tex]E[/tex] is a constant that does not depend on [tex]\mathbf{a}[/tex].
3- [tex]E[\mathbf{s}][/tex] is another [tex]N\times 1[/tex] vector, that does not depend on [tex]\mathbf{a}[/tex].
4- [tex]\|\mathbf{a}\|^2[/tex] is as you clarified.

Thank you all, now I get it.

Regards

Hurkyl · Jun 30, 2010

S_David said:

I thought it is obvious, sorry.

I, in particular, was confused by the square brackets around s. (i.e. why not write Es?)

EngWiPy · Jul 1, 2010

Hurkyl said:

I, in particular, was confused by the square brackets around s. (i.e. why not write Es?)

Yes, you are right, it is confusing, because the terms [tex]E[/tex] and [tex]E[\mathbf{s}][/tex] are different. Let me re-write the original equation:

[tex]\mathcal{E}-2\,\mathbf{a}^{\text{T}}\,E[\mathbf{s}]+\|\mathbf{a}\|^2[/tex]

and the term [tex]E[\mathbf{s}][/tex] is written in this way, because in the context in which I am working in it is the statistical average of a number of vectors [tex]\left\{\mathbf{s}_i\right\}_{i=1}^{M}[/tex]

Regards

Studiot · Jul 1, 2010

Perhaps it would help simplify things if I pointed out that the norm of anything is a number ie a scalar.

HallsofIvy · Jul 1, 2010

Studiot said:

Perhaps it would help simplify things if I pointed out that the norm of anything is a number ie a scalar.

Yes, so the function to be differentiated is a number, not a vector. But it is being differentiated with respect to a vector. That is the whole point.

Mute · Jul 1, 2010

S_David said:

The vectors are N*1 column vectors. At the end, I need to find the derivative with respect to [tex]\mathbf{a}[/tex]. How can I do that?

Thanks in advance

Write it in index notation (repeated indices are assumed to be summed over):

[tex]f(\{a\}) = \mathcal E - 2 a_i E[\mathbf{s}]_i + a_i a_i[/tex]

Then, what you want is

[tex]\frac{\partial f}{\partial a_\ell} = -2 \frac{\partial a_i}{\partial a_\ell} E[\mathbf{s}]_i + \frac{\partial a_i}{\partial a_\ell} a_i + a_i \frac{\partial a_i}{\partial a_\ell} = (-2 E[\mathbf{s}]_i + 2a_i )\delta_{i\ell} = -2E[\mathbf{s}]_\ell +2a_\ell[/tex].
where I used the fact that [itex]\partial a_i/\partial a_i = \delta_{i\ell}[/itex], the Kronecker delta.
So, the ith component of the vector [itex]\partial f/\partial \mathbf{a}[/itex] is

[tex]\frac{\partial f}{\partial a_i} = -2E[\mathbf{s}]_i +2a_i[/tex]
which means

[tex]\frac{\partial f}{\partial \mathbf{a}} = - 2\mathbf{E}[\mathbf{s}] + 2\mathbf{a}[/tex].

(Since E is a vector that depends on the vector s, I have made E bold in vector notation).

How to compute the derivative with respect to a vector in index notation?

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Undergrad Finding the minimum distance between two curves

Undergrad Proving that convexity implies second order derivative being positive

High School Straightforward integration…

Undergrad Why ##a^0=1##?

High School Arc Length for Hyperbolic Sin

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight