Why Do I Get Three Kronecker Deltas Instead of One in Tensor Summation?

bremvil
Messages
10
Reaction score
0
Hi everyone,

I recently started a course on continuum mechanics. It started with the mathematical background of transforming tensors with contravariant and/or covariant indices. There is one thing I don't understand and it should be really straight forward. I hope you can give me a hint.

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg

In equation 5 on the page that I linked above I do not see how I can get from term 3 to term 4. I end up with 3 kronecker delta's instead of just 1. Since there is a double index 'm' summation can be performed and it should lead to the same result as when you use a 'shortcut'. I can show what I do using 'latex' notation, with ^ = superscript, _ = subscript, d = kronecker delta, t = theta

'term 3' in Equation (5) reads:
(dt^i/dx^m) * (dx^m/dt^j)

in case I decide to do summation this equation will turn into:
(dt^i/dx^1) * (dx^1/dt^j) + (dt^i/dx^2) * (dx^2/dt^j) +
(dt^i/dx^3) * (dx^3/dt^j)

Each component of vector x is a function of all three components of vector theta. And each component of vector theta is a function of all three components of vector x. By the chain rule the last expression would become

dt^i/dt^j + dt^i/dt^j + dt^i/dt^j

this is:
d^i_j + d^i_j + d^i_j = 3 * d^i_j

so in case I decide to do the summation I end up with something different than I
would expect. 3 kronecker delta's instead of 1! Is there any objection to using a sum in this
case?

with kind regards,

Bremvil
 
Physics news on Phys.org
That equality is just the chain rule. It doesn't have anything to do with tensors or properties of the Kronecker delta.

When g:\mathbb R^n\rightarrow\mathbb R^n and f:\mathbb R^n\rightarrow\mathbb R, I like to write the chain rule like this:

(f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)

Here ",i" denotes partial derivative with respect to the ith variable, and g^j is the jth component of the function g. If f:\mathbb R^n\rightarrow\mathbb R^n, we have

(f\circ g)^i(x)=(f(g(x))^i=f^i(g(x))=f^i\circ g(x)

so

\delta^i_j=(f\circ f^{-1})^i{}_{,j}(x)=(f^i\circ (f^{-1}))_{,j}(x)

Define g=f^{-1} to unclutter the notation somewhat. Then the above is

=(f^i\circ g)_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(x)

and...uhh...I can't explain why this is written in the form

\frac{\partial A^i}{\partial B^k}\frac{\partial B^k}{\partial A_j}

without explaining partial derivatives with respect to a coordinate system. (Edit: Actually I can. See the comments at the end of the post). On a manifold, expressions like f(x+h) don't work, because in general, addition isn't defined for points in the manifold. This is why we have to use a coordinate system to define partial derivatives. A coordinate system is a function x:U\rightarrow R^n, where U is an open subset that contains the point p at which we want to define the partial derivative. If f is a function from the manifold to the real numbers, we define

\frac{\partial f}{\partial x^i}(p)=(f\circ x^{-1})_{,i}(x(p))

In particular, if y is another coordinate system,

\frac{\partial y^j}{\partial x^i}(p)=(y^j\circ x^{-1})_{,i}(x(p))

The f above is a coordinate change function, i.e. an expression of the form A\circ B^{-1}, where A and B are coordinate systems. So we have

\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=(A^i\circ B^{-1})_{,k}(g(x))(B^k\circ A^{-1}){}_{,j}(x)=\frac{\partial A^i}{\partial B^k}(A^{-1}(x))\frac{\partial B^k}{\partial A_j}(A^{-1}(x))

Edit: It turned out to be easier than I thought to explain the notation. No techniques from differential geometry are needed.

\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(f(g(x)))=\frac{\partial f^i}{\partial g^k}(g(x))\frac{\partial g^k}{\partial f_j}(f(g(x))
 
Last edited:
Dear Fredrik,

Thanks for your reply. I read through it carefully but I find it quite difficult, my math background is not as strong as yours. So I'm still not really there yet. In your explanation you started with:

<br /> (f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)<br />

but this step is basically my entire problem! The index j appears twice which would mean summation right. I will try to write down my original problem with latex. Could you instead maybe tell where I am making the error?

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg
equation 5, the problem is in step from term 3 to term 4.


<br /> x_i = x_i(\theta_1 ,\theta_2 , \theta_3)<br />

<br /> \theta_i = \theta_i(x_1 ,x_2 , x_3)<br />

If I decide to apply summation over 'index m' in the equation below I get:

<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

by the chain rule I get

<br /> \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j<br />

So I get 3 times the delta function instead of only 1 delta function. I guess I should not do summation for some reason, but I don't understand why. There is a double 'index m' so a summation should be justified. In every part of the book 'classical and computational solid mechanics' the presence of a double index means summation.
 
bremvil said:
<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

by the chain rule I get

<br /> \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j<br />
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

\frac{\partial\theta^i}{\partial\theta_j}

(once, not three times).
 
So basically you are saying that the term on the top right:

<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

equals a single delta function? The way I interpret it, each term within the expression above is a single delta function.
Fredrik said:
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

\frac{\partial\theta^i}{\partial\theta_j}

(once, not three times).
 
bremvil said:
The way I interpret it, each term within the expression above is a single delta function.
You're not applying the chain rule correctly. Another example:

\frac{d}{dx}f(g(x),h(x))=\frac{\partial f}{\partial g}\frac{dg}{dx}+\frac{\partial f}{\partial h}\frac{dh}{dx}\neq \frac{df}{dx}+\frac{df}{dx}
 
I finally see it! Thanks a lot.
 

Similar threads

Back
Top