Why Do I Get Three Kronecker Deltas Instead of One in Tensor Summation?

  • Context: Graduate 
  • Thread starter Thread starter bremvil
  • Start date Start date
  • Tags Tags
    Convention Tensors
Click For Summary

Discussion Overview

The discussion revolves around a question related to tensor summation in the context of continuum mechanics, specifically focusing on the application of the chain rule and the resulting appearance of multiple Kronecker deltas in an equation. Participants explore the mathematical intricacies of tensor transformations and the implications of summation over indices.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant, Bremvil, expresses confusion over obtaining three Kronecker deltas instead of one when applying summation in a tensor equation, questioning the validity of their approach.
  • Another participant explains that the equality in question is a result of the chain rule and does not inherently relate to tensor properties or the Kronecker delta.
  • Bremvil seeks clarification on the step involving the index j appearing twice, indicating a potential misunderstanding of summation conventions.
  • Further replies emphasize that the application of the chain rule should yield a single delta function, not multiple, and challenge Bremvil's interpretation of the summation process.
  • Some participants provide examples to illustrate the correct application of the chain rule, contrasting it with Bremvil's reasoning.

Areas of Agreement / Disagreement

There is no consensus on the interpretation of the application of the chain rule in this context. Some participants agree that Bremvil's understanding of the summation leading to three Kronecker deltas is incorrect, while Bremvil maintains that their interpretation is valid.

Contextual Notes

Participants note that the presence of double indices typically indicates summation, but the specific application of the chain rule in this case is contested, leading to confusion about the resulting expressions.

bremvil
Messages
10
Reaction score
0
Hi everyone,

I recently started a course on continuum mechanics. It started with the mathematical background of transforming tensors with contravariant and/or covariant indices. There is one thing I don't understand and it should be really straight forward. I hope you can give me a hint.

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg

In equation 5 on the page that I linked above I do not see how I can get from term 3 to term 4. I end up with 3 kronecker delta's instead of just 1. Since there is a double index 'm' summation can be performed and it should lead to the same result as when you use a 'shortcut'. I can show what I do using 'latex' notation, with ^ = superscript, _ = subscript, d = kronecker delta, t = theta

'term 3' in Equation (5) reads:
(dt^i/dx^m) * (dx^m/dt^j)

in case I decide to do summation this equation will turn into:
(dt^i/dx^1) * (dx^1/dt^j) + (dt^i/dx^2) * (dx^2/dt^j) +
(dt^i/dx^3) * (dx^3/dt^j)

Each component of vector x is a function of all three components of vector theta. And each component of vector theta is a function of all three components of vector x. By the chain rule the last expression would become

dt^i/dt^j + dt^i/dt^j + dt^i/dt^j

this is:
d^i_j + d^i_j + d^i_j = 3 * d^i_j

so in case I decide to do the summation I end up with something different than I
would expect. 3 kronecker delta's instead of 1! Is there any objection to using a sum in this
case?

with kind regards,

Bremvil
 
Physics news on Phys.org
That equality is just the chain rule. It doesn't have anything to do with tensors or properties of the Kronecker delta.

When g:\mathbb R^n\rightarrow\mathbb R^n and f:\mathbb R^n\rightarrow\mathbb R, I like to write the chain rule like this:

(f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)

Here ",i" denotes partial derivative with respect to the ith variable, and g^j is the jth component of the function g. If f:\mathbb R^n\rightarrow\mathbb R^n, we have

(f\circ g)^i(x)=(f(g(x))^i=f^i(g(x))=f^i\circ g(x)

so

\delta^i_j=(f\circ f^{-1})^i{}_{,j}(x)=(f^i\circ (f^{-1}))_{,j}(x)

Define g=f^{-1} to unclutter the notation somewhat. Then the above is

=(f^i\circ g)_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(x)

and...uhh...I can't explain why this is written in the form

\frac{\partial A^i}{\partial B^k}\frac{\partial B^k}{\partial A_j}

without explaining partial derivatives with respect to a coordinate system. (Edit: Actually I can. See the comments at the end of the post). On a manifold, expressions like f(x+h) don't work, because in general, addition isn't defined for points in the manifold. This is why we have to use a coordinate system to define partial derivatives. A coordinate system is a function x:U\rightarrow R^n, where U is an open subset that contains the point p at which we want to define the partial derivative. If f is a function from the manifold to the real numbers, we define

\frac{\partial f}{\partial x^i}(p)=(f\circ x^{-1})_{,i}(x(p))

In particular, if y is another coordinate system,

\frac{\partial y^j}{\partial x^i}(p)=(y^j\circ x^{-1})_{,i}(x(p))

The f above is a coordinate change function, i.e. an expression of the form A\circ B^{-1}, where A and B are coordinate systems. So we have

\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=(A^i\circ B^{-1})_{,k}(g(x))(B^k\circ A^{-1}){}_{,j}(x)=\frac{\partial A^i}{\partial B^k}(A^{-1}(x))\frac{\partial B^k}{\partial A_j}(A^{-1}(x))

Edit: It turned out to be easier than I thought to explain the notation. No techniques from differential geometry are needed.

\delta^i_j=f^i{}_{,k}(g(x))g^k{}_{,j}(x)=f^i{}_{,k}(g(x))g^k{}_{,j}(f(g(x)))=\frac{\partial f^i}{\partial g^k}(g(x))\frac{\partial g^k}{\partial f_j}(f(g(x))
 
Last edited:
Dear Fredrik,

Thanks for your reply. I read through it carefully but I find it quite difficult, my math background is not as strong as yours. So I'm still not really there yet. In your explanation you started with:

<br /> (f\circ g)_{,i}(x)=f_{,j}(g(x)) g^j{}_{,i}(x)<br />

but this step is basically my entire problem! The index j appears twice which would mean summation right. I will try to write down my original problem with latex. Could you instead maybe tell where I am making the error?

http://s35.photobucket.com/albums/d174/Brasempje/?action=view&current=tensor.jpg
equation 5, the problem is in step from term 3 to term 4.


<br /> x_i = x_i(\theta_1 ,\theta_2 , \theta_3)<br />

<br /> \theta_i = \theta_i(x_1 ,x_2 , x_3)<br />

If I decide to apply summation over 'index m' in the equation below I get:

<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

by the chain rule I get

<br /> \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j<br />

So I get 3 times the delta function instead of only 1 delta function. I guess I should not do summation for some reason, but I don't understand why. There is a double 'index m' so a summation should be justified. In every part of the book 'classical and computational solid mechanics' the presence of a double index means summation.
 
bremvil said:
<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

by the chain rule I get

<br /> \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} + \frac{\partial \theta^i}{\partial \theta^j} = \delta^i_j + \delta^i_j + \delta^i_j = 3\delta^i_j<br />
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

\frac{\partial\theta^i}{\partial\theta_j}

(once, not three times).
 
So basically you are saying that the term on the top right:

<br /> \frac{\partial \theta^i}{\partial x^m}\frac{\partial x^m}{\partial \theta^j} = <br /> \frac{\partial \theta^i}{\partial x^1}\frac{\partial x^1}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^2}\frac{\partial x^2}{\partial \theta^j} + \frac{\partial \theta^i}{\partial x^3}\frac{\partial x^3}{\partial \theta^j}<br />

equals a single delta function? The way I interpret it, each term within the expression above is a single delta function.
Fredrik said:
The first line is correct, but that's not the chain rule. By the chain rule, what you have on the upper right is equal to

\frac{\partial\theta^i}{\partial\theta_j}

(once, not three times).
 
bremvil said:
The way I interpret it, each term within the expression above is a single delta function.
You're not applying the chain rule correctly. Another example:

\frac{d}{dx}f(g(x),h(x))=\frac{\partial f}{\partial g}\frac{dg}{dx}+\frac{\partial f}{\partial h}\frac{dh}{dx}\neq \frac{df}{dx}+\frac{df}{dx}
 
I finally see it! Thanks a lot.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 16 ·
Replies
16
Views
12K
  • · Replies 6 ·
Replies
6
Views
8K
  • · Replies 2 ·
Replies
2
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
Replies
12
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K