Proper usage of Einstein sum notation

Click For Summary

Homework Help Overview

The discussion revolves around the proper usage of Einstein sum notation in the context of derivatives of a kernel function. The original poster attempts to simplify complex summations and derivatives using this notation but expresses uncertainty about its correctness, particularly regarding the use of repeated indices.

Discussion Character

  • Conceptual clarification, Assumption checking

Approaches and Questions Raised

  • Participants question the validity of using repeated indices in the notation and discuss the implications of defining terms like ##P_i##. There is an exploration of alternative representations, such as defining a vector of ones to simplify expressions.

Discussion Status

Some participants have provided guidance on correcting the notation and suggested clearer formulations for the derivatives. Multiple interpretations of the original poster's expressions are being explored, with no explicit consensus reached yet.

Contextual Notes

There is a noted concern about the original poster's use of the delta function and the implications of having the same index appear multiple times in the notation. The discussion reflects on the conventions of summation notation and the need for clarity in mathematical expressions.

Gan_HOPE326
Messages
64
Reaction score
7

Homework Statement



I'm dealing with some pretty complex derivatives of a kernel function; long story short, there's a lot of summations going on, so I'm trying to write it down using the Einstein notation, for shortness and hopefully reduction of errors (also for the sake of a paper in which I have to write all this stuff down and possibly do it without blowing past the page's margins). Right now I was testing something that's relatively simple, but I'm not sure I'm using this correctly.

Homework Equations



My test example was a relatively simple derivative. For reference, these are the symbols I am using:

$$ P_{ij} = exp[-(x_i-x_j)^2]
\qquad
P_{ij}' = \frac{dP_{ij}}{dx_i} = -\frac{dP_{ij}}{dx_j}
\qquad
P_i = P_{ij}\delta_{jj}
\qquad
P_i' = P_{ij}'\delta_{jj}
$$

I'm already unsure about the use of ##\delta_{jj}## there, but then comes the problem. As a first exercise I'm trying an example of a derivative, with an additional index ##n##:

$$\frac{d(P_iP_i)}{dx_n}$$

The Attempt at a Solution


[/B]
Here's my solution:

$$\frac{d(P_iP_i)}{dx_n} = 2P_i\frac{dP_i}{dx_n} = 2P_i\left[\frac{dP_i}{dx_i}\delta_{in}-\frac{dP_i}{dx_j}\delta_{jn}\right] = 2P_nP_n' - 2P_iP_{in}'
$$

Which actually works (tested numerically), but seems ugly and wrong to me due to those repeated ##n## indices which seem to imply a summation that isn't really there. Did I do something wrong? Is there some other symbol I'm disregarding or some rule I don't know? Thanks!
 
Physics news on Phys.org
One error is that the same index shouldn't show up more than twice in a term, so ##P_i = P_{ij}\delta_{jj}## doesn't make sense because ##j## appears three times. It's not clear to me what you're trying to do there. What is ##P_i## supposed to be equal to in normal summation notation?
 
vela said:
One error is that the same index shouldn't show up more than twice in a term, so ##P_i = P_{ij}\delta_{jj}## doesn't make sense because ##j## appears three times. It's not clear to me what you're trying to do there. What is ##P_i## supposed to be equal to in normal summation notation?

In regular notation,

$$P_i = \sum_j P_{ij} $$

I suppose I could get the same result by multiplying by an array of ones with a single index, I just don't know if there's a conventional symbol for that.
 
That's probably the most straightforward way. You can define a vector of ones, say ##e = (1, 1, \dots, 1)##, then ##P_i = P_{ij}e_j##.

You also need to clean up the notation for the derivative. The chain rule gives you (with no implied summation here)
$$\frac{d}{dx_n} P_{ij} = \frac{\partial P_{ij}}{\partial x_i}\frac{dx_i}{dx_n} + \frac{\partial P_{ij}}{\partial x_j}\frac{dx_j}{dx_n}.$$
 
vela said:
That's probably the most straightforward way. You can define a vector of ones, say ##e = (1, 1, \dots, 1)##, then ##P_i = P_{ij}e_j##.

You also need to clean up the notation for the derivative. The chain rule gives you (with no implied summation here)
$$\frac{d}{dx_n} P_{ij} = \frac{\partial P_{ij}}{\partial x_i}\frac{dx_i}{dx_n} + \frac{\partial P_{ij}}{\partial x_j}\frac{dx_j}{dx_n}.$$

Yes, right, I'll fix that. The minus sign came from me knowing it appears in the end but it's not correct there.

EDIT: apparently I can't edit the first post in the thread? Sorry for that.
 

Similar threads

Replies
5
Views
4K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 11 ·
Replies
11
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
7K