Differentiating the complex scalar field

Click For Summary
The discussion revolves around the differentiation of complex scalar fields in the context of Lagrangian mechanics. Participants express confusion over the proper treatment of complex fields, particularly in calculating derivatives with respect to both the field and its complex conjugate. A key point is the realization that treating the scalar field as a complex function leads to incorrect results due to improper application of the chain rule. Clarifications are made regarding the independence of variables and the correct approach to functional derivatives, emphasizing that the relationship between a field and its conjugate must be respected. Ultimately, the conversation highlights the nuances of handling complex variables in theoretical physics, particularly in scalar field theory.
  • #31
Making a coordinate transformation

<br /> u = x + y<br />
<br /> v = x - y<br />

is not same thing as making a transformation

<br /> z = x + iy = (x, y)<br />
<br /> z^* = x - iy = (x, -y)<br />

because

<br /> (u,v) \in \mathbb{R}^2<br />

and

<br /> (z,z^*) \in \mathbb{C}^2 = \mathbb{R}^4<br />

u,v are independent, and z,z^* are not independent.
 
Physics news on Phys.org
  • #32
strangerep said:
If I say "x and y are independent real variables", I mean that it's
possible to vary x without necessarily inducing any change in y as a
consequence (and vice versa). Do you agree with that definition?
I consider a variable to be a symbol that represents a member of a set. (Actually, since every member of every set (in ZFC theory) is a set, I could have said "a symbol that represents a set"). If we say that x is a variable of S, it means that x represents a member of S. In a typical situation where x and y are described as "variables", the set S isn't specified explicitly, but it's clear from the context that x and y are to be assigned values from the same set.

With this definition of "variable", the only way I can make sense of the phrase "x and y are independent" is that no assignment of a value to x will shrink the set of values that can be assigned to y, and vice versa. For example, if x and y are variables of ℝ, the condition x2+y2=4 makes them dependent, because the choice y=2 means that x can only be assigned a value from {-2,2}≠ℝ. On the other hand, if we had said that x and y are variables of {-2,2}, they would be independent even after imposing that condition on them.

I'm not sure if this means that I agree with your definition or not. Does this sound like what you had in mind?

strangerep said:
If so, I now propose a change of variables from x,y to a new pair u,v
(also both real), defined by:

u ~:=~ x + y
v ~:=~ x - y

Question #1: are these u,v "variables" or "functions"? (I would say
that it's ok to think of them as both of these.)
If x and y are variables, then so are u and v. Since the maps (x,y)\mapsto u and (x,y)\mapsto v are functions, I agree that it's ok to think of u and v as functions, even though they're not, because they can be used to define functions in an obvious way. I don't even mind too much if we call those functions u and v respectively, but we need to be very careful if we do. If there's any chance that this will cause confusion, we should call them something else.

Edit: See post #36 for more comments about variables that satisfy constraints, and the maps that are implicitly defined by those constraints.

strangerep said:
Question #2: are the u,v variables independent of each other?
By this I mean, is it possible to vary u (over the real line) without
necessarily inducing a change in v and vice versa. I.e., is it possible
to vary u while holding v constant?
Yes. They are independent even with my definition of what that means.

strangerep said:
Now I'll invent an involution operator denoted by a tilde:

<br /> \widetilde{u} ~:=~ v ~~~~~;~~~~ \widetilde{v} ~:=~ u ~~,<br />

and if I now vary u while holding v constant, my involution operation
is not preserved in general. But this does not change the fact that u,v
are independent variables. This doesn't matter -- I'm just using two
different variables to parametrize the plane.
I don't understand what you're doing here. I agree that (x,y)\mapsto (y,x) is an involution on ℝ2, but I don't see why that matters.
 
Last edited:
  • #33
Sankaku said:
I have now looked up two of the main print references from the Wikipedia page. Interestingly, as far as I could see, nether one described it as a Wirtinger derivative or even mentioned his name.

Nevertheless, he invented this calculus and justified it rigorously.

Sankaku said:
Saying that "Wirtinger was a mathematician" is the worst sort of "proof by authority." Until you can provide evidence to the contrary, I am going to assume that he just stated it as a definition as well (and then showed the power of using this particular construction).

I was using it as an indicator that mathematicians (who are much more conscious about using rigorous notations than physicists) created and use the notation consistently for many, many years. This needs no proof but just looking up books and papers on complex analysis of several variables.

Of course, as often in mathematics, one has a choice what one wants to call a definition and what a theorem. One can take the non-limit relation as a definition and then prove my limit relation to be a theorem, or vice versa.

In any case, both formulas are valid and make sense rigorously.
 
  • #34
Sankaku said:
Also, books trying to justify the definition using the chain rule with things like this (and the equivalent for y),

<br /> x ~=~ \frac{z + \bar{z}}{2} ~ \to ~<br /> \frac{ \partial x}{\partial \bar{z}} <br /> ~=~ \frac{1}{2}<br />

are just confusing the issue.

If you don't understand this formula it is only you who is confused.

Given that z is a complex variable, x is a function of z, and according to the Wirtinger calculus, dx/dz equals 1/2 by a trivial calculation going back to the definition. Either that in terms of limits or that in terms of real and imaginary parts; in both cases it is very easy.
 
  • #35
Fredrik said:
With this definition of "variable", the only way I can make sense of the phrase "x and y are independent" is that no assignment of a value to x will shrink the set of values that can be assigned to y, and vice versa.

You should realize that the same word may have different meanings in different contexts,
being generalized by mathematicians if they can give it a more general interpretation that still fits the formal rules.

The word number was originally reserved for a natural number. Over time it accommodated fractions, zero, negative numbers, irrational numbers, and complex numbers, because they behaved in the same way: the algorithmic formula manipulation is identical as long as you don't consider specific values.

In a similar way, the meaning of the phrase "x and y are independent" is generalized to apply to "z and z^* are independent", since the algorithmic formula manipulation is identical.
 
  • #36
A. Neumaier said:
You should realize that the same word may have different meanings in different contexts, being generalized by mathematicians if they can give it a more general interpretation that still fits the formal rules.
Unless of course that word happens to be "observable", right? :smile:

Edit: You're obviously going to counter by saying that you're talking about a generalization while I was talking about a restriction. This means that you aren't contradicting yourself, but I still find it funny that you're so willing to embrace a redefinition of the term "independent" that assigns it to a pair of variables that have "I depend on that guy" tatooed on their foreheads, and at the same time find a restriction of the term "observable" so appalling.



I think I have a pretty good idea about how this Wirtinger stuff works now. This is a summary: Suppose that x,y,z,w are variables that represent complex numbers. In this post I will call any piece of additional information about the values of those variables a constraint. The equalities

z=x+iy

w=x-iy

are constraints. This pair of equalities is equivalent to

x=\frac{z+w}{2}

y=\frac{z-w}{2i}

These constraints implictly define four maps from ℂ2 into ℂ:

(x,y)\mapsto z

(x,y)\mapsto w

(z,w)\mapsto x

(z,w)\mapsto y

Now we would like to impose one more constraint, x\in\mathbb R. This is of course equivalent to w=z*. When we do, the maps that are implicitly defined by our constraints change:

(x,y)\mapsto z\qquad :\mathbb R^2\rightarrow\mathbb C

(x,y)\mapsto z^*\qquad :\mathbb R^2\rightarrow\mathbb C

(z,z^*)\mapsto x\qquad :\{(z,w)\in\mathbb C^2|w=z^*\}\rightarrow\mathbb R

(z,z^*)\mapsto y\qquad :\{(z,w)\in\mathbb C^2|w=z^*\}\rightarrow\mathbb R

Let's call them u,v,F,G respectively. The partial derivatives of u and v are clearly well-defined, and I don't think it's too horrible to write them as

\frac{\partial z}{\partial x}=1,\ \frac{\partial z}{\partial y}=i,\ \frac{\partial z^*}<br /> <br /> {\partial x}=1,\ \frac{\partial z^*}{\partial y}=-i

The definition of partial derivative fails miserably for any function H:\{(z,w)\in\mathbb C^2|w=z^*\}\rightarrow S, where S is a subset of ℂ, and this of course includes F and G. The "solution" to this "problem" is apparently to define

\frac{\partial H(z,z^*)}{\partial z}=\frac{1}{2}\left(\frac{\partial}{\partial x}+i\frac{\partial}{\partial y}\right)\Big((x,y)\mapsto H(u(x,y),v(x,y))\Big)

and similarly for the other partial derivative. This definition is motivated by the fact that if the domain of H had been ℂ2, so that the usual definition of partial derivative had worked, D_1H(u(x,y),v(x,y)) would have been equal to the right-hand side of the equality above for all x,y\in\mathbb R.

So the weird definition of the partial derivatives of a function that's only defined on pairs of the form (z,z*) as a result of the constraint x,y\in\mathbb R, is equivalent to just waiting until after we have taken the partial derivatives before we impose that constraint.

What I still don't get about all of this is why we would prefer to make a bunch of weird redefinitions of standard notation and terminology in order to make each step of a nonsensical calculation correct, instead of just saying "hey, let's compute the partial derivative first, and then set w=z*".

Edit: What's even harder to understand is why we would want to describe this result as "z and z* are independent".
 
Last edited:
  • #37
A. Neumaier said:
If you don't understand this formula it is only you who is confused.

[ citation needed ]

I referenced 3 books, including 2 you pointed me to. Are you arguing with Ahlfors? Cough up some paper reference where they do the derivation as a limit.


.
 
Last edited:
  • #38
I think A. Neumaier's discussion was clear, and it has helped me to understand this issue much better than I did.

For me, the key point is that partial derivatives with respect a particular variable are well-defined, even if other variables are functions of them.

For example, suppose I have a function g(x,y), where x and y are cartesian coordinates on a plane. The meaning of the partial derivative \partial g(x,y)/\partial x is clear.

Now suppose I am interested in the value of g(x,y) along a curve y=f(x). This is given by g(x,f(x)). But, the partial derivative with respect to x is still well-defined, even though y is no longer "independent" of x.

To be clear, we should write the partial derivative with respect to x in this situation as

{\partial g(x,y)\over\partial x}\bigg|_{y=f(x)}.[/itex]<br /> <br /> Complex derivatives are of this nature, it seems to me. We declare z and z* to be &quot;independent&quot; for purposes of taking partial derivatives, even though we are later going to take z* to be a particular function of z (namely, the complex conjugate).
 
  • #39
Sankaku said:
[ citation needed ]
I referenced 3 books, including 2 you pointed me to. Are you arguing with Ahlfors? Cough up some paper reference where they do the derivation as a limit.

I am not arguing with Ahlfors. I stated a formula which is valid in the Wirtinger calculus, no matter whether or not it is in the book by Ahlfors. One doesn't need a book to see that the limit formula is correct. It follows easily from the other definition. And it can serve as an alternative definition since one can derive from it the formula Ahlfors may have used as definition. (I don't have his book.)
 
  • #40
Avodyne said:
For example, suppose I have a function g(x,y), where x and y are cartesian coordinates on a plane. The meaning of the partial derivative \partial g(x,y)/\partial x is clear.

Now suppose I am interested in the value of g(x,y) along a curve y=f(x). This is given by g(x,f(x)). But, the partial derivative with respect to x is still well-defined, even though y is no longer "independent" of x.

To be clear, we should write the partial derivative with respect to x in this situation as

{\partial g(x,y)\over\partial x}\bigg|_{y=f(x)}.[/itex]<br />
<br /> I think this notation and terminology is very misleading. The worst part is the notation at the end, but let&#039;s start at the beginning. g(x,y) isn&#039;t a function. That expression represents a member of the range of the function g. If we write g:ℝ<sup>2</sup>→ℝ, there&#039;s no need to mention coordinates.<br /> <br /> If we are only interested in the values of g at points in its domain of the form (x,f(x)), we can consider the restriction of g to the set of such points, but the partial derivatives of <i>that</i> function are undefined. What we need to do here is to define the curve C by C(x)=(x,f(x)) for all x, and to consider the ordinary derivative of g<sub>°</sub>C:ℝ→ℝ. I wouldn&#039;t describe the fact that what we&#039;re really interested in is an ordinary derivative of a different function than the one we started with, as &quot;the partial derivative with respect to x is still well-defined&quot;.<br /> <br /> Now let&#039;s talk about the notation at the end. If h:\mathbb R^2\rightarrow\mathbb R is differentiable, then the partial derivative with respect to the first variable is the function D_1h:\mathbb R^2\rightarrow\mathbb R defined by<br /> <br /> D_1h(x,y)=\lim_{h\rightarrow 0}\frac{h(x+h,y)-h(x,y)}{h}<br /> <br /> for all (x,y)\in\mathbb R^2. \partial h/\partial x is just an alternative notation for D_1h, motivated by the fact that we often use the symbol x as the first variable. The expression<br /> <br /> \frac{\partial h(x,y)}{\partial x}<br /> <br /> just means &quot;the value of the function D_1h at (x,y)&quot;. So<br /> <br /> \frac{\partial f(x,g(x))}{\partial x}<br /> <br /> can only mean D_1f(x,g(x)), which is equal to<br /> <br /> \lim_{h\rightarrow 0}\frac{g(x+h,f(x))-g(x,f(x))}{h},<br /> <br /> not<br /> <br /> \lim_{h\rightarrow 0}\frac{g(x+h,f(x+h))-g(x,f(x))}{h}=(f\circ C)&amp;#039;(x)<br /> <br /> Since \partial/\partial x by definition denotes <a href="https://www.physicsforums.com/insights/partial-differentiation-without-tears/" class="link link--internal">partial differentiation</a> with respect to the first variable (i.e. exactly the same thing as D_1), the expression you used,<br /> <br /> {\partial g(x,y)\over\partial x}\bigg|_{y=f(x)}<br /> <br /> should therefore be the same thing as<br /> <br /> D_1g(x,y)\big|_{y=f(x)}<br /> <br /> and I can only interpret that as &quot;what you get when you replace y with f(x) in the expression D_1g(x,y)&quot;, and this is D_1g(x,f(x)), which is equal to the first of the two limits above, not the second.
 
  • #41
Fredrik said:
You're obviously going to counter by saying that you're talking about a generalization while I was talking about a restriction. This means that you aren't contradicting yourself, but I still find it funny that you're so willing to embrace a redefinition of the term "independent"

My interest in these discussions here on PF is to explain the actual usage of concepts in theoretical physics. One cannot change these traditions, but one can understand them and become confident in their correct use.

In this thread, I was simply explaining in which sense the existing, well-established traditions about df(z^*,z)/dz and ''treating z and z^* as independent'' are fully rigorous and make perfect sense, at least to me.

That you don't like this tradition is a different matter about which I can't argue.
 
  • #42
Suppose we want to solve the equations of motion defined by this Lagrangian.

<br /> L(\dot{x},\dot{y},x,y) = \frac{1}{2}(\dot{x}^2 + \dot{y}^2) - \frac{C}{2}(x^2 + y^2)<br />

The way 1:

<br /> 0 \;=\; D_t \frac{\partial L}{\partial\dot{x}} - \frac{\partial L}{\partial x} \;=\; \ddot{x} + Cx<br />
<br /> 0 \;=\; D_t \frac{\partial L}{\partial\dot{y}} - \frac{\partial L}{\partial y} \;=\; \ddot{y} + Cy<br />

The way 2:

First we denote z = x+iy and z^* = x - iy, and redefine the Lagrangian

<br /> L = \frac{1}{2}\dot{z}^* \dot{z} - \frac{C}{2} z^* z<br />

(Of course not writing L(\dot{z},\dot{z}^*,z,z^*) explicitly.) Then we assume that z and z^* are independent and compute

<br /> 0 = D_t \frac{\partial L}{\partial \dot{z}^*} - \frac{\partial L}{\partial z^*} = \frac{1}{2}\big(\ddot{z} + Cz\big)<br />

My question is that why would you use the "way 2"? What do you achieve with it? Is it surely worth all the confusion it will inevitably generate? You could have also obtained the same result by the way 1.
 
Last edited:
  • #44
jostpuur said:
Suppose we want to solve the equations of motion defined by this Lagrangian.

<br /> L(\dot{x},\dot{y},x,y) = \frac{1}{2}(\dot{x}^2 + \dot{y}^2) - \frac{C}{2}(x^2 + y^2)<br />

My question is that why would you use the "way 2"? What do you achieve with it? Is it surely worth all the confusion it will inevitably generate? You could have also obtained the same result by the way 1.

If the Lagrangian is given in your form, there is no reason to perform the transformation.

But suppose you have a problem where your Hamiltonian is given in the form of an anharmonic oscillator
H(z^*,z)= \omega z^*z + g (z^*z)^2 + g&#039; (z^4+(z^*)^4),
say. Then you want to write your dynamics directly in terms of z,
dz/dt = i dH/dz^*(z^*,z)=\omega z + 2g z(z^*z)^2 + 4g&#039; (z^*)^3,
rather than first have to convert it to real and imaginary part, and using the real Hamiltonian equations.

Note that in electrical circuits, say, the variables are naturally given as complex quantities, and the above form is far more natural than the one in terms of real quantities.
 
  • #45
A. Neumaier said:
Sankaku said:
A. Neumaier said:
If H is an analytic function of z^* and z then
dH(z^*,z)/dz^*=lim_{h\to 0} (H(z^*+h,z)-H(z^*,z))/h
makes perfect sense and gives the right result.
I am sorry, I can only say again that z cannot be fixed while you vary \bar{z}. If you can't see the circular logic in your statement, there is nothing I can do.
In case you haven't seen it: I am fixing both z and z^* and vary a _new_ variable h.
There is nothing circular in my argument; it has a standard, perfectly well-defined, rigorous interpretation.

Neumaier, in the beginning you said that H would be an analytic function of z^* and z, which sounds suspicious, because if H is an analytic function of z, then it is not an analytic function of z^*. It could be that this distracted Sankaku. But I see that what you mean makes sense.

Avodyne said:
For me, the key point is that partial derivatives with respect a particular variable are well-defined, even if other variables are functions of them.

I see this now too.

Since this has been a confusing thread, it won't hurt if I iterate this a little bit for others to see more explicitly:

If

<br /> f:\mathbb{C}^2\to\mathbb{C},\quad (z_1,z_2)\mapsto f(z_1,z_2)<br />

is a function such that it is complex analytic with respect to the both variables separately, then the following partial derivative functions exists

<br /> (z_1,z_2) \mapsto (\partial_1 f)(z_1,z_2),\quad (z_1,z_2)\mapsto (\partial_2 f)(z_1,z_2)<br />

and it makes sense to use the following notation:

<br /> \frac{\partial f(z,z^*)}{\partial z} := (\partial_1 f)(z,z^*),\quad \frac{\partial f(z,z^*)}{\partial z^*} := (\partial_2 f)(z,z^*)<br />

Sankaku, Frederik, all clear?

My final comment on this is that it's amazing how physicists succeeded in preventing me from understanding this earlier. ;(
 
  • #46
A. Neumaier said:
Note that in electrical circuits, say, the variables are naturally given as complex quantities, and the above form is far more natural than the one in terms of real quantities.

Everything is real in classical EM unless something is specifically somehow interpreted as complex. Aren't the complex numbers in electrical circuits only used as a computational trick, because people don't want to deal with formulas

<br /> \sin(A + B) = \sin(A)\cos(B) + \cos(A)\sin(B)<br />
<br /> \cos(A + B) = \cos(A)\cos(B) - \sin(A)\sin(B)<br />

but prefer

<br /> e^{i(A + B)} = e^{iA} e^{iB}<br />

instead?
 
  • #47
jostpuur said:
Neumaier, in the beginning you said that H would be an analytic function of z^* and z, which sounds suspicious, because if H is an analytic function of z, then it is not an analytic function of z^*. It could be that this distracted Sankaku. But I see that what you mean makes sense.

Of course. An analytic function in z is as different from an analytic function in z and z^*
as a real function of x and y is different from a real function of x.


jostpuur said:
If
<br /> f:\mathbb{C}^2\to\mathbb{C},\quad (z_1,z_2)\mapsto f(z_1,z_2)<br />
is a function such that it is complex analytic with respect to the both variables separately, then the following partial derivative functions exists
<br /> (z_1,z_2) \mapsto (\partial_1 f)(z_1,z_2),\quad (z_1,z_2)\mapsto (\partial_2 f)(z_1,z_2)<br />
and it makes sense to use the following notation:
<br /> \frac{\partial f(z,z^*)}{\partial z} := (\partial_1 f)(z,z^*),\quad \frac{\partial f(z,z^*)}{\partial z^*} := (\partial_2 f)(z,z^*)<br />

Yes. And in this case one says that f(z,z^*) is an analytic function of z and z^*.

Note that given an analytic function of z and z^* in the form of a nonanalytic function of z (e.g., f=Re z), one can find out what f(z_1,z_2) must be: The series expansion in powers of z and z^* is well-defined and unique. Replacing in this expansion z by z_1 and z^* by z_2 gives the expansion of f(z_1,z_2).
 
  • #48
Fredrik said:
"z and z* are independent variables" should mean that we use the notation (z,z*) for points in the domain of the function we're dealing with. This is of course just as trivial, but if * denotes complex conjugation, the domain of the function would have to be (a subset of) the subset of ℂ2 that consists of pairs of the form (z,z*), and now we have a problem.

Fredrik said:
The definition of partial derivative fails miserably for any function H:\{(z,w)\in\mathbb C^2|w=z^*\}\rightarrow S, where S is a subset of ℂ, and this of course includes F and G.

These domains emerged from your attempts to guess the meaning for vague statements, but IMO you should forget them now, because they turned out not to be relevant for sensible interpretations of these initially vague statements.
 
  • #49
jostpuur said:
Aren't the complex numbers in electrical circuits only used as a computational trick, because people don't want to deal with formulas

<br /> \sin(A + B) = \sin(A)\cos(B) + \cos(A)\sin(B)<br />
<br /> \cos(A + B) = \cos(A)\cos(B) - \sin(A)\sin(B)<br />

but prefer

<br /> e^{i(A + B)} = e^{iA} e^{iB}<br />

instead?

I agree with your statement if you drop the ''only'', which isn't justified for a trick that improves things to an extent that it is virtually everywhere used where physicists work with periodic terms. The latter is far more natural than the former, much easier to remember, much easier to use, and in every respect better behaved.

It is the natural expression of periodicity, as can be seen everywhere: from the Fourier transform or from the solution of linear differential equations with constant coefficients in terms of an eigenvalue problem, from the way the Schroedinger equation is treated. And of course also from the way, linear electrical circuits are analyzed. Once your circuit has more than very few elements, it becomes extremely messy to work with trigonometric functions.

All mathematics consists of tricks to make reasoning shorter. We invent the concept of a prime because it is a useful trick not to have to say each time ''number that is not divisible by any other number apart from one itself'', the decimal notation to be able to say 123 in place of ''one times 100 plus two times ten plus three'', etc. We invent the concept of a phase space vector to be able to abbreviate by the letter x a long list of coordinates. etc. etc..

Mathematics progresses by finding concepts that reduce the labor of precise reasoning to an extent that even very complex matters look comprehensible.
 
  • #50
jostpuur said:
My question is that why would you use the "way 2"? What do you achieve with it? Is it surely worth all the confusion it will inevitably generate? You could have also obtained the same result by the way 1.
My problem is not so much with "way 2", but with the way it's presented in physics books. If they had said something like

In this Lagrangian, the symbol * doesn't denote complex conjugation, and z* is just another variable. We determine the equations of motion for these two functions, and find a) that they're exactly the same, and b) that the complex conjugate of any solution is a solution. This means that if we set z* equal to the complex conjugate of z after we have determined the equation satisfied by both, we obtain a theory of a single complex-valued function instead of a theory of two.​

I would have been OK with it.


jostpuur said:
Neumaier, in the beginning you said that H would be an analytic function of z^* and z, which sounds suspicious, because if H is an analytic function of z, then it is not an analytic function of z^*. It could be that this distracted Sankaku. But I see that what you mean makes sense.
It definitely distracted me. I didn't even look at the limit right away. But I agree that the limit is well-defined for all values of z, assuming that H is defined and analytic in an open set that contains the set of pairs of the form (z*,z). This is just the standard (not Wirtinger) definition of partial derivative.

With that in mind, the definition

\frac{\partial H(z^*,z)}{\partial z}=\frac{1}{2}\left(\frac{\partial}{\partial x}+i\frac{\partial}{\partial y}\right)\Big((x,y)\mapsto H(x-iy,x+iy)\Big)

seems strange and unnecessary. This is the sort of stuff that the Wikipedia article put into my head. Maybe it's useful for something, but I don't think we need it here.

jostpuur said:
These domains emerged from your attempts to guess the meaning for vague statements, but IMO you should forget them now, because they turned out not to be relevant for sensible interpretations of these initially vague statements.
I agree that we have no interest in any functions with that domain. All we need to do is to wait until after we have found the partial derivative (which is another function from ℂ2 into ℂ) until we set z*=(complex conjugate of z).

jostpuur said:
My final comment on this is that it's amazing how physicists succeeded in preventing me from understanding this earlier. ;(
I have felt that way many times. I could totally understand the frustration you displayed in #6. I still get angry when I think about how tensors were explained to me in 1994.
 
  • #51
Fredrik said:
My problem is not so much with "way 2", but with the way it's presented in physics books. If they had said something like

In this Lagrangian, the symbol * doesn't denote complex conjugation, and z* is just another variable. We determine the equations of motion for these two functions, and find a) that they're exactly the same, and b) that the complex conjugate of any solution is a solution. This means that if we set z* equal to the complex conjugate of z after we have determined the equation satisfied by both, we obtain a theory of a single complex-valued function instead of a theory of two.​

I would have been OK with it.

They don't say this because that's not what's done.

Suppose z=x+iy is a complex field, and z*=x-iy is its complex conjugate. We can express the lagrangian as a function of x and y, or as a function of z and z*. I will call the first function R and the second function C. (R is to remind us of "real" and C of "complex".) R is a function from ℝ2 into ℝ, and C is a function from ℂ2 into ℂ. These functions are related by

R(x,y)=C(x+iy,x-iy).

From this, it follows immediately (using the chain rule) that the derivatives of these functions are related by (using your non-standard notation)

D1R = (D1 + D2)C

D2R = (1/i)(D1 - D2)C.

We can now solve for D1C and D2C, with the result

D1C = (1/2)(D1 + iD2)R

D2C = (1/2)(D1 - iD2)R.

(These are the formulas that you call "strange and unnecessary".)

The equations of motion are

D1R = 0

D2R = 0.

Using the "strange and unnecessary" formulas, we find that

D1C = 0

D2C = 0.

Now it so happens that it is generally easier to compute D1C and D2C than it is to compute D1R and D2R. One reason for this is that C has the property that C(x+iy,x-iy) must be real, and this implies that D2C=(D1C)* when evaluated at (x+iy,x-iy). Thus we only need to compute D1C.

Note that it is never necessary to say that "the symbol * doesn't denote complex conjugation".
 
Last edited:
  • #52
Avodyne said:
Note that it is never necessary to say that "the symbol * doesn't denote complex conjugation".
OK, you're right. I have no objections to anything you did in this post.

Avodyne said:
your non-standard notation
I think all of these notations are standard, but perhaps I picked the least popular one:

D_1f(x,y)=\partial_1 f(x,y)=f_{,1}(x,y)=\frac{\partial f(x,y)}{\partial x}=\frac{\partial}{\partial x}f(x,y)=\frac{\partial}{\partial x_1}\bigg|_{(x,y)}f(x_1,x_2)

Edit: I think you actually made the best post in the entire thread. It explains a lot, without any weird terminology or strange definitions. One of the things you proved is that there's no need to take the formulas I called "strange and unnecessary" as definitions, because the result

\frac{\partial C(z,z^*)}{\partial z}=\frac{1}{2}\left(\frac{\partial}{\partial x}+i\frac{\partial}{\partial y}\right)C(x+iy,x-iy)

follows from the standard definition of a partial derivative, assuming of course that the function C is defined and analytic in an open subset of ℂ2 that includes the point (z,z*).
 
Last edited:
  • #53
Fredrik said:
OK, you're right. I have no objections to anything you did in this post.
!
Fredrik said:
I think you actually made the best post in the entire thread.
! !
Fredrik said:
I think all of these notations are standard, but perhaps I picked the least popular one:
Well, I haven't seen it before (except in Mathematica). But if the arguments were called (x_1,x_2), then I would know what \partial_1 meant.

Incidentally, the property that C(x+iy,x-iy) must be real is better expressed as the statement that C must be a symmetric function on ℂ2.
 
  • #54
Fredrik said:
OK, you're right. I have no objections to anything you did in this post.I think all of these notations are standard, but perhaps I picked the least popular one:

D_1f(x,y)=\partial_1 f(x,y)=f_{,1}(x,y)=\frac{\partial f(x,y)}{\partial x}=\frac{\partial}{\partial x}f(x,y)=\frac{\partial}{\partial x_1}\bigg|_{(x,y)}f(x_1,x_2)

Edit: I think you actually made the best post in the entire thread. It explains a lot, without any weird terminology or strange definitions. One of the things you proved is that there's no need to take the formulas I called "strange and unnecessary" as definitions, because the result

\frac{\partial C(z,z^*)}{\partial z}=\frac{1}{2}\left(\frac{\partial}{\partial x}+i\frac{\partial}{\partial y}\right)C(x+iy,x-iy)

follows from the standard definition of a partial derivative, assuming of course that the function C is defined and analytic in an open subset of ℂ2 that includes the point (z,z*).

The point of the Wirtinger calculus is that this assumption (which I made for didactical reasons) is not needed when you work in terms of his definitions. Because in general you only have a function f(z) that is continuous in z and continuously differentiable in Re z and I am z (i.e., considered as a function in R^2), such as F(z)=|Re z|^3-|Im z|^3. One cannot apply Avodyne's recipe in that case, but the Wirtinger derivative 9and the second derivatives) are still well-defined.

It is a bit like the difference between a real analytic function and a real differentiable function. To get the latter, you need more careful definitions than to get the former.
 
  • #55
Avodyne said:
Incidentally, the property that C(x+iy,x-iy) must be real is better expressed as the statement that C must be a symmetric function on ℂ2.

But this is not a requirement for the calculus to work.
 
  • #56
Avodyne said:
!

! !
:smile: Do I seen so stubborn and grumpy that I can't give someone else credit for something good? Maybe it's the avatar. :rolleyes:

Avodyne said:
Well, I haven't seen it before (except in Mathematica).
You made me wonder if the book that I picked up the Dif notation from (a little known Swedish book) was using a non-standard notation, but I just checked my copy of the infamous "baby Rudin" (Principles of mathematical analysis), and it's the default notation there too.

I actually like the notation f_{,\,i} a lot, because it makes the chain rule look so nice, e.g.

(f\circ g)_{,i}(x)=f_{,j}(g(x))g_{j,i}(x)
 
  • #57
Fredrik said:
I actually like the notation f_{,\,i} a lot, because it makes the chain rule look so nice, e.g.
(f\circ g)_{,i}(x)=f_{,j}(g(x))g_{j,i}(x)

The notation without indices (regarding D as a vector with components D_i) looks even nicer: D(f(g(x)) = Df(g(x))Dg(x)
 
  • #58
Arrggh! I somehow managed to screw up some minus signs in my big post above, and now I'm not seeing the "edit" button.

Correct versions are

D1R = (D1 + D2)C

D2R = i(D1 - D2)C.

We can now solve for D1C and D2C, with the result

D1C = (1/2)(D1 - iD2)R

D2C = (1/2)(D1 + iD2)R.

This implies

{\partial\over\partial z}C(z,z^*)={1\over2}\left({\partial\over\partial x}-i{\partial\over\partial y}\right)C(x+iy,x-iy)

which is the same formula given by A.Neumaier in post #17.
 
  • #59
A. Neumaier said:
in general you only have a function f(z) that is continuous in z and continuously differentiable in Re z and I am z (i.e., considered as a function in R^2), such as F(z)=|Re z|^3-|Im z|^3. One cannot apply Avodyne's recipe in that case

I don't see the problem ... Is it because of the absolute-value signs?
 
  • #60
Fredrik said:
:smile: Do I seen so stubborn and grumpy that I can't give someone else credit for something good? Maybe it's the avatar. :rolleyes:
I thought my math would be too "lowbrow" for you, though I did make an effort to specify the domain and range of my functions, and to avoid sloppy language, such as referring to the function R as R(x,y).
 

Similar threads

  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 5 ·
Replies
5
Views
1K
Replies
4
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 13 ·
Replies
13
Views
2K