# Differentiating the complex scalar field

christodouloum
Basic question on scalar filed theory that is getting on my nerves. Say that we have the langrangian density of the free scalar (not hermitian i.e. "complex") field

$$L=-1/2 (\partial_{\mu} \phi \partial^{\mu} \phi^* + m^2 \phi \phi^*)$$

Thus the equations of motion are

$$(\partial_{\mu} \partial^{\mu} - m ) \phi=0$$ the kg equation plus the complex conjugate equation. fine. Now I have been taught to do this calculation by thinking of the scalar field as really a complex function i.e.
$$\phi=\phi_1 + i \phi_2$$ with phi1 phi2 reals.

This is giving the right results e.g. $$\frac{\partial}{\partial \phi^*} \phi \phi^*=2 \phi$$ but also in the same way I get

$$\frac{\partial}{\partial \phi} {\phi} =2$$

which is quite crazy. So how should one actually do the differentiation in Lagrange's equations? The functional derivative doesn't really help me here it is the product of phi and it's complex that is giving the problem.

Last edited:

Chopin
The derivatives aren't with respect to $$\phi$$, they're with respect to the components of $$x$$. The notation $$\partial_u\phi(x)$$ is short for $$({\frac{\partial}{\partial x^0}\phi(x), \frac{\partial}{\partial x^1}\phi(x), \frac{\partial}{\partial x^2}\phi(x), \frac{\partial}{\partial x^3}\phi(x)})$$. Each of those partial derivatives should work out correctly with respect to either $$\phi$$ or $$\phi^*$$. Is that what you meant, or did I misunderstand the question?

Last edited:
This is giving the right results e.g. $\frac{\partial}{\partial \phi^*} \phi \phi^*=2 \phi$ but also in the same way I get
$\frac{\partial}{\partial \phi} {\phi} =2$ which is quite crazy.
Both factors of 2 are wrong. See
http://people.ccmr.cornell.edu/~muchomas/P480/Notes/dft/node10.html

christodouloum
I will explain where my question is a bit better. The e-l equations are

$$\frac{\delta L}{\delta \phi} - \partial_{\mu} \frac{\delta L} {\delta \partial_{\mu} \phi}=0$$

where L is the langrangian density in the first post meaning that say the action is a functional

$$S=\int{ dx^D L}$$

so the mass*phi term in the equations of motion comes from

$$\frac{\delta L}{\delta \phi^{*}}=\frac{\delta( m^2 \phi \phi^{*})}{\delta \phi}=2 m \phi$$

Now I do not know why this last result holds, I only know it does. (There are no components in this part of the calculation). The way I know to calculate this is to think of phi as
$$\phi = \phi_{1} + i \phi_{2}$$ and thus in a sloppy way do the chain rule etc, take inverse of derivatives (i.e. assuming $$\frac {\partial \phi_1 }{\partial \phi} ^{-1}= \frac {\partial \phi }{\partial \phi_1}$$) and coming up with (if you write it down it is simple and sloppy)

$$\frac{\partial}{\partial \phi^{*}}=\frac{\partial}{\partial \phi_1}+i\frac{\partial}{\partial \phi_2}$$

thus for $$\phi \phi^* =\phi_1 ^2 +\phi_2 ^2$$ it works fine, also for the part with the partial derivatives. But if we assume that this recipe works we also get in this sense

$$\frac{\partial}{\partial \phi} {\phi} =2$$

which kind of sucks.

christodouloum
while i was writing the last reply anodyne replied. i am checking the link now

jostpuur
Physicists enjoy getting correct results by nonsensical calculations. I would recommend not to waste time trying to understand the physicists' explanations. Compute partial derivatives

$$\frac{\partial}{\partial(\textrm{Re}(\phi))},\quad \frac{\partial}{\partial(\textrm{Im}(\phi))}$$

and everything will remain clear.

Physicists often assume that $z=(x,y)$ and $z^*=(x,-y)$ would be independent. Now, what does it mean that you "assume them to be independent", when they clearly are not independent? It's like genuine notorious Orwellian double think: CONSTRAINED VARIABLES ARE INDEPENDENT VARIABLES.

----

$$(x,y) = (0,0)$$

However, I didn't get full points, because I should have written

$$(x,y) = (0,0),\quad\textrm{and}\quad (x,-y) = (0,0)$$

I tried to explain to the course assistant that the second equation was redundant, but he explained yes it is redundant but it is not redundant (or something like that). The catch was that these were "physical dynamical variables".

Chopin
...

Ok, that's clearer now. I think Avodyne probably has the answer to your immediate question. You might also be interested in looking at http://www.physics.upenn.edu/~chb/phys253a/coleman/06-1009.pdf" [Broken], specifically the section titled "Internal Symmetries". These are from Sidney Coleman's lectures at Harvard. He goes in the opposite direction from what you're trying to do--he begins by treating $$\phi_1$$ and $$\phi_2$$ as two independent real fields, and uses symmetry arguments to show how they are isomorphic to one complex field. He then uses that to show that even if you start with a complex field, you can treat $$\phi$$ and $$\phi^*$$ as if they were separate fields when minimizing the Lagrangian, and things still come out ok.

Last edited by a moderator:
Staff Emeritus
Gold Member
$$\frac{\partial}{\partial\phi} f(\phi_1,\phi_2) =\frac{\partial}{\partial\phi} f\Big(\frac{\phi+\phi^*}{2},\frac{\phi-\phi^*}{2i}\Big)=\frac{\partial f}{\partial\phi_1}\frac{1}{2}+\frac{\partial f}{\partial\phi_2}\frac{1}{2i}$$

So

$$\frac{\partial}{\partial\phi}\phi=\frac{1}{2}\Big(\frac{\partial}{\partial\phi_1}-i\frac{\partial}{\partial\phi_2}\Big)(\phi_1+i\phi_2)=\frac{1+0+0+1}{2}=1$$

christodouloum
So, cheers to anodyne( and all the rest that answered while writing this) ! you saved my day that was exactly what I was doing wrong and it is quite simple too. So in writing $$\phi=\phi_1+i \phi_2$$ and taking the chain rule I was taking say
$$\frac{\partial \phi }{\partial \phi_2}=i= \frac{\partial \phi_2 }{\partial \phi} ^{-1}$$
(last sloppy equality is what i was doing wrong) and coming up with
$$\frac{\partial \phi_2 }{\partial \phi}=-i$$

while in noting that $$\phi_2=(\phi - \phi^*)/2i$$ obviously

$$\frac{\partial \phi_2 }{\partial \phi}=-i/2$$. (in case anyone has the same question i think it is explained)

So cool i know where I was wrong. How about the 2 factor between $$\frac{\partial \phi_2 }{\partial \phi}=2(\frac{\partial \phi }{\partial \phi_2})^{-1}$$. I understand now that this is the case for any complex variable $$\phi=\phi_1 + i \phi_2$$. Just asking any intuitive reason for this to happen?

 fredrik had it right too and I am checking the pdf from jostpuur. cheers everyone I can't believe how many people replied while writing my reply, actually now I have 2 or 3 ways more to think about this

Last edited:
Staff Emeritus
Gold Member
How about the 2 factor between $$\frac{\partial \phi_2 }{\partial \phi}=2(\frac{\partial \phi }{\partial \phi_2})^{-1}$$. I understand now that this is the case for any complex variable $$\phi=\phi_1 + i \phi_2$$. Just asking any intuitive reason for this to happen?
The 2 comes from the fact that there's a factor of 1/2 in

$$\phi_2=\frac{\phi-\phi^*}{2i}$$

but not in

$$\phi=\phi_1+i\phi_2$$.

What your intuition needs is probably just a reminder that you're working with partial derivatives. You expected a factor of 1 instead of 2 because of your experience with ordinary derivatives.

Physicists often assume that $z=(x,y)$ and $z^*=(x,-y)$ would be independent. Now, what does it mean that you "assume them to be independent", when they clearly are not independent? [...]

If x and y are independent variables, then so are $z$ and $z^*$.
This is straight calculus, not sloppy assumptions.

Proof:

"x and y are independent variables" means (by definition) that they're not functions
of each other, i.e.,

$$\frac{ \partial x }{\partial y} ~=~ 0 ~=~ \frac{ \partial y }{\partial x} ~~.$$

So if

$$z ~=~ x + iy ~~;~~~~~~ z^* ~=~ x - iy ~,$$

then
$$x ~=~ \frac{z + z^*}{2} ~~;~~~~~~ y ~=~ \frac{z - z^*}{2i} ~.$$

Hence (using the chain rule),

$$\frac{ \partial z }{\partial z^*} ~=~ \frac{ \partial z }{\partial x} \frac{ \partial x }{\partial z^*} ~+~ \frac{ \partial z }{\partial y} \frac{ \partial y }{\partial z^*} ~=~ \frac{1}{2} ~+~ i\,\frac{-1}{2i} ~=~ 0 ~,$$

which means $z$ and $z^*$ are indeed independent variables.

Staff Emeritus
Gold Member
If x and y are independent variables, then so are $z$ and $z^*$.
This is straight calculus, not sloppy assumptions.
I'm not convinced by that argument. I don't see anything wrong with your calculations, but I also don't see how they answer jostpuur's concern. Clearly there's something very strange about saying that $\phi^*(x)$ is the complex conjugate of $\phi(x)$ for all x and then saying that $\phi$ and $\phi^*$ are two independent functions to be determined by a condition we impose on the action. Hmm...I suppose that if we can show that the map $(f,g)\mapsto S[f,g]$ is minimized only by pairs (f,g) such that g(x)=f(x)* for all x, then there's no problem, because then we have derived the relationship between $\phi$ and $\phi^*$ rather than assumed it. (I wrote that sentence after all the stuff below, so I haven't had time to think about whether this is the case).

I decided to take a closer look at your calculations, to see if I can find out how, or if, they're relevant. It seems that all I accomplished was to show that if you make sure that you know what functions are involved in your calculation, and at what points they're evaluated, there's no need to do the calculation at all. I had already typed most of this when I understood that it doesn't add all that much to the discussion, so I thought about throwing it away, but since I think it adds something, I figured I might as well post it:

We're interested in the derivative $\partial z/\partial z^*$. The first thing we need to do is to think about what this expression means. To find a derivative, we need to know what function we're taking the derivative of. We also need to know at what point in the derivative's domain the derivative is to be evaluated.

We seem to be talking about the projection function $(\alpha,\beta)\mapsto\alpha$, and points in the domain of the form (z,z*) (i.e. points such that the value of the second variable is the complex conjugate of the value of the first). So I interpret $\partial z/\partial z^*$ to mean this:

$$\frac{\partial z}{\partial z^*}=D_2\big|_{(z,z^*)}\Big((\alpha,\beta)\mapsto \alpha\Big)$$

This is of course trivially =0. So there's nothing to prove, and yet it looks like you have proved something. Now we just have to figure out what you proved.

We make the following definitions:

$$u(x,y)=x+iy$$
$$v(x,y)=x-iy$$

$$h(\alpha,\beta)=\frac{\alpha+\beta}{2}$$
$$k(\alpha,\beta)=\frac{\alpha-\beta}{2i}$$

These can all be thought of as functions from ℂ2 into ℂ, but we will of course be especially interested in the restrictions of u and v to ℝ2, and h and k evaluated at points of the form (z,z*).

Any complex number $\alpha$ can be expressed as $\alpha=u(x,y)$ for some x,y in ℝ. This equality and the definitions above imply that

$$\alpha=u(h(\alpha,\alpha^*),k(\alpha,\alpha^*))$$

So

$$\frac{\partial z}{\partial z^*}=D_2\big|_{(z,z^*)}\Big((\alpha,\beta)\mapsto \alpha\Big) =D_2\big|_{(z,z^*)}\Big((\alpha,\beta)\mapsto u(h(\alpha,\alpha^*),k(\alpha,\alpha^*))\Big)$$

We're going to use the chain rule now, and the notation becomes less of a mess if we define

$$f(\alpha,\beta)=u(h(\alpha,\beta),k(\alpha,\beta))$$

$$\frac{\partial z}{\partial z^*}=D_2(u\circ f)(z,z^*)=D_1u(f(z,z^*))\, D_2h(z,z^*)+D_2u(f(z,z^*))D_2k(z,z^*)$$

$$=1\frac{1}{2}+i\frac{-1}{2i}=0$$

This seems to be the same calculation you did, except that I kept track of what functions are involved, and at what points they're evaluated. To be able to do that, I had to start with an expression that's obviously equal to 0, so the chain rule doesn't tell us anything new.

Last edited:
Fredrik,

I'm not sure what more I can usefully say, except that I think you're making
it much more complicated than it needs to be. IMHO, it really is as simple as
saying that if we have functions of two independent variables then it's
possible to make a change of those two variables into two other (independent)
variables.

It's surprising how independent variables x and y generate no confusion,
but z and z* as independent variables does. :-)

BTW, it sometimes helps to remember that the Cauchy-Riemann equations
that define what a complex-analytic function is can also be expressed as

$$\frac{\partial f(z)}{\partial z^*} ~=~ 0 ~~.$$

Or maybe not. :-)

Cheers.

jostpuur
Physicists often assume that $z=(x,y)$ and $z^*=(x,-y)$ would be independent. Now, what does it mean that you "assume them to be independent", when they clearly are not independent?
If x and y are independent variables, then so are $z$ and $z^*$.

You are wrong.

You are wrong.

Really? Then so are heaps of calculus textbooks.

Your bald assertion is of no value as it stands.
I gave a proof, but you did not.

Staff Emeritus
Gold Member
It's surprising how independent variables x and y generate no confusion,
but z and z* as independent variables does. :-)

"x and y are independent variables" means (by definition) that they're not functions
of each other, i.e.,

$$\frac{ \partial x }{\partial y} ~=~ 0 ~=~ \frac{ \partial y }{\partial x} ~~.$$
What that phrase means to me is that we're dealing with a function f:X×Y→Z (often with X=Y=ℝ) and use the notation (x,y) for points in its domain. This is why "independent variables x and y" cause no confusion and is entirely trivial.

"z and z* are independent variables" should mean that we use the notation (z,z*) for points in the domain of the function we're dealing with. This is of course just as trivial, but if * denotes complex conjugation, the domain of the function would have to be (a subset of) the subset of ℂ2 that consists of pairs of the form (z,z*), and now we have a problem. Suppose that g:D→ℂ is such a function (where D is the set of pairs (z,z*)). What does the expression

$$\frac{\partial g(z,z^*)}{\partial z}$$

mean? This should be the partial derivative of g with respect to the first variable, evaluated at (z,z*), but

$$\lim_{h\rightarrow 0}\frac{g(z+h,z^*)-g(z,z^*)}{h}$$

is undefined! So if z and z* are "independent variables" (in the sense I described) and complex conjugates of each other, the partial derivatives of the function are undefined.

I gave a proof
I really don't think you have proved anything. I mean, what function are you taking a partial derivative of when you write $\partial z/\partial z^*$? "z"? z isn't a function, it's a point in the range of a function. I don't see what function you could have meant other than "the function that takes (z,z*) to z", and in that case, you must have assumed that z and z* are independent in the sense I described above. So my argument above applies here, and that means that if you don't postpone setting z* equal to the complex conjugate of z until after you've taken the partial derivative, the partial derivative is ill-defined. And if you do postpone it, the function you're taking a partial derivative of is just Proj1:ℂ2→ℂ defined by Proj1(z,w)=z, and your proof just verifies the trivial fact that D2Proj1(z,z*)=0.

In the case of the complex scalar field, we have essentially the same problem that I described above. If S(f,g) (where S is the action) is only defined on pairs (f,g) such that g(x) is the complex conjugate of f(x) for all x, then the derivative that we're setting to 0 at the start of the derivation of the Euler-Lagrange equation is ill-defined. I see no way out of this other than to wait until after we've minimized the action to set one of the fields equal to the complex conjugate of the other. The justification for this has to be that S is minimized by any pair of scalar fields that both satisfy the Klein-Gordon equation, and that if a field satisfies the Klein-Gordon equation, than so does its complex conjugate.

What does the expression

$$\frac{\partial g(z,z^*)}{\partial z}$$

mean? This should be the partial derivative of g with respect to the first variable, evaluated at (z,z*), but

$$\lim_{h\rightarrow 0}\frac{g(z+h,z^*)-g(z,z^*)}{h}$$

is undefined!

It is quite well-defined, and called Wirtinger derivative; see, e.g., http://en.wikipedia.org/wiki/Wirtinger_derivatives The basics are as follows:

Consider a continuous function g:D subset C to C, mapping z=x+iy in C (with x,y in R) to g(z) in C. If g is differentiable with respect to x and y then for complex w=u+iv,
$$\frac{d}{dh} g(z+hw)|_{h=0} = \frac{dg}{dx} u + \frac{dg}{dy}v =\frac{dg}{dx} \frac{w+w^*}{2}+ \frac{dg}{dy}\frac{w-w^*}{2i}.$$
With the definitions
$$\frac{dg}{dz}:=\frac{1}{2}(\frac{dg}{dx}-i\frac{dg}{dy}),$$
$$\frac{dg}{dz^*}:=\frac{1}{2}(\frac{dg}{dx}+i\frac{dg}{dy}),$$
and noting that 1/i=-i, we therefore find
$$\frac{d}{dh} g(z+hw)|_{h=0} =\frac{dg}{dz} w +\frac{dg}{dz^*} w^*.$$
This relation (with h understood to be real) can also be taken as the definition of
dg/dz and dg/dz^*, since the latter are uniquely determined by it.

Sincce dz/dz^*=dz^*/dz=0, the rules for the Wirtinger calculus are as that for bivariate calculus, with z and z^* replacing real variables x and y.

Note that g is analytic iff dg/dz^*=0, and then dg/dz has the standard meaning from complex calculus.

Edit: I corrected some inaccuracies in the derivation, and added the link to Wikipedia and another comment.
Some of the dx in the denominators were corrected to be dy in the source version but still appear as dx - I don't know why my change doesn't show in the view version.

Last edited:
Staff Emeritus
Gold Member
The expression I said is undefined contains a function that's being evaluated at a point that's not in its domain, so it's clearly undefined.

$$\frac{\partial}{\partial z} = \frac{1}{2} \left( \frac{\partial}{\partial x} - i \frac{\partial}{\partial y} \right),\qquad \frac{\partial}{\partial z^*}= \frac{1}{2} \left( \frac{\partial}{\partial x} + i \frac{\partial}{\partial y} \right)$$

as definitions. These operators are clearly meant to act on functions defined on subsets of ℝ2. I understand that we can use them to assign a meaning to the expression

$$\frac{\partial g(z,z^*)}{\partial z*}$$

by defining it to mean

$$\frac{1}{2} \left( \frac{\partial}{\partial x} + i \frac{\partial}{\partial y} \right)\Big((x,y)\mapsto g(x+iy,x-iy)\Big)$$

instead of the usual limit, which is ill-defined here, but I don't really see why we would want to. More importantly, I don't see how it sheds any light on the main issue here, which is the question of whether it makes sense to say that a scalar field and its complex conjugate are independent variables in the action.

The expression I said is undefined contains a function that's being evaluated at a point that's not in its domain, so it's clearly undefined.

I wasn't specifically talking of your limit but of making sense of dg/dz and dg/dz^* when g is a function of a complex variable z. One just needs to take a slightly different limit than the one you chose and found undefined.

$$\frac{\partial}{\partial z} = \frac{1}{2} \left( \frac{\partial}{\partial x} - i \frac{\partial}{\partial y} \right),\qquad \frac{\partial}{\partial z^*}= \frac{1}{2} \left( \frac{\partial}{\partial x} + i \frac{\partial}{\partial y} \right)$$

as definitions. These operators are clearly meant to act on functions defined on subsets of ℝ2. I understand that we can use them to assign a meaning to the expression

$$\frac{\partial g(z,z^*)}{\partial z*}$$

by defining it to mean

$$\frac{1}{2} \left( \frac{\partial}{\partial x} + i \frac{\partial}{\partial y} \right)\Big((x,y)\mapsto g(x+iy,x-iy)\Big)$$

instead of the usual limit, which is ill-defined here, but I don't really see why we would want to.

We would want to because it is frequently used, both in the quantum mechanics of oscillating systems written in terms of complex variables, and in the field theory of a complex scalar field with general interaction term V(Phi,Phi^*). Wirtinger's calculus gives the rigorous justification for proceeding in the customary way.

More importantly, I don't see how it sheds any light on the main issue here, which is the question of whether it makes sense to say that a scalar field and its complex conjugate are independent variables in the action.

It is the common way to express the fact that in the Wirtinger calculus, one can apply the standard rules of calculus if one pretends that z and z^* are independent real variables. Since there is an underlying rigorous interpretation, it makes sense to use this way of speaking about it.

It makes certainly more sense than the Feynman path integral for interacting fields in Minkowski space.

Sankaku
...one can apply the standard rules of calculus if one pretends that z and z^* are independent real variables. Since there is an underlying rigorous interpretation, it makes sense to use this way of speaking about it.
It has always bothered me to think of them as independent. Could someone direct me to a text where this is explained rigorously? Thanks!

It has always bothered me to think of them as independent. Could someone direct me to a text where this is explained rigorously? Thanks!

Look at the cited wikipedia article.

The fact than dz/dz^*=0 and dz^*/dz=0 inplies that in the chain rule
d/du f(g(u),h(u)) = df/dg dg/du + df/dh dh/du
the mixed derivatives are not present when you specialize u to z or z^*, g to z, and h to z^*.

Sankaku
Isn't that repeating what has already been claimed several times? I understand how the formulas are generated - it is just that the reasoning looks flawed to me. I may be thick, but could you please give me an idea how to vary $$\bar{z}$$ while keeping z constant? The last time I checked, the conjugate was a function of z (or vice versa), unless there is some definition of the complex conjugate that I don't know yet (which is certainly possible). I am guessing that this is the same thing that jostpuur was complaining about earlier in the thread. In everything I have read, it seems like this is a convenient definition, not a derivation. I would be happy to be proven wrong as my complex-fu is pretty basic.

I will quote some lines from Ahlfors (3rd edition, page 27):

We present this procedure with an explicit warning to the reader that it is purely formal and does not possess the power of proof.

...snip...

With this change of variable, we can consider f(x,y) as a function of z and $$\bar{z}$$ which we will treat as independent variables (forgetting that they are in fact conjugate to each other). If the rules of calculus were applicable....

...snip...

These expressions have no convenient definition as limits, but we can introduce them as symbolic derivatives with respect to z and $$\bar{z}$$.

I certainly see that the formalism has some practical use, but even Ahlfors seems to be saying pretty clearly that it is just a handy trick and not to be taken literally. As I said, though, my ability in complex analysis is still basic so if there is a rigorous derivation of this, I would really love to see it in print (not Wikipedia).

Thanks!

Isn't that repeating what has already been claimed several times? I understand how the formulas are generated - it is just that the reasoning looks flawed to me. I may be thick, but could you please give me an idea how to vary $$\bar{z}$$ while keeping z constant? The last time I checked, the conjugate was a function of z (or vice versa), [...]

Here lies the root of the confusion. "Existence of a mapping between A and B"
does not necessarily imply "A and B are dependent on each other".

Let's take step back and consider a simpler example. Let x and y be independent
real variables, and let f(x,y) and g(x,y) be functions on $R^2$, at least
once-differentiable thereon. Then ask the question: "Is the function f dependent
on the function g, or are they independent of each other?". If f is dependent on g,
it means (by definition) that f is a function of g, so we can evaluate the derivative
via the standard multivariate chain rule, i.e.,

$$\frac{ \partial f }{\partial g} ~=~ \frac{ \partial f }{\partial x} \frac{ \partial x }{\partial g} ~+~ \frac{ \partial f }{\partial y} \frac{ \partial y }{\partial g} ~.$$

At this point we can't say any more about whether f and g are/aren't
independent functions unless we know more about them.

Now take the specific case:

$$f(x,y) ~:=~ x + iy ~~~~~;~~~ g(x,y) ~:=~ x - iy ~,$$

and we get 0 in the above, showing that these particular two functions
are independent of each other.

(And if I still "haven't proved anything", I'd sure like to know why not.)

Staff Emeritus
Gold Member
Here lies the root of the confusion.
I think the root of the confusion is that we're talking about variables when we should be talking about functions.

Let's take step back and consider a simpler example. Let x and y be independent
real variables, and let f(x,y) and g(x,y) be functions on $R^2$, at least
once-differentiable thereon. Then ask the question: "Is the function f dependent
on the function g, or are they independent of each other?". If f is dependent on g,
it means (by definition) that f is a function of g, so we can evaluate the derivative
via the standard multivariate chain rule, i.e.,

$$\frac{ \partial f }{\partial g} ~=~ \frac{ \partial f }{\partial x} \frac{ \partial x }{\partial g} ~+~ \frac{ \partial f }{\partial y} \frac{ \partial y }{\partial g} ~.$$

At this point we can't say any more about whether f and g are/aren't
independent functions unless we know more about them.
I'm going to nitpick every little detail, because I think this reply would get kind of incoherent if I try to be more selective. I would never call f(x,y) a function. f is the function. f(x,y) is a member of its range. If $f:\mathbb R^2\rightarrow\mathbb R$, then the claim that "x and y are independent" doesn't add any information. It just suggests that we intend to use the symbols x and y to represent real numbers and intend to put x to the left of y in ordered pairs.

I agree that phrases like "f and g are independent" must be defined, if we are going to use them at all. But I don't think we should. A function from ℝ2 into ℝ is by definition a subset of ℝ2×ℝ (that satisfies a couple of conditions). It seems quite odd to describe two members of the same set (the power set of ℝ2×ℝ) as "dependent" or "independent", based on things other than the sets f,g and ℝ2×ℝ.

But OK, let's move on. The partial derivative of f with respect to the ith variable is another function, which I like to denote by Dif or f,i. The notations $\partial f/\partial x$ and $\partial f/\partial y$ are much more common in the physics literature. This is unfortunate, because I think a student is much less likely to misinterpret an expression like

$$D_1f(x,g(x,y))$$

than

$$\frac{\partial f(x,g(x,y))}{\partial x}$$

which of course means the same thing: The value of D1f at (x,g(x,y)).

OK, back to the f and g that you're talking about. What does $\partial f/\partial g$ mean? What function are you taking a partial derivative of, and which one of its partial derivatives does this expression refer to?

You're using the chain rule in a way that strongly suggests that what you call $\partial f/\partial g$ is the partial derivative with respect to the second variable of

$$(s,t)\mapsto f(x(s,t),y(s,t))$$

from ℝ into ℝ, where x and y have been redefined to refer to two unspecified functions from ℝ into ℝ. But why write $\partial x/\partial g$ for $D_1x$ unless you intend to "denote the first variable by g", but that means either

a) that g denotes a function of the type you mentioned, and the partial derivative is to be evaluated at a point of the form (x(s,g(a,b)),y(s,g(c,d))), or

b) that g denotes a number, and the partial derivative is to be evaluated at a point of the form (x(s,g),y(s,g)).

If we choose option b), we get

$$D_1f(x(s,g),y(s,g))D_2x(s,g)+D_2f(x(s,g),y(s,g))D_2(s,g)$$

which can be written as

$$\frac{ \partial f }{\partial g} ~=~ \frac{ \partial f }{\partial x} \frac{ \partial x }{\partial g} ~+~ \frac{ \partial f }{\partial y} \frac{ \partial y }{\partial g} ~$$

if we suppress the points at which the functions are being evaluated, and accept the rather odd notation $\partial x/\partial g$ for $D_2x$, and similarly for y.

Now take the specific case:

$$f(x,y) ~:=~ x + iy ~~~~~;~~~ g(x,y) ~:=~ x - iy ~,$$

and we get 0 in the above, showing that these particular two functions
are independent of each other.
Ugh...how do you intend to insert this into the chain rule calculation above? Things are already messy, and it gets a lot worse if we try to insert this into the above. I think your previous attempt was much clearer, and I believe I showed why that doesn't work in my previous posts.

Last edited:
Isn't that repeating what has already been claimed several times? I understand how the formulas are generated - it is just that the reasoning looks flawed to me. I may be thick, but could you please give me an idea how to vary $$\bar{z}$$ while keeping z constant?

If H is an analytic function of z^* and z then
$$dH(z^*,z)/dz^*=lim_{h\to 0} (H(z^*+h,z)-H(z^*,z))/h$$
makes perfect sense and gives the right result.

my ability in complex analysis is still basic so if there is a rigorous derivation of this, I would really love to see it in print (not Wikipedia).

The wikipedia article lists a number of references where you can see things in print.
Many of the references are in pure math, so there should be no question that this is rigorous stuff. It is very useful to make things short and comprehensible that would otherwise be somewhat messy.

For example, if you have a classical anharmonic oscillator with Hamiltonian H(a^*,a),
the dynamics defined by it is
da/dt = i dH/da^*(a^*,a).
This would become an impractical, messy, and much less comprehensible formula if one would have to interpret it in terms of the real and imaginary parts. Mathematical notation is there to make life easy!

And it is very easy to apply unambiguously. Typically, H(a^*,a) is a formula rather than an abstract function. Thus you can replace every a^* by a temporary variable u and every remaining a by another temporary variable v, This gives you an expression H(u,v) that defines a function of two variables. You take the partial derivatives, and then substitute back the a^* for u and the a for v - this and nothing else is meant by treating a^* and a as independent variables. And you get provably correct results that way.

But it is a waste of effort to actually do the substitutions since it is very clear what to do without that. E.g., if
H = omega a^*a + lambda (a^*a)^2
then one sees directly
dH/da^* = omega a +2 lambda a^* a^2,
without having first to write
H(u,v)= omega uv+lambda(uv)^2,
dH/du=omega v + 2 lambda uv^2,
dH/da^* = dH/du|u=a^* = omega a +2 lambda a^* a^2.

Sankaku
I apologize. This is the Physics section of the forum and I am drawing the discussion offtopic into Mathematics - I should really be asking my question in another section. It seems to be a construction that is of great use in Physics that has a convenient (but nonsensical) mnemonic using partial derivative notation. Remember that this does not take away from its practical use in Physics.

It may be derived in a more satisfactory way using other tools, but I don't think the partial derivative, limit and chain rule work the way you are using them. I will go to the library today and look at two books cited on the Wikipedia page.

If H is an analytic function of z^* and z then
$$dH(z^*,z)/dz^*=lim_{h\to 0} (H(z^*+h,z)-H(z^*,z))/h$$
makes perfect sense and gives the right result.
I am sorry, I can only say again that z cannot be fixed while you vary $$\bar{z}$$. If you can't see the circular logic in your statement, there is nothing I can do.

These expressions have no convenient definition as limits...

I apologize. This is the Physics section of the forum and I am drawing the discussion offtopic into Mathematics - I should really be asking my question in another section. It seems to be a construction that is of great use in Physics that has a convenient (but nonsensical) mnemonic using partial derivative notation. Remember that this does not take away from its practical use in Physics.

Wirtinger, who invented the notation, was a mathematician. It is standard notation used by mathematicians in the theory of several complex variables.

It may be derived in a more satisfactory way using other tools, but I don't think the partial derivative, limit and chain rule work the way you are using them.

I am using it in the standard way, as it has been used since Wirtinger defined it.

I am sorry, I can only say again that z cannot be fixed while you vary $$\bar{z}$$. If you can't see the circular logic in your statement, there is nothing I can do.

In case you haven't seen it: I am fixing both z and z^* and vary a _new_ variable h.
There is nothing circular in my argument; it has a standard, perfectly well-defined, rigorous interpretation.

so the mass*phi term in the equations of motion comes from

$$\frac{\delta L}{\delta \phi^{*}}=\frac{\delta( m^2 \phi \phi^{*})}{\delta \phi}=2 m \phi$$

Now I do not know why this last result holds, I only know it does.

It doesn't. The correct result is m^2\phi in place of 2m\phi. m is constant and not differentiated, and \phi^* is treated as independent of \phy, according to the theorems of the Wirtinger calculus.

Sankaku
I have now looked up two of the main print references from the Wikipedia page. Interestingly, as far as I could see, nether one described it as a Wirtinger derivative or even mentioned his name.

Kaup & Kaup (1983), pages 2 and 4, has no attempt to 'prove' the construction and just states it as a definition (note the := in the equation).

$$\frac{ \partial}{\partial \bar{z}_j} ~:=~ \frac{1}{2} \left ( \frac{ \partial}{\partial x_j} ~+~ i \frac{ \partial }{\partial y_j} \right )$$

Gunning & Rossi (1965), page 4, states very clearly (emphasis added by me):
It should perhaps be remarked that the left-hand sides in (above) are defined by that equation, and have no separate meaning.
This is essentially equivalent to what I already quoted from Ahlfors. So, it is a purely a definition (not a derivation) and it certainly has nothing to do with that awful limit posted earlier.

Also, books trying to justify the definition using the chain rule with things like this (and the equivalent for y),

$$x ~=~ \frac{z + \bar{z}}{2} ~ \to ~ \frac{ \partial x}{\partial \bar{z}} ~=~ \frac{1}{2}$$

are just confusing the issue. To paraphrase Ahlfors: That would work if the rules of calculus applied. But they don't.

Just use the definition as a definition. It is ok.

Saying that "Wirtinger was a mathematician" is the worst sort of "proof by authority." Until you can provide evidence to the contrary, I am going to assume that he just stated it as a definition as well (and then showed the power of using this particular construction).

I think the root of the confusion is that we're talking about variables when we should be talking about functions.

Well then, let's talk about variables...

If I say "x and y are independent real variables", I mean that it's
possible to vary x without necessarily inducing any change in y as a
consequence (and vice versa). Do you agree with that definition?

If so, I now propose a change of variables from x,y to a new pair u,v
(also both real), defined by:

$$u ~:=~ x + y$$
$$v ~:=~ x - y$$

Question #1: are these u,v "variables" or "functions"? (I would say
that it's ok to think of them as both of these.)

Assuming you agree at least that it's ok to call u,v "variables", then...

Question #2: are the u,v variables independent of each other?
By this I mean, is it possible to vary u (over the real line) without
necessarily inducing a change in v and vice versa. I.e., is it possible
to vary u while holding v constant?

Let's see... Set v = c, where c is an arbitrary fixed real number.
Hence c = x - y, implying y = x - c. Therefore u = 2x - c, and
so u can indeed be varied over the real line even though v is being
held constant.

Now I'll invent an involution operator denoted by a tilde:

$$\widetilde{u} ~:=~ v ~~~~~;~~~~ \widetilde{v} ~:=~ u ~~,$$

and if I now vary u while holding v constant, my involution operation
is not preserved in general. But this does not change the fact that u,v
are independent variables. This doesn't matter -- I'm just using two
different variables to parametrize the plane.

In the case of a complex field in a (Hermitian) Lagrangian, the
counter-intuitive point is that the complex-conjugate relationship need
not continue to hold as the two parts of the field are independently
varied. We'll still get two sensible field equations at the end.

jostpuur
Making a coordinate transformation

$$u = x + y$$
$$v = x - y$$

is not same thing as making a transformation

$$z = x + iy = (x, y)$$
$$z^* = x - iy = (x, -y)$$

because

$$(u,v) \in \mathbb{R}^2$$

and

$$(z,z^*) \in \mathbb{C}^2 = \mathbb{R}^4$$

$u,v$ are independent, and $z,z^*$ are not independent.

Staff Emeritus
Gold Member
If I say "x and y are independent real variables", I mean that it's
possible to vary x without necessarily inducing any change in y as a
consequence (and vice versa). Do you agree with that definition?
I consider a variable to be a symbol that represents a member of a set. (Actually, since every member of every set (in ZFC theory) is a set, I could have said "a symbol that represents a set"). If we say that x is a variable of S, it means that x represents a member of S. In a typical situation where x and y are described as "variables", the set S isn't specified explicitly, but it's clear from the context that x and y are to be assigned values from the same set.

With this definition of "variable", the only way I can make sense of the phrase "x and y are independent" is that no assignment of a value to x will shrink the set of values that can be assigned to y, and vice versa. For example, if x and y are variables of ℝ, the condition x2+y2=4 makes them dependent, because the choice y=2 means that x can only be assigned a value from {-2,2}≠ℝ. On the other hand, if we had said that x and y are variables of {-2,2}, they would be independent even after imposing that condition on them.

I'm not sure if this means that I agree with your definition or not. Does this sound like what you had in mind?

If so, I now propose a change of variables from x,y to a new pair u,v
(also both real), defined by:

$$u ~:=~ x + y$$
$$v ~:=~ x - y$$

Question #1: are these u,v "variables" or "functions"? (I would say
that it's ok to think of them as both of these.)
If x and y are variables, then so are u and v. Since the maps $(x,y)\mapsto u$ and $(x,y)\mapsto v$ are functions, I agree that it's ok to think of u and v as functions, even though they're not, because they can be used to define functions in an obvious way. I don't even mind too much if we call those functions u and v respectively, but we need to be very careful if we do. If there's any chance that this will cause confusion, we should call them something else.

Edit: See post #36 for more comments about variables that satisfy constraints, and the maps that are implicitly defined by those constraints.

Question #2: are the u,v variables independent of each other?
By this I mean, is it possible to vary u (over the real line) without
necessarily inducing a change in v and vice versa. I.e., is it possible
to vary u while holding v constant?
Yes. They are independent even with my definition of what that means.

Now I'll invent an involution operator denoted by a tilde:

$$\widetilde{u} ~:=~ v ~~~~~;~~~~ \widetilde{v} ~:=~ u ~~,$$

and if I now vary u while holding v constant, my involution operation
is not preserved in general. But this does not change the fact that u,v
are independent variables. This doesn't matter -- I'm just using two
different variables to parametrize the plane.
I don't understand what you're doing here. I agree that $(x,y)\mapsto (y,x)$ is an involution on ℝ2, but I don't see why that matters.

Last edited:
I have now looked up two of the main print references from the Wikipedia page. Interestingly, as far as I could see, nether one described it as a Wirtinger derivative or even mentioned his name.

Nevertheless, he invented this calculus and justified it rigorously.

Saying that "Wirtinger was a mathematician" is the worst sort of "proof by authority." Until you can provide evidence to the contrary, I am going to assume that he just stated it as a definition as well (and then showed the power of using this particular construction).

I was using it as an indicator that mathematicians (who are much more conscious about using rigorous notations than physicists) created and use the notation consistently for many, many years. This needs no proof but just looking up books and papers on complex analysis of several variables.

Of course, as often in mathematics, one has a choice what one wants to call a definition and what a theorem. One can take the non-limit relation as a definition and then prove my limit relation to be a theorem, or vice versa.

In any case, both formulas are valid and make sense rigorously.

Also, books trying to justify the definition using the chain rule with things like this (and the equivalent for y),

$$x ~=~ \frac{z + \bar{z}}{2} ~ \to ~ \frac{ \partial x}{\partial \bar{z}} ~=~ \frac{1}{2}$$

are just confusing the issue.

If you don't understand this formula it is only you who is confused.

Given that z is a complex variable, x is a function of z, and according to the Wirtinger calculus, dx/dz equals 1/2 by a trivial calculation going back to the definition. Either that in terms of limits or that in terms of real and imaginary parts; in both cases it is very easy.

With this definition of "variable", the only way I can make sense of the phrase "x and y are independent" is that no assignment of a value to x will shrink the set of values that can be assigned to y, and vice versa.

You should realize that the same word may have different meanings in different contexts,
being generalized by mathematicians if they can give it a more general interpretation that still fits the formal rules.

The word number was originally reserved for a natural number. Over time it accommodated fractions, zero, negative numbers, irrational numbers, and complex numbers, because they behaved in the same way: the algorithmic formula manipulation is identical as long as you don't consider specific values.

In a similar way, the meaning of the phrase "x and y are independent" is generalized to apply to "z and z^* are independent", since the algorithmic formula manipulation is identical.