Understanding the Concept of 4-Vectors in Physics: A Mathematical Perspective

jdstokes · Nov 22, 2007

Hi all,

I had a rather poor introduction to special relativity and right now I'm refreshing myself in order to study quantum field theory.

In particular, I've always found the concept of four-vectors confusing. The problem is that from the mathematical point of view 4-vectors are nothing other than 4-tuples of real numbers, indeed the tangent space is always going to be isomorphic to \mathbb{R}^4. So it seems like everything is a 4-vector.

On the other hand, physicists define them completely differently as creatures which transform according to the Lorentz transformation. So mathematically we have our smooth manifold (ie spacetime) and to each point we assign a bunch of tangent vectors which make up our tangent space.

I'd like to try to reconcile this discrepancy by explaining it in words. I'd appreciate your comments as to whether or not this is a good way of thinking about.

Fix a point p in spacetime and consider any physical quantity which can be described by a real variable, eg energy, or the x-component of momentum. Now consider the set of coordinate charts \mathcal{A}_p about p. We then get a function f : \mathbb{R} \times \mathcal{A}_p \times \mathcal{A}_p \to \mathbb{R} which describes how the value of the quantity changes when we transform from one coordinate chart to another. So now we have a completely mathematical definition of e.g. scalars as constant functions id : \mathbb{R} \times \mathcal{A}_p \times \mathcal{A}_p \to \mathbb{R}.

If we take four of these functions and line them up in a row (f^1,f^2,f^3,f^4) then we don't necesarily get a 4-vector. The condition that the result be a 4-vector is that f^\lambda(q,x^\mu,x^{\nu'}) = \sum_\mu\frac{\partial x^{\lambda'}}{\partial x^\mu}f^{\mu}(q) for each \lambda \in \{1,2,3,4 \}. This is basically saying that the physical quantity in each entry of the 4-vector must be related to the corresponding coordinate function, otherwise there is no chance of the thing being a 4-vector.

robphy · Nov 22, 2007

jdstokes said:

In particular, I've always found the concept of four-vectors confusing. The problem is that from the mathematical point of view 4-vectors are nothing other than 4-tuples of real numbers, indeed the tangent space is always going to be isomorphic to \mathbb{R}^4. So it seems like everything is a 4-vector.

On the other hand, physicists define them completely differently as creatures which transform according to the Lorentz transformation. So mathematically we have our smooth manifold (ie spacetime) and to each point we assign a bunch of tangent vectors which make up our tangent space.

Start with the [mathematician's] definition of a vector... not merely tuples.. but objects that can be added together and multiplied by scalars... etc. The familiar "vectors" used in PHY101 are loosely defined as something with a magnitude and direction... however, more precisely, the notion of magnitude comes from the specification of an inner-product or a metric [which is preserved under a set of transformations]. Instead of the Euclidean metric in PHY101, we have the [indefinite] Minkowskian metric in Special Relativity.

I'm not sure about the following (but someone with more knowledge of the history of mathematics can chime in)... I think that the notion that objects "which transform according to the ... transformation" is derived, not from a physicist, but from the mathematician Felix Klein and his Erlanger program. If I'm wrong, please correct me.

This is a nice presentation:
http://books.google.com/books?id=wp2A7ZBUwDgC&pg=PA79&lpg=PA79&dq=geroch+minkowski&source=web&ots=pqh0zf25sk&sig=qLSzy3OAb6RBPDJO4L4HIzFiJEM

pmb_phy · Nov 22, 2007

jdstokes said:

Hi all,

I had a rather poor introduction to special relativity and right now I'm refreshing myself in order to study quantum field theory.

In particular, I've always found the concept of four-vectors confusing. The problem is that from the mathematical point of view 4-vectors are nothing other than 4-tuples of real numbers, indeed the tangent space is always going to be isomorphic to \mathbb{R}^4. So it seems like everything is a 4-vector...

I created two web pages which defines tensors, especially the 4-vector. They are at

http://www.geocities.com/physics_world/gr_ma/tensor_via_geometric.htm
http://www.geocities.com/physics_world/gr_ma/tensors_via_analytic.htm

I think the first link will be more helpful to you. IF you have any questions or comments regarding those pages please ask (questions help me perfect my web pages).

Best regards

Pete

nrqed · Nov 22, 2007

jdstokes said:

Hi all,

I had a rather poor introduction to special relativity and right now I'm refreshing myself in order to study quantum field theory.

In particular, I've always found the concept of four-vectors confusing. The problem is that from the mathematical point of view 4-vectors are nothing other than 4-tuples of real numbers, indeed the tangent space is always going to be isomorphic to \mathbb{R}^4. So it seems like everything is a 4-vector.

On the other hand, physicists define them completely differently as creatures which transform according to the Lorentz transformation. So mathematically we have our smooth manifold (ie spacetime) and to each point we assign a bunch of tangent vectors which make up our tangent space.

I'd like to try to reconcile this discrepancy by explaining it in words. I'd appreciate your comments as to whether or not this is a good way of thinking about.

Fix a point p in spacetime and consider any physical quantity which can be described by a real variable, eg energy, or the x-component of momentum. Now consider the set of coordinate charts \mathcal{A}_p about p. We then get a function f : \mathbb{R} \times \mathcal{A}_p \times \mathcal{A}_p \to \mathbb{R} which describes how the value of the quantity changes when we transform from one coordinate chart to another. So now we have a completely mathematical definition of e.g. scalars as constant functions id : \mathbb{R} \times \mathcal{A}_p \times \mathcal{A}_p \to \mathbb{R}.

If we take four of these functions and line them up in a row (f^1,f^2,f^3,f^4) then we don't necesarily get a 4-vector. The condition that the result be a 4-vector is that f^\lambda(q,x^\mu,x^{\nu'}) = \sum_\mu\frac{\partial x^{\lambda'}}{\partial x^\mu}f^{\mu}(q) for each \lambda \in \{1,2,3,4 \}. This is basically saying that the physical quantity in each entry of the 4-vector must be related to the corresponding coordinate function, otherwise there is no chance of the thing being a 4-vector.

I am not very knowledgeable in differential geometry but I would say that vectors are not defined as simple n-tuples. A vector is defined as something that will map a scalar function to a number (the directional derivative in the direction of the vector). Imagine now changing coordinate system. In order for the vector to map the same scalar function to the same final number (the directional derivative, which does not depend on the coordinate system used) , the components of the vectors in a certain basis must transform a certain definite way. The vector itself is a geometrical object which does not change, but its components must change.

pmb_phy · Nov 22, 2007

nrqed said:

I am not very knowledgeable in differential geometry but I would say that vectors are not defined as simple n-tuples. A vector is defined as something that will map a scalar function to a number (the directional derivative in the direction of the vector). Imagine now changing coordinate system. In order for the vector to map the same scalar function to the same final number (the directional derivative, which does not depend on the coordinate system used) , the components of the vectors in a certain basis must transform a certain definite way. The vector itself is a geometrical object which does not change, but its components must change.

One definition of a vector is as a map from 1-forms to real numbers (scalars). It is not a map of scalar functions to numbers.

Pete

shoehorn · Nov 22, 2007

pmb_phy said:

One definition of a vector is as a map from 1-forms to real numbers (scalars). It is not a map of scalar functions to numbers.

Pete

Eh? You've got this the wrong way round. In the context of differential geometry, a nice (and correct) way to view a vector is as a map which takes a scalar function to a real number. This is obvious since, for example, it allows you to define directional derivatives of functions along curves by allowing a vector to act on the function. This is very basic stuff.

robphy · Nov 22, 2007

I think
Pete is talking about v^a \omega_a
while nrqed and shoehorn are talking about v^a\nabla_a f.

One might be more appropriate for a tensor algebra [say, based at a point]... rather than tensor fields.

jdstokes · Nov 22, 2007

nrqed said:

I am not very knowledgeable in differential geometry but I would say that vectors are not defined as simple n-tuples. A vector is defined as something that will map a scalar function to a number (the directional derivative in the direction of the vector). Imagine now changing coordinate system. In order for the vector to map the same scalar function to the same final number (the directional derivative, which does not depend on the coordinate system used) , the components of the vectors in a certain basis must transform a certain definite way. The vector itself is a geometrical object which does not change, but its components must change.

Yes, you can define a tangent vector at a point p on a smooth manifold M as a smooth derivation at p, ie a function

v : \mathcal{C}^{\infty}(M) \to \mathbb{R}

which satisfy the product rule: v(fg) = v(f) g(p) + f(p) v(g). The problem with this definition is that it seems to be very far removed from anything physical. This is why I prefer to think of 4-vectors as 4-tuples of functions (physical quantities) which transform in a specified way.

shoehorn · Nov 23, 2007

jdstokes said:

Yes, you can define a tangent vector at a point p on a smooth manifold M as a smooth derivation at p, ie a function

v : \mathcal{C}^{\infty}(M) \to \mathbb{R}

which satisfy the product rule: v(fg) = v(f) g(p) + f(p) v(g). The problem with this definition is that it seems to be very far removed from anything physical. This is why I prefer to think of 4-vectors as 4-tuples of functions (physical quantities) which transform in a specified way.

This definition is certainly not removed from physical intuition; on the contrary, it is at the very heart of physical intuition in special relativity. As a hint, think about how objects are represented in Minkowski space. For example, presumably you know that a massive object will follow a timelike curve. Now, what is the defining property of a timelike curve? And how can this defining property be used to define other quantities of physical interest.

(On a totally unrelated topic, your requirement for differentiability in the above quote is very strong. Technically, you need only consider C^1 functions. The restriction to C^{k<\infty} spaces of functions can have a massive influence on one's ability to analyse the existence and uniqueness properties of the governing equationsand you shouldn't, without very good reason, require C^\infty.)

pmb_phy · Nov 23, 2007

shoehorn said:

Eh? You've got this the wrong way round.

From A first course in general relativity, by Bernard F. Schutz, page 110

Now a vector is defined as a linear function of of one-forms into real numbers.

In the context of differential geometry, a nice (and correct) way to view a vector is as a map which takes a scalar function to a real number.

Please provide a source for your definition, especially the source from which you learned this.

This is obvious since, for example, it allows you to define directional derivatives of functions along curves by allowing a vector to act on the function. This is very basic stuff.

Why do you believe that Shutz is not very basic stuff. This is the text used at MIT for their GR course. Alan Guth recommended this text to me himself. It is obviously a well respected and well known text which is always spoken of in positive terms by most of those people who learn GR from it.

Pete

jdstokes · Nov 23, 2007

Hi pmb_phy,

This sounds like a very strange and probably circular way to define a vector. How does he define a 1-form? Usually 1-forms are defined to be real-valued linear functions on vectors. Thus it makes no sense to define a vector in terms of 1-forms!

jdstokes · Nov 23, 2007

shoehorn said:

This definition is certainly not removed from physical intuition; on the contrary, it is at the very heart of physical intuition in special relativity. As a hint, think about how objects are represented in Minkowski space. For example, presumably you know that a massive object will follow a timelike curve. Now, what is the defining property of a timelike curve? And how can this defining property be used to define other quantities of physical interest.

Let \alpha : I \subset \mathbb{R} \to M be a smooth curve in M. The requirement that alpha be a a timelike worldline is that the tangent vector to the curve have everywhere positive Minowski norm. Ie that the push-forward of t \in I by \alpha is such that \langle \alpha_\ast(t),\alpha_\ast(t) \rangle_{\alpha(t)} >0\; \forall t \in I.

I'm afraid I don't see how any of this is at the heat of physical intuition in SR.

shoehorn said:

(On a totally unrelated topic, your requirement for differentiability in the above quote is very strong. Technically, you need only consider C^1 functions. The restriction to C^{k<\infty} spaces of functions can have a massive influence on one's ability to analyse the existence and uniqueness properties of the governing equationsand you shouldn't, without very good reason, require C^\infty.)

Interesting. The definition I gave was defined to me by a mathematician. I note this is also the definition used in Modern Differential Geometry for Physicists by Isham.

pmb_phy · Nov 23, 2007

jdstokes said:

Hi pmb_phy,

This sounds like a very strange and probably circular way to define a vector. How does he define a 1-form? Usually 1-forms are defined to be real-valued linear functions on vectors. Thus it makes no sense to define a vector in terms of 1-forms!

Yeah. I felt that way too when I first read Schutz. But later, upon more careful reading of Schutz, I realized it was not a circular definition.

On page 67 Schutz writes

Derivative of a function is a one-form. ...

The "..." means that there is a long discussion, too much to post to get the idea.

On page 127 Schutz writes

Consider a scalar field \phi. Given a coordinate system ... it is always possible to form the derivatives ... . We define the one form ... to be the geometrical object whose components are ... in the ... coordinate system. This is a general definition of an infinity of one-forms, each formed from a different scalar field. The transformation then of components is automatic from the chain rule for partial derivatives: ...

The "..." means that I didn't know how to write the symbols with Latex.

I've scanned the text into a PDF file for those pages and more. See
http://www.geocities.com/pmb_phy/Schutz.pdf

Best regards

Pete

Understanding the Concept of 4-Vectors in Physics: A Mathematical Perspective

Similar threads

Hot Threads

A Minimal property of Spacelike geodesics in GR/curved spacetime?

A Dirac's "GTR" Eq (27.4): how momentum ##p^\mu## varies

A Question on Dirac's derivatives of the 4-velocity w.r.t. coordinates

B No object actually approaches the speed of light

B When I jump up and down what is the Einsteinian way to describe it?

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem