# Why doesn't using a basis which is not orthogonal work?

• etotheipi

#### etotheipi

As far as I know, a set of vectors forms a basis so long as a linear combination of them can span the entire space. In ##\mathbb{R}^{2}##, for instance, it's common to use an orthogonal basis of the ##\hat{x}## and ##\hat{y}## unit vectors. However, suppose I were to set up a basis (again in ##\mathbb{R}^{2}##) with two vectors which weren't orthogonal, ##\hat{a}## and ##\hat{b}##. The vectors can still span the space, so everything seems fine. However, if I try to express a vector law in terms of this basis, like
$$\vec{F} = m\vec{a} \implies F_{a}\hat{a} + F_{b}\hat{b} = ma_{a}\hat{a} + ma_{b}\hat{b}$$
then a few problems arise. If we compare the coefficients of any given component, the result is valid (i.e. ##F_{a} = ma_{a}##). However the overall expression is just wrong.

For one, the magnitude of ##\vec{a}## is not ##\sqrt{a_{a}^2 + a_{b}^{2}}##; we know this because to compute the real magnitude we'd need to use the cosine rule. However, I thought that a property of vectors was that the magnitude is the root of the sum of the squares of the components (although I'm assuming this only applies if the basis is orthogonal...).

So then I'm conflicted, since it seems valid to use a basis with basis vectors that are not orthogonal, but now the usual rules of vectors don't apply (or at least they need to be tweaked slightly). So is that to say we can only use an orthogonal basis? Thank you!

Last edited by a moderator:

So then I'm conflicted, since it seems valid to use a basis with basis vectors that are not orthogonal, but now the usual rules of vectors don't apply (or at least they need to be tweaked slightly). So is that to say we can only use an orthogonal basis?
Some of these "usual rules" that you're thinking about (such as the one about the magnitde being the square root of the sum of the squares of the components) are orthonormal-only special cases of more general rules. If you are not using an orthonormal (othogonal isn't good enough) basis you need to use the more powerful and general rules, not the special case ones.

You might want to spend some time reading up on the "metric tensor" and practicing using it with various weird coordinate systems in teh Euclidean plane.

Tazerfish, Abhishek11235, Chestermiller and 2 others
You might want to spend some time reading up on the "metric tensor" and practicing using it with various weird coordinate systems in teh Euclidean plane.

Thank you, pretty interesting stuff! The wikipedia page looks like a bit of a beast, so I'll see how far I can get...!

As far as I know, a set of vectors forms a basis so long as a linear combination of them can span the entire space. In ##\mathbb{R}^{2}##, for instance, it's common to use an orthogonal basis of the ##\hat{x}## and ##\hat{y}## unit vectors. However, suppose I were to set up a basis (again in ##\mathbb{R}^{2}##) with two vectors which weren't orthogonal, ##\hat{a}## and ##\hat{b}##. The vectors can still span the space, so everything seems fine. However, if I try to express a vector law in terms of this basis, like
$$\vec{F} = m\vec{a} \implies F_{a}\hat{a} + F_{b}\hat{b} = ma_{a}\hat{a} + ma_{b}\hat{b}$$
then a few problems arise. If we compare the coefficients of any given component, the result is valid (i.e. ##F_{a} = ma_{a}##). However the overall expression is just wrong.

For one, the magnitude of ##\vec{a}## is not ##\sqrt{a_{a}^2 + a_{b}^{2}}##; we know this because to compute the real magnitude we'd need to use the cosine rule. However, I thought that a property of vectors was that the magnitude is the root of the sum of the squares of the components (although I'm assuming this only applies if the basis is orthogonal...).

So then I'm conflicted, since it seems valid to use a basis with basis vectors that are not orthogonal, but now the usual rules of vectors don't apply (or at least they need to be tweaked slightly). So is that to say we can only use an orthogonal basis? Thank you!

Mathematically you must be careful about some things here.

First, in a vector space there is no concept of orthogonality until you have defined an inner product. Nor of the length of a vector.

Once you introduce an inner product, you can define what it means for two vectors to be orthogonal and for a vector to have a certain length, including the special case of having unit length.

The usual inner product is defined in such a way that the vectors ##\hat x, \hat y, \hat z## form an orthonormal basis.

If you have the components of a vector in a different basis, then the inner product can be computed using the appropriate basis transformation matrix. Then you are into the heart of linear algebra with the notion of unitary transformations that trabsform one orthonormal basis to another etc.

etotheipi
Another approach is to note that vectors and tensors do not depend on any basis, but their components of course do.

A convenient framework to deal with the transformations of the vector and tensor components is the Ricci calculus. For the general non-Cartesian case, given a scalar product, you have to introduce indices with upper and lower indices. So in the following an upper index must not be misunderstood as a power!

So we have a (real) ##n##-dimensional vector space ##V## with vectors, which I'll denote with ##\vec{x}##. These are invariant objects. Also there's defined a scalar product, which is a symmetric bilinear form ##(\vec{x},\vec{y}) \mapsto \vec{x} \cdot \vec{y}## which is positive definite, i.e., for which for all ##\vec{x} \in V## you have ##\vec{x} \cdot \vec{x} \geq 0## and from ##\vec{x} \cdot \vec{x}=0## you necessarily have ##\vec{x}=0##.

Now we introduce an arbitrary basis ##\vec{b}_j##, which is a set of ##n## linearly independent vectors, which admit a unique decomposition of a vector,
$$\vec{x}=x^{j} \vec{b}_j,$$
where summation over two repeated indices with one written as a lower index and one written as an upper index over ##j \in \{1,2,\ldots,n \}## is understood.

Now you can calculate the scalar product between any two vectors given their components wrt. this basis as soon as the scalar products between the basis vectors
$$g_{jk}=\vec{b}_j \cdot \vec{b}_k$$
is known, because then
$$\vec{x} \cdot \vec{y}=x^j \vec{b}_j \cdot y^k \vec{b}_k = x^j y^k \vec{b}_j \cdot \vec{b}_k = g_{jk} x^j y^k.$$
From the symmetry and positive definiteness of the scalar product you have
$$g_{jk}=g_{kj}, \quad \mathrm{det} (g_{jk}) \neq 0,$$
i.e., the matrix ##\hat{g}=(g_{jk})## is invertible, i.e., there's the inverse matrix ##\hat{g}^{-1}##, whose matrix elements we denote by definition with ##g^{jk}##, i.e., you have
$$g_{jk} g^{kl}=\delta_{j}^{l}=\begin{cases} 1 & \text{for} \quad j=l, \\ 0 & \text{for} \quad j \neq l. \end{cases}$$
Now it's useful to introduce the idea of "index dragging", i.e., the definition that you can lower an index with ##g_{jk}## and raise and index with ##g^{jk}##. E.g., for the vector components with upper indices you define those with lower indices via
$$x_j=g_{jk} x^k \; \Leftrightarrow \; x^k=g^{kj} x_j.$$
Now we investigate how to get from one basis ##\vec{b}_j## to an arbitrary new basis ##\vec{b}_k'##. Since the new basis is again a linearly independent set of ##n## vectors there's a one-to-one-mapping between the bases,
$$\vec{b}_k'=\vec{b}_j {T^j}_k, \quad \vec{b}_j =\vec{b}_k' {U^k}_j.$$
Of course the matrices ##\hat{T}## and ##\hat{U}## are inverse to each other:
$$\vec{b}_k' = \vec{b}_j {T^j}_k =\vec{b}_l' {U^l}_j {T^j}_k \; \Rightarrow\; {U^l}_j {T^j}_k=\delta_k^l,$$
and the latter equation means in matrix notation
$$\hat{U} \hat{T} = \hat{1} \; \Rightarrow \; \hat{U}=\hat{T}^{-1}.$$
Now it's easy to see how the vector components transform. Since ##\vec{x}## is totally independent of any choice of basis we have
$$\vec{x}=x^j \vec{b}_j =\vec{b}_k' x^{\prime k}=\vec{b}_j {T^j}_k x^{\prime k} \; \Rightarrow \; x^j={T^j}_k x^{\prime k}$$
or, in the other way
$$\vec{x}=\vec{b}_k' x^{\prime k} = \vec{b}_j x^j = \vec{b}_k' {U^k}_j x^j \; \Rightarrow\; x^{\prime k}={U^k}_j x^j.$$
One says that the objects with upper indices transform contragrediently to those with the lower indices or the objects with lower indices transform covariantly and the ones with upper indices contravariantly.

This also holds for the components of the scalar product, ##g_{jk}##. By definition we have
$$g_{jk}' = \vec{b}_j' \cdot \vec{b}_k' =\vec{b}_l \cdot \vec{b}_m {T^l}_j {T^m}_k = g_{lm} {T^l}_j {T^m}_k,$$
i.e., you have to transform an object with two lower indices as the basis vectors, using the rule for each index. Now it's clear from the formalism that ##\vec{x} \cdot \vec{y}## is independent of the choice of basis either. Indeed you get
$$\vec{x} \cdot \vec{y} = g_{jk}' x^{\prime j} y^{\prime k} = g_{lm} {T^l}_j {T^m}_k x^{\prime j} y^{\prime k} = g_{lm} x^l y^m,$$
as it should be.

After one has got used to the somewhat murky index gymnastics (called "Ricci calculus") everything is pretty straight forward ;-).

It's, e.g., easy to see that you can write
$$\vec{x} \cdot \vec{y} = x_j y^j=x^j y_j$$
and that the ##g^{jk}## transform contravariantly, i.e., you have to apply the rule for transforming between contravariant vector components to each upper index:
$$g^{\prime lm}={U^l}_j {U^m}_k g^{jk}.$$
It's a good exercise to prove this explicitly!

Tazerfish, JD_PM, weirdoguy and 1 other person
Goodness, seems like that'll keep me occupied for the rest of the day! Thank you both for these insights, I'm going to try to work through this and will report back!

As an interlude from the fun with indices, I have a question of a more mundane nature. How exactly is a basis connected to a coordinate system (i.e. which comes first?).

From what I've read this morning, I'm inclined to say that a vector space is the most general and is essentially just a set of vectors (things that obey a few rules of vector spaces, like associativity of addition, commutativity of addition etc.). Furthermore, every vector space consists of a basis such that every vector in the space can be written as a linear combination of the basis vectors.

It appears that we can work with vectors without even defining a coordinate system, just by decomposing them into components in a particular basis and applying vector laws (e.g. ##F_{x} = ma_{x}## doesn't appear to require a coordinate system - though I might be wrong!).

A coordinate system just seems like a construction which labels points within a space with a tuple of scalars e.g. ##(3,2,38)##. But then we also talk about ##\hat{x}, \hat{y}, \hat{z}## as being a part of that coordinate system. So in effect, do we need to define a basis first in order to define a coordinate system? And even so, does a tuple like ##(3,2,38)## make sense on its own without a basis?

Edit I mean the tuples (0,0,1), (1,0,0) and (0,1,0) are a basis, which are essentially the same as ##\hat{x}##, ##\hat{y}## and ##\hat{z}##. So is a coordinate then a vector? I feel like I'm digging myself into a hole...

Last edited by a moderator:
A vector space is indeed what you say, a set of objects called vectors and a field (in the algebraic mathematicians' meaning like the field of real numbers ##\mathbb{R}##) with the usual rules for adding vectors and multiplying vectors with numbers in the field.

Then it's easy to see that you can have sets of linearly independent vectors as well as complete sets of independent vectors, where a complete set of vectors is one such that you can write any vector as superpositions of vectors in this set. It's also clear by definition that the decomposition of a vector in terms of superpositions of a compelete linearly independent set is unique. That makes the complete linearly independent sets special and thus they get a special name: basis. It's also clear that there are many possible bases, not only one.

If there's a basis with a finite number of vectors, than all bases have the same finite number of vectors which is called the dimension of the vector space.

Now, using the above notation: If ##\vec{b}_i## is a basis of an ##n##-dimensional vector space, then there's a one-to-one mapping between vectors ##\vec{x}## to the set of ordered ##n##-tupels of real numbers, since
$$\vec{x}=x^i \vec{b}_i$$
is unique, i.e., for each ##\vec{x}## each ##x^i## is uniquely defined, and the mapping is ##\vec{x} \mapsto (x^1,x^2,\ldots,x^n)##.

It's also easy to show that the ##n##-tupels ##(x^1,\ldots,x^n)## build an ##n##-dimensional vector space when you define the addition of vectors as ##(x^1,\ldots,x^n) + (y^1,\ldots,y^n)=(x^1+y^1,\ldots,x^n+y^n)## and the multiplication with a scalar ##\lambda(x^1,\ldots x^n)=(\lambda x^1,\ldots,\lambda x^n)##. It's also clear that the mapping between vectors ##\vec{x} \in V## to the vector space of ##n##-tuples of real numbers, ##\mathbb{R}^n##, is an isomorphism, i.e., a linear invertible mapping between the abstract vector space ##V## to ##\mathbb{R}^n##. This implies that each real vector space of the dimension ##n## is isomorphic to ##\mathbb{R}^n##.

Now let's briefly look at physics. There are the invariant vectors, which don't depend on a basis, which have a clear physical meaning like in your example the acceleration ##\vec{a}## or particle along its trajectory. Now you can introduce a basis and the corresponding components ##(a^1,a^2,a^3)## are a triplet of numbers, depending on this basis, and there's a one-to-one mapping between the physical vector ##\vec{a}## and these three numbers.

etotheipi
So if I understand correctly, is a coordinate system just a special type of vector space? The vectors within this space would be a tuple of scalars ##(a^{1}, a^{2}, a^{3})##, and the basis would just be ##(1,0,0), (0,1,0), (0,0,1)##.

And if we position our coordinate system somewhere in a physical space, then we can create a one-to-one mapping between every location in the space and a particular tuple of scalars?

First of all, the "trivial basis" of ##\mathbb{R}^3## you just quoted above is only one of the arbitrarily many bases possible. Admittedly it's the most convenient one.

Concerning the physics, you are right. The point is that you can choose any three vectors that are not linearly dependent as a basis, and this creates a one-to-one-mapping between the physical vectors and ##\mathbb{R}^3##, but that mapping is not independent of the chosen basis and thus the components of vectors depend on the basis. The physical vectors themselves don't of course, because nature doesn't care about our descriptions of her. So physical laws must be independent under the arbitrary choice of a basis to define a reference frame, i.e., they must be expressible in terms of laws described by scalars, vectors, and tensors, which are independent of the choice of basis.

That's why the fundamental laws of, e.g., Newtonian mechanics are all formulated in terms of equations between vectors like ##\vec{F}=m \vec{a}## (which implies that the mass ##m## must be a scalar quantity since both the force ##\vec{F}## and acceleration ##\vec{a}## are vectors).

etotheipi
"classical dynamics of particles and systems" by Marion have a discussion of olique coordinate systems.

vanhees71 and etotheipi
I'm still a little confused. Given any vector, I understand it can be represented in terms of a coordinate vector, i.e. a list of scalars each of which is the component in a particular basis.

However, often a coordinate system is just thought of as a smooth mapping between points in space to ##\mathbb{R}^n##, in that we drop a perpendicular to each axis and read off the value in order to obtain our tuple of numbers which describes that point, maybe ##(3, 2)##. In that sense, we're not even dealing with vectors but are just labelling points.

On the one hand, it seems that if we have a vector space with physical vectors like ##\vec{r}##, ##\vec{v}## and ##\vec{a}##, we could form a basis consisting of maybe ##\hat{x}, \hat{y}, \hat{z}## and represent each vector in terms of the basis vectors.

But an arbitrary point somewhere on the x-y plane doesn't appear to be a vector, its coordinates are just labels. Like if we consider the points on the graph of ##y=x^{2}##. And we could, if we so wish, superimpose some basis vectors onto this coordinate system so that each point is now labeled by both a set of coordinates and a position vector - though the two ideas seem fairly distinct (although linked).

So I wonder whether a coordinate system can exist on its own without basis vectors?

I'm still a little confused. Given any vector, I understand it can be represented in terms of a coordinate vector, i.e. a list of scalars each of which is the component in a particular basis.

However, often a coordinate system is just thought of as a smooth mapping between points in space to ##\mathbb{R}^n##, in that we drop a perpendicular to each axis and read off the value in order to obtain our tuple of numbers which describes that point, maybe ##(3, 2)##. In that sense, we're not even dealing with vectors but are just labelling points.

On the one hand, it seems that if we have a vector space with physical vectors like ##\vec{r}##, ##\vec{v}## and ##\vec{a}##, we could form a basis consisting of maybe ##\hat{x}, \hat{y}, \hat{z}## and represent each vector in terms of the basis vectors.

But an arbitrary point somewhere on the x-y plane doesn't appear to be a vector, its coordinates are just labels. Like if we consider the points on the graph of ##y=x^{2}##. And we could, if we so wish, superimpose some basis vectors onto this coordinate system so that each point is now labeled by both a set of coordinates and a position vector - though the two ideas seem fairly distinct (although linked).

So I wonder whether a coordinate system can exist on its own without basis vectors?

There are a few subtleties here that I must admit I haven't ever seen fully discussed. The key point is the relationship between physics, physical quantities and physical coordinates systems and the mathematical systems.

There was a thread recently (maybe it one of yours?!) that showed there is no clear consensus on the mathematics underpinning 1D motion.

First, and I think this is a very important point, you shouldn't drag physics into the theory of vector spaces. Up to a point, you have to study and learn to think about vector spaces without any recourse to physical concepts, physics or the "real world". Vector spaces without an inner product generally only have bases, not coordinate systems, as such.

If you then define an inner product on the vector space, you can define a mathematical coordinate system on the vector space. It could be Cartesian or polar or something more exotic. This introduces subtleties and technicalities. For example, the polar unit vectors do not form a basis for your vector space in the usual mathematical sense. ##\hat \theta## is some sort of local unit vector that is fundamentally different from the a basis vector in a vector space.

In my experience it's not until you study GR that you really have to confront this.

If you have a mathematical model of a physical theory, then you map certain aspects of your model onto certain mathematical structures. There are subtleties here too. For example, you cannot add a position vector to a velocity vector. What does it even mean to add two velocities together? And, velocity is frame dependent, so how does that tie in with the mapping to your mathematical vector spaces.

Suffice it to say that there are a lot of hidden complexities in this physics to mathematics mapping. And, I don't know a good resource that tries to straighten it all out.

My advice at this stage is not to confuse the mathematical theory of vector spaces with the physical nature of vectors and coordinate systems that are maps of space and time.

etotheipi
So I wonder whether a coordinate system can exist on its own without basis vectors?

I think this question is essentially meaningless. First, from a mathematical point of view:

Define a coordinate system. Does it have basis vectors as part of or implied by its definition? There's your answer.

Second, physically:

I have three cupboards, which I'll label A, B and C. That's a coordinate system for the things in my cupboards. There is no concept of vectors there at all.

That said, the coordinate systems you tend to use in physics imply some sort of mapping to a vector space, where we can talk about basis vectors. But, if for whatever reason you don't map things to a vector space, then the question is not well-defined.

etotheipi
For example, the polar unit vectors do not form a basis for your vector space in the usual mathematical sense. ##\hat \theta## is some sort of local unit vector that is fundamentally different from the a basis vector in a vector space.

That's an interesting point, I'd never considered that before!

Define a coordinate system. Does it have basis vectors as part of or implied by its definition? There's your answer.

This seems like it should be straightforward but in fact it's quite tricky! The Wikipedia page just defines a coordinate system as measuring signed perpendicular distances from the axes - so no mention of basis vectors at all. And indeed, the concept of basis vectors only exists within vector spaces.

It's just the two are so intertwined, since the basis vectors used in any physics problem are essentially always aligned with the coordinate axes. So it's unclear as to whether we impose a coordinate system onto a vector space and define the coordinates as the tuple containing the scalar components, or whether we just fix a coordinate system somewhere physically and also set up some basis vectors which are aligned with those axes.

Having re-read some of the previous posts again, here's how I'm thinking everything fits together,

Given a vector space ##V## over ##F^{n}## consisting of a basis ##B = \{\vec{v_{1}}, \vec{v_{2}}, ..., \vec{v_{n}}\}##, any vector ##\vec{v}## can be expressed as such
$$\vec{v} = a_{1}\vec{v_{1}} + a_{2}\vec{v_{2}} + ... + a_{n}\vec{v_{n}}$$ such that ##a_{1}, a_{2}, ..., a_{n}## are the coordinates of ##\vec{v}## with respect to the basis ##B##.

For example, if we have a vector space equipped with a Cartesian coordinate system, perhaps in ##\mathbb{R}^{3}##, then any vector ##\vec{v}## can be expressed relative to the standard basis ##B = \{\vec{e_{1}}, \vec{e_{2}}, \vec{e_{3}}\}## of this coordinate system.

This interpretation of coordinates of a vector appears distinct from the idea of coordinates of a point, whose coordinates are defined as the perpendicular distances from the axes. Although, a position vector from the origin to a point has vector coordinates which equal the coordinates of this point.

Then of course we get onto more exotic coordinate systems, such as polar. In this, the coordinates of a point are represented via a tuple of numbers ##(r, \theta)##. Here the idea of a position vector doesn't really work, since the basis vectors are not fixed.

So perhaps, we could say that choosing a vector space allows you to choose a coordinate system whose axes are parallel to the basis vectors we used to define the vector space. Though this isn't a great definition for 2 reasons:
• It's been suggested that we don't need a vector space to define a coordinate system
• This doesn't work well/at all for non-Cartesian coordinate systems

Having re-read some of the previous posts again, here's how I'm thinking everything fits together,

Given a vector space ##V## over ##F^{n}## consisting of a basis ##B = \{\vec{v_{1}}, \vec{v_{2}}, ..., \vec{v_{n}}\}##, any vector ##\vec{v}## can be expressed as such
$$\vec{v} = a_{1}\vec{v_{1}} + a_{2}\vec{v_{2}} + ... + a_{n}\vec{v_{n}}$$ such that ##a_{1}, a_{2}, ..., a_{n}## are the coordinates of ##\vec{v}## with respect to the basis ##B##.

For example, if we have a vector space equipped with a Cartesian coordinate system, perhaps in ##\mathbb{R}^{3}##, then any vector ##\vec{v}## can be expressed relative to the standard basis ##B = \{\vec{e_{1}}, \vec{e_{2}}, \vec{e_{3}}\}## of this coordinate system.

This interpretation of coordinates of a vector appears distinct from the idea of coordinates of a point, whose coordinates are defined as the perpendicular distances from the axes. Although, a position vector from the origin to a point has vector coordinates which equal the coordinates of this point.

Then of course we get onto more exotic coordinate systems, such as polar. In this, the coordinates of a point are represented via a tuple of numbers ##(r, \theta)##. Here the idea of a position vector doesn't really work, since the basis vectors are not fixed.

So perhaps, we could say that choosing a vector space allows you to choose a coordinate system whose axes are parallel to the basis vectors we used to define the vector space. Though this isn't a great definition for 2 reasons:
• It's been suggested that we don't need a vector space to define a coordinate system
• This doesn't work well/at all for non-Cartesian coordinate systems

I put my pure mathematician's hat on and underlined all the terms in your post that are undefined for a vector space.

As vector spaces ##\mathbb{R}^3## and the set of polynomial functions of quadratic or lower degree, let's call it ##P_2##, are isomorphic. Anything you can say about one you can say about the other.

Moreover, if we define an inner product on ##P_2##, then we can make them isomorphic as inner product spaces.

A vector space is an abstract mathematical object. Unless and until you define things like "axis", "coordinate system", "point", "position vector", "polar coordinates", they have mathematically no meaning.

One problem with using ##\mathbb{R}^3## as an example of a vector space is that it comes with a lot of pre-existing mathematical and even physical baggage. You're tempted to start talking about things that ##\mathbb{R}^3## "obviously" has, even where these concepts and properties are undefined in terms of abstract vector spaces.

etotheipi
Perhaps it's best, for now at least, to just keep it simple and say that a coordinate system is defined using a set of basis vectors. Those basis vectors do indeed span a vector space, though that itself is not too important.

So for a good old 2D rectangular Cartesian system, a point in the plane is identified by a 2-tuple, which are the coordinates of the position vector of that point relative to the standard basis.

This seems fine, if a little overkill. If we're plotting a graph of ##y=f(x)##, then we're essentially just putting a dot on the section of graph above the ##x## value and to the right of the ##f(x)## value, i.e. dropping perpendiculars, which is in effect just showing all of the ordered pairs that satisfy the relation. It's questionable as to whether a full blown vector formalisation is required to define such a graph? I'm not sure. After all we are still using coordinate system, so I don't see why not...!

I'm still a little confused. Given any vector, I understand it can be represented in terms of a coordinate vector, i.e. a list of scalars each of which is the component in a particular basis.

However, often a coordinate system is just thought of as a smooth mapping between points in space to ##\mathbb{R}^n##, in that we drop a perpendicular to each axis and read off the value in order to obtain our tuple of numbers which describes that point, maybe ##(3, 2)##. In that sense, we're not even dealing with vectors but are just labelling points.

On the one hand, it seems that if we have a vector space with physical vectors like ##\vec{r}##, ##\vec{v}## and ##\vec{a}##, we could form a basis consisting of maybe ##\hat{x}, \hat{y}, \hat{z}## and represent each vector in terms of the basis vectors.

But an arbitrary point somewhere on the x-y plane doesn't appear to be a vector, its coordinates are just labels. Like if we consider the points on the graph of ##y=x^{2}##. And we could, if we so wish, superimpose some basis vectors onto this coordinate system so that each point is now labeled by both a set of coordinates and a position vector - though the two ideas seem fairly distinct (although linked).

So I wonder whether a coordinate system can exist on its own without basis vectors?
Of course you can also use any kind of coordinates like spherical or cylindrical coordinates, but that's the next step. Before you have to understand the simpler business with the bases.

Now when you deal with Newtonian physics you have, strictly speaking, not simply a vector space describing the physical 3D configuration space but a socalled affine space. It's a set of points and a 3D real vector space to begin with. Now to any two points ##A## and ##B## you can define a vector ##\overrightarrow{AB}##, which is just an arrow connecting these two points in a straight line (assuming you have usual Euclidean geometry established on your set of points). Two vectors of this kind, ##\overrightarrow{AB}## and ##\overrightarrow{CD}## are considered the same when you can parallel translate the one of these arrows to the other. Vector addition is now when you take the beginning of one vector at the end of the other and then draw the arrow from the beginning of the first to end of the second arrow, i.e., ##\overrightarrow{AB} + \overrightarrow{BC}=\overrightarrow{AC}##.

To also define lengths and angles you need also a scalar product, such that ##|AB|=\sqrt{\overrightarrow{AB} \cdot \overrightarrow{AB}}## and the angle ##\phi## between two vectors is defined via ##\vec{a} \cdot \vec{b}=|\vec{a}| |\vec{b}| \cos \phi## and so on.

Now in physics a reference frame consists of a fixed point ##O## ("the origin") and a basis ##\vec{b}_j##. Any point ##P## is then uniquely maped to the position vector ##\vec{r}=\overrightarrow{OP}## and can then be mapped to ##\mathbb{R}^3## via its components ##r^j## wrt. this basis.

Of course the most convenient bases are the Cartesian bases which we usually label as ##\vec{e}_j## and fulfill ##g_{jk}=\vec{e}_j \cdot \vec{e}_k=\delta_{jk}##, i.e., they have unit length and are perpendicular to each other. This makes life much easier than using a general basis and that's why almost all calculations in Newtonian mechanics use Cartesian bases.

Now you can of course define arbitrary curves, like the trajectory of a particle, as a map from the real numbers (for the trajectory you use time as the parameter) to the position vectors ##t \mapsto \vec{r}(t)## along the curve.

Via the mapping to ##\mathbb{R}^3## you now can also use calculus and define derivatives, integrals, limits etc. for vectors.

etotheipi
Right, and I suppose the benefit of defining an affine space ##A\mathbb{R}^{3}## would be so that you can set ##O## arbitrarily.

Google advises me that the vector space you described is known as a Euclidian space ##E## such that
The Cartesian coordinates of a point P of E are the Cartesian coordinates of the vector ##{\displaystyle {\overrightarrow {OP}}.}##

It's actually pretty cool to think that all this time the coordinate systems used in pure maths are actually descendants of vector spaces defined as a mapping between the position vector and real numbers via the components. The signed perpendicular distance from the axis definition is then just an interpretation of the point.

I don't want to burden anyone by dragging out this thread any longer since you've already provided a lot of help, though I wondered whether anyone had any suggestions for textbooks which might help iron this sort of stuff out a little bit? I do have one for linear algebra but it's definitely more of a mathematical treatment.

I did find a publication by Don Koks which I thought provides a fairly clear explanation of some of the concepts we've discussed.

Though I know @PeroK mentioned earlier that resources on this matter aren't readily available!

vanhees71 and PeroK
Luckily I think I've gotten somewhere and it seems to be equivalent to what @vanhees71 described in an earlier post.

The phrase "coordinate system" on its own as some have mentioned earlier doesn't have a precise meaning at all, since it just refers to a labelling system of some sort (like cupboards A/B/C etc.)

However, until you get to GR with all of the weirdness and non-Euclidian spaces, all of the work you do in elementary geometry and Newtonian physics etc. is conducted within Euclidian spaces.

And a Euclidian space can be defined as an affine space consisting of a set of points ##E## as well as a vector space (apparently some also call this a space of translations) ##\vec{E}##. An inner product is defined on ##\vec{E}##, and a subtraction operation is defined on ##E## such that subtracting one point from another in ##E## gives a vector in ##\vec{E}##.

Then, it remains to impose a coordinate system on ##E## so that each point can be mapped to an n-tuple. And for this we can define a Cartesian coordinate system as an affine coordinate system on the Euclidian space:

the idea of which has already been discussed to some length.

This is by no means a complete picture even for Euclidian spaces, however. I've made no mention of spherical coordinate systems nor (dare I bring it up again...) the concept of physical reference frames. The accepted answer to this stack overflow post appears to provide a coherent-looking explanation though the downside is that it is so unbelievably complicated that I can't understand any of it...

I think for now it is best to just leave all of this as is and maybe revisit it in a couple of centuries once I've covered a bit more maths. After all, we have coherent functional definitions of coordinate systems in Euclidian spaces (i.e. measure the radial coordinate and measure the angle, then set up the unit vectors as so... etc.).

But to answer the original (second) question, I think it's safe to conclude that the choice of the basis (+ origin) induces the coordinate system, and not the other way around!

Last edited by a moderator:
I think the manuscript you mentioned in #21 is very good, though perhaps it's indeed better to first stick with Cartesian bases, with which you come very far in mechanics and even electrodynamics.

etotheipi