I How do Tensors "work" in relation to linear algebraic objects?

Sciencemaster
Messages
129
Reaction score
20
TL;DR Summary
I understand Einstein summation on its own, but when I try to connect tensor notation directly to linear algebra objects, I get tripped up by how vectors and tensors should be represented. How should I properly think about this connection?
I've been reviewing some introductory tensor stuff, and I've come to the realization that some of the things tensors do confuse me. For example, the notes I'm reading say that the invariant interval is both ##S=\eta_{\mu\nu} x^\mu x^nu## and ##S=x^T \eta x##. Both of which are totally fine on their own, the former sums over each vector with the proper sign from the metric, and the second gives the same scalar value via matrix multiplication. My issue is with equating the two. The way the former is written, I think of the first one as the metric matrix times two coordinate column vectors (because of their upper indices). In terms of linear algebra, not only would this be a column vector rather than a scalar, but neither of the vectors would be transposed since both ##x^\mu## and ##x^\nu## are both after the matrix and have their indices in the same place (i.e. on equal footing).
Upon further reflection, I realized that just the idea that ##a_{\mu\nu} b^\mu## would take a column vector (upper index) and matrix multiply it to produce a row vector (lower index) seems strange with linear algebra sensibilities. In the past, I have worked both with linear algebra and tensors just fine, Einstein summation notation makes sense in itself and I've manipulated my fair share of tensor equations. However, going back and trying to conceptually connect Tensor math with the presented framework of linear algebra, I'm having trouble with the connection between the two areas of math. Since (0,1)-tensors have been presented as row vectors thus far, I'm tempted to visualize a (0,2) tensor as a collection of row vectors rather than a "normal" matrix given how it interacts with (1,0)-tensors (summing with two column vectors to create a scalar or summing over one column vector and leaving one row vector as the output). The overall point is, I'm struggling with the connection between linear algebra and basic tensor math.
 
  • Like
Likes user14920251
Physics news on Phys.org
@Sciencemaster please note I've used magic moderator powers to edit your LaTeX a bit. You use double dollar signs to enclose inline LaTeX, not single ones, and you use curly braces, not parentheses, to enclose things like multiple subscript indexes.
 
Sciencemaster said:
I'm struggling with the connection between linear algebra and basic tensor math.
"Linear algebra" is not the same as "matrix algebra". Column vectors, row vectors, and matrices are just one possible representation of linear algebra objects (vectors and tensors).

A more general way to look at things is that what you call a "column vector" (a thingie with one upper index) is really a "tangent vector", i.e., an element of a vector space that's called a "tangent space". (In flat spacetime you can get away with ignoring this since flat Minkowski spacetime is a vector space in its own right, with vectors interpreted as displacements instead of tangents to worldlines. However, that interpretation breaks down in curved spacetime.) What you call a "row vector" (a thingie with one lower index) is actually a "covector" or a "1-form", i.e., a bounded linear map from tangent vectors to numbers. (The operation of applying a 1-form to a vector to get a number is called "contraction".) Tensors are then just generalizations of the same thing to higher dimensions; an (0, 2) tensor, for example, is a bounded linear map from pairs of tangent vectors to numbers, and a (2, 0) tensor is a linear combination of bivectors, i.e., of pairs of tangent vectors.

Note that all of this works even if you don't have a metric tensor; but if you don't have a metric tensor, you can't raise or lower indexes, i.e., there is no natural correspondence between vectors and 1-forms (and similarly for higher order tensors). If you have a metric tensor, however, that gives you a natural correspondence between vectors and 1-forms, which is what your second way of writing the invariant interval is making use of (sort of).
 
  • Like
Likes PhDeezNutz and user14920251
PeterDonis said:
"Linear algebra" is not the same as "matrix algebra". Column vectors, row vectors, and matrices are just one possible representation of linear algebra objects (vectors and tensors).

A more general way to look at things is that what you call a "column vector" (a thingie with one upper index) is really a "tangent vector", i.e., an element of a vector space that's called a "tangent space". (In flat spacetime you can get away with ignoring this since flat Minkowski spacetime is a vector space in its own right, with vectors interpreted as displacements instead of tangents to worldlines. However, that interpretation breaks down in curved spacetime.) What you call a "row vector" (a thingie with one lower index) is actually a "covector" or a "1-form", i.e., a bounded linear map from tangent vectors to numbers. (The operation of applying a 1-form to a vector to get a number is called "contraction".) Tensors are then just generalizations of the same thing to higher dimensions; an (0, 2) tensor, for example, is a bounded linear map from pairs of tangent vectors to numbers, and a (2, 0) tensor is a linear combination of bivectors, i.e., of pairs of tangent vectors.

Note that all of this works even if you don't have a metric tensor; but if you don't have a metric tensor, you can't raise or lower indexes, i.e., there is no natural correspondence between vectors and 1-forms (and similarly for higher order tensors). If you have a metric tensor, however, that gives you a natural correspondence between vectors and 1-forms, which is what your second way of writing the invariant interval is making use of (sort of).
First off, thanks for fixing my formatting. I knew about the curly brackets, I just missed them when I was typing. I keep getting tripped up on the inline formatting with #'s and $'s, though. As for the question itself, I think I get it. If a (0,2) tensor is a covector of covectors, and you sum over each covector's action on some vector, you'd be left with a covector of scalars. I suppose part of the confusion was that a matrix is *also* a linear combination of vectors (namely, where the standard basis vectors transform to under said matrix). In fact, I ##c^\nu=a^{\nu}_{\mu}b^\mu## would, I believe, be identical to the common matrix transformation formula. I'm imagining our matrix as a vector of covectors, where each covector acts on the vector being transformed, giving a vector of resulting numbers. In other words, if ##A=\begin{bmatrix}b \\ c\end{bmatrix}## where b and c are covectors, A is a 2x2 matrix, and a is a vector, then ##Aa=\begin{bmatrix}ba \\ ca\end{bmatrix}##. Since ba and ca both represent a covector acting on a vector, each is simply a scalar, and Aa is a vector. On the other hand, a (2,0) vector would instead be a vector of vectors (to be acted on, rather than the previous case where each element acts on something), which makes sense with the results. I'm not sure if I'm expressing my thought process in a readable way (I've said the words 'vector' and 'covector' a LOT), but hopefully you can get the gist of what I'm trying to say. If so, does this make any sense?
 
Sciencemaster said:
If a (0,2) tensor is a covector of covectors
That's not what I said. Read my post again.

Sciencemaster said:
a (2,0) vector would instead be a vector of vectors
That's not what I said either.
 
PeterDonis said:
That's not what I said. Read my post again.


That's not what I said either.
That's fair. I'm just trying to visualize what expanding a vector into "more dimensions" really 'looks' like. A tangent vector being represented as something like an arrow from one point on a surface outward onto a tangent plane (or higher-dimensional equivalent) makes sense. A covector being an object that, when applied to that tangent vector, gives a scalar makes sense. Expanding these to higher ranks is a bit harder to visualize. I get that (2,0), for example, is effectively a linear combination of tangent vectors (which is why I was trying to include a regular matrix analogy, as those are linear combinations of "regular" vectors). This being the case, I was using the 'vector of vectors' analogy to try and find a way to make sense of how operations work, like how each of the vectors in this linear combination would sum with a covector, reducing the rank by one, but sort of keeping the "linear combination" itself intact so the rank is nonzero (This specific sentence is a bad analogy, but I'm trying to illustrate the shortcomings of my intuition anyways). I can see how the linear combination of vectors doesn't equate to a "vector of vectors", though. I'm also struggling with how combinations of raised/lowered indices can be thought of. For example, if a (2,0) tensor is a linear combination of tangent vectors, then what exactly is a (1,1) tensor?
 
Sciencemaster said:
I'm just trying to visualize what expanding a vector into "more dimensions" really 'looks' like.
You're missing the point: a tensor is not "expanding a vector into more dimensions".

Sciencemaster said:
Expanding these to higher ranks is a bit harder to visualize.
That's why mathematicians and physicists don't try to visualize these things. They work with the equations that the things satisfy.

That said, there are visualizations of these things in the literature. For example, see Figure 4.1 of Misner, Thorne, & Wheeler, which gives a visualization of the 2-form that describes the electromagnetic field.

Of course at some point our powers of visualization simply aren't up to the task at all, since we can only visualize up to three dimensions, and spacetime has four. That's another reason why mathematicians and physicists work with the equations instead.

Sciencemaster said:
I get that (2,0), for example, is effectively a linear combination of tangent vectors
No, it isn't, as I've already said. You need to read more carefully. You keep confusing yourself by clinging to a wrong analogy even though you've already been told it's wrong.

Sciencemaster said:
what exactly is a (1,1) tensor?
One way of looking at it is as a linear map from vectors to vectors, i.e., from a vector space to itself.
 
Sciencemaster said:
TL;DR Summary: I understand Einstein summation on its own, but when I try to connect tensor notation directly to linear algebra objects, I get tripped up by how vectors and tensors should be represented. How should I properly think about this connection?

I've been reviewing some introductory tensor stuff, and I've come to the realization that some of the things tensors do confuse me. For example, the notes I'm reading say that the invariant interval is both ##S=\eta_{\mu\nu} x^\mu x^nu## and ##S=x^T \eta x##. Both of which are totally fine on their own, the former sums over each vector with the proper sign from the metric, and the second gives the same scalar value via matrix multiplication. My issue is with equating the two.

Are you familiar with the concept of dual vectors? I suspect that this may be the underlying issue you are struggling with.

Here's the way I interpret things. Hopefully my memory isn't off too much, I haven't gone back to the textbooks to double check my memory, though, be warned.

##x^a## is a column vector in traditional notation, which would be just ##\vec{x}## in standard notation.
##x_a## is a row vector, which is the dual of a column vector. (More on this later). In standard notation, it would be ##x^T##. Now we will discuss the notion of dual vectors.

Standard notation is implicitly assuming we have what is called an "inner product space", https://en.wikipedia.org/wiki/Inner_product_space.

It might be clearest to start with the assumption that we have an inner product space, that allows us to compute the scalar product of two vectors, than other equivalent ideas. So that's what I'll do. From now on, we will assume from here on in that we do have an inner product space.

So - what is a dual space?. There are proofs that show if we have some vector space v, that a map from the vectors v to a scalar is another, different vector space, that's called the dual space, that for the finite dimensional case has the same rank or dimension as the original vector space. We lack proofs, at this point, that there is any relationship between vectors and their duals, but when we have an inner product space, there is, so we can talk about the operation of "taking the dual" of a vector. What's a bit puzzling is that taking the dual once does generate a different vector space from the original. But doing it twice recovers the original vector space. I'm afraid you'll have to dig into your linear algebra textbooks for why this is true - it should be discussed in the section on the dual space. But hopefully it should make sense now why we call it a dual space, we can perform the operation once and generate a different vector, a dual vector. In standard notation, this duality operator is the transpose operator, and it takes the column vector (one sort of vector space) and coverts it to a row vector (another sort of vector space). And if we take the transpose again, we go back to the original vector space.

You'll probably have to read up more on "dual vectors" in your linear algebra textbook, you most likely encountered them but probably didn't focus very much on them. The existence of an inner product space means that there is a natural mapping from the original vector space v to the dual space v*. I won't attempt to explain why, you can read about it in your textbook. Just be aware that WITHOUT the existence of an inner product, there is not necessarily any mapping from vector to their duals. In an attempt to be simple and useful - and frankly because I'm so used to making this assumption that I'd probably mess up a more general exposition - I'll just state that I am making this assumption.

So - my advice is to read up on dual vector spaces and inner product spaces, and I hope that will fill in the missing pieces for you, and how it ties in with standard notation.

Standard notation essentially always assumes that the metric is the identity matrix, so you won't see any mention of the metric in them. Introducing the idea of a metric is an important generalization to the notion of an inner product. Standard notation assumes that the squared length of a vector is a specific quadratic form x_1^2 + x_2^2 + ....x_n^2, metric notation allows it to be any quadratic form.
 
Back
Top