# How to Tell Operations, Operators, Functionals and Representations Apart

All these concepts belong to the toolbox of physicists. I read them quite often on our forum and their usage is sometimes a bit confused. Physicists learn how to apply them, but occasionally I get the impression, that the concepts behind are forgotten. So what are they? Especially when it comes to the adjoint representation in quantum field theory it is rarely distinguished which one is meant: the Lie group or the Lie algebra. Both have one and it is not the same! But why do they have an identical name? And what has it to do with operations? They are used like basic arithmetic is used. However, there is more to them, than only arithmetic. I have provided a list of textbooks at the end of this article, because this terminology is so fundamental to physics and of an enormous importance. The books can serve as good companions on these subjects throughout a lifetime, and you will more than once pick them from the shelf to read some chapters or look up important definitions and theorems. ^{1)}

### Arithmetic Operations and the division by zero

Everybody knows the basic arithmetic operators addition, subtraction, multiplication and division. They connect two numbers and form a new one. Let’s consider the real numbers. We basically have two groups, one which we write with a ##+## sign and one with a ##\cdot ## sign. Group here means we have associativity, a neutral and an inverse element. The multiplicative group does not contain ##0##, so the question how to divide by zero doesn’t even arise at this level. Let’s write these two groups ##G_+## and ##G_*##. Now we all know how to mix these two operations. \begin{equation}\label{AO-I}

\begin{aligned}

G_* \times G_+ &\longrightarrow G_+ \\

(r,p) &\longmapsto r\cdot p

\end{aligned}

\end{equation}

There is a certain property which tells us how ##G_*## has to handle the group structure of ##G_+##

$$

(r,p+q) = (r,p) + (r,q) = r\cdot p + r \cdot q

$$

We call it the distributive law. It is actually an example of an operation of ##G_*## on ##G_+\,##. Here we have to define, how elements ##r \in G_*## deal with ##0 \in G_+##. The distributive law forces us to define ##r\cdot 0 = 0\,##. The interesting point is: There is still no need to even think about the division by zero. It simply does not occur in the concept. Strictly speaking, even ##0 \cdot 0## doesn’t occur. And if we’re honest, it isn’t really needed at all. But ##0 \cdot 0 = 0## is forced by the distributive law, too. Of course this is a rather algebraic point of view, but don’t we call these operations on real numbers algebra?

Other functions like $$(b,a) \longmapsto \log_b a\; , \; (a,n \longmapsto a^n)\; , \; (n,r)\longmapsto \sqrt[n]{r}\; , \;(n,k)\longmapsto \binom{n}{k}$$

can also be viewed as operations with certain rules.

### Linear and Non-Linear Operators, Functionals and Hilbert spaces

Operators are subject to functional analysis ^{2) }. The terms are a bit confusing:

*Operators are functions, functionals are certain operators, and functional analysis deals with operators on spaces, where the elements are functions.*

O.k. so far? Let’s get some order in this mess. We start with the linear case.

Linear means we consider vector spaces over fields. Historically and from the point of view by applications, those fields are in general ##\mathbb{R}## or ##\mathbb{C}##. These are the most important ones and those which are well suited to describe physical processes. The basic difference to linear algebra is, that our vectors are functions themselves: continuous functions, differentiable functions, bounded functions or sometimes sequences. It’s obvious that those vector spaces are generally not finite dimensional. However, they are so important, that we gave them names:

A real or complex vector space with a dot product ##\langle x,y \rangle## (scalar product, inner product) which induces a norm by ##||x|| = \sqrt{\langle x,x \rangle}## is called a **Pre-Hilbert space** and **Hilbert space** if it is complete, i.e. all Cauchy sequences converge. The latter means, if elements of a sequence get closer and closer to each other, then we can find a limit. This is the difference between the rational and the real numbers: we can find sequences of rationals, which get closer and closer towards ##\sqrt{2}## but this point doesn’t belong to the rational numbers. In the real numbers, it does. Therefore we call the real numbers complete and the rationals not. If we drop the requirement of a dot product, and only want to consider a complete real or complex vector space with a norm, then we call it **Banach space**. Important examples are Lebesgue spaces, which deal with measurable subsets of ##\mathbb{R}^n##. ^{3) } ^{4)}

Now a **linear operator** ##L## is simply a linear map between vector spaces: ##L : V \rightarrow W##. The elements of ##V## on which ##L## is defined is the **domain** of ##L## and its image, i.e. the elements of ##W## which are hit by ##L## is the **range **or **image** of ##L## in its **codomain** ##W##. The only difference to linear algebra is, that if we say operator, then we usually mean (infinite dimensional) Pre-Hilbert, Hilbert, or Banach spaces. ^{5) } Certain classes of functions as those mentioned above form those vector spaces. If ##W## happens to be ##\mathbb{R}## or ##\mathbb{C}## then our operator ##L## is often called a **functional**. So in contrast to operations, where we have a pair of input parameters, we only have functions between vector spaces. The generally infinite dimension of the vector spaces, however, can make quite a difference for theorems and methods used in comparison to basic linear algebra. Also the scalar fields considered here are the real or complex numbers, which also is a difference to a purely algebraic point of view with eventually finite fields.

The main development of * the algebra of the infinite *was achieved in the 19th and early 20th century: “And, however unbelievable this may seem to us, it took quite a long time until it has been clear to mathematicians, that what the algebraists write as ##(I-\lambda L)^{-1}## for a matrix ##L##, is essentially the same as the analysts represent by ##I+\lambda L + \lambda^2 L^2 + \ldots ## for a linear operator ##L##.”

^{6) }

In the non-linear case ^{7) }, operators are (non-linear) functions between normed (in general infinite dimensional) vector spaces (Banach spaces), and (non-linear) functionals are those which map to ##\mathbb{R}## or ##\mathbb{C}##.

Important operators in physics are (among others) the density operator, the position operator, the momentum operator or the Hamiltonian (Hamilton operator) or simply the differential ##\frac{d}{dx}##, the Volterra operator ##\int_0^x dt ## or the gradient ##\nabla##. Sometimes entire classes of operators are considered, e.g. compact operators, which map bounded sets to those whose closure is compact.

### Operations and Representations

### The adjoint case

Here we are by real operations again, where the elements of one object (e.g. groups, rings, fields or algebras) transform elements of another object (e.g. sets, vector spaces, modules). This means we have an operation

\begin{equation}\label{OR-I}

\begin{aligned}

G \times V &\longrightarrow V \\

(g,v) &\longmapsto g.v

\end{aligned}

\end{equation}

A common example are matrix groups ##G## and vector spaces ##V## ^{8) } where the operation is the application of the transformation represented by the matrix. A orthogonal matrix (representing a rotation)

\begin{equation}\label{OR-II}

\begin{aligned}

g = \begin{bmatrix}

\cos \varphi & -\sin \varphi \\

\sin \varphi & \cos \varphi

\end{bmatrix}

\end{aligned}

\end{equation}

transforms a two dimensional vector in its by ##\varphi## rotated version. Let’s consider – for the sake of simplicity – this example, i.e. ##G=SO(2,\mathbb{R})## is the group of rotations in the plane ##V=\mathbb{R}^2##. This is the major reason why operations are considered: Can we find out something either about ##G## or about ##V## by means of an operation, which we otherwise would have difficulties to examine? E.g. the operation of certain matrices on vector spaces are used to describe the spin of fermions. Thus as we did in our first example, we always have to tell, how the operating elements (##g \in G##) handle the structure of the set they operate on (##v \in V##). Of course only if there is a structure. In our example the operation respects the vector space structure

\begin{equation}\label{OR-III}

\begin{aligned}

g.(\lambda v + \mu w) = \lambda (g.v) + \mu (g.w)

\end{aligned}

\end{equation}

It means, it doesn’t matter if we rotate the result of an operation of vectors, or perform the operation after we rotated the components. The operation respects linearity and is called a linear operation. We will see that similar is true, if ##V## carries other structures as, e.g. a Lie algebra structure. Usually we require the operation to have the properties inherited by the nature of objects we deal with: linear operations on vector spaces, isometric operations on geometric objects, continuous (or differentiable) operations on continuous (or differentiable) functions, smooth operations on smooth manifolds and so on. But in any case we have to define how the objects have to be handled, especially their structure.

Important elements of an operation are orbits and stabilizers. An **orbit** of ##v\in V## is the set $$G.v = \{g.v \in V \,\vert \,g \in G\}$$

It is the set of all elements of ##V## which can be reached by the operation. In case of groups ##G##, orbits are equivalence classes. If we can reach every point ##w## form any point ##v## by a certain group element, i.e. ##w \in G.v## for all ##v,w \in V\,,## then the operation is called **transitive**. Somehow corresponding to orbits are **stabilizers**, which are all elements of ##G## which leave a given element of ##V## unchanged, i.e. $$G_v = \{g \in G\,\vert \, g.v=v\}$$

In case of groups, if only the neutral element stabilizes elements, i.e. ##G_v=\{e\}## for all ##v\in V##, the operation is called **free**. ##G## operates freely on ##V##. If only the neutral element ##e## fixes all ##v##, the operation is called **faithful** or **injective**. Free operations on non-empty sets are faithful.

Let us consider our example and take ##v=(1,-2)## and ##w=(1,1)##. Then the orbit of ##v## is a circle with radius ##\sqrt{5}## and ##w## cannot be reached from ##v## by rotation, which means our operation is not transitive. We also see, that an orbit doesn’t need to be a subspace. However, concentric circles are equivalence classes.

For the stabilizer, which is always a subgroup, we will have to be careful with the definition of ##G##. If we restrict ourselves to values ##\varphi \in [0,2 \pi) ## then we get a free operation, as only the rotation by ##0## stabilizes elements. But if we allow any real number as an angle, then ##G_v= 2\pi \mathbb{Z}##

If we generally consider a group ##G## which operates on a vector space ##V## we usually require the property (in this case being a group) to be respected. This means ##e.v=e## and ##g.(h.v) =(gh).v## and ##g^{-1}.g.v=v##. Now these properties can be summarized by saying

\begin{equation}\label{OR-IV}

\begin{aligned}

\varphi \, : \, G &\longrightarrow GL(V) \\

\varphi \, : \, g &\longmapsto (v \longmapsto g.v)

\end{aligned}

\end{equation}

is a group homomorphism, where ##GL(V)## is the general linear group of ##V##, the group of all regular linear functions ##V \longrightarrow V##. Homomorphism means, ##\varphi## maps group to group and ##\varphi(g\cdot h)=\varphi(g) \cdot \varphi(h)##. In this case ##V## is called **representation space** and ##\varphi## a **representation** of ##G##. Thus an operation and a representation are the same thing: it’s only a different way to look at it. One emphasizes the group side of it, the other the vector space side. ^{9) }^{10)}

Let me finish with three important examples of representations which play a crucial role in the standard model of particle physics. Therefore let ##G## be a matrix group, e.g. a Lie group like the unitary group and ##V## a vector space where this group applies to, e.g. ##\mathbb{C}^n##.

The first example is a pure group operation on itself

\begin{equation}\label{V}

\begin{aligned}

G \times G &\longrightarrow G\\

g.h &\longmapsto (ghg^{-1})

\end{aligned}

\end{equation}

Here ##G## operates not on a vector space, but on itself instead. A group element ##g## defines a bijective map from ##G## to ##G##, which is called **conjugation** or **inner automorphism**. If there is a Lie algebra ##\mathfrak{g}## associated with ##G##, as for matrix groups and in case of the unitary group, the skew-Hermitian matrices, we get from this conjugation a naturally induced map ##(g\in G\, , \,X\in \mathfrak{g})##

\begin{equation}\label{VI}

\begin{aligned}

\operatorname{Ad}\, : \,G &\longrightarrow GL(\mathfrak{g})\\

Ad(g)(X)&=gXg^{-1}

\end{aligned}

\end{equation}

which is called **adjoint represenatation** of ##G##.^{11)}

Its representation space is now the Lie algebra ##\mathfrak{g}\,##, which is the tangent space of smooth functions in ##G## at the neutral element, the identity matrix, and as such a vector space. It further means, there is a group homomorphism ##\operatorname{Ad}## of matrix groups (Lie groups). Therefore ##GL(\mathfrak{g})## has also an associated Lie algebra, called the general linear Lie algebra ##\mathfrak{gl(g)}\,##, which is basically nothing else as all square (not necessarily regular) matrices the size of ##\operatorname{dim}\mathfrak{g}\,##. The Lie algebra multiplication is given by the **commutator**

$$

[X,Y] = X \cdot Y – Y \cdot X

$$

The left multiplication in ##\mathfrak{g}## gives rise to a Lie algebra operation

\begin{equation}\label{VII}

\begin{aligned}

\varphi\, : \,\mathfrak{g} \times \mathfrak{g} & \longrightarrow \mathfrak{g}\\

(X,Y) &\longmapsto [X,Y]

\end{aligned}

\end{equation}

Remember that we said an operation is required to respect the structure. Here it means, that ##\varphi## is a Lie algebra homomorphism

$$

\varphi([X,Y])(Z) = [\varphi(X),\varphi(Y)](Z)=\varphi(X)\varphi(Y)(Z)-\varphi(Y)\varphi(X)(Z)

$$

which is nothing else than the **Jacobi identity**. Furthermore as an operation is always a representation, we get a representation

\begin{equation}\label{VIII}

\begin{aligned}

\operatorname{ad}\, : \,\mathfrak{g}&\longrightarrow \mathfrak{gl(g)}\\

\operatorname{ad}(X)(Y)& = \varphi(X,Y) = [X,Y]

\end{aligned}

\end{equation}

which is called **adjoint representation** of ##\mathfrak{g}\,##.^{12)} It’s simply the left-multiplication in the Lie algebra. The homomorphism property, which is the Jacobi identity, is also the defining property of a derivation:

$$

\operatorname{ad}([X,Y])= [\operatorname{ad}(X),Y]+[X,\operatorname{ad}(Y)]

$$

Therefore the adjoint representation ##\operatorname{ad}## is also called an **inner derivation** of ##\mathfrak{g}\,##; the same as a conjugation is called an inner automorphism of ##G##. Both adjoint representations, the one of ##G## and the one of ##\mathfrak{g}## are related by the following formula ##(X \in \mathfrak{g})##

\begin{equation}\label{IX}

\begin{aligned}

\operatorname{Ad}(\exp(X)) = \exp(\operatorname{ad}(X))

\end{aligned}

\end{equation}

The defining property for a derivation is just the Leibniz rule of differentiation (product rule). Also closely related are the Lie derivative and the Levi-Civita connection. In the end all of them are just versions of the Leibniz rule we learned at school.

### Summary

- Arithmetic operations: ##+\; , \;-\; , \;\cdot \; , \; :##
- (Linear) Operators: ##L : (V,||.||_V) \longrightarrow (W,||.||_W)##
- (Linear) Functionals: ##L : (V,||.||_V) \longrightarrow (\mathbb{R},|.|_\mathbb{R}) ## or ##(\mathbb{C},|.|_\mathbb{C}) ##
- Operation in general: ##Operator\,.\,Object_{old} = Object_{new}##
- Group Operations: ##G \times V \longrightarrow V##
- Group Representation: ##G \longrightarrow GL(V)##
- Conjugation in groups: ##g.h = ghg^{-1}##
- Adjoint representation for Lie groups: ##\operatorname{Ad} g(X) = gXg^{-1}##
- Adjoint representation for Lie algebras: ##\operatorname{ad} X(Y) = [X,Y]##
- ##\operatorname{Ad}\circ \exp = \exp \circ \operatorname{ad}##

Fresh, thank you for this work, which I believe fits in your more general – and much welcomed – effort to familiarize interested physics students with the mathematics underpinnings of their field.

I feel qualified to comment only on the section titled "Linear and Non-Linear Operators, Functionals and Hilbert spaces". Here are some comments, mostly minor. Quotes from your text are in italic.

1.

If we drop the requirement of a dot product, and only want to consider a complete real or complex vector space with a norm, then we call it Banach space. Another important example are Lebesgue spaces, which deal with measurable subsets of ##R^n##.

I find writing

anotherhere a bit confusing. People unfamiliar with ##L^p##-spaces might think that Lebesgue spaces are something else than just particular cases of Banach spaces (##1 le p le infty##) or Hilbert spaces (##p = 2##).2.

Now a linear operator ##L## is simply a linear map between vector spaces: ##L:V to W##. The elements of ##V## on which ##L## is defined is the domain of ##L## and its image, i.e. the elements of ##W## which are hit by ##L## is the codomain or range of ##L##.Since at the end of this section you give examples of bounded as well as unbounded operators (and since these classes of operators require substantially different techniques for their analysis, as you of course know), I would have preferred to write ##L : D(L) subseteq V to W## where ##D(L)## is the domain of ##L##, with ##D(L) = V## in case ##L## is bounded (e.g. the Volterra operator).

In this notation, for me the codomain of ##L## would be ##W## itself, while the range of ##L## would be ##R(L) := LD(L)##.

3.

In the non-linear case, operators are (non-linear) functions between normed (in general infinite dimensional) vector spaces (Banach spaces), and (non-linear) functionals are those which map to ##mathbb{R}## or ##mathbb{C}##. Sometimes entire classes of operators are considered, e.g. compact operators, which map bounded sets to those whose closure is compact.The occurrence of the last line in a paragraph on nonlinear operators unjustly suggests that compact operators are only considered in a nonlinear context.

4. I think if I were to be unfamiliar with functional analysis, this presentation is very brief. The text could benefit from some specific references that you like, preferably per topic of discussion. This way, it really becomes an initiation for the curious student.

Hopefully you find these comments more useful than irritating.

Thanks @Krylov for the careful reading.

ad 1) Changed phrasing a bit.

ad 2) Well, "codomain" was easy to correct, but the rest was a bit tricky in order to keep it simple and not to go into details. Not sure if it's better now though. It is always a big temptation to me to say more and more about certain subjects, historically as well as technically. All the doors I have knocked on and din't open hide so much mathematical beauty, that it's hard not to enter.

ad 3) Swapped to the next paragraph.

ad 4) I added some sources, but I admit that my library doesn't contain a lot of functional analysis, resp. in the wrong language. E.g. I have a lovely book by Courant and Hilbert (1924), Mathematical Methods of Physics I, but this is probably not a book to recommend nowadays. If you know of a "must read", I would appreciate to add it to the list.

So thank you, once more.

Typo? Should it be "functionals are certain operators" ?

A case of lost in translation. Being a list I thought a repetition of the verb wouldn't be necessary as long as it is still valid from the first part and until a new one is needed. However, I'm not sure how the English grammar deals with these cases and I applied mine. I'll add the repetition.

Heh, I'm not sure either — although English is my first language. :blushing:

To my ear, there's a difference between everyday English, and highfalutin English. :oldbiggrin:

Actually, @fresh_42, you're correct in your understanding of that particular convention for listing. However 1) it is a convention more common to UK and continental English than to U.S. English; and 2) it can become somewhat awkward if noun phrases are involved that require extra parsing by the reader. Also 3) listing of any sort is helped by what copyeditors call the "serial comma," which is a comma used in any list of three or more members; it comes after "and" and before the final member. This helps the reader understand that the last two members of the list aren't joined by that "and" into a separate category, but rather, are both members of the overall list. Thus rather than "X, Y and Z", which subtly implies that Y and Z are somewhat closer to each other than to X, better to write "X, Y, and Z".

So with all that in mind, you could indeed write your list like this –

– and it would just barely skate by as grammatically correct. However, "functionals certain operators" will likely cause many readers to hesitate and re-parse in order to recognize "certain operators" as a complete noun phrase; and even then some may not understand what you mean by "certain" since the word has potentially more than one meaning. Moreover, the last element in the list is a much longer phrase than the first two, which may cause additional hesitation and re-reading. So better in this case to go with @strangerep's suggestion.

Also, to introduce your list, better in this case to use a colon rather than a period in the preceding sentence – a very minor point, but helpful to alert the reader to what's coming; e.g.

I imagine that's more than enough pedantic copyediting for today . . .

Hello Fresh, thank you for your reply, and you are welcome.

I don't know if this would violate one rule or the other, but I personally would not object to foreign language references, as long as there are also references in English. It just so happens that some very good mathematics books are written in German or Russian, and not all of them have been translated.

Speaking about this, a well-known German author whose works have been translated into English is Eberhard Zeidler. His

Nonlinear Functional Analysis and its Applicationsfour-part series is no less than encyclopedic, but individual chapters are remarkably readable. His two-partApplied Functional Analysisis a good introduction that contains linear as well as non-linear material.There is the classical book by Reed and Simon,

Functional Analysis. @micromass has its front cover on his shirt.There are many more personal favorites, but these may be interesting specifically for physics students.

@UsableThought Thanks, for the trip to English grammar. Yes, the Oxford comma is a discussion on its own. In German the rule is a different one: Commas only in listings without conjunctions like and, or if the conjunction is the start of a complete sentence with subject and predicate. The and substitutes the comma if used as separator only.

@Krylov Thanks for the sources. I added them to the list, although Zeidler was a bit of work to do with his many volumes. On the other hand I found that especially Zeidler is a very valuable source. This way the list got a bit long, but in comparison to the importance of the subject, a recommendable list of textbooks should be a good thing. And if some decided to use the links I integrated to buy a book, then it's even good for us. :wink:

Can we give a good

mathematicalanswer to the frequently asked question "What is the physical significance of operators?" ? (e.g. https://www.physicsforums.com/threads/what-are-operators.919653/ )The answers to that question in the physics section say to study more physics. Since we are in the general math section, perhaps we can take a more abstract view. We are faced with the mathematical problem of modelling a phenomena that is probabilistic, but whose probabilistic nature is, in some sense, deterministic. Does this lead to some natural choice of the appropriate mathematical tools?

I would say: "yes". As soon as the state of a system (e.g. the wave function) is not determined by finitely many numbers, infinite-dimensional function spaces naturally occur. SInce, unlike ##mathbb{R}^n## or ##mathbb{C}^n##, these function spaces generally have non-equivalent norm topologies, the study of their (functional) analytic properties becomes non-trivial and linear algebra alone no longer suffices.