# Understanding Superposition

### Introduction

One should first and foremost understand that superposition has both a mathematical an a physical meaning, or rather it originates as a mathematical property of the system description which will have implications in the physics. The mathematical concepts are fairly straightforward aspects of linear algebra which will be briefly reviewed first. To understand the physical implications of the mathematics we need to understand how representations in the mathematics get translated to operationally meaningful statements of the physical phenomena we observe. So the next step is to consider if where and how mathematical aspects of superposition connect to the logical structure of our empirically verifiable statements about physical systems. When this is understood then one has a solid foundation to stand upon while contemplating the physical and philosophical meaning of this idea of superposition.

### Vectors and Linearity

So, what is mathematical superposition? It is simply the linearity of the system description which is to say the system is described by some abstract form of vector in a vector space. Specifically, these vectors are in the abstract mathematical objects which may be added and multiplied by scalars to form other vectors. When taken together addition and scalar multiplication resolves as the action of taking linear combinations. Given [itex]X[/itex] and [itex]Y[/itex] we also have as defined the object [itex]W = aX+bY[/itex] for real (or complex) numbers [itex]a[/itex] and [itex]b[/itex].

A linear combination of two vectors is what we mean by referring to a *superposition* of them. That vector [itex]W[/itex] is a superposition of vectors [itex]X[/itex] and [itex]Y[/itex] says simply that [itex]W[/itex] is a linear combination of [itex]X[/itex] and [itex]Y[/itex]. And note that where this is true and the multipliers are not zero we can likewise solve this linear equation for [itex]X[/itex] or for [itex]Y[/itex] so the more balanced statement is that each of the three is actually a superposition of the other two. For example: If [itex]W=aX+bY[/itex] and [itex]a\ne 0[/itex] then [itex]X = 1/a W -b/a Y[/itex].

### Classical Superposition of Fields

Now in this abstract setting we typically do not use the term “superposition”. This term is more often applied to spaces of scalar and vector valued functions or *fields* and in particular is used in physical applications. The crucial point is when dynamic equations are linear, then linear combinations of solutions will also be a solution. We thus express general solutions as linear combinations of, or *superpositions* of certain standard solutions.

This is true specifically in classical electromagnetism. When we consider the electric fields caused by many sources (charges and currents), since the field equations are linear, we can simply determine the field which results from each component source and add the results. We then say the electromagnetic field at a given point is a *superposition* of the fields due to the many sources.

That is classical superposition. In quantum superposition we work within an abstract vector space, the Hilbert space, from the very beginning. To understand the meaning of superposition we must first have a relatively clear understanding of the operational meaning of the Hilbert space vectors.

### Quantum vs Classical Logic

Classical logic is concretely expressed using the algebra of sets. Our ontological model of classical reality is as a set of states of reality. Prepositions about outcomes of deterministic experiments may then be reduced to sets of states of realities which cause the respective outcomes. Measurements reduce our set of possible states to a subset of those states consistent with the observations. The operations on sets mirror our Boolean algebra of logical operations generated by “**and**“ing, “**or**“ing and **negating** our events.

The skeleton of this algebraic structure is the lattice of subsets for the set of possible states of our system. If you had, say, a three state system with states **a**, **b**, or **c** then our “state space” is the set {**a**, **b**, **c**} and all the definite statements we can make about the system resolve as subsets. The most specific statements (which are not a contradiction) are the singleton sets {**a**}, {**b**}, {**c**}.

**Stochastic Descriptions:** We may further extend our descriptions of physical systems by introducing probabilistic statements. When we account for uncertainties in our measurements, or of our knowledge of future evolution of the system we quantify that uncertainty with probabilistic statements. In the classical setting we assign a probability distribution over the lattice of events which we insist obey certain “natural” properties, namely the additive property:

[tex] P(A)+P(B)=P(A\cup B) + P(A\cap B)[/tex]

This is the book-keeping on probabilities based on the assumption that the likelyhood of a deterministic measurement of any system can be calculated by summing (or integrating) a distribution function over the space of implied states. In short, probabilities form a **measure** over the state space.

However an important change of paradigm has occurred in this extension to probabilistic descriptions. When we present a specific probability distribution as our system description we are no longer pointing out a specific state of reality. If I give you a probability density for the position and momentum of a classical particle over a given range, the particle isn’t “spread out over space”. It is, in the classical supposition, in some exact state. So the probabilistic description can’t be referring to just that one particle. It rather is referring to a class of possible particles, or rather a mode of production of such particles such that, over many repeated instances we should see the probabilities manifest as predicted frequencies in the limit of large samples.

However we note that this stochastic description envelopes the prior logical description. We recover the usual logical structure by restricting our descriptions to statements of absolute certainty. We equate “[itex]A[/itex] is True” with [itex]P(A)=1[/itex] and “[itex]A[/itex] is False” with [itex]P(A)=0[/itex], and play the exact same games of logical operations… but behind the scenes we are still working with classes of systems and their modes of production.

**Quantum Logic**

In quantum theory we replace this set of states with a (complex) dimensional Hilbert space. In our three state example we would then have a three dimensional complex space. The lattice structure for the quantum logic is the lattice of *subspaces*. The most specific (non-contradictory) statements about the system would then correspond to the one-dimensional subspaces we identified with these axies. We may “**or**” such logical statements by taking the minimal subspace containing these. We can specify such a subspace with a set of vectors for which that subspace is the span. This is the role of the Hilbert-space vector. It projectively represents the state of (our sharp knowledge about) the system.

The logic of implication which classically is modeled by the subset inclusion operation is now modeled by the *subspace* inclusion operation. That “A (definitely) implies B” is expressed by the parallel relationship that the space associated with A is a subspace of the one associated with B. We can also “and” statements by taking, as with sets, the intersection. The origin point a.k.a. zero vector then corresponds to the contradictory statement, the logical equivalent to “2=3”.

This alone defines *projective* quantum mechanics however the operation of logical negation is problematic since, for example, in the three dimensional case, there are an infinite number of two dimensional subspaces which only intersect a given axis at the origin (and thus whose “**and**” results in a null statement). The resolution of this is why we use a Hilbert space with its metric structure defined by the inner product rather than a simple linear space. The metric defines which pairs of vectors are orthogonal and thence which subspaces are orthogonal. The negation of the event associated with a given subspace is the orthogonal subspace.

The result is that for a pair of one dimensional subspaces if they are orthogonal they are mutually exclusive both positively an negatively. A implies not B and B implies not A. If on the other hand they are parallel then “they” are a single subspace and equivalent A implies B and B implies A.

But there is the in-between cases where they are neither parallel nor perpendicular. For systems where A is assured and not-A is forbidden, the event B may or may not occur, and the probability of this will be determined by the angle between the two cases. [itex]P(B| A \text{ is assured}) = cos^2(\theta)[/itex]. So to interpret these in-between relationships we again must extend from a language of certainty to a stochastic description. We must recognize that these description are not of the system as it is but of our knowledge of how the system will or might behave when measured.

Now classical logic is embedded in the quantum case in that orthogonal vectors (or rather subspaces) correspond to mutually exclusive sharp outcomes and so an ortho-nomal basis, say {[itex]\lvert a \rangle, \lvert b \rangle, \lvert c \rangle[/itex]} corresponds to a *complete* measurement with three corresponding outcomes. The system has three “states” when we consider the classical logic of the three possible outcomes of this measurement. You can think of this as a *classical frame* and so long as we’re only talking about these three possibilities and their combinations we can safely apply classical logic just as if it were a *set of states*.

But the weirdness of quantum theory comes in when we arbitrarily rotate our basis and have a wholly distinct classical frame,{[itex]\lvert a’ \rangle, \lvert b’ \rangle, \lvert c’ \rangle[/itex]}. Any action making a determination in one such frame precludes our simultaneously making an exact determination in the other frame. Making the distinct measurements in succession will yield different behaviors depending on the causal sequence. The acts of observation do not commute and we can’t make legitimate statements about circumstances in both frames at the same time.

We can however relate one frame to the other. Definite statements in one frame will resolve as probabilistic statements in the other. In the vector space representation the representative vectors in one basis will be linear combinations of or *superpositions of * the vectors in the other basis. To pin this down let us consider a concrete example.

#### The Double-Slit Experiment

Now picture in your mind the classic double slit experiment where an electron is emitted from some source, passes through (somehow) a barrier containing two narrow slits, and is then detected somewhere past on a florescent screen where we will see a point flash of light caused by the electron.

Let’s call one slit the X slit and the other slit the Y slit. If we think of the electron classically then it must either pass through the X slit or the Y slit on the way to the screen. You can then describe the logical structure in terms of the possible subsets of this set of possibilities: {X,Y}.

Those would be:

- {X,Y} corresponding to “Either the electron passed through X or through Y.”,
- {X} corresponding to “The electron passed through X.”,
- {Y} corresponding to “The electron passed through Y.”, and
- {} corresponding “The electron passed through neither” (which never happens since we are only considering electrons we’ve later detected at the screen.)

Quantum mechanically we would instead draw an X axis indicating the “It went through X” and a perpendicular Y axis indicating “it went through Y” and then the plane containing these two axes is the space they span, the X-Y plane which represents the Either X or Y case. The “Neither” case is expressed by the zero dimensional spaced {0}.

But, looking at the plain, there’s no particular natural choice for perpendicular axis. You could choose a slightly or even significantly rotated pair of axes which would be just as meaningful.

(It gets a bit more complicated in that we actually work within a complex space where we can take both real and imaginary superpositions and where we are considering 1-complex dimensional subspaces.)

Every vector in this space defines a 1-dimensional subspace, a line through the origin which likewise represents a sharp system description. Such a system mode is a superposition of the X and Y modes. One may observe the particle in a U mode or V mode each as specific a description as saying either “It went through X” or “It went through Y”. However, in either case one can neither say “It went through X” nor “It went through Y” although, by being in the spanning plane, it still (sort of) makes sense to say “It either went through X or Y”.

Beyond simply stating “It is a superposition of the two cases” it is not really proper given a U or V observation to describe the electron’s behavior in terms of passage through X vs through Y. This is where we realize that we aren’t talking about a particle in the classical sense anymore. And this alternative resolution of the system, say U vs V, is not just a mathematical abstraction. You can actually build some apparatus with magnets and charged plates which will separate electrons into beams flowing only into U or V detectors, just as you can create a magnetic lens which will bring the electrons which pass through X to a given detector and those which pass through Y to a second separate detector.

There are two different frames of logic about electrons in this situation, each equally valid and neither more fundamental or more natural or more real than the other. It is just as correct to say the X and Y outcomes are superpositions of the U and V ones as it is to say that the U and V outcomes are superpositions of the X and Y ones. We have a relativity of the classical logic we can embed within the quantum logic.

This loss of absoluteness here is similar to the problem of asking if two separated events happened at the same time in the space-time of special relativity. Unless an observer frame is selected the relationship between space and time for events is relative. Here we are dealing with a relativity of the classical reality of the quantum system.

#### Another Example

So let’s look at another example system (again an electron but in different circumstances) for which we have a clear understanding of the meanings of some of the various quantum superpositions. If we run an electron through a Stern-Gerlach magnet oriented so as to measure the electron’s spin component in the z-direction we will see the electron deflect along one of two paths corresponding to two and only two possible values for its z-component of spin. We therefore find that one and only one binary bit of information can be encoded in the electron’s spin. The Hilbert space for this spin system is two dimensional and we can pick out one direction to correspond to “z-spin +1/2” vs an orthogonal direction for “z-spin -1/2” (the unit of 1/2 here is basically so that the spacing is 1 unit).

Now we could also try to resolve the x-component of spin. But we find that if we previously measure a z-spin value, then measure the x-spin, then subsequent measurement of the z-spin will not correspond to the earlier measurement. The x-measurement destroyed the prior z-spin value. We also note that after repeated examples of these types of z then x then z again experiments, we see, regardless of which z measurement we first saw, a 50-50 split in the x-measurements and then a subsequent 50-50 split in the z measurements with no correlation to the x measurements. Mathematically this works in conjunction with choosing, in the same 2-dimensional space, the x-spin up direction to be rotated 45 degrees in the z-up z-down plane.

[tex] |x= +1/2\rangle = \tfrac{1}{\sqrt{2}} \left|z=+1/2\right\rangle +\tfrac{1}{\sqrt{2}}\left |z= -1/2\right\rangle [/tex]

[tex] |x= -1/2\rangle = \tfrac{1}{\sqrt{2}} \left|z=+1/2\right\rangle -\tfrac{1}{\sqrt{2}}\left |z= -1/2\right\rangle [/tex]

This means the spin measured as +1/2 in the x-direction is in a superposition of the + and – z-spin values. This does NOT mean we are adding z spins to get an x-spin. The physical spins are not getting added here, but rather the abstract logical amplitudes represented in the Hilbert space. And it’s not that we are adding anything per se because the “amounts” of the things we add do not mean anything. It is the directions only that matter here. The “reality” of x-spins is rotated 45 degrees to the “reality” of z-spins for the same electron. Likewise with the y-spins but to get enough independent directions we must work in complex space so we end up with:

[tex] |y= +1/2\rangle = \tfrac{1}{\sqrt{2}} \left|z=+1/2\right\rangle +\tfrac{i}{\sqrt{2}}\left |z= -1/2\right\rangle [/tex]

[tex] |y= -1/2\rangle = \tfrac{1}{\sqrt{2}} \left|z=+1/2\right\rangle -\tfrac{i}{\sqrt{2}}\left |z= -1/2\right\rangle [/tex]

With the symmetry we have the x spins as superpositions of the z-spins but also vice versa, and also the y spins are superpositions of the z-spins et vice versa, and the x spins are superpositions of the y-spins and vice versa.

### Summary

The concept of superposition manifests both in classical and quantum physics. These both have their root in the linearity of the abstract vector spaces used in system descriptions. Quantum superposition is a more fundamental and less intuitive phenomenon as it occurs in the structure of the very logic of the system description rather than (or more aptly, as well as) at a higher more dynamic level.

To say a system “is in superposition” is meaningless in and of itself. The phenomenon of an electron or other quantum system “being in superposition” is totally relative to what one has chosen as a starting basis for the Hilbert space representation. One singular description will manifest as a superposition when expressed in terms of an alternate representation. To resolve the cognitive dissonance one experiences when trying to understand how a physical object can be “in a superposition of two states” one must reflect upon the very semantics we use and the implicit premises it brings with it. We never observe physical systems in superpositions, rather we observe physical systems with measurements which are in their descriptions and their effects, superpositions of other acts of measurement.

MS. Applied Mathematics, PhD Physics

As I remember the article

Quantum mechanics made transparent. by Richard C. Henry, it makes the argument that probability distributions for classical states can be conveniently represented by vectors that lie on the surface of an n-dimensional sphere (or equivalence classes of rays that pass through such points).Perhaps there is a way to present the transistion between classical logic and quantum mechanics in a more gradual manner instead of the jump from the logic of sets to the methods of QM. The properties of probability distributions on classical states could be an intermediate step.