FOUNDATIONS OF QUANTUM MECHANICS

FOUNDATIONS OF QUANTUM MECHANICS FOUNDATIONS OF QUANTUM MECHANICS

from projects.science.uu.nl More from this publisher

01.06.2014 Views

FOUNDATIONS OF QUANTUM MECHANICS JOS UFFINK INSTITUTE FOR HISTORY AND FOUNDATIONS OF SCIENCE UTRECHT UNIVERSITY SEPTEMBER 2010

FOUNDATIONS

OF

QUANTUM MECHANICS

JOS UFFINK

INSTITUTE FOR HISTORY AND FOUNDATIONS

OF

SCIENCE

UTRECHT UNIVERSITY

SEPTEMBER 2010

PREFACE

These lecture notes serve as a support for the course on Foundations of Quantum Mechanics, provided

by the Institute for History and Foundations of Science of the University of Utrecht. Although

the text has been revised repeatedly, efforts to improve can sometimes bring along new imperfections,

making revision a never-ending process. The current version, the 11 th , is slightly modified

with respect to the previous one. Many thanks are due to Anne van Weerden for help in the English

translation.

Remarks and comments remain very welcome.

Jos Uffink

Utrecht, August 2010

CONTENTS

I CONCEPTUAL PROBLEMS 7

I. 1 Introduction . . . . . . . . . . . . . . . 7

I. 2 Incompleteness and locality . . . . . . . . . . . 11

II THE FORMALISM 17

II. 1 Finite - dimensional Hilbert spaces . . . . . . . . . . 17

II. 2 Operators . . . . . . . . . . . . . . . . 20

II. 3 Eigenvalue problem and spectral theorem . . . . . . . . 24

II. 3. 1 Appendix . . . . . . . . . . . . . . . 26

II. 4 Functions of normal operators . . . . . . . . . . . 27

II. 5 Direct sum and direct product . . . . . . . . . . . 30

II. 5. 1 Direct sum . . . . . . . . . . . . . . 30

II. 5. 2 Direct product . . . . . . . . . . . . . 31

II. 6 Addendum: Infinite - dimensional Hilbert spaces . . . . . . 34

II. 6. 1 The structure of vector spaces . . . . . . . . . . 34

II. 6. 2 Operators . . . . . . . . . . . . . . . 36

II. 6. 2. 1 Unbounded operators . . . . . . . . . . 37

II. 6. 2. 2 Continuous spectra . . . . . . . . . . . 38

II. 6. 2. 3 Spectral theorem . . . . . . . . . . . . 39

II. 6. 3 Dirac . . . . . . . . . . . . . . . . 40

II. 6. 4 Summary . . . . . . . . . . . . . . . 40

III THE POSTULATES 41

III. 1 Von Neumann’s postulates . . . . . . . . . . . . 41

III. 2 Pure and mixed states . . . . . . . . . . . . . 45

III. 3 The interpretation of mixed states . . . . . . . . . . 51

III. 4 Composite systems . . . . . . . . . . . . . 55

III. 4. 1 Summary . . . . . . . . . . . . . . . 63

III. 5 Proper and improper mixtures . . . . . . . . . . . 63

III. 6 Spin 1/2 particles . . . . . . . . . . . . . . 64

III. 6. 1 Spin 1/2 and rotations in spin space . . . . . . . . 67

III. 6. 2 Mixed spin 1/2 states . . . . . . . . . . . . 70

III. 6. 3 Two spin 1/2 particles . . . . . . . . . . . . 72

III. 6. 3. 1 Singlet and triplet states . . . . . . . . . . 72

III. 6. 3. 2 Correlations . . . . . . . . . . . . . 73

III. 6. 3. 3 Conditional probabilities . . . . . . . . . . 74

III. 6. 3. 4 Example of a mixed state of two spin 1/2 particles . . . 75

IV THE COPENHAGEN INTERPRETATION 77

IV. 1 Heisenberg and the uncertainty principle . . . . . . . . 77

IV. 1. 1 Remarks . . . . . . . . . . . . . . . 81

IV. 2 Bohr and complementarity . . . . . . . . . . . . 82

IV. 2. 1 Complementary phenomena . . . . . . . . . . 84

IV. 2. 2 Remarks and problems . . . . . . . . . . . 86

IV. 2. 3 Agreement and difference between Heisenberg and Bohr . . . 87

IV. 3 Debate between Einstein en Bohr . . . . . . . . . . 88

IV. 3. 1 Introduction . . . . . . . . . . . . . . 88

IV. 3. 2 The photon box . . . . . . . . . . . . . 90

IV. 3. 3 Einstein, Podolsky and Rosen . . . . . . . . . . 92

IV. 3. 4 Heisenberg, Bohr and Einstein, Podolsky and Rosen . . . . 92

IV. 4 Neutron interferometry . . . . . . . . . . . . 93

IV. 5 The uncertainty relations . . . . . . . . . . . . 97

IV. 5. 1 Introduction . . . . . . . . . . . . . . 97

IV. 5. 2 The standard uncertainty relations . . . . . . . . 98

IV. 5. 3 Single slit experiment . . . . . . . . . . . . 100

IV. 5. 4 Time and energy . . . . . . . . . . . . . 103

IV. 5. 5 Double slit experiment . . . . . . . . . . . 104

IV. 5. 6 A new uncertainty measure . . . . . . . . . . 105

IV. 5. 7 Interpretation . . . . . . . . . . . . . . 108

V HIDDEN VARIABLES 109

V. 1 Hidden reality . . . . . . . . . . . . . . . 109

V. 2 Non - contextual hidden variables . . . . . . . . . . 110

V. 3 Kochen and Specker’s theorem . . . . . . . . . . 115

V. 3. 1 Summary . . . . . . . . . . . . . . . 120

V. 4 Contextual hidden variables . . . . . . . . . . . 120

VI BOHMIAN MECHANICS 127

VI. 1 Introduction . . . . . . . . . . . . . . . 127

VI. 2 The quantum potential . . . . . . . . . . . . . 128

VI. 3 Composite systems . . . . . . . . . . . . . 132

VI. 4 Remarks and problems . . . . . . . . . . . . 135

VI. 5 The Hamilton - Jacobi equation . . . . . . . . . . 136

VII BELL’S INEQUALITIES 139

VII. 1 Local deterministic hidden variables . . . . . . . . . 139

VII. 1. 1 Derivation of the first Bell inequality . . . . . . . . 139

VII. 1. 2 The Bell inequality of Clauser, Horne, Shimony and Holt . . . 141

VII. 1. 3 Violation of the Bell inequalities by quantum mechanics . . . 142

VII. 1. 4 The Bell inequality in a non-contextual, local deterministic HVT . 144

VII. 2 Local deterministic contextual hidden variables . . . . . . 145

VII. 3 Wigner’s derivation . . . . . . . . . . . . . 147

VII. 4 The derivation of Eberhard and Stapp . . . . . . . . . 150

VII. 4. 1 Counterfactual conditional statements and indeterminism . . . 152

VII. 5 Stochastic hidden variables . . . . . . . . . . . 153

VII. 5. 1 Outcome, parameter and source independence . . . . . 155

VII. 5. 2 Quantum mechanics as a stochastic HVT . . . . . . . 156

VII. 6 An algebraic proof without inequalities . . . . . . . . 158

VII. 7 Miscellanea . . . . . . . . . . . . . . . 160

VII. 7. 1 Locality and relativity . . . . . . . . . . . 160

VII. 7. 2 Locality versus conditional independence . . . . . . . 161

VII. 7. 3 Determinism . . . . . . . . . . . . . . 161

VIII THE MEASUREMENT PROBLEM 163

VIII. 1 Introduction . . . . . . . . . . . . . . . 163

VIII. 2 Measurement according to classical physics . . . . . . . 164

VIII. 3 Measurement according to quantum mechanics . . . . . . 166

VIII. 4 The measurement problem in the narrow sense . . . . . . 170

VIII. 4. 1 The projection postulate and consciousness . . . . . . 172

VIII. 4. 2 Bohmian mechanics . . . . . . . . . . . . 173

VIII. 4. 3 Spontaneous collapse . . . . . . . . . . . . 173

VIII. 4. 4 Many worlds . . . . . . . . . . . . . . 174

VIII. 4. 5 Superselection rules . . . . . . . . . . . . 175

VIII. 4. 6 Irreversibility of measurement . . . . . . . . . 176

VIII. 4. 7 Modal interpretation . . . . . . . . . . . . 176

VIII. 4. 8 Decoherence . . . . . . . . . . . . . . 177

VIII. 5 Incompatible quantities . . . . . . . . . . . . 179

VIII. 6 Comments on the theory of measurement . . . . . . . . 181

A GLEASON’S THEOREM 183

A. 1 Introduction . . . . . . . . . . . . . . . 183

A. 2 Conversion to a 3 - dimensional real problem . . . . . . . 184

A. 2. 1 Step 1 . . . . . . . . . . . . . . . 185

A. 3 Formulation of the problem on the surface of a sphere . . . . . 186

A. 3. 1 Step 2 . . . . . . . . . . . . . . . 188

A. 3. 1. 1 Lemma 1 . . . . . . . . . . . . . 188

A. 3. 1. 2 Lemma 2 . . . . . . . . . . . . . 189

A. 3. 1. 3 Result of lemma 1 and 2 . . . . . . . . . . 191

A. 3. 2 Step 3 . . . . . . . . . . . . . . . 192

A. 4 An analytic lemma . . . . . . . . . . . . . 196

A. 4. 1 Step 4 . . . . . . . . . . . . . . . 196

A. 5 Summary . . . . . . . . . . . . . . . . 198

WORKS CONSULTED 199

BIBLIOGRAPHY 200

LIST OF FIGURES

III. 1 A discontinuous measure for dim H = 2 . . . . . . . . . 48

III. 2 A rotated unit vector in the xz - plane . . . . . . . . . . 68

III. 3 Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b . . . . . . 73

IV. 1 Heisenberg’s γ - microscope . . . . . . . . . . . . 79

IV. 2 The double slit interference experiment (Bohr 1949 ) . . . . . . 89

IV. 3 Contexts of measurement in which the interference of the particles is visible,

and those in which the recoil of the screen is visible, exclude each other. (Bohr

1949 ) . . . . . . . . . . . . . . . . . . 90

IV. 4 Several perfect crystal neutron interferometers (Rauch and Werner 2000 ) . 93

IV. 5 The interference pattern in the neutron interferometer is acquired by measuring

the intensity in the detectors at a variable optical path length difference. . 94

IV. 6 The probability distribution in position for a slit of width 2 a . . . . 101

IV. 7 The diffraction pattern for a small slit of width 2 a . . . . . . . 101

IV. 8 The probability distribution in position for a double slit, 2 a is the width of each

slit and 2 A the distance between the slits . . . . . . . . . 104

IV. 9 The interference pattern for the double slit . . . . . . . . . 104

IV. 10 Moving screen . . . . . . . . . . . . . . . . 106

V. 1 A solution for dim H = 2 . . . . . . . . . . . . . 117

V. 2 a) Kochen - Specker diagram b) Conway - Kochen diagram . . . . 118

V. 3 M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of

the left pillar. Each cube has 4 lines from the mutual center to its vertices, 6

lines to the centers of its edges, and 3 lines to the centers of its faces. Three of

the lines are shared by all three cubes, giving 3 · (4 + 6 + 3 ) − 6 = 33 lines.

These are Peres’ vectors. (Text Meyer 2003 ) . . . . . . . . 119

V. 4 µ(P i ) = cos 2 θ . . . . . . . . . . . . . . . 120

VI. 1

VI. 2

The quantum potential for the two slit system as viewed from the screen, under

assumption of a Gaussian distribution at the slits (Bohm 1989 ) . . . 131

A simulation of the double slit experiment in Bohmian mechanics. Each particle

follows a certain path between the slits and the photographic plate. All

particles coming from the upper slit arrive at the upper half of the photographic

plate, likewise for the lower slit and lower half of the plate. The twists in the

paths are caused by the quantum potential U. (Vigier et al. 1987 ) . . . 132

VII. 1 Thought experiment of Einstein, Podolsky and Rosen on the singlet . . . 140

VII. 2 A configuration in which the spin quantities violate the Bell inequality . . 142

VII. 3 The Bell inequality violated for every acute angle ϕ . . . . . . 143

VII. 4 the configuration giving the largest violation of the Bell inequality (all vectors

in the same plane) . . . . . . . . . . . . . . . 143

VII. 5 Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n

is positive, in the unshaded areas a n b n is negative. . . . . . . . 144

VII. 6 Comparison of the quantum mechanical expectation values and those for the

local deterministic HVT . . . . . . . . . . . . . . 145

VII. 7 Violation of the Bell inequality again . . . . . . . . . . 149

VII. 8 The Mermin pentagon . . . . . . . . . . . . . . 159

VII. 9 Minkowski diagram of the EPRB experiment, where λ is in the past light cones

of both A and B . . . . . . . . . . . . . . . 160

VIII. 1 Schrödinger’s cat paradox (DeWitt 1970 ) . . . . . . . . . 170

A. 1 Construction of a 3 - dimensional subspace E . . . . . . . . 185

A. 2 Rotation of s to s 0 and t to t ′ along a great circle around axis r . . . 188

A. 3 Projection of points on a great circle onto a plane P through the north pole 189

A. 4 Projection of meridians, circles with constant latitude, and a great circle . 190

A. 5 Spiral representing a projected path from s to t along subsequent great circles,

each time starting at their most northern point . . . . . . . . 190

A. 6 Path from t to v, having the same longitude . . . . . . . . 191

A. 7 A strictly in - or decreasing curve C . . . . . . . . . . 192

A. 8 Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ ) . . 193

A. 9 Great circle C and tilted great circles C ′ and C ′′ . . . . . . . 194

A. 10 Two continuous curves on S 2 , intersecting in q . . . . . . . 195

I

CONCEPTUAL PROBLEMS

Anyone who is not shocked by quantum theory has not understood it.

— Niels Bohr

I think it is safe to say that no one understands quantum mechanics.

— Richard Feynman

I. 1 INTRODUCTION

Quantum mechanics emerged at the beginning of the 20 th century from an attempt to understand

the interaction between atoms and radiation. The presence of discrete lines in the emission

and absorption spectra of chemical elements indicates that this interaction takes the form of discrete

quanta. When, in the years 1925 and 1926, a coherent theory was developed by the unified efforts of

Werner Heisenberg, Paul Dirac, Max Born, Pascual Jordan, Wolfgang Pauli and Erwin Schrödinger,

and this theory was axiomatized seven years later by John von Neumann, the question about the

physical interpretation of the mathematical symbols of the theory arose.

The central mathematical concept in quantum mechanics is ψ, in the form of a wave function ψ(q)

in Schrödinger’s wave mechanics, or of a vector |ψ⟩ in Hilbert space, à la Von Neumann. According

to Born, its physical meaning is that ψ determines probabilities for results of measurements, and a

key question is then how such probabilities must be interpreted. By means of four examples we will

give an idea of the conceptual problems raised by quantum mechanics.

(i) Consider as a first example the decay of radioactive nuclei of a certain kind, as discussed by

Einstein (P.A. Schilpp (1949, p.667, ff). We see the unstable nuclei decay at various times, one almost

immediately, another only after a long time; the α - particles are radiated in ever different directions.

Quantum mechanics describes these nuclei by a non-stationary wave function, and using this function

one can calculate the expected lifetime of the nuclei.

A natural reaction is to assume that the nuclei differ from each other, and that this difference is

the cause of the mutually different individual life spans and the different directions the α - particles

are radiated in. In this view, the quantum mechanical expectation value would be comparable to

the average life span in a population. However, this does not fit in a natural way in the quantum

mechanical description. Quantum mechanics describes all nuclei by the same wave function. If this

description is complete, the fact that quantum mechanics gives only expected life spans is not due to

a lack of knowledge. Rather, there simply is nothing more to know concerning the nuclei than their

wave function and the probabilities that follow from it.

On the other hand, we see before our eyes that the nuclei do not behave the same way, they decay

at different times and send the α - particles in ever different directions. This suggests that more can

8 CHAPTER I. CONCEPTUAL PROBLEMS

be known about nuclei than their expected life spans, just like a more thorough investigation of the

individuals of a population enables us to know more than their mere average life span; we would

then be able to make a more detailed statement about their individual life spans. In this view the

quantum mechanical description is not complete, there are extra, until now ‘hidden’, variables which

say something about the individual case.

There is a standard answer to this problem, called the ‘Copenhagen interpretation’, after the view

developed by Bohr and his coworkers. This answer is that the idea that the individual nuclei have a

definite life span, independent of the observation of this life span, is incorrect. We can only speak of

an individual life span within the context of an experiment in which this is measured. An experiment

always entails a disturbance of the system. For this reason no conclusions can be drawn concerning

the undisturbed system. It is incorrect to speak of the life span of a nucleus which is not observed.

The statistical spread in the measured individual life spans is due to the quantum character of the

interaction between object and measuring apparatus. As a matter of principle, what happens in this

interaction cannot be described more precisely. This makes every individual measurement into a

unique event.

Characteristic for the Copenhagen interpretation is, furthermore, that one cannot simply combine

the description of the system, obtained within the context of a certain type of experiment, with a

description of the same system, obtained in a different kind of experiment. The best known example

of such mutually excluding experiments are measurements of position and momentum. According to

Bohr, descriptions of a system with terms like ‘position’ or ‘momentum’ are complementary; they are

supplementary to each other, but they can never be united in one picture.

The main point behind the Copenhagen answer is the idea of measurement disturbance. According

to this line of thought quantum mechanics is distinguished from classical physics by the quantization

of the interaction between system and measuring apparatus. Every observation involves an

interaction with, and therefore a disturbance of, the observed system. This disturbance cannot be

made arbitrarily small; ≠ 0. Therefore, one cannot identify observation results with properties

the system has independently of the observation. One can only talk meaningfully about observation

results which are created by the measurement. In contrast to classical physics, quantum mechanics

does not deal with what exists, but with what is observed.

At first sight this reasoning seems to be plausible, it is, however, not without problems. Can

we use the same reasoning if the observed system is macroscopic? And, as a matter of fact, what

exactly is an observation? Is it essential that some conscious being takes notice of the result of the

observation, or is an apparatus registering the outcome sufficient? These problems will appear in the

third and fourth example.

(ii) The next example is from a letter Einstein wrote to Born in 1948 (Born 1971, pp. 169, 170).

Consider a free particle described by a wave function ψ. According to the quantum mechanical

description, ψ satisfies an uncertainty relation; the statistical deviations of position and momentum

cannot simultaneously be made arbitrarily small. Apparently the outcomes of measurements of position

and momentum of an individual particle cannot both be predicted exactly, and the question arises

how to interpret this situation. Einstein distinguishes two points of view.

(a) The (free) particle really has a definite position and a definite momentum, even if they

cannot both be ascertained by measurement in the same individual case. According to

this point of view, the ψ - function represents an incomplete description of the real state

I. 1. INTRODUCTION 9

of affairs. [. . . ] Its acceptance would lead to an attempt to obtain a complete description

of the real state of affairs as well as the incomplete one, and to discover physical laws

for such a description. The theoretical framework of quantum mechanics would then be

exploded.

(b) In reality the particle has neither a definite momentum nor a definite position; the description

by [the] ψ - function is, in principle, a complete description. The strictly defined

position of the particle, obtained by measuring the position, cannot be interpreted as the

position of the particle prior to the measurement. The sharp localization which appears as

a result of the measurement is brought about only as a result of the unavoidable (but not

unimportant) operation of measurement. The result of the measurement depends not only

on the real particle situation but also on the nature of the measuring mechanism, which

in principle is incompletely known. An analogous situation arises when the momentum

or any other observable quantity relating to the particle is measured.

Interpretation (b) is accepted by the majority of the physicists and Einstein admits

[. . . ] it alone does justice in a natural way to the empirical state of affairs expressed in

Heisenberg’s principle within the framework of quantum mechanics.

Nevertheless, he emphasizes his preference for interpretation (a). His argument is that it is basic

to physics that physical concepts refer to entities, such as particles, fields, etc., that exist independently

of the observer, and are situated in space and time. Interpretation (b) renders this kind of

description impossible. A second argument has to do with composite systems and will be discussed

in section I. 2.

(iii) The next example, also originating from the correspondence between Einstein and Born

(Born 1971, pp. 188, 208 - 209), concerns a freely moving macroscopic object, for instance a star.

A simple Schrödinger equation applies to the center of mass of such a body, namely that of a free

particle. Since all wave functions which are solutions of the Schrödinger equation are admissible,

one may consider as a solution a wave function with two peaks of equal size, located far from each

other.

Upon measurement of the position of the center of mass of such a body, the outcome is found at

one peak in about half of the measurements, in the other half the outcome is found at the other peak.

In this case it is tempting to say that for half of these measurements the center of mass was at that

one position, that the object was at that position, while at the other half the center of mass was at the

other position. But according to the standard interpretation this is incorrect: prior to the measurement

no position can be assigned to the center of mass. Quantum mechanics applies just as well to the

center of mass of a macroscopic body as to an electron. It is, however, difficult to imagine how a

measurement ‘creates’ the position of the center of mass of a star as a result of a disturbance in the

order of the size of one single quantum .

According to Pauli, one of the representatives of the Copenhagen interpretation, this is a creation

outside the laws of nature (ibid., p. 223). The laws of nature only say something about the statistics

of the outcomes. The quantum mechanical probability description does not express our ignorance

concerning the position of the center of mass of the body; the probability description corresponds

10 CHAPTER I. CONCEPTUAL PROBLEMS

to an essential indeterminacy of that position. Pauli states that the question whether the ‘position’

of a body would also exist without observation is fundamentally unanswerable and for this reason

meaningless.

In this example the problem of the transition between the microscopic and the macroscopic levels

arises. Our intuition tells us that somewhere along the way the quantum mechanical probability

description must turn into a classical description of an ensemble, an ensemble of objects that have

properties. But if we accept at the same time that quantum mechanics applies as well to macroscopic

bodies as to microscopic ones, our expectation is refuted. This transition of the one type of ensemble

to the other is a problem which invariably emerges in considerations concerning the ‘measurement

problem’. We will come back to this in chapter VIII.

The previous discussion follows rather closely the formulations of Einstein and Pauli in the

years 1948-1954, as can be found in the correspondence between Born and Einstein (Born 1971).

An interesting aspect is that the discussion actually takes place over Born’s head. Born saw Einstein

as the one who had, in his theory of relativity, abolished the idea of absolute simultaneity by means of

the argument that it is meaningless to want to speak about something you cannot measure in principle.

Einstein reacts (ibid., p. 188)

There is nothing analogous in relativity to what I call incompleteness of description in

the quantum theory. Briefly it is because the ψ - function is incapable of describing certain

qualities of an individual system, whose ‘reality’ we none of us doubt (such as a

macroscopic parameter).

Moreover, Born continues to believe, despite everything Einstein writes, that Einstein objects

to the indeterministic character of quantum mechanics, i.e., the fact that it only provides probability

statements, instead of objecting to the alleged completeness of quantum mechanics, until Pauli

intervenes in the discussion and explains Einstein’s position to Born (ibid., pp. 217-219).

(iv) The last example is Schrödinger’s notorious cat paradox (Schrödinger 1935b).

One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along

with the following diabolical device (which must be secured against direct interference

by the cat): in a Geiger counter there is a tiny bit of radioactive substance, so small

that perhaps in the course of one hour one of the atoms decays, but also, with equal

probability, perhaps none; if it happens, the counter tube discharges and through a relay

releases a hammer which shatters a small flask of hydrocyanic acid. If one has left this

entire system to itself for an hour, one would say that the cat still lives if meanwhile no

atom has decayed. The first atomic decay would have poisoned it. The Ψ - function for

the entire system would express this by having in it the living and the dead cat (pardon

the expression) mixed or smeared out in equal parts.

In this example a number of problems is combined. In the first place there is again the difference

between a classical state and a quantum state. If the standard interpretation is extended consistently,

the cat cannot be considered dead or alive as long as the chamber is not opened and the cat is not

observed. (One may wonder what the cat itself thinks of this.)

The question whether it is permitted to extend the standard interpretation in this way coincides

with the question if and to what extent the quantum mechanical description can be transferred from

I. 2. INCOMPLETENESS AND LOCALITY 11

the microscopic to the macroscopic level. Then there is the question what an observation exactly is.

Are cats observers of their own situation? And if consciousness is essential for an observation, do

cats have the correct type of consciousness?

From the examples above we can isolate the following central concepts:

1. the real state of a system independent of measurement,

2. incompleteness,

3. measurement disturbance,

4. complementarity,

5. the transition from microscopic to macroscopic,

6. consciousness. 1

I. 2 INCOMPLETENESS AND LOCALITY

The previous discussion only served to get the reader in the right mood! In 1935 Albert Einstein,

Boris Podolsky and Nathan Rosen, from now on abbreviated as EPR, came up with an example

which considerably sharpened the discussion (EPR 1935). Using rigorous reasoning they argued that

quantum mechanics is an incomplete theory. As an introduction to their argumentation we will first

examine a more simple argument that Einstein formulated in the same year in a letter to Schrödinger,

as paraphrased by A. Fine (1986, p. 37).

Consider a composite system of two particles which interacted with each other but are so widely

separated in space now that they no longer interact. Suppose they are in a state |ψ⟩ which is an eigenstate

of the total momentum P 1 +P 2 with eigenvalue 0, but is not an eigenstate of P 1 or P 2 separately,

(P 1 + P 2 ) |ψ⟩ = 0 and P 1 |ψ⟩ ̸= a |ψ⟩, P 2 |ψ⟩ ̸= b |ψ⟩, for a, b ∈ R. (I. 1)

Through a measurement of the momentum of particle 1 we can predict with certainty what the result

will be of a measurement of the momentum of particle 2. Moreover, the measurement of particle 1

has absolutely no physical influence on particle 2. But if it is possible to predict the momentum of

particle 2 with certainty without any interaction with that particle, then particle 2 must already have

this momentum before the measurement, and this must even be the case before the measurement of

particle 1, since the measurement absolutely does not disturb particle 2. However, the value of this

property of particle 2 cannot be derived from the quantum mechanical description using the state |ψ⟩.

Therefore, quantum mechanics is incomplete.

We see how Einstein succeeds, thanks to the strict correlation between the particles that quantum

mechanics allows for, and thanks to the spatial separation of the particles, to refute the argument of

1 The role of consciousness is regarded as essential by mathematicians and physicists like Von Neumann, London,

Heitler and Wigner. The fact that they felt forced to take this highly unusual step in physical theory illustrates how serious

the situation is.

12 CHAPTER I. CONCEPTUAL PROBLEMS

the measurement disturbance as a physical process. In the earlier examples we could imagine the

measurement to create the outcome (although this already seemed a hardly convincing escape in Einstein’s

example of macroscopic bodies), and that this outcome did not exist prior to the measurement

because of the disturbance that comes with the measurement. We now see that we cannot imagine

these measurement disturbances as spatially limited, ‘local’ processes. Einstein spoke of “a spooky

action at a distance” and of “telepathy”.

The case against the completeness of the quantum mechanics gained strength with this example.

However, objections can be made. (Later Einstein would be amused about the fact that everyone knew

the argumentation was not correct but that everyone had another reason to think so.) The argument

uses the fact that in quantum mechanics there are eigenstates of P 1 + P 2 in which the momentum

of each individual particle is undetermined. It could be objected that such states are perhaps not

physically realizable, that only eigenstates of P 1 + P 2 which are at the same time also eigenstates of

both P 1 and P 2 would be realizable, and that we should therefore replace the state |ψ⟩ by a mixture

of such eigenstates, in which case the argumentation does not hold any longer.

The EPR article itself gives a more balanced argumentation that does not have this shortcoming.

The article deviates from the above on two points. First, not only the momentum, but also the position

of the two particles is brought into the consideration. Second, EPR formulate a ‘sufficient condition of

reality’ by means of the term ‘element of physical reality’, which we will call EPR(EPR). As worded

by EPR, p. 777,

EPR(EPR): If, without in any way disturbing a system, we can predict with certainty

(i.e., with probability equal to unity) the value of a physical quantity, then there exists an

element of physical reality corresponding to this physical quantity.

How else could we explain that we are able to predict the outcomes of measurements with certainty?

A necessary, and certainly sufficient, condition for a complete physical theory is, that each

element of physical reality must have a counterpart in the theoretical description,

COMP(T): If a physical theory T is complete, then every element of physical reality

must have a counterpart in the theory T .

It is possible to choose for |ψ⟩ a state which is a simultaneous eigenstate of the commuting operators

P 1 + P 2 and Q 1 − Q 2 . In Dirac - notation, and only considering one spatial dimension, such a

state is written in the ‘p - language’ and in the ‘q - language’, as

∫

|ψ⟩ = |p 1 = p⟩ ⊗ |p 2 = −p⟩ e − i l p dp = |q 1 = q⟩ ⊗ |q 2 = q − l⟩ dq, (I. 2)

R

where l is the eigenvalue of the mutual distance Q 1 − Q 2 and can be chosen arbitrarily large, and

the terms with the ‘cartwheels’ are the direct products, see subsection II. 5. 2, p. 31, of which the first

factor refers to particle 1, and the second to particle 2. The ‘p - language’ and the ‘q - language’ can

be ‘translated’ into each other by means of a Fourier - transformation. 2

2 Without Dirac - notation but in terms of Dirac’s δ - ‘functions’ the wave function has, in ‘p - language’ and in ‘q - language’,

the following form,

ψ(p 1 , p 2 ) = e − i lp 1

δ(p 1 + p 2 ) and ˜ψ(q1 , q 2 ) = δ(q 1 − q 2 + l).

I. 2. INCOMPLETENESS AND LOCALITY 13

Although this state |ψ⟩ is an eigenstate of the total momentum P 1 + P 2 of the two particles and

their mutual distance Q 1 − Q 2 , with eigenvalues 0 or l, respectively,

(

P 1 + P 2

)

|ψ⟩ = 0 |ψ⟩ and

(

Q1 − Q 2

)

|ψ⟩ = l |ψ⟩, (I. 3)

it is not an eigenstate of any of the 1 - particle operators P 1 , Q 1 , P 2 or Q 2 . However, given the

outcome of a measurement of P 1 , e.g. a, we can predict the result of a measurement of P 2 with

certainty, namely −a. In the same way, from a measurement of Q 1 with outcome x, the result of a

measurement of Q 2 follows with certainty, namely x − l.

Now the argumentation is as follows. If we would measure the momentum P 1 of particle 1,

then we could predict the value of P 2 with certainty, without disturbing particle 2. According to

the aforementioned criterion the momentum P 2 of particle 2 must then correspond to an element of

physical reality. On the other hand, if we would measure the position Q 1 of particle 1, then we could

predict the value of Q 2 with certainty, again without disturbing particle 2. In that case there must

be an element of physical reality which corresponds to Q 2 . Therefore we can, depending on which

measurement we perform on particle 1, assign an element of physical reality to particle 2.

However, because of the absence of physical interaction between the particles there can be no real

change in particle 2 as a result of what is done with particle 1. Consequently, particle 2 must have both

elements of physical reality. But such a simultaneous assignment of exact position and momentum

has no counterpart in the quantum mechanical formalism, there are no wave functions which are

simultaneous eigenfunctions of position and momentum. The conclusion is unavoidable, the answer

to the question in the title of their article ‘Can quantum - mechanical description of physical reality be

considered complete?’ must be negative.

Notice that it is not necessary to perform the measurements on P 1 or Q 1 simultaneously, the

only thing that matters is the possibility to choose whether to predict the position or momentum of

particle 2 with certainty. Because of the absence of interaction between both particles it makes no

difference for particle 2 which choice is made for particle 1. This part of the argumentation relies

on the supposition that the elements of physical reality have a local character. This implicit, but

reasonable locality premise, runs as follows,

LOC(EPR): Performing a measurement on a physical system S 1 does not have an instantaneous

effect on elements of physical reality belonging to any system S 2 which is spatially

separated from S 1 .

We can thus summarize the argument of EPR schematically; quantum mechanics, QM, together

with EPR(EPR) and LOC(EPR), implies that quantum mechanics is an incomplete theory,

not COMP(QM). Or:

QM ∧ EPR(EPR) ∧ LOC(EPR) → ¬ COMP(QM). (I. 4)

In comparison to the foregoing, the strength of this argument is, in the first place, the larger precision

with which the argumentation has been set up: the conclusion follows logically from a number

of explicitly formulated premises and conditions. Moreover, we see that we are able to attribute to

particle 2 both position and momentum without interacting with particle 2. This means that we cannot

avoid the argumentation by assuming that for the correct quantum mechanical description the

14 CHAPTER I. CONCEPTUAL PROBLEMS

given wave function ψ must be replaced by a mixture of eigenstates. Such eigenstates of position and

momentum are simply not available in quantum mechanics. The possibility to assign values to P 2

and Q 2 attacks the complementarity idea in the heart.

EPR anticipated the objection that only that which has been measured is real (EPR 1935, p. 780),

Indeed, one would not arrive at our conclusion if one insisted that two or more physical

quantities can be regarded as simultaneous elements of reality only when they can be

simultaneously measured or predicted. On this point of view, since either one or the

other, but not both simultaneously, of the quantities P and Q can be predicted, they are

not simultaneously real. This makes the reality of P and Q depend upon the process of

measurement carried out on the first system, which does not disturb the second system in

any way. No reasonable definition of reality could be expected to permit this.

They conclude their article with the next paragraph,

While we have thus shown that the wave function does not provide a complete description

of physical reality, we left open the question of whether or not such a description exists.

We believe, however, that such a theory is possible.

The problem whether a complete theory is possible or not, is called the hidden variable problem. The

so - called ‘hidden variable theories’ are attempts to solve this problem. We will come back to this in

chapter V.

Bohr’s (1935a) response to the argument of EPR aims at the question to what extent the condition

for an element of ‘physical reality’, as worded by EPR, is fulfilled in their example. The next quotation

is from Bohr (1935b, p. 700),

From our point of view we now see that the wording of the aforementioned criterion of

physical reality proposed by Einstein, Podolsky and Rosen contains an ambiguity as regards

the meaning of the expression “without in any way disturbing a system.” Of course

there is in a case like that just considered no question of a mechanical disturbance of

the system under investigation during the last critical stage of the measuring procedure.

But even at this stage there is essentially the question of an influence on the very conditions

which define the possible types of predictions regarding the future behavior of

the system. Since these conditions constitute an inherent element of the description of

any phenomenon to which the term ‘physical reality’ can be properly attached, we see

that the argumentation of the mentioned authors does not justify their conclusion that

quantum mechanical description is essentially incomplete. (Emphasis added.)

It is not easy to completely comprehend what Bohr says here. Evidently, he abandons the original

idea that the measurement disturbance creates the measurement results, or, at least, that such a creation

can be understood as a physical process. It is replaced by the idea that applicability of physical concepts

depends on the context of measurement. Performing a measurement on one of the particles

is considered as determinative for the applicability of concepts to the other particle. Bohr says that

the measurement disturbance is not a mechanical disturbance; apparently LOC(EPR) continues to

apply for him if we, using the term ‘influence’, refer to a mechanical interaction, but not if we mean

I. 2. INCOMPLETENESS AND LOCALITY 15

by ‘influence’ the ‘defining effect’ of the context of measurement. The experimental circumstances

define what you may call physical reality. Physical reality is not defined by experiments you could do,

as is the case according to EPR, but exclusively by experiments you actually do. Under circumstances

as described in the EPR experiment this ‘defining effect’ of the experimental setup also reaches parts

of the system with which the measuring apparatus has no physical interaction.

A distinct difference between Einstein and Bohr is that Einstein wants to visualize reality independent

of observation, whereas Bohr is satisfied with complementary pictures of which the applicability

always remains dependent on the chosen measurement setup. In 1955 Einstein says (Fine 1986, p.95)

It is basic for physics that one assumes a real world existing independently from any act

of perception. But this we do not know. We take it only as a programme in our scientific

endeavors. This programme is, of course, prescientific and our ordinary language is

already based on it.

And concerning the EPR situation he says (Schilpp 1949, p. 85)

But on one supposition we should, in my opinion, absolutely hold fast: the real factual

situation of the system S 2 is independent of what is done with the system S 1 , which is

spatially separated from the former.

Bohr’s conceptions concerning physical reality are much more difficult to characterize. According

to him there is no independent reality of which the physical theory would have to give an unambiguous

representation. He writes (Schilpp 1949, p. 211)

Thus, a sentence like “we cannot know both the momentum and the position of an atomic

object” immediately raises questions as to the physical reality of two such attributes of the

object, which can be answered only by referring to the conditions for the unambiguous

use of space - time concepts, on the one hand, and dynamical conservation laws, on the

other hand.

An exhaustive description of reality must always use concepts which themselves remain dependent

on mutually excluding contexts. Bohr says (A. Petersen 1963, p. 11)

The word ‘reality’ is also a word, a word which we must learn to use correctly.

He constantly emphasizes the restricted applicability of our physical concepts, which makes the link

between description and reality very complicated. Petersen mentions (ibid., p. 12)

When asked whether the algorithm of quantum mechanics could be considered as somehow

mirroring an underlying quantum world, Bohr would answer, “There is no quantum

world. There is only an abstract quantum physical description. It is wrong to think that

the task of physics is to find out how nature is. Physics concerns what we can say about

nature.”

16 CHAPTER I. CONCEPTUAL PROBLEMS

Einstein’s conceptions are, in a certain way, easier than those of Bohr and correspond to the

intuition of the majority of physicists. When the preponderance of the Copenhagen school started to

wane, in the 1960s, attention for Einstein’s viewpoint revived.

In 1964, John Bell gave a reconstruction of the EPR experiment (see chapter VII) satisfying Einstein’s

requirement that the real, factual situation of physical system S 2 is independent of what is

done with system S 1 , the two systems being spatially separated. He constructed a very general model

and made the surprising discovery that such a model cannot completely reproduce the quantum mechanical

predictions. Especially remarkable are the broad generality of his derivation and the fact that

the differences with quantum mechanics are large enough to be able to be measured. Sensationally, a

‘philosophical’ issue thus came within the range of experimental physics! Abner Shimony has spoken

in this respect of experimental metaphysics.

Bell’s work is an attempt solve the completeness problem. Hereafter attempts were undertaken

to really carry out the EPR experiment, which was thus far only a thought experiment. The first

experiment was done in 1972 by Freedman and Clauser. Later, several other experiments have been

done, the highlight of which was, in 1982, the experiment of Alain Aspect and his group in Paris. In

turn, this has been superseded by the experiments of Anton Zeilinger and his groups in Vienna and

Innsbruck (e.g. Weihs 1998). The results of these experiments are in good to excellent agreement

with quantum mechanics, and therefore in conflict with all models meeting Einstein’s requirements.

The latter conclusion applies irrespective of the validity of quantum mechanics.

These results brought about a great number of responses and is one of the main causes of the

revived interest for interpretation problems of quantum mechanics. The discussion focusses on the

question what exactly the suppositions are that lead to the result of Bell and whether his model is

indeed the most general model that meets Einstein’s requirements.

The consequences of Bell’s result seem to be considerable. It can be argued that no independent

existence can be granted to objects that at some time interacted, irrespective of how far apart they are,

this even holds completely independent of the distance. This suggests that reality cannot be reduced

to the ‘sum’ of its parts and that a more holistic approach is imperative, making our picture of nature

much more complicated.

Through the discussion of the EPR argument some more basic concepts are added to our list:

7. element of physical reality,

8. separability of physical systems,

9. locality,

10. holism.

These ten concepts play a central role in the research on the foundations of quantum mechanics.

II

THE FORMALISM

As far as the laws of mathematics refer to reality, they are not certain; and as far as they

are certain, they do not refer to reality.

In mathematics you don’t understand things. You just get used to them.

— Albert Einstein

— John von Neumann

The usual mathematical formulation of quantum mechanics has been developed by John von Neumann

in 1932 as an operator calculus on a Hilbert space. We will not need all details of this

calculus, therefore, give only a succinct review. For our purposes we can limit ourselves to a

finite - dimensional Hilbert space, a complex vector space with an inner product. We will give an

overview of the elementary concepts of this Hilbert space, and in an addendum concisely summarize

the infinite - dimensional case. For a more extensive treatment of Hilbert spaces we refer

to the first chapters of E. Prugovečki (2006).

II. 1

FINITE - DIMENSIONAL HILBERT SPACES

We start this chapter by defining a space called a Hilbert space, denoted by H. The elements H are

called vectors. Following Dirac’s ket notation the vectors will be written as |α⟩,|β⟩,|γ⟩,|ϕ⟩,|ψ⟩,|χ⟩, . . . ,

complex numbers will be specified by the first characters of the alphabet a, b, c ∈ C.

Vectors can be added, and multiplied with a complex number, also called a scalar, we then remain

in H, i.e., for all |ϕ⟩, |ψ⟩ ∈ H and a, b ∈ C we have

a|ϕ⟩ + b|ψ⟩ ∈ H. (II. 1)

In other words, H is closed under linear combinations.

The addition is commutative and associative,

|ϕ⟩ + |ψ⟩ = |ψ⟩ + |ϕ⟩, (II. 2)

|ϕ⟩ + ( |ψ⟩ + |χ⟩ ) = ( |ϕ⟩ + |ψ⟩ ) + |χ⟩. (II. 3)

We require the existence of a null vector, 0 ∈ H, which is provable unique and has the property

that for all |ϕ⟩ ∈ H

0 + |ϕ⟩ = |ϕ⟩, (II. 4)

18 CHAPTER II. THE FORMALISM

and that every vector has an additive inverse, i.e., for every |ϕ⟩ ∈ H there is a vector |ϕ ′ ⟩ ∈ H, also

provable unique, such that

|ϕ⟩ + |ϕ ′ ⟩ = 0 . (II. 5)

The scalar multiplication is distributive and associative,

(a + b) ( |ϕ⟩ + |ψ⟩ ) = a |ϕ⟩ + a |ψ⟩ + b |ϕ⟩ + b |ψ⟩, (II. 6)

a ( b |ϕ⟩ ) = (a b) |ϕ⟩, (II. 7)

and we demand that

1 |ψ⟩ = |ψ⟩. (II. 8)

Incidentally we also write

a |ψ⟩ ≡ |a ψ⟩ ≡ |ψ⟩ a. (II. 9)

EXERCISE 1. Prove (a) 0|ϕ⟩ = 0 ,

(b) the additive inverse of |ϕ⟩ equals −1|ϕ⟩.

An inner product on a vector space is a mapping H × H → C, where the image in C

of ( |ϕ⟩, |ψ⟩ ) ∈ H × H is written as ⟨ϕ | ψ⟩. The inner product has the following properties:

(i)

⟨ϕ | a ψ + b χ⟩ = a ⟨ϕ | ψ⟩ + b ⟨ϕ | χ⟩,

(ii) ⟨ϕ | ψ⟩ = ⟨ψ | ϕ⟩ ∗ ,

(iii) ⟨ϕ | ϕ⟩ 0, (II. 10)

(iv) ⟨ϕ | ϕ⟩ = 0 iff |ϕ⟩ = 0 .

The value

∥ψ∥ := √ ⟨ψ | ψ⟩ (II. 11)

is called the norm of |ψ⟩ and meets the usual requirements for a norm; its value is positive, except

for the zero vector which is assigned 0, it is homogeneous, in the sense that ∥aψ∥ = |a|∥ψ∥, and it

satisfies the triangle inequality ∥ψ + ϕ∥ ∥ψ∥ + ∥ϕ∥. A vector is called a unit vector if the norm

equals 1.

II. 1. FINITE - DIMENSIONAL HILBERT SPACES 19

An important inequality is the Cauchy - Schwarz inequality

|⟨ϕ | ψ⟩| 2 ⟨ϕ | ϕ⟩ ⟨ψ | ψ⟩. (II. 12)

EXERCISE 2. Prove (a) the Cauchy - Schwarz inequality (II. 12),

(b) the definition of the norm satisfies the standard requirements for a norm.

The n vectors |α 1 ⟩, . . . , |α n ⟩ are called (linearly) independent if it follows from

n∑

c i |α i ⟩ = 0 (II. 13)

i=1

that all coefficients c i are equal to zero, otherwise the vectors are called dependent.

EXERCISE 3. Prove that mutually orthogonal vectors are linearly independent.

A set of vectors |α 1 ⟩, . . . , |α N ⟩ in H is complete 1 if every vector |ψ⟩ ∈ H can be written as a

linear combination of this set,

|ψ⟩ =

N∑

c i |α i ⟩. (II. 14)

i=1

A complete, independent set of vectors is called a basis. A basis is called orthonormal if

⟨α i | α j ⟩ = δ ij , (II. 15)

where δ ij is the Kronecker delta. It can be proved that every basis of a space H contains the same

number of elements, this number is, by definition, the dimension of H, and is written dim H. The

dimension of a Hilbert space is infinite if every finite set of linearly independent vectors is incomplete.

If |α 1 ⟩, . . . , |α N ⟩ is an orthonormal basis, with N = dim H, then it follows from (II. 15) that the

coefficients in (II. 14) are given by

c i = ⟨α i | ψ⟩, (II. 16)

and the vectors |ψ⟩ can thus be represented in such a basis by columns of N complex numbers.

Therefore, an N - dimensional Hilbert space can also be written as C N .

1 The use of the term ‘complete’ for a system of vectors should not be confused with the same phrase as used within the

context of the foundations of quantum mechanics, that is, as a property of a physical theory.

20 CHAPTER II. THE FORMALISM

With (II. 16), in an orthonormal basis we have

⎛ ⎞

c 1

c 2

|ψ⟩ = ⎜ ⎟

⎝ . ⎠

c n

(II. 17)

and hence ⟨ψ| = (c 1 ∗ , c 2 ∗ , . . . , c ∗ n), therefore

⎛

c 1 c

∗ 1 . . . c 1 c ∗ ⎞

n

⎜

|ψ⟩ ⟨ψ| = ⎝

.

. ..

⎟

⎠ , (II. 18)

c n c

∗ 1 c n cn

∗

from which it is evident that for the vectors of the orthonormal basis {|α i ⟩} it holds that

N∑

|α i ⟩ ⟨α i | = 11, (II. 19)

i=1

with 11 the identity mapping on H,

11 |ψ⟩ = |ψ⟩ ∀ |ψ⟩ ∈ H. (II. 20)

Using (II. 14) and (II. 16), we see that an orthonormal basis is indeed characterized by the relation

|ψ⟩ =

N∑

⟨α i | ψ⟩ |α i ⟩ =

i=1

N∑

|α i ⟩ ⟨α i | ψ⟩. (II. 21)

i=1

The definition of a finite - dimensional Hilbert space is now completed; it is a finite - dimensional

complex Hilbert space with an inner product which is related to the norm by means of (II. 11). A real

finite - dimensional Hilbert space is obtained by replacing C everywhere by R, i.e., the set of scalars is

in R and the inner product is always real. In section II. 6 we will see that for the infinite - dimensional

case the definition must be extended with two requirements, ‘separability’ and ‘completeness’, which

we can prove in the finite - dimensional case.

II. 2

OPERATORS

An operator A on a Hilbert space H is a linear mapping of H onto itself,

A : H → H, |ψ⟩ ↦→ A |ψ⟩ with A ( a |ψ⟩ + b |ϕ⟩ ) = a A |ψ⟩ + b A |ϕ⟩. (II. 22)

From (II. 16) we saw that in a given orthonormal basis |α 1 ⟩, . . . , |α N ⟩ the vectors |ψ⟩ ∈ H are

unambiguously represented by rows of N complex numbers c i = ⟨α i | ψ⟩. This corresponds to the

II. 2. OPERATORS 21

representation of an operator A as an N × N - matrix A in a basis {|α i ⟩}, and the coefficients of the

vector A|ψ⟩ in this basis are, using (II. 19),

with

⟨α i | A | ψ⟩ = ⟨α i | A 11 | ψ⟩ =

N∑

⟨α i | A | α j ⟩ ⟨α j | ψ⟩ =

j=1

N∑

A ij c j , (II. 23)

j=1

A ij := ⟨α i | A | α j ⟩. (II. 24)

Operators A and B can be added and multiplied,

(A + B) |ψ⟩ := A |ψ⟩ + B |ψ⟩ and (A B) |ψ⟩ := A ( B |ψ⟩ ) . (II. 25)

The adjoint A † of an operator A is defined by the following equation

⟨ψ | A † | ϕ⟩ = ⟨ϕ | A | ψ⟩ ∗ ∀ |ϕ⟩, |ψ⟩ ∈ H. (II. 26)

EXERCISE 4.

( ) A

† = A ∗

ij ji .

Show that for the matrix representation in an orthonormal basis it holds that

Every operator on a finite - dimensional vector space has a unique adjoint, and the following holds

(c A) † = c ∗ A † ,

(A + B) † = A † + B † ,

(A B) † = B † A † ,

(

A

† ) † = A. (II. 27)

An operator B is called an inverse of A if

A B = B A = 11. (II. 28)

In this case we write A −1 for B, because the inverse, if it exists, is unique. Not every operator has an

inverse, an example in the Hilbert space C 2 is

( ) 0 1

. (II. 29)

0 0

The trace of an operator A is defined as follows,

Tr A :=

N∑

⟨γ i | A | γ i ⟩, (II. 30)

i=1

where |γ 1 ⟩, . . . , |γ N ⟩ is an arbitrary orthonormal basis and N = dim H.

22 CHAPTER II. THE FORMALISM

EXERCISE 5. Show that Tr A is independent of the choice of the orthonormal basis.

The trace has the following properties:

Tr A † = Tr A ∗ ,

Tr (bA + cB) = b Tr A + c Tr B,

Tr AB = Tr BA. (II. 31)

EXERCISE 6. Prove the three statements in (II. 31).

We will now list the most important types of operators. An operator A is called normal if it

commutes with its adjoint,

[

A, A

† ] := A A † − A † A = 0 , (II. 32)

where 0 is actually the ‘zero operator’, it maps all vectors to the zero vector 0 . An operator is called

self - adjoint or Hermitian if it is equal to its adjoint,

A † = A, (II. 33)

and with the first statement of (II. 31) we see that the trace of a self-adjoint operator is always real.

Self - adjoint operators are normal, but not all normal operators are self - adjoint, e.g., the unitary

operator,

U † = U − 1 . (II. 34)

EXERCISE 7. Prove that a unitary operator preserves the inner product, e.g., for all |ϕ⟩,|ψ⟩ ∈ H

the following holds: if |ϕ ′ ⟩ = U |ϕ⟩ and |ψ ′ ⟩ = U |ψ⟩ then ⟨ψ ′ | ϕ ′ ⟩ = ⟨ψ | ϕ⟩.

An operator A is called positive, i.e. A 0, if

⟨ψ | A | ψ⟩ 0 ∀ |ψ⟩ ∈ H. (II. 35)

An operator P is called a projection operator, or a projector for short, if it is self - adjoint and

idempotent,

P = P † and P 2 = P. (II. 36)

II. 2. OPERATORS 23

An example of a projector, apart from the obvious examples of the zero operator 0 and the identity

operator 11, is the mapping P ϕ = |ϕ⟩ ⟨ϕ| which projects on a given unit vector |ϕ⟩,

P ϕ : |ψ⟩ ↦→ ⟨ϕ | ψ⟩ |ϕ⟩ = |ϕ⟩ ⟨ϕ | ψ⟩. (II. 37)

EXERCISE 8. Show that (a) every projector is positive,

(b) if P is a projector, then 11 − P is one also.

Projectors are the workhorses of Hilbert space. Nearly all of our further considerations concerning

quantum mechanics can be formulated in terms of projectors, and therefore we will now discuss their

properties somewhat more elaborate.

We write the set of all projectors on a Hilbert space H as P (H). Every projector P can be

characterized by means of its range, i.e. the set

H P := { P |ψ⟩ : |ψ⟩ ∈ H } . (II. 38)

This set is closed under linear combinations and thus forms another Hilbert space by itself, called a

subspace of H. Conversely, every subspace of H corresponds unambiguously to a projector. 2

The subspace corresponding to a projector is also called its eigenspace, and if the dimension of its

eigenspace is N, the projector is called N - dimensional.

Two projectors P 1 and P 2 are called mutually orthogonal, written as P 1 ⊥ P 2 , if

P 1 P 2 = 0. (II. 39)

In that case their eigenspaces are also orthogonal,

P 1 ⊥ P 2 iff ∀ |ψ⟩ ∈ H P 1

, ∀ |ϕ⟩ ∈ H P 2

it holds that ⟨ϕ | ψ⟩ = 0. (II. 40)

EXERCISE 9. Verify that P 1 P 2 = 0 =⇒ P 2 P 1 = 0 holds for projectors.

For two orthogonal projectors P 1 ⊥ P 2 , the sum P 1 + P 2 is also a projector since it is, as can be

seen using (II. 27), self - adjoint, and it is idempotent,

(P 1 + P 2 ) 2 = P 1 2 + P 1 P 2 + P 2 P 1 + P 2 2 = P 1 2 + P 2 2 = P 1 + P 2 , (II. 41)

thereby satisfying the requirements (II. 36). The eigenspace of the projector P 1 + P 2 is the linear

space spanned by the vectors in H P1 and H P2 .

2 In infinite - dimensional Hilbert spaces this only holds for closed subspaces.

24 CHAPTER II. THE FORMALISM

A set of projectors P 1 , . . . , P N is called mutually orthogonal if

P i P j = δ ij P i for i, j = 1, . . . , N, (II. 42)

a set of mutually orthogonal projectors is called complete if

N∑

P i = 11. (II. 43)

i=1

In particular, in accordance with (II. 19), for an orthonormal basis |α i ⟩, . . . , |α N ⟩ it holds that the

associated 1 - dimensional projectors form a complete set,

N∑

|α i ⟩ ⟨α i | = 11. (II. 44)

i=1

II. 3

EIGENVALUE PROBLEM AND SPECTRAL THEOREM

If |β 1 ⟩, . . . , |β N ⟩ is an arbitrary orthonormal basis, an operator A is represented in this basis as

an arbitrary N × N - matrix,

A ij = ⟨β i | A | β j ⟩. (II. 45)

A powerful tool for the study of such matrices is obtained if they can be ‘diagonalized’, i.e., if an

orthonormal basis |α 1 ⟩, . . . , |α N ⟩ can be found where the matrix representation of A is of the form

⎛ ⎞

a 1

A = ⎝ . ..

0

⎠ , (II. 46)

0 a N

or, equivalently,

A ij = a j δ ij . (II. 47)

For such a basis it holds that

A |α i ⟩ = a i |α i ⟩. (II. 48)

Equation (II. 48) is called the eigenvalue equation of the operator A, the values a i are called the

eigenvalues of A, the set of eigenvalues of A the spectrum of A, written as Spec A, the vectors |α i ⟩

are called the eigenvectors, and the system |α 1 ⟩, . . . , |α N ⟩ an eigenbasis of A. For a self - adjoint

operator it holds that the eigenvalues are all real, and the eigenvalues are not negative if the operator

is positive. For a unitary operator U all eigenvalues u i ∈ C are on the complex unit circle, |u i | = 1,

for a projector the eigenvalues are 0 or 1.

The eigenvalue equation does, however, not always have a solution. See as an example operator

(II. 29). The conditions under which the equation can be solved are given by the next important

theorem which we mention without proof.

II. 3. EIGENVALUE PROBLEM AND SPECTRAL THEOREM 25

SPECTRAL THEOREM:

Every normal operator A has an orthonormal basis of eigenvectors |α 1 ⟩, . . . , |α N ⟩ and

associated eigenvalues a 1 , . . . , a N , not necessarily distinct, satisfying (II. 48).

The spectral theorem tells us that normal operators can be diagonalized. This can be formulated

more elegantly in Dirac notation, where we must distinguish between the case in which all eigenvalues

differ from each other, and the case in which some eigenvalues are equal. In the first case the operator

is called maximal, in the second case the operator is called degenerate.

Suppose that the operator A is maximal, i.e. all eigenvalues a i differ from each other, a i ≠ a j

if i ≠ j. In this case we often use the eigenvalues as a label for the eigenvectors and write |a i ⟩ instead

of |α i ⟩. This notation is unambiguous, since there is exactly one eigenvalue for every eigenvector.

Now, according to the spectral theorem, there is an orthonormal basis |a 1 ⟩, . . . , |a n ⟩ such that

A =

N∑

a i |a i ⟩ ⟨a i |, (II. 49)

i=1

since, with (II. 44), it holds for all |ψ⟩ ∈ H that

A |ψ⟩ = A 11 |ψ⟩ = A

N∑

|a i ⟩ ⟨a i | ψ⟩ =

i=1

N∑

a i |a i ⟩ ⟨a i | ψ⟩. (II. 50)

i=1

If the operator is degenerate there are only M < N distinct eigenvalues a 1 , . . . , a M . For every

eigenvalue a i , there exists a number n i of mutually orthogonal eigenvectors, for which we have

M∑

n i = N. (II. 51)

i=1

The eigenvalue a i is called n i - fold degenerate. The associated eigenvectors span a n i - dimensional

subspace of eigenvectors for the value a i .

Choose, in this subspace, an orthonormal basis {|α i , j⟩} with j = 1, . . . , n i . Here we can also

use the eigenvalues a i as a label for the basis vectors because the extra label j prevents our notation

from becoming ambiguous. Now the eigenvalue equation (II. 48) becomes

A |a i , j⟩ = a i |a i , j⟩. (II. 52)

Analogous to (II. 49), we find

A =

M∑

i=1

a i

∑n i

j=1

|a i , j⟩ ⟨a i , j|, (II. 53)

which, in terms of the n i - dimensional eigenprojectors

P ai =

∑n i

j=1

|a i , j⟩ ⟨a i , j|, (II. 54)

26 CHAPTER II. THE FORMALISM

can also be written as

A =

M∑

a i P ai . (II. 55)

i=1

EXERCISE 10. (a.) Show that P ai in (II. 54) is independent of the choice of the orthonormal

basis |a i , 1⟩, . . . , |a i , n i ⟩. (b). Show that for P ai as defined in (II. 54) and P ϕ given by II. 37:

TrP ai P ϕ = ⟨ϕ|P ai |ϕ⟩ (II. 56)

We summarize the two preceding cases in the following, equivalent, form of the spectral theorem,

formulated in terms of projectors.

SPECTRAL THEOREM:

For every normal operator A a unique set of mutually distinct eigenvalues a 1 , . . . , a M

exists, with M N, and an associated unique complete set of mutually orthogonal projectors

P a1 , . . . , P aM , such that

A =

11 =

M∑

a i P ai , (II. 57)

i=1

M∑

P ai . (II. 58)

i=1

If the operator is non - degenerate, all of these projectors are 1 - dimensional; if it is degenerate,

dim P ai gives the degeneracy of eigenvalue a i . Equation (II. 57) is called the spectral decomposition

of A, the set of mutually orthogonal projectors P ai is called the spectral family of A, and (II. 58) a

resolution of identity.

II. 3. 1

APPENDIX

A formulation of the spectral theorem which is equivalent to the preceding, but is more suitable

for generalizations, can be obtained if we introduce the correspondence between the eigenvalues and

the associated eigenprojectors as a mapping A of all subsets of Spec A ⊂ C to the set P (H) of

projectors on H.

We construct that mapping by demanding

{a i } ↦→ P ai , (II. 59)

II. 4. FUNCTIONS OF NORMAL OPERATORS 27

and extend this with the condition

{a 1 , a 2 } ↦→ P {a1 , a 2 } := P a1 + P a2 , (II. 60)

or, more generally, if ∆ represents an arbitrary set of eigenvalues, we define

∆ ↦→ P ∆ = ∑

P a . (II. 61)

a ∈ ∆

A mapping A : C → P (H) is called a projection - valued measure if

(i) P ∅ = 0

(ii) P Spec A = 11

(iii) P ∪i ∆ i

= ∑ i

P ∆i , for all ∆ i mutually disjoint. (II. 62)

EXERCISE 11. Verify that: P ∆ c = 11 − P ∆ where ∆ c = Spec A \ ∆ is the complement of ∆.

The spectral theorem can now again be formulated.

SPECTRAL THEOREM:

Every normal operator A corresponds unambiguously to a projection - valued measure A.

II. 4

FUNCTIONS OF NORMAL OPERATORS

The spectral theorem makes it possible to treat functions of normal operators in a simple manner.

If f is an arbitrary function, real or complex, and A is an operator with spectral decomposition

A =

M∑

a i P ai , (II. 63)

i=1

then the function f (A) of A is defined as

f (A) :=

M∑

f (a i ) P ai . (II. 64)

i=1

This means that f (A) always has the same eigenvectors and eigenprojections as A, and only differs

from A in the labeling of its eigenvalues, namely by f (a i ) instead of a i . As an example, consider the

characteristic function χ a of a ∈ C,

χ a : C → {0, 1}, x ↦→ χ a (x) :=

{ 1 if x = a

0 otherwise

(II. 65)

28 CHAPTER II. THE FORMALISM

for which, with (II. 64), we have

χ ak (A) : =

M∑

χ ak (a i ) P ai = P ak , (II. 66)

i=1

and we see that the projectors from the spectral decomposition of A, (II. 63), are functions of A.

We use the spectral decompositions in the proof of the following theorem.

THEOREM:

If two self - adjoint operators A and B commute, there is a maximal, self - adjoint operator

C of which both A and B are a function.

To prove this theorem we first prove a useful lemma.

LEMMA:

If [A, B] = 0, a basis {|γ i ⟩} exists in which A and B are simultaneously diagonal.

Proof

Let {|a i , j⟩} be an orthonormal eigenbasis of operator A, where j = 1, . . . , n i is the degeneracy

of eigenvalue a i , and we have

⟨a p , q | a i , j⟩ = δ pi δ qj . (II. 67)

Analogously, let there be an orthonormal eigenbasis {|b k , l⟩} for operator B. From [A, B] = 0

and (II. 63) it follows that

A ( B |a i , j⟩ ) = B A |a i , j⟩ = a i B |a i , j⟩, (II. 68)

and B |a i , j⟩ is, apparently, an eigenvector of A with the eigenvalue a i , i.e., B |a i , j⟩ is in the

eigenspace spanned by |a i , 1⟩, . . . , |a i , n i ⟩. Or, equivalently,

B |a i , j⟩ =

∑n i

k=1

holds for certain numbers Λ [i]

j,k ∈ C.

Λ [i]

j,k |a i, k⟩ (II. 69)

By assmuptionion, B is self - adjoint and therefore the matrix Λ [i] must be Hermitian,

and we see that

⟨a k , l | B | a i , j⟩ = Λ [i]

l,j δ ki = Λ [i]

l,j

, (II. 70)

⟨a k , l | B | a i , j⟩ ∗ = Λ [i] ∗

l,j = ⟨ai , j | B † | a k , l⟩ = Λ [k]

j,l δ ik = Λ [i]

j,l

, (II. 71)

Λ [i]

l,j

∗ [i] = Λ

j,l

. (II. 72)

II. 4. FUNCTIONS OF NORMAL OPERATORS 29

Because Λ [i] is self - adjoint, it can be diagonalized by a unitary matrix S [i] ,

Λ ′ [i]

= S [i]− 1 Λ [i] S [i] . (II. 73)

This corresponds to an orthonormal basis transformation within the n i - dimensional subspace

with eigenvalue a i . Carrying out this transformation in each of the subspaces and writing |a i , m ′ ⟩

for the transformed eigenvectors of A, we have

|a i , m ′ ⟩ =

∑n i

j=1

S [i]

j,m ′ |a i, j⟩. (II. 74)

In the new basis {|a i , m ′ ⟩} the matrix Λ [i] is diagonalized and therefore

B |a i , m ′ ⟩ = Λ ′ [i]

m ′ , m ′ δ m ′ j |a i , j⟩ = Λ ′ [i]

m ′ , m ′ |a i, m ′ ⟩. (II. 75)

The vectors |a i , m ′ ⟩ are not just eigenvectors of A, but also of B and form, by construction, a

basis. □

Notice that it is not in contradiction to this lemma if non - commuting operators have some eigenvectors

in common. ▹

Now we come to the proof of the theorem.

Proof

Define, in the basis {|γ i ⟩} of the lemma

A = ∑ i

a i P |γi⟩ and B = ∑ i

b i P |γi⟩, (II. 76)

where the eigenvalues a i and b i are allowed to be degenerate. Next, define a maximal self - adjoint

operator

C = ∑ i

c i P |γi ⟩, (II. 77)

with all c i ∈ C distinct.

Then, according to (II. 66), with χ ci defined analogously to (II. 65),

P |γi⟩ = χ ci (C). (II. 78)

With f (x) = ∑ i

a i χ ci (x) and g(x) = ∑ i

b i χ ci (x), as defined in (II. 64), we now find

A = ∑ i

a i χ ci (C) = f (C) and B = ∑ i

b i χ ci (C) = g(C). (II. 79)

30 CHAPTER II. THE FORMALISM

Thus, both self - adjoint, and mutually commuting, operators A and B are functions of the maximal,

self - adjoint operator C, which is what we set out to prove. □

Note that the choice of C in the above theorem is not unique. Indeed, suppose that

A = f (C 1 ) = g(C 2 ), (II. 80)

where C 1 and C 2 are both maximal. In general, it is not required for C 1 and C 2 to commute.

But they do commute if A itself is maximal . In that case f can be inverted

C 1 = f − 1 (A) = f − 1 (g(C 2 )) (II. 81)

from which it follows that

[C 1 , C 2 ] = 0. ▹ (II. 82)

II. 5

DIRECT SUM AND DIRECT PRODUCT

There are two ways to construct a new Hilbert space H from two given Hilbert spaces H 1

and H 2 , or vice versa, to divide a given Hilbert space H into smaller spaces.

II. 5. 1

DIRECT SUM

Let H 1 and H 2 be two Hilbert spaces. By definition we call the space H := H 1 ⊕ H 2 the direct

sum space of H 1 and H 2 if the following requirements are satisfied:

(i) The space H 1 ⊕ H 2 contains as its elements all ordered pairs of vectors, written as |ϕ⟩ 1 ⊕ |ψ⟩ 2 ,

with |ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 ∈ H 2 .

(ii) Addition and scalar multiplication are defined on H 1 ⊕ H 2 , and obey

a ( |ϕ⟩ 1 ⊕ |ψ⟩ 2

)

+ b

(

|χ⟩1 ⊕ |ξ⟩ 2

)

=

(

a |ϕ⟩1 + b |χ⟩ 1

)

⊕

(

a |ψ⟩2 + b |ξ⟩ 2

)

. (II. 83)

(iii) The inner product is additive,

(

1⟨ϕ| ⊕ 2 ⟨ϕ| ) ( |ψ⟩ 1 ⊕ |ψ⟩ 2

)

= 1 ⟨ϕ | ψ⟩ 1 + 2 ⟨ϕ | ψ⟩ 2 . (II. 84)

(iv) H 1 ⊕ H 2 is the smallest Hilbert space spanned by the elements of the form |ϕ⟩ 1 ⊕ |ψ⟩ 2 and

their linear combinations.

II. 5. DIRECT SUM AND DIRECT PRODUCT 31

A few remarks about this definition are in order. (a) According to (II. 83), an arbitrary linear

combination of elements in H 1 ⊕ H 2 is, of the form

∑ ( ) ∑

a i |ϕi ⟩ 1 ⊕ |ψ i ⟩ 2 = a i |ϕ i ⟩ 1 ⊕ ∑ a i |ψ i ⟩ 2 . (II. 85)

i

Consequently, with |ϕ⟩ 1 := ∑ i a i|ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 := ∑ i a i|ψ⟩ 2 ∈ H 2 , all elements in H 1 ⊕ H 2

are of the form |ϕ⟩ 1 ⊕|ψ⟩ 2 . This means that the requirements (i) and (ii) imply that H 1 ⊕H 2 is closed

under linear combinations.

(b) The subspace of H 1 ⊕H 2 , existing of all vectors of the form 0 1 ⊕|ψ⟩ 2 , with 0 1 the null vector

in H 1 , and |ψ⟩ 2 ∈ H 2 arbitrary, is isomorphic to H 2 , likewise for |ϕ⟩ 1 ⊕ 0 2 and H 1 . Moreover, these

two subspaces of H 1 ⊕ H 2 are mutually orthogonal, because

(

1⟨ϕ| ⊕ 0 2

) (

0 1 ⊕ |ψ⟩ 2

)

= 1 ⟨ϕ | 0 ⟩ 1 + 2 ⟨0 | ψ⟩ 2 = 0. (II. 86)

Therefore, every vector |χ⟩ ∈ H 1 ⊕ H 2 can be written uniquely as the direct sum of two orthogonal

terms,

|χ⟩ = |ϕ⟩ 1 ⊕ |ψ⟩ 2 = |ϕ⟩ 1 ⊕ 0 2 + 0 1 ⊕ |ψ⟩ 2 . (II. 87)

Vice versa, suppose that H is an arbitrary Hilbert space, and that H 1 is a subspace of H. Now

let H 2 = H 1 ⊥ be the orthocomplement of H 1 , i.e., H 2 contains all vectors in H which are perpendicular

to all vectors in H 1 . Then H = H 1 ⊕ H 2 holds, with the identification

and

|ϕ⟩ 1 ⊕ 0 2 ↔ |ϕ⟩ ∈ H 1 , (II. 88)

0 1 ⊕ |ψ⟩ 2 ↔ |ψ⟩ ∈ H 2 , (II. 89)

|ϕ⟩ ⊕ |ψ⟩ = |ϕ⟩ + |ψ⟩. (II. 90)

In this case the direct sum ⊕ is nothing but ordinary addition in H, which was given in (II. 1) as

a general property of H. This means that every Hilbert space can be written as a direct sum of

an arbitrary subspace and its orthocomplement. We also see something that holds generally: the

dimension of H 1 ⊕ H 2 is the sum of the dimensions of H 1 and H 2 ,

dim (H 1 ⊕ H 2 ) = dim H 1 + dim H 2 . (II. 91)

II. 5. 2

DIRECT PRODUCT

There is another, actually more important, way to construct a new Hilbert spaces out of two given

spaces. Again, let H 1 and H 2 be two Hilbert spaces. By definition we call the space H :=

H 1 ⊗ H 2 the direct product space if the following requirements have been satisfied.

32 CHAPTER II. THE FORMALISM

(i) The space H 1 ⊗ H 2 has as its elements at least all ordered pairs ( |ϕ⟩ 1 , |ψ⟩ 2

)

, with |ϕ⟩1 ∈ H 1

and |ψ⟩ 2 ∈ H 2 , which we now write as |ϕ⟩ 1 ⊗ |ψ⟩ 2 .

(ii) The addition and scalar multiplication on H 1 ⊗ H 2 satisfy

|ϕ⟩ 1 ⊗ |ψ⟩ 2 + |ϕ⟩ 1 ⊗ |χ⟩ 2 = |ϕ⟩ 1 ⊗ ( |ψ⟩ 2 + |χ⟩ 2

)

. (II. 92)

and

a ( |ϕ⟩ 1 ⊗ |ψ⟩ 2

)

= a |ϕ⟩1 ⊗ |ψ⟩ 2 = |ϕ⟩ 1 ⊗ a |ψ⟩ 2 (II. 93)

(iii) The inner product is multiplicative,

(

1⟨ϕ| ⊗ 2 ⟨χ| ) ( |ψ⟩ 1 ⊗ |ξ⟩ 2

)

= 1 ⟨ϕ | ψ⟩ 1 2 ⟨χ | ξ⟩ 2 . (II. 94)

(iv) H 1 ⊗ H 2 is the smallest Hilbert space spanned by vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 ∈ H and

their linear combinations.

If |α 1 ⟩ 1 , . . . , |α N1 ⟩ 1 is an orthonormal basis in H 1 , and |β 1 ⟩ 2 , . . . , |β N2 ⟩ 2 is likewise in H 2 ,

with N 1 = dim H 1 , N 2 = dim H 2 , their direct products, i.e. the vectors of the form |α i ⟩ 1 ⊗ |β j ⟩ 2

provide, an orthonormal set of vectors in H 1 ⊗ H 2 . Indeed, using (II. 94),

(

1⟨α j | ⊗ 2 ⟨β k | ) ( |α m ⟩ 1 ⊗ |β n ⟩ 2

)

= 1 ⟨α j | α m ⟩ 1 2 ⟨β k | β n ⟩ 2 = δ jm δ kn . (II. 95)

Because orthonormal vectors are independent, the dimension of H 1 ⊗ H 2 cannot be smaller than

the product of the separate dimensions. But furthermore, according to (iv), all vectors in H 1 ⊗ H 2

are obtainable as linear combinations of vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 , which in turn are linear

combinations of the vectors |α j ⟩ 1 ⊗|β k ⟩ 2 . Therefore, these vectors also span the entire space H 1 ⊗H 2 .

In other words, |α 1 ⟩ 1 ⊗ |β 1 ⟩ 2 , |α 2 ⟩ 1 ⊗ |β 1 ⟩ 2 , . . . , |α N1 ⟩ 1 ⊗ |β N2 ⟩ 2 is also a basis for H 1 ⊗ H 2 . For

the dimension of H 1 ⊗ H 2 we thus find

dim (H 1 ⊗ H 2 ) = dim H 1 · dim H 2 . (II. 96)

Consequently, an arbitrary vector |χ⟩ ∈ H 1 ⊗ H 2 can, in this product basis |α j ⟩ 1 ⊗ |β k ⟩ 2 , be written

as

|χ⟩ =

∑N 1 ∑N 2

j=1 k=1

c jk |α j ⟩ 1 ⊗ |β k ⟩ 2 with c jk = ( 1⟨α j | ⊗ 2 ⟨β k | ) |χ⟩ ∈ C. (II. 97)

For vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 it holds that

N 1 ∑

j=1

a j |α j ⟩ 1 ⊗

N 2 ∑

k=1

b k |β k ⟩ 2 =

∑N 1 ∑N 2

j=1 k=1

a j b k |α j ⟩ 1 ⊗ |β k ⟩ 2 . (II. 98)

II. 5. DIRECT SUM AND DIRECT PRODUCT 33

We see that (II. 98) is a special case of (II. 97), that is, where c jk = a j b k . The special vectors which

can be written as (II. 98), i.e., in the form |ϕ⟩ 1 ⊗|ψ⟩ 2 , are called direct product vectors, or factorizable.

In a direct sum space H 1 ⊕H 2 all vectors can be written in the form |ϕ⟩ 1 ⊕|ψ⟩ 2 , but in a direct product

space H 1 ⊗ H 2 not all vectors can be written in the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 . Further on we will see that

states for which c jk cannot be written as a j b k give rise to typical quantum mechanical behavior, as

in the thought experiment of EPR where composite systems are considered, corresponding to states

on H 1 ⊗ H 2 which cannot be factorized. Such states are called non - factorizable or entangled states.

If A and B are operators on H 1 and H 2 , respectively, the direct product operator A ⊗ B is the

operator on H 1 ⊗ H 2 , defined by

(A ⊗ B) ( |ϕ⟩ 1 ⊗ |ψ⟩ 2

)

:= A |ϕ⟩1 ⊗ B |ψ⟩ 2 . (II. 99)

It follows that, with operators C ∈ H 1 and D ∈ H 2 ,

(A ⊗ B) (C ⊗ D) = (A C) ⊗ (B D). (II. 100)

Similar to vectors, operators on the direct product space H 1 ⊗ H 2 are not always factorizable. The

total momentum operator P 1 + P 2 and the distance operator Q 1 − Q 2 of EPR, with P as defined in

section I. 2, (I. 1), and Q likewise, are examples of such non - factorizable direct product operators,

P 1 ⊗ 11 2 + 11 1 ⊗ P 2 and Q 1 ⊗ 11 2 − 11 1 ⊗ Q 2 . (II. 101)

EXERCISE 12. Calculate the commutator of these operators, given that [ P i , Q j

]

= −iδij .

The following properties of the direct product of operators will, further on, be used frequently:

A ⊗ 0 = 0 ⊗ B = 0 ,

(A 1 + A 2 ) ⊗ B = (A 1 ⊗ B) + (A 2 ⊗ B),

11 ⊗ 11 = 11,

a A ⊗ b B = a b (A ⊗ B), (II. 102)

(A ⊗ B) − 1 = A − 1 ⊗ B − 1 ,

(A ⊗ B) † = A † ⊗ B † ,

Tr ( bA ⊗ cB ) = b c Tr A · Tr B.

EXERCISE 13. Prove the properties of ⊗ in (II. 102).

34 CHAPTER II. THE FORMALISM

Finally, the matrix A ⊗ B of the operator A ⊗ B in the direct product space H 1 ⊗ H 2 is of the

form

⎛ ⎛

⎞

b 11 · · · b 1N2

⎜

a 11 ⎝

.

. ..

⎟

⎠ · · · a 1N1 B

b N2 1 b N2 N 2 A ⊗ B =

, (II. 103)

a 22 B .

⎜

.

⎝ .

..

⎟

⎠

a N1 1 B · · · a N1 N 1

B

where a ij = ⟨α i | A | α j ⟩ and b kl = B kl = ⟨β k | B | β l ⟩, as in (II. 24). This matrix is called the

Kronecker product of the matrices A and B.

II. 6

ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES

This section is intended for interested readers, who wish to gain more in - depth knowledge of

Hilbert spaces.

In physical applications of quantum mechanics we nearly always need infinite - dimensional Hilbert

spaces. Indeed, this already applies to the case of a free particle in one spatial dimension.

The mathematical theory of infinite - dimensional Hilbert spaces is in some aspects more difficult

than that of finite - dimensional ones.

II. 6. 1

THE STRUCTURE OF VECTOR SPACES

An infinite - dimensional space H is a space where for every n independent vectors in H, with

n arbitrarily large, it is always possible to find still another vector in H that is independent of these

vectors. In rough approximation it can be said that all formulas of the previous sections remain valid

if we replace the sums from 1 to N by sums from 1 to infinity. But, of course, attention must be given

to the convergence of such sums. This leads to two extra assumptions which were superfluous in the

theory of finite - dimensional spaces.

(i) Separability. A Hilbert space H is called separable if it has a countable basis, i.e., a countable

set of independent vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H exists such that every vector |ϕ⟩ ∈ H

can, analogously to (II. 14), be written as

|ϕ⟩ =

∞∑

c j |ϕ j ⟩ with c j = ⟨ϕ j | ϕ⟩. (II. 104)

j=1

This equation is shorthand for

lim

m→∞

∥

∥ϕ −

m∑ ∥ ∥∥

c j ϕ j = 0. (II. 105)

j=1

II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 35

(ii) Completeness. We require that the space is complete, which means that every Cauchy sequence,

i.e., a sequence of vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H, for which

lim ∥ϕ j − ϕ k ∥ = 0, (II. 106)

j, k→∞

has a limit vector |ϕ⟩ in H,

lim ∥ϕ m − ϕ∥ = 0. (II. 107)

m→∞

for example, in this sense Q, the set of rational numbers, is incomplete, since many Cauchy

sequences of rational terms exist which have no limit in Q, for instance the series expansions of π

and e. If the limiting points of all Cauchy sequences are added to Q, we obtain exactly R. Q is called

a countably infinite set, R is called an uncountably infinite set.

Below, we will assume Hilbert spaces to be separable and complete.

EXERCISE 14. Prove that every finite - dimensional complex vector space with an inner product

is separable and complete.

The claim in the3 above exercise makes clear that in the finite - dimensional case the requirements

of separability and completeness are indeed superfluous.

The next two spaces are well - known examples of infinite - dimensional Hilbert spaces.

(i) The space of all complex, square integrable functions,

{

∫

}

L 2 (R) := ψ : R → C ∣ |ψ(q)| 2 dq < ∞ , (II. 108)

R

with an inner product defined as

∫

⟨ψ | ϕ⟩ := ψ ∗ (q) ϕ(q) dq, (II. 109)

R

and likewise for L 2 (R n ) with arbitrary n ∈ N + .

(ii) The space of square summable sequences of complex numbers, defined by Erhard Schmidt,

l 2 (N) :=

{

c : N → C ∣

∞∑

j=0

}

|c j | 2 < ∞ , (II. 110)

with inner product

⟨c | d⟩ :=

∞∑

cj ∗ d j . (II. 111)

j=0

36 CHAPTER II. THE FORMALISM

The proof that these vector spaces are complete is not simple, however, the proof that the remaining

requirements for a Hilbert space have been met, is.

These two spaces correspond to two versions of quantum mechanics, where L 2 (R) corresponds

to Schrödingers wave mechanics (1926) and l 2 (N) to the matrix mechanics of Heisenberg, Born, and

Jordan (1925), that is, if we take matrix mechanics in the enriched version of Von Neumann, since

the original version did not contain a ‘state space’. These two versions of quantum mechanics are

mathematically equivalent, see F.A. Muller (1997a, 1997b and 1999) for historical details.

II. 6. 2

OPERATORS

More serious complications occur when introducing of operators on infinite - dimensional Hilbert

spaces. First, we will see that such operators are in general ‘unbounded’, which entails that

they cannot be defined on the entire Hilbert space. Consequently, the definition of sum and product

of operators, as well as their adjoints, becomes more cumbersome, and the terms ‘self - adjoint’

and ‘Hermitian’ no longer coincide. Second, these operators do not always have eigenvectors in H.

Therefore it is more difficult to give a useful version of the spectral theorem.

The second problem is independent of the first, i.e., it can also appear for bounded self - adjoint

operators. ▹

For position and momentum both complications occur together which is shown by an example.

EXAMPLE

Consider the position operator

Q : ψ(q) ↦→ q ψ(q), (II. 112)

and the momentum operator

P : ψ(q) ↦→ − i d ψ(q), (II. 113)

dq

both acting on L 2 (R).

The first problem is that these operators do not map every vector in L 2 (R) to another vector

in L 2 (R). For instance, every non - differentiable function in L 2 (R) is outside the domain of P .

Vice versa, taking for Q, for example, ψ (q) = (a + q) − 3 2 with a ∈ R, we have ψ ∈ L 2 (R),

but Qψ ∉ L 2 (R).

The second problem is that the eigenvalue equation for momentum, −i d dq

ψ (q) = pψ (q), has

solutions ψ (q) ∝ e i pq for p ∈ R, but these functions are not square integrable and therefore

they are not in L 2 (R). Something similar applies to the eigenvalue equation Qψ(q) = q 0 ψ(q)

and its solutions ψ(q) = δ(q − q 0 ).

II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 37

II. 6. 2. 1

UNBOUNDED OPERATORS

Let us start with a definition: an operator A on Hilbert space H is called bounded if the set of

positive numbers ∥Aχ∥ = ∥⟨χ | A | χ⟩∥ has an upper bound for all unit vectors |χ⟩, where the least

upper bound, or supremum, is called the norm of A,

{

}

∥A∥ = sup ∥Aχ∥ ∈ R ∣ ∥χ∥ = 1 . (II. 114)

The set of all bounded operators on H is written as B(H).

In finite - dimensional Hilbert spaces all operators are bounded, but this is not the case in infinite -

dimensional Hilbert spaces. As we want to hold on to the requirement that every vector A|ψ⟩ has a

finite norm, we have to exclude from the domain of A the set of vectors |ϕ⟩ for which

∥A χ∥

∥χ∥

→ ∞ if |χ⟩ → |ϕ⟩. (II. 115)

Therefore, from now an operator A is a linear mapping from a subset of H to H. This subset is called

the domain of A, written as Dom A ⊂ H. Hence, an operator is a linear mapping

ψ ∈ Dom A, A : ψ ↦→ A ψ ∈ H. (II. 116)

We will, however, always assume that Dom A is dense in H which means that every vector ϕ in H

can be approximated arbitrarily well by vectors in Dom A. The foregoing implies that also sums and

products of operators are generally defined on a limited domain only,

Dom (A + B) = Dom A ∩ Dom B (II. 117)

{

}

Dom (A B) = ψ ∈ Dom B : B ψ ∈ Dom A . (II. 118)

It is more difficult to introduce the adjoint A † of an operator A. The operator is again called

Hermitian if

⟨ϕ | A | ψ⟩ = ⟨ψ | A | ϕ⟩ ∗ ∀ ϕ, ψ ∈ Dom A, (II. 119)

but this definition is no longer sufficient for our purposes, as can be seen in the next example.

EXAMPLE

Consider the operator P from (II. 113), now acting on L 2( [0, ∞⟩ ) , and choose as its domain

Dom P =

{

ψ :

∫ ∞

0

∫

|ψ(q)| 2 dq < ∞,

}

|P ψ(q)| 2 dq < ∞, ψ(0) = 0 . (II. 120)

This operator is indeed Hermitian, which can be checked using integration by parts, where the

non - integral term cancels out because of the boundary condition ψ(0) = 0. But the operator is

not self - adjoint, as we will see in the next exercise.

38 CHAPTER II. THE FORMALISM

To introduce the adjoint of an operator we first delimit the domain. Let Dom A † be the set of all

vectors |ϕ⟩ such that a vector |η⟩ exists for which

⟨ϕ | A | ψ⟩ = ⟨η | ψ⟩ ∀ |ψ⟩ ∈ Dom A. (II. 121)

Using the assumption that Dom A is dense in H it is possible to show that if such a vector |η⟩ exists

it is also unique. The adjoint A † of operator A is now, by definition, the mapping

A † : |ϕ⟩ ∈ Dom A † ↦→ |η⟩ := A † |ϕ⟩, (II. 122)

and the operator is called self - adjoint if

A = A † and Dom A = Dom A † . (II. 123)

This requirement is stronger than Hermiticity; it can be shown that in general it holds for Hermitian

operators that Dom A ⊂ Dom A † , instead of (II. 123).

EXERCISE 15. Verify that the domain of P † , with P as in the example above, is indeed larger

than the domain of P .

II. 6. 2. 2

CONTINUOUS SPECTRA

Another aspect in which infinite - dimensional Hilbert spaces deviate from finite - dimensional

ones is the possibility for an operator to have a continuous spectrum, a mathematical impossibility

in the finite - dimensional case since the term ‘spectrum’ was defined as the set of eigenvalues of

operators. Examples of operators with continuous spectra are, again, the position operator and the

momentum operator, whose spectra consist of the entire line of real numbers R. Therefore, the

term ‘spectrum’ needs to be redefined. The spectrum of operator A is now defined as the set of all

values λ ∈ C for which the operator A − λ11 has no inverse operator. To illustrate the deviations from

the finite - dimensional case we give two examples, the angle operator and the angular momentum

operator.

EXAMPLE

Consider the Hilbert space L 2( [0, 2π] ) and the angle operator

Q : ψ(q) ↦→ q ψ(q), 0 q 2 π. (II. 124)

This operator has, analogous to (II. 112), eigenfunctions which are not in H, its spectrum is the

interval [0, 2π], but it is bounded, ∥Q∥ = 2π.

The angular momentum operator

II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 39

L : ψ(q) ↦→ − i d ψ(q), (II. 125)

dq

with domain

Dom L =

{

}

ψ : ∥L ψ∥ < ∞, ψ(0) = ψ(2π) , (II. 126)

does have normalized eigenfunctions,

ψ(q) = 1 √

2 π

e i l q , (II. 127)

and a discrete spectrum l ∈ Z. But, since l can be arbitrarily large, it is unbounded.

II. 6. 2. 3

SPECTRAL THEOREM

Von Neumann succeeded in proving the spectral theorem, in the version of II. 3. 1, for infinite -

dimensional Hilbert spaces for which we can formulate the theorem now.

SPECTRAL THEOREM:

To every normal operator A, bounded or unbounded, corresponds a unique mapping of

subsets of Spec A to the set P (H) of projectors on H, ∆ ↦→ P A (∆), having the following

properties:

(i) P ∅ = 0

(ii) P C = 11

(iii) P ∪i ∆ i

= ∑ i

P ∆i for all ∆ i mutually disjoint. (II. 128)

For the position operator Q we have an explicit expression for the spectral family of eigenprojectors

of Q,

P Q (∆) ψ(q) =

{ q ψ(q) if q ∈ ∆

0 otherwise

, (II. 129)

hence, P Q (∆) is in fact a multiplication with the characteristic function of ∆. The spectral family of

the momentum operator is obtained by applying a Fourier transform to the aforementioned expression.

The probability of finding upon measurement for the physical quantity A, which corresponds to

the normal operator A if the physical system is in the state ψ ∈ H, a value a ∈ ∆ ⊂ R, is

Prob ψ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩, (II. 130)

40 CHAPTER II. THE FORMALISM

which, using (II. 129), yields for the physical quantity position Q

∫

Prob ψ (Q : ∆) = ⟨ψ(q) | P Q (∆) | ψ(q)⟩ = q |ψ(q)| 2 dq. (II. 131)

All empirical statements of quantum mechanics can therefore be expressed in terms of projectors, or,

more precisely, all empirical statements of quantum mechanics concerning physical quantity A can

be expressed in terms of the spectral family of A.

∆

II. 6. 3

DIRAC

Finally we remark that quantum mechanics à la Dirac willingly and knowingly violates Von Neumann’s

postulates by going outside the Hilbert space. Dirac writes (1958, p. 40)

The bra and ket vectors that we now use form a more general space than a Hilbert space.

To make Dirac’s approach mathematically expressible, the French mathematician Laurent Schwarz

developed the theory of distributions, and the Russian mathematical physician I.M. Gel’fand developed

the theory of rigged Hilbert spaces. Contrary to Schrödinger and Von Neumann, Dirac regarded

wave mechanics as a generalization of matrix mechanics, going from a discrete index to a continuous

index, making a transition from square summable sequences of complex numbers to wave functions,

and from infinite matrices to integral kernels.

II. 6. 4

SUMMARY

A complex Hilbert space is, by definition, a complete, separable complex vector space with an inner

product which is related to the norm by ∥ψ∥ 2 = ⟨ψ | ψ⟩, its dimension is either finite or countably

infinite. Contrary to the infinite - dimensional case, the requirements of separability and completeness

are superfluous in the finite - dimensional case because they are derivable from the other properties of

a Hilbert space, but in the vast majority of physical applications infinite - dimensional Hilbert spaces

and unbounded operators are required.

III

THE POSTULATES

The sciences do not try to explain, they hardly even try to interpret, they mainly make

models. By a model is meant a mathematical construct which, with the addition of certain

verbal interpretations, describes observed phenomena. The justification of such a

mathematical construct is solely and precisely that it is expected to work [. . . ]

— John von Neumann

It would seem that the theory is exclusively concerned about ‘results of measurement’,

and has nothing to say about anything else. [. . . ] To restrict quantum mechanics to be

exclusively about piddling laboratory operations is to betray the great enterprise.

— John Bell

In this chapter we will formulate and discuss Von Neumann’s postulates. Next, we will extend the

quantum mechanical concept of ‘pure’ states by adding ‘mixed’ states, and show how quantum

mechanics treats states of subsystems of composite physical systems. Finally, we apply these

concepts to spin 1/2 particles and we derive some formulas needed in subsequent chapters.

III. 1

VON NEUMANN’S POSTULATES

We are now ready to give, in some cases in simplified fashion, Von Neumann’s postulates of

quantum mechanics, which link the physical concepts of the theory to the mathematical concepts of

its formalism.

1. State postulate, pure states. Every physical system has a corresponding Hilbert space H, the

states of the system are completely described by unit vectors in H. A composite physical

system corresponds to the direct product of the Hilbert spaces of the subsystems.

2. Observables postulate. Every physical quantity A of the system corresponds to a self - adjoint

operator A in H. Dirac called the quantities ‘observables’.

3. Spectrum postulate. The only possible outcomes which can be found upon measurement of a

physical quantity A, corresponding to an operator A, are values from the spectrum of A.

4. Born postulate, discrete case. If the system is in a state |ψ⟩ ∈ H, and a measurement is made

of a physical quantity A, corresponding to an operator A with a discrete spectrum Spec A,

probability to find the outcome a i ∈ Spec A, is equal to

Prob |ψ⟩ (a i ) = ⟨ψ | P ai | ψ⟩, (III. 1)

42 CHAPTER III. THE POSTULATES

where P ai is the projector from the spectral decomposition (II. 57) of A.

5. Schrödinger postulate. As long as no measurements are made on the system, the time evolution

of the system is described by a unitary transformation,

|ψ(t)⟩ = U (t, t 0 ) |ψ(t 0 )⟩. (III. 2)

6. Projection postulate, discrete case. If the system is in a state |ψ⟩ ∈ H and a measurement is

made on a physical quantity A corresponding to an operator A with discrete spectrum, and the

outcome of the measurement is the eigenvalue a i ∈ Spec A, the system is, immediately after

the measurement, in the eigenstate

|ψ⟩ P a i

|ψ⟩

. (III. 3)

∥P ai |ψ⟩∥

The first four postulates connect the (undefined) concepts ‘physical system’, ‘state’, ‘quantity’

and ‘measurement’ to mathematical concepts. In the literature the postulates 3 and 4 are sometimes

combined into the so - called measurement postulate. The last two postulates determine the evolution

of the states in time.

Ad 1. The state postulate implies that systems with the same |ψ⟩ are in the same physical state.

The way in which this state vector |ψ⟩ is produced, is thus unimportant. Also the fact that two systems

which are described by the same |ψ⟩ can, upon measurement, have different outcomes, which is

allowed according to the measurement postulate, is no reason to regard their states as being different.

On the other hand, not every pair of mutually different unit vectors also represent different states.

Usually it is assumed that vectors whose only difference is their phase factor e iθ , with θ ∈ R, describe

the same physical state, because they predict the same probability distributions for outcomes of all

possible measurements. Such vectors form a so - called unit ray.

The statement that all unit vectors of H describe physical states also need not be true in general.

Notice that the set of unit vectors is extremely large. Even for a particle in one spatial dimension the

Hilbert space is infinite - dimensional. Furthermore, some types of superposition, linear combinations

of two or more eigenstates, do not occur in nature, for instance superpositions of states with different

charges, i.e., electrical, baryonic etc., or superpositions of states with different spin.

It is possible to prohibit these superpositions in the theory by introducing so - called superselection

rules. The requirement that, for identical particles, only states are allowed which are symmetric or

antisymmetric under permutation of the particles is an example of such a superselection rule. In the

presence of a superselection rule the class of allowed states breaks up into in a direct sum of the

eigenspaces of the superselection operator,

H = ⊕ j=1

H j . (III. 4)

Within one such subspace H j , called a coherent sector, superpositions of all states are allowed.

III. 1. VON NEUMANN’S POSTULATES 43

In absence of superselection rules the entire Hilbert space is one coherent sector. Then the superposition

principle is valid in general, which says that for every two states |ψ⟩ and |ϕ⟩ the linear

combination a|ψ⟩ + b|ϕ⟩, with |a| 2 + |b| 2 = 1, is a state too. Because nature apparently imposes

superselection rules, which can sometimes be derived from symmetries as was first shown by Wick,

Wightman and Wigner (1952), the superposition principle only applies for coherent sectors. Since

superpositions of vectors from different coherent sectors do not correspond to physical states, the

state postulate has to be accordingly reformulated.

As far as composite physical systems are concerned, we say that the system is in an entangled

state iff the state vector is not factorizable, see section II. 5. In the thought experiment of EPR such an

entangled state plays the principal part. Schrödinger (1935b) was the first to show that the occurrence

of entanglement is widespread in quantum mechanics and he considered this to be the cardinal distinction

between classical mechanics and quantum mechanics. In section III. 2 we will further extend

the notion of state.

Ad 2. The question if every self - adjoint operator represents a physical quantity, has, according

to some authors, a negative answer. Wigner, for instance, asked how to measure the quantity corresponding

to the self - adjoint operator P + Q. Another example is a projector which projects on

superpositions of vectors from different coherent sectors, as we saw in Ad 1.

Also the reverse question, whether every physically meaningful quantity is represented by a self -

adjoint operator, is controversial. For some physical quantities which correspond to experimentally

clear measuring procedures, such as ‘time of decay’ in case of a radioactive atom, or the ‘phase’ of

a harmonic oscillator, no associated self - adjoint operator can be found. In later generalizations of

the formalism of quantum mechanics this problem is somewhat relieved by considering more general

mathematical constructions, the so - called positive operator valued measures, which are also capable

of representing physical quantities; see for example A.S. Holevo (1982) or Busch, Grabowski and

Lahti (1995).

Another question is which operator exactly corresponds to which quantity. Again, no commonly

accepted recipe is available here. Generally, one starts with demanding that certain classical quantities

are represented by special operators. It is standard procedure to choose position and momentum to

be these quantities and to require that the corresponding operators satisfy the canonical commutation

relation of Born and Jordan (1925), and Dirac (1925),

[P, Q] := P Q − Q P = − i 11. (III. 5)

Next, a certain ‘quantization prescription’ is chosen which can be used to construct an operator

corresponding to more general physical quantities. Dirac’s mathematical prescription of replacing

Poisson brackets by commutators is famous. Unfortunately, this prescription is inconsistent. The

alternative prescriptions for quantization which have been presented for this purpose, do not mutually

agree. We will not discuss this problem further.

Ad 4. With P ψ = |ψ⟩ ⟨ψ|, as defined in (II. 37), P ai as in (II. 54), and using the relation (II. 56),

the probability of finding a value a i ∈ Spec A, in a measurement of the physical quantity A with

44 CHAPTER III. THE POSTULATES

corresponding operator A, can also be written as

⟨ψ | P ai | ψ⟩ =

∑n i

j=1

⟨ψ | a i , j⟩ ⟨a i , j | ψ⟩ =

∑n i

j=1

|⟨a i , j | ψ⟩| 2 = Tr P ai P ψ (III. 6)

Likewise, the expectation value of A, with A as defined in (II. 55), is

⟨A⟩ ψ = ⟨ψ | A | ψ⟩ =

M∑ ∑n i

⟨ψ | a i , j⟩ a i ⟨a i , j | ψ⟩ =

i=1 j=1

M∑

i=1

∑n i

j=1

a i |⟨a i , j | ψ⟩| 2 = Tr(III. AP ψ 7) .

In case there is no degeneracy, (III. 6) takes the simpler form

⟨ψ | P ai | ψ⟩ = |⟨a i | ψ⟩| 2 = Tr P ψ P ai . (III. 8)

We also note that in case A has a continuous spectrum, as discussed in section II. 6, we have (II. 130),

Prob |ψ⟩ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩. ▹ (III. 9)

Ad 5. If the system is invariant under translations in time, the unitary evolution operator U (t, t 0 )

depends only on the time difference t − t 0 , and can be written as U (t − t 0 ). The evolution operators

then form a continuous abelian Lie group, the group of translations in time, satisfying the group multiplication

structure U(t) U(t ′ ) = U(t + t ′ ). According to the Stone - Von Neumann theorem (1932),

they can be written as

U (t) = e − i H t (III. 10)

where H is a unique self - adjoint operator H as the generator of the Lie group. H is called the

Hamiltonian. Therefore, the evolution operator U (t − t0) from the Schrödinger postulate can be

written as

U (t − t 0 ) = e − i H (t−t 0) , (III. 11)

and the Schrödinger equation is, according to (III. 2),

i d dt |ψ(t)⟩ = i d dt e − i H (t − t 0) |ψ(t 0 )⟩ = H |ψ(t)⟩. (III. 12)

III. 2. PURE AND MIXED STATES 45

Ad 6. This is the notorious projection postulate. It introduces a second kind of dynamics in

the theory; a projector is, in general, not unitary and therefore it cannot be written in terms of the

Schrödinger postulate. Some authors do not regard the projection postulate to be a part of quantum

mechanics. The problem is then how to account for the measurement process using the other

postulates, this will be discussed further in chapter VIII.

The version of the projection postulate we gave is a stronger version of Von Neumann’s original

formulation and is defined by G. Lüders (1951). Von Neumann only required that the state, directly

after a measurement of A which has a i as an outcome, is an (arbitrary) eigenstate with eigenvalue a i .

In Lüders’ version the state directly after the measurement, (III. 3), is the normalized projection of the

original state on the eigenspace of a i . Here the disturbance of the original state is as small as possible,

in the sense that the angle between the original and the final state is as small as possible.

If the operator A is maximal, both versions coincide because in that case P ai is a 1 - dimensional

projector.

III. 2

PURE AND MIXED STATES

A state vector, a unit vector in H, provides a description of the system which is as complete as the

theory allows. In classical mechanics such a description corresponds, for a system of point particles,

to giving all coordinates of position and momentum; (q, p) := (q 1 , . . . , q n ; p 1 , . . . , p n ) is a point

in the phase space Γ. In practice, the value of these coordinates is often not known precisely and a

probability distribution ρ(q, p) is introduced over the phase space Γ. The integral of ρ(q, p) over ∆

is the probability to find the system in the subset ∆ ⊆ Γ. The probabilities have to be positive and

normalized,

∫

ρ(q, p) 0 and ρ(q, p) dq dp = 1. (III. 13)

Γ

In classical physics it is also customary to extend the notion of state and also call a probability

distribution ρ a (generalized) state of the system. A physical quantity A corresponds to a real function

on the phase space, A : Γ → R, and the expectation value of A in the state ρ is

∫

⟨A⟩ ρ := A(q, p) ρ(q, p) dq dp. (III. 14)

Γ

The states ρ form a convex set, i.e., if ρ 1 and ρ 2 are states on Γ and w 1 and w 2 both are real numbers

satisfying

then

0 w i 1 and w 1 + w 2 = 1, (III. 15)

ρ := w 1 ρ 1 + w 2 ρ 2 (III. 16)

also satisfies the requirements of (III. 13) and therefore it is also a state on Γ. This convex set of states

is written S (Γ).

46 CHAPTER III. THE POSTULATES

A state which cannot be decomposed according to (III. 16) is called a pure state, otherwise it

is called a mixed state. The pure states are the states ρ concentrated on a single point of Γ, the δ -

‘functions’. Generally, the elements of a convex set which cannot be written in the form (III. 16),

with w 1 , w 2 ≠ 0, are called extreme elements of that set, therefore in our case the extreme elements

are the pure states. Every element of a convex set can always be written as a convex sum of extreme

elements. This corresponds to the expansion of ρ to δ - functions,

∫

ρ(q, p) = ρ(q ′ , p ′ ) δ(q − q ′ ) δ(p − p ′ ) dq ′ dp ′ . (III. 17)

Γ

The dynamics of an arbitrary state follows from the Hamiltonian equations of motion of the pure

states, found by calculating the path of least energy. This holds for conservative systems, which

the quantum mechanical states in these lecture notes are assumed to be. We will come back to the

derivation of the equations in section VI. 5.

The Hamiltonian equations of motion are

˙q = ∂H

∂p

and

ṗ = − ∂H . (III. 18)

∂q

To find the equation of motion in terms of ρ we use Liouville’s theorem which states that for points

moving in phase space obeying the Hamiltonian equations of motion the time evolution of the probability

distribution ρ(q, p, t) is constant. Using (III. 18) and the Poisson brackets

{H, ρ} :=

( ∂H

∂q

∂ρ

∂p − ∂ρ

∂q

∂H

∂p

the Liouville equation, the equation of motion for the state ρ

)

, (III. 19)

equals

dρ

dt = ∂ρ

∂t + ∂ρ

∂q

∂ρ ˙q + ṗ = 0, (III. 20)

∂p

∂ρ

∂t

= {H, ρ}. (III. 21)

Now we will consider, analogous to the classical case, a probability distribution of the state vectors

in H. With help of the state ρ we introduce a mapping µ of subsets ∆ of Γ to R,

∫

µ(∆) := ρ(q, p) dq dp with ∆ ⊆ Γ. (III. 22)

∆

This mapping µ is additive

µ (∪ i ∆ i ) = ∑ i=1

µ(∆ i ) (III. 23)

for every countable sequence of disjoint ∆ i ⊂ Γ. Furthermore,

0 µ(∆) 1, µ(∅) = 0 and µ(Γ) = 1. (III. 24)

III. 2. PURE AND MIXED STATES 47

Each mapping which maps a measurable subset of Γ to a number in the interval [0, 1], thereby

satisfying (III. 23) and (III. 24), is called a probability measure. It is not difficult to see that every

probability distribution ρ corresponds univocally to a probability measure and vice versa, this is even

true for δ - functions. Therefore we can also represent a state, in the extended meaning, by a probability

measure on Γ.

Analogous to this reasoning we now aim to let the physical states in quantum mechanics correspond

to probability measures on H. Since we want to preserve the structure of H, we do not consider

arbitrary subsets of H, instead we look at the set P(H) of all subspaces of H generated by orthogonal

projectors, or, equivalently, at the projectors projecting on those subspaces. What we are thus looking

for is a probability measure on P (H), i.e. a mapping

µ : P (H) → [0, 1], (III. 25)

which is additive in the relevant manner; if P 1 , P 2 , . . . , P N is a set of pairwise orthogonal projectors,

P i ⊥ P j for i ≠ j, the following holds,

( ∑

µ

j

P j

)

= ∑ j

and the mapping satisfies

µ(P j ), (III. 26)

µ(0 ) = 0 and µ(11) = 1. (III. 27)

In 1957 A.M. Gleason proved the following theorem.

GLEASON’S THEOREM:

Every probability measure µ on P (H) can, under the condition that dim H > 2, be

written as

µ(P ) = Tr P W, (III. 28)

for a certain operator W satisfying the following requirements: 1

(i) W = W † ,

(ii) ⟨ψ | W | ψ⟩ 0 ∀ |ψ⟩ ∈ H,

(iii) Tr W = 1. (III. 29)

The original proof of Gleason’s theorem is extraordinarily difficult. In the appendix of these

lecture notes, p. 183, ff, we prove a simplified version of this theorem for the interested reader.

1 Conditions (i) and (ii) of (III. 29) are not mutually independent in the complex Hilbert space of the formalism of

quantum mechanics, in this space (i) is in fact superfluous. In a complex Hilbert space all positive operators are self -

adjoint, and an operator A is uniquely defined by all matrix elements of the form ⟨ψ | A | ψ⟩. This is, however, not the case

in a real space, where Gleason’s theorem is also valid. In that case (i) and (ii) are independent.

48 CHAPTER III. THE POSTULATES

Here we prove that (III. 28) indeed satisfies the requirements (III. 25), (III. 26) and (III. 27) of a

probability measure.

Proof

Requirement (III. 27) is obvious, and verification of (III. 26) can be done with (II. 31). To

prove (III. 25), i.e.

µ(P ) = Tr P W ∈ [0, 1], (III. 30)

we choose an orthonormal basis of eigenvectors of P ; P |v k ⟩ = |v k ⟩, P |u l ⟩ = 0. Then

Tr P W = ∑ k

⟨v k | P W | v k ⟩ + ∑ l

⟨u l | P W | u l ⟩

= ∑ k

⟨v k | W | v k ⟩ 0, (III. 31)

due to the positivity of the operators W . If P is a projector, then 11 − P is one also, therefore

Tr (11 − P )W 0, (III. 32)

such that indeed, with (III. 29) (iii), we see that

0 Tr P W + Tr (11 − P )W = Tr (P + 11 − P )W = Tr W = 1. □ (III. 33)

An important aspect of Gleason’s theorem is the fact that the probability measure (III. 28) is

continuous in P . For measures representing pure states, this is proved in the appendix on p. 183, ff.

If dim H = 2, discontinuous probability measures exist on P (H). To see this, consider a real H.

The 1 - dimensional subspaces are lines through the origin connecting opposite points on the circle.

Attaching values as in the diagram, figure III. 1,

P 2

1

0

P 1

0

1

Figure III. 1: A discontinuous measure for dim H = 2

we see that, with µ(0 ) = 0 and µ(11) = 1, for two arbitrary orthogonal projectors we have

µ(P 1 ) + µ(P 2 ) = 1 = µ(11) = µ(P 1 + P 2 ). (III. 34)

III. 2. PURE AND MIXED STATES 49

This measure is indeed additive, but we also see that it is not continuous, and consequently, Gleason’s

theorem does not hold for dim H = 2.

The operator W is known as the statistical operator, or as the density matrix, or the state operator.

In analogy with the classical case we extend the notion of state and call W a state of the physical

system. From now on states will be represented by the state operators W .

The state operators W form a set S(H) which is again convex; if W 1 and W 2 are state operators,

then

W = w 1 W 1 + w 2 W 2 with 0 w i 1 and w 1 + w 2 = 1 (III. 35)

is again a state operator. The most simple example of a state operator is a 1 - dimensional projector.

A higher - dimensional projector is not a state operator.

EXERCISE 16. Why not?

Before showing how the state operators W represent states, we will prove the next theorem.

THEOREM:

The 1 - dimensional projectors in P(H) are the extreme elements of the convex set S(H)

of all state operators W on H.

Proof

To prove this theorem we first have to show that P ψ cannot be written in the form

P ψ = w W 1 + (1 − w) W 2 , with 0 w 1. (III. 36)

Suppose it could be done. Then, using (II. 37), it also has to hold that, for all |ϕ⟩ ⊥ |ψ⟩,

which implies

⟨ϕ | P ψ | ϕ⟩ = 0 = w ⟨ϕ | W 1 | ϕ⟩ + (1 − w) ⟨ϕ | W 2 | ϕ⟩, (III. 37)

⟨ϕ | W 1 | ϕ⟩ = ⟨ϕ | W 2 | ϕ⟩ = 0. (III. 38)

Now, a positive operator can always be written as the square of a self - adjoint operator, W i = A 2 i ,

yielding that for all |ϕ⟩ ⊥ |ψ⟩

⟨ϕ | W i | ϕ⟩ = ⟨ϕ | A 2 i | ϕ⟩ = ∥A i |ϕ⟩∥ 2 = 0 ⇒ A i |ϕ⟩ = 0 ⇒ W i |ϕ⟩ = 0 .(III. 39)

Therefore, W 1 and W 2 map to the 1 - dimensional space spanned by |ψ⟩. They are, according

to (III. 29), therefore, both identical to the projector P ψ ,

W 1 = P ψ = W 2 . (III. 40)

We thus conclude that P ψ cannot be split up into other state operators.

50 CHAPTER III. THE POSTULATES

Now we have to show that the 1 - dimensional projectors are the only extreme elements. A state

operator is self - adjoint and has, according to the spectral theorem, p. 26, a complete orthonormal

set of eigenstates |w i , j⟩, where j is the degeneracy, j = 1, . . . , n i , and which has M ∈ N +

different w i . We can write an arbitrary W ∈ S (H) as

W =

M∑

∑n i

i=1 j=1

w i W i,j , (III. 41)

where

W i,j := |w i , j⟩ ⟨w i , j|, and

M∑

n i = dim H. (III. 42)

i=1

For w i it holds that

M∑

n i w i = 1 and 0 < w i < 1 (III. 43)

i=1

because, according to (III. 29) (ii) and (III. 29) (iii),

w i = ⟨w i , j | W | w i , j⟩ 0 and Tr W =

M∑

n i w i = 1. (III. 44)

i=1

Thus we se that the sum (III. 41) is a convex decomposition of W .

A convex decomposition W = w 1 W 1 + w 2 W 2 can always be decomposed further through

expansion of W 1 and W 2 . In case of a bounded convex set the expansion ends on extreme

elements. Therefore, if W is an extreme element, the sum has to reduce to one term. In that case

W is a 1 - dimensional projector, and we see that all extreme elements of S(H) are 1 - dimensional

projectors. □

Physical states which are represented by 1 - dimensional projectors are called pure states, where

states which can be divided non - trivially are called mixed states or mixtures. To see that pure states

correspond to the vector states of H, consider W to be the 1 - dimensional projector P ψ projecting

on the vector |ψ⟩. The state defined by this state operator through (III. 28) behaves exactly like the

vector state |ψ⟩; for arbitrary |ϕ⟩ it holds that

µ W (P ϕ ) = Tr P ϕ W = Tr P ϕ P ψ = ⟨ψ | P ϕ | ψ⟩ = |⟨ψ | ϕ⟩| 2 , (III. 45)

which means that the probability to find the state |ϕ⟩ in the state |ψ⟩ is equal to (III. 6). 2 It holds

especially that µ(P ψ ) = 1, and if |ϕ⟩ ⊥ |ψ⟩, then µ(P ϕ ) = 0. We see that the state P ψ assigns a

2 ‘The probability to find the state |ϕ⟩’ is shorthand for the probability to find, upon measurement of the quantity corresponding

to the projector |ϕ⟩ ⟨ϕ|, the value 1.

III. 3. THE INTERPRETATION OF MIXED STATES 51

probability to the orthogonal set of vectors from which |ψ⟩ is an element, which is totally concentrated

on the vector |ψ⟩. In this sense P ψ is analogous to a δ - distribution on the classical phase space.

But the 1 - dimensional projectors are, generally, not mutually orthogonal which means that the

pure state P ψ also assigns a positive probability to P ϕ if ⟨ϕ | ψ⟩ ̸= 0. This is contradictory to the

classical case, where the pure state, which is concentrated on (p 0 , q 0 ), i.e., δ(q − q 0 , p − p 0 ), always

assigns a zero probability to every other pure state. This is characteristic for quantum mechanics and

is the cause for the radical difference between quantum states and classical states.

In this section we showed that a unique correspondence exists between the pure states, the extreme

elements of the convex set S (H) of state operators, the 1 - dimensional projectors and, up to a phase

factor, the unit vectors in H. We will conclude this section with a formulation of the extended version

of the state postulate (1) and the generalization of the Born postulate (4).

1 ′ State postulate, mixed and pure states. Every physical system has a corresponding Hilbert

space. The mixed physical states of the system uniquely correspond to the state operators

within S (H), the pure physical states of the system uniquely correspond to the state operators

on the boundary ∂ S (H). States of composite physical systems correspond bijectively to state

operators on the direct product space of the state spaces H 1 and H 2 of the subsystems, i.e., with

elements of S (H 1 ⊗ H 2 ).

◃ 3

4 ′ Generalized Born postulate, discrete case. If the system is in the state W ∈ S (H), the probability

to find, upon measurement of quantity A corresponding to an operator A having a discrete

spectrum, an eigenvalue in ∆ ⊆ Spec A, is equal to

Prob W (A : ∆) = Tr P A (∆)W, (III. 46)

where P A (∆) ∈ P (H) projects on the subspace span by the eigenvectors having their eigenvalues

in ∆.

III. 3

THE INTERPRETATION OF MIXED STATES

The spectral decomposition (III. 41) suggests an interpretation of the state W . As we saw in (III. 45),

a pure state W = P ψ corresponds to a probability measure µ, which we call concentrated on the

eigenvector |ψ⟩ since µ(P ψ ) = 1. In the same way an arbitrary W corresponds, according to (III. 41),

to a probability measure on its orthonormal set of eigenvectors |w i , j⟩, assigning a probability w i to

the eigenvector |w i , j⟩. With the projector W i,j as in (III. 42), we have

µ W (W i,j ) = Tr W i,j W = Tr

M∑

k=1

n k ∑

l=1

W i,j w k |w k , l⟩ ⟨w k , l|

=

M∑

k=1

n k ∑

l=1

w k |⟨w k , l | w i , j⟩| 2 = w k δ ik δ jl = w i . (III. 47)

3 Notice how in this extended version of the state postulate the annoying phase factor has disappeared.

52 CHAPTER III. THE POSTULATES

The expectation value of operator A is, according to (III. 7) and replacing P ψ by W , also forming

an orthonormal basis,

⟨A⟩ W = Tr AW, (III. 48)

which yields, using again (III. 7) and the spectral decomposition of W , (III. 41),

⟨A⟩ W = Tr

M∑

i=1

∑n i

j=1

A w i W i,j

=

M∑

i=1

∑n i

j=1

w i Tr AW i,j =

M∑

i=1

w i

∑n i

j=1

⟨w i , j | A | w i , j⟩. (III. 49)

This is exactly the weighted sum of w i and the expectation values of A in the states |w i , j⟩.

The above suggests that W describes an ensemble of physical systems each of which is in one

of the pure states |w i , j⟩ and that w i is the fraction of systems in |w i , j⟩. This is the way Von Neumann

originally introduced state operators, in analogy to ensembles in classical statistical mechanics,

hence his terminology statistical operator. But this attractive interpretation, known as the ignorance

interpretation of mixtures, is not without problems as we will show now.

In case of degeneracy the choice of the basis vectors in (III. 41) is not unique, and the projector P i

in the subspace corresponding to the eigenvalue w i can be written in terms of basis states in arbitrarily

many ways,

∑n i

j=1

|w i , j⟩ ⟨w i , j| =

∑n i

k=1

|w i , k⟩ ⟨w i , k|, (III. 50)

with { |w i , k⟩} another arbitrary orthonormal basis in this subspace. Therefore, given any W we

cannot say of which vector states the ensemble is composed. To see that this is a general phenomenon,

consider the operator

W =

K∑

p k U k =

k=1

K∑

p k |u k ⟩ ⟨u k |. (III. 51)

k=1

Here K ∈ N + is arbitrary and {|u k ⟩} is an arbitrary basis of unit vectors which are, in general,

not orthogonal, but as long as the p k satisfy 0 p k 1 and ∑ p k = 1, as required in (III. 35), the

operator W in (III. 51) is still a state operator.

Indeed, equation (III. 51) is an alternative decomposition of W into extreme elements, just like

the spectral decomposition. We see that, in contrast to the classical case, convex decompostions are

not unique.

According to the ignorance interpretation, W describes the ensemble as consisting of systems of

which a fraction p k is in the state |u k ⟩, e.g.

⟨A⟩ W = Tr AW =

K∑

p k ⟨u k | A | u k ⟩, (III. 52)

k=1

ut the probability to find the system in |u k ⟩ is

µ W (U k ) = Tr U k W = Tr

III. 3. THE INTERPRETATION OF MIXED STATES 53

K∑

m=1

U k p m |u m ⟩ ⟨u m | =

K∑

m=1

p m |⟨u k | u m ⟩| 2 (III. 53)

Although the result (III. 52) is in accordance with the behavior of an ensemble of systems being in

the state |u k ⟩ with probability p k , we see that for (III. 53), contrary to (III. 47), the outcome, i.e. the

probability to find in (III. 51) the state |u k ⟩, is in general not p k , which is a consequence of the non -

orthogonality of the states |u k ⟩. On the other hand, (III. 51) can always be written in the form (III. 41),

in terms of the orthonormal set of eigenvectors of W , which leads to the conclusion that ensembles

which are interpreted as being physically completely different, are described by the same operator W .

This can be compared with the fact that a pure state |ψ⟩ can be written in numerous ways as a

superposition of other pure states, which corresponds to different ways of preparation of |ψ⟩ by superposition

of other states, for instance in a tilted Stern - Gerlach apparatus in case of measurement of

spin. We can no longer see if |ψ⟩ is, for example, a superposition of spin up and down in the z - direction,

or of spin up and down in the x - direction.

For pure states this seems completely natural; it is a direct consequence of the state postulate

which forms a vector space of states. In case of mixed states the situation is less clear. It can be

maintained that an ensemble, of which each system is in the state |u k ⟩ with probability p k , really

differs from an ensemble of systems which are in a state |w i , j⟩ with probability w i , even though the

expectation values of all physical quantities are equal for both ensembles. In that case, from

W =

M∑

i=1

∑n i

j=1

w i |w i , j⟩ ⟨w i , j| =

K∑

p k |u k ⟩ ⟨u k | (III. 54)

k=1

it has to be concluded that the state operator W characterizes these ensembles incompletely. There is

no postulate in quantum mechanics by which this is prohibited.

Another view is, however, that the state operator is a complete description of a state, the different

possible ways of preparation are not retrievable from the state W . Consequently, the conclusion has

to be that W , in (III. 51), does not characterize an ensemble which exists of a mixture of systems in

pure states |u k ⟩, but an ensemble characterized by W only presents itself as such an ensemble upon

measurement. Here we see again that, in quantum mechanics, we get in trouble if we speak in terms

of what really exists. In section III. 5 we will return to this discussion in the context of improperly

mixed states.

The dynamics of mixed states follows, as in the classical case, from the pure states. Define,

analogously to (III. 41),

W (t) :=

M∑

i=1

∑n i

j=1

w i W i,j (t). (III. 55)

According to the Schrödinger postulate, (III. 2),

|w i , j, t⟩ := U (t − t 0 ) |w i , j, t 0 ⟩, (III. 56)

54 CHAPTER III. THE POSTULATES

which yields for (III. 55)

W (t) =

M∑

i=1

∑n i

j=1

w i U (t − t 0 ) W i,j (t 0 ) U † (t − t 0 ), (III. 57)

and therefore

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 58)

With (III. 11) we find

i d dt W (t) = [H, W (t 0)], (III. 59)

which is the analogue of the Liouville equation of motion, (III. 21), describing the time evolution of

the states ρ. Equation (III. 59) is called the Liouville - Von Neumann equation, it is the generalization

of the Schrödinger equation to an equation for mixed states.

The extensions of the Schrödinger postulate and the projection postulate for mixed states can now

be formulated.

5 ′ Generalized Schrödinger postulate. If no measurements are made on the physical system, the

time evolution of the state of the system is described by a unitary transformation,

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 60)

6 ′ Generalized projection postulate, discrete case. If the system is in a state W when a measurement

is made on a physical quantity A corresponding to an operator A having a discrete spectrum,

and the outcome of the measurement is the eigenvalue a i ∈ R, the system is, directly

after the measurement, in the eigenspace corresponding to the eigenvalue a i ,

W P a i

W P ai

Tr P ai W P ai

. (III. 61)

◃ Remark

Remember that, in general, the projectors P ai do not have to be 1 - dimensional. ▹

Finally, we give a theorem concerning the generalized Schrödinger postulate which is important

for the measurement problem.

VON NEUMANN’S THEOREM A:

The properties ‘pure’ and ‘mixed’ are invariant under a unitary time evolution.

III. 4. COMPOSITE SYSTEMS 55

Proof

We know that if W is pure, i.e. equal to a 1-dimensional projector, then W 2 = W .

Now consider the expression (sometimes called the purity of W ):

Tr W 2 = ∑ i,

w 2

i (III. 62)

since Tr W = 1 → ∑ i w i = 1, and W is pure iff exactly one of the w i is equal to 1, and all

others vanish, we conclude that

Tr W 2 = 1iff W is pure; Tr W 2 < 1iff W is mixed (III. 63)

But Tr W 2 is invariant under the time evolution (III. 60). Indeed, if we remember that U † (t −

t 0 ) = U −1 (t − t 0 ) and that Tr AB = Tr BA, it follows that

Tr (W(t)) 2 = Tr U(t−t 0 ) W(t 0 ) U † (t−t 0 )U(t−t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U(t−t 0 ) W(t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U

□

III. 4

COMPOSITE SYSTEMS

Suppose that a system S is composed of two subsystems S I and S II . The Hilbert spaces associated

with S I and S II are H I and H II , with dim H I = N I and dim H II = N II , the Hilbert space

associated with S is the direct product space H = H I ⊗ H II , with dim H = N. If |α 1 ⟩, . . . , |α n ⟩

and |β 1 ⟩, . . . , |β m ⟩ are bases of the subspaces H I and H II , {|α i ⟩ ⊗ |β j ⟩} forms a basis in H. An

arbitrary vector in H is a superposition of such direct products of basis vectors and is generally not of

the form |ψ⟩ ⊗ |ϕ⟩, with |ψ⟩ ∈ H I and |ϕ⟩ ∈ H II . Consequently, one cannot say for such an arbitrary

state in H that the subsystems are in some pure state in H I or H II .

This entanglement of the subsystems, when |Ψ⟩ ̸= |ψ⟩ ⊗ |ϕ⟩, with |Ψ⟩ ∈ H, which is characteristic

for quantum mechanics, has no analogue in classical mechanics. It is a consequence of the formal

requirement that the state space of a composite system is also a vector space. Entanglement is the aspect

of the quantum mechanical description that gives rise to the EPR - paradox and the measurement

problem as we shall see in later chapters.

The quantities of system S correspond to self - adjoint operators in H. We make the supposition

that quantities of the subsystem S I correspond to operators of the form A ⊗ 11 in H, where A is

a self - adjoint operator in H I , and quantities of S II correspond analogously to operators of the

form 11 ⊗ B, with B in H II . A state of S is given by a state operator W in H; W ∈ S (H). In

general, W is not a direct product of operators, but in case W can be written as a direct product, we

write W = W 1 ⊗ W 2 , with W 1 and W 2 state operators in H I and H II , respectively.

56 CHAPTER III. THE POSTULATES

EXERCISE 17. Prove the following statements.

(a) W = W 1 ⊗ W 2 is a state operator if W 1 and W 2 are state operators.

(b) The opposite of (a) is not true; give a counterexample.

EXERCISE 18. Prove that for all vectors |ψ⟩, |ψ ′ ⟩ ∈ H I and |ϕ⟩, |ϕ ′ ⟩ ∈ H II we have

(

|ψ⟩ ⊗ |ϕ⟩

)(

⟨ψ

′

| ⊗ ⟨ϕ ′ | ) = |ψ⟩ ⟨ψ ′ | ⊗ |ϕ⟩ ⟨ϕ ′ |. (III. 65)

THEOREM:

If W is a direct product of operators, W = W 1 ⊗ W 2 , the subsystems are mutually

independent, i.e., the probability to find for A⊗11 the value a i and for 11⊗B the value b j

is equal to the product of the separate probabilities. In this case the expectation values

factorize too, such that ⟨A ⊗ B⟩ W 1 ⊗ W 2

= ⟨A⟩ W 1

⟨B⟩ W 2

.

Proof

Let a i and b j be eigenvalues of A and B, respectively. Using (III. 65) we see that the projector on

the eigenstate |a i ⟩ ⊗ |b j ⟩ of A ⊗ B is P |ai⟩ ⊗ P |bj⟩. Therefore, with (II. 102), p. 33,

( )

µ W P|ai⟩ ⊗ P |bj⟩

= Tr ( )( )

P |ai⟩ ⊗ P |bj⟩ W 1 ⊗ W 2

= Tr ( )

P |ai ⟩W 1 ⊗ P |bj ⟩W 2

= Tr P |ai ⟩W 1 Tr P |bj ⟩W 2

=

( ( )

µ W 1 P|ai⟩)

µW

2

P|bj⟩

=

(

µ W P|ai⟩ ⊗ 11 ) (

µ W 11 ⊗ P|bj⟩)

, (III. 66)

which proves the first part of the theorem.

For the factorization of the expectation values, we have, analogously,

⟨A ⊗ B⟩ W 1 ⊗W 2

= Tr (A ⊗ B)(W 1 ⊗ W 2 ) = Tr AW 1 Tr BW 2

= ⟨A⟩ W 1

⟨B⟩ W 2

, (III. 67)

and we see that the expectation values indeed factorize. □

From (III. 67) we also see that, if W = W 1 ⊗ W 2 , then ⟨A ⊗ 11⟩ W = Tr A W 1 = ⟨A⟩ W1

and ⟨11 ⊗ B⟩ W = ⟨B⟩ W2 , but this does not hold for more general statistical operators W .

III. 4. COMPOSITE SYSTEMS 57

With (II. 99), for an arbitrary state operator W , hence in general W ≠ W 1 ⊗ W 2 , the expectation

value of A ⊗ 11 is

⟨A ⊗ 11⟩ W = Tr (A ⊗ 11)W

=

∑N I ∑N II

i=1

N I ∑

i=1

N I ∑

i=1

j=1

N I ∑

k=1 j=1

N I

(

⟨αi | ⊗ ⟨β j | )( A ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )

N II ∑

∑

⟨α i | A | α k ⟩

k=1

(

⟨αi | ⊗ ⟨β j | )( A |α k ⟩ ⟨α k | ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )

N II ∑

j=1

(

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) . (III. 68)

To find ⟨A ⊗ 11⟩ W , define the operator W I in H I , called the partial trace of W in relation to H II ,

W I = Tr II W :=

N II

∑

⟨β j | W | β j ⟩, W I ∈ S (H I ). (III. 69)

j=1

For this partial trace it holds that

⟨α k | W I | α i ⟩ =

N II ∑

j=1

and substituting (III. 70) in (III. 68) yields

⟨A ⊗ 11⟩ W =

N I ∑

i=1

(

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) , ⟨α k | W I | α i ⟩ ∈ R, (III. 70)

N I

∑

⟨α i | A | α k ⟩ ⟨α k | W I | α i ⟩ = Tr AW I = ⟨A⟩ WI . (III. 71)

k=1

Analogously, with W II the partial trace of W in relation to H I ,

W II = Tr I W :=

N I

∑

⟨α i | W | α i ⟩, W II ∈ S (H II ), (III. 72)

i=1

we see that

⟨11 ⊗ B⟩ W = Tr BW II = ⟨B⟩ WII . (III. 73)

EXERCISE 19. Prove that Tr II W and Tr I W are state operators in H I and H II , respectively.

58 CHAPTER III. THE POSTULATES

Concerning the expectation values of the quantities of the subsystem S I alone we can replace the

state W by the partial trace, or state operator, Tr II W in H I , analogously for S II . Therefore it is

customary to let the states of the subsystems correspond to the partial traces Tr II W and Tr I W .

For the partial traces it holds that if W is a direct product of state operators W 1 and W 2 in H I

and H II , respectively, W can also be written as a direct product of its partial traces, which we now

show in a lemma.

LEMMA:

If W is a direct product of the form W = W 1 ⊗ W 2 , where W 1 and W 2 are state operators

in H I and H II , respectively, then Tr II W = W 1 and Tr I W = W 2 .

Proof

Tr II W = Tr II (W 1 ⊗ W 2 ) =

∑N II

⟨β j | W 1 ⊗ W 2 | β j ⟩

j=1

∑N II

= W 1 ⟨β j | W 2 | β j ⟩ = W 1 Tr W 2 = W 1 , (III. 74)

j=1

likewise,

Tr I (W 1 ⊗ W 2 ) = W 2 . □ (III. 75)

From this lemma we see that W = W 1 ⊗ W 2 = Tr II W ⊗ Tr I W , and with the first theorem

of this section, p. 56, this leads to the conclusion that if W is a direct product of its partial traces, it

can be uniquely reconstructed from its partial traces. Generally, an arbitrary state operator W of the

composite system can not be defined by its partial traces, which was shown by Von Neumann.

VON NEUMANN’S THEOREM B:

The partial traces Tr II W and Tr I W uniquely define W , iff at least one of the partial

traces is pure, in which case W is factorizable,

W = Tr II W ⊗ Tr I W. (III. 76)

Proof

Let {|u i ⟩} be a basis of eigenstates of W I having non - degenerate eigenvalues. Leaving out the

eigenvalues p n and u i which are equal to 0, expand W and Tr II W in their eigenvectors,

W =

N∑

p n |ψ n ⟩ ⟨ψ n | with |ψ n ⟩ ∈ H (III. 77)

n=1

and

Tr II W =

∑N I

i=1

u i |u i ⟩ ⟨u i | with |u i ⟩ ∈ H I . (III. 78)

III. 4. COMPOSITE SYSTEMS 59

◃ Remark

Leaving out the eigenvalues u i = 0, the eigenvectors |u i ⟩ with eigenvalue 0 do not occur in the

expansion of Tr II W , however, they do belong to the complete basis basis {|u i ⟩}. ▹

Let {|v j ⟩} be a basis in H II . Then {|u i ⟩ ⊗ |v j ⟩} is a basis in H, and |ψ n ⟩ can be expanded as

where

|ψ n ⟩ =

|ϕ n i ⟩ :=

∑N I

∑N II

i=1 j=1

∑N II

j=1

ψ n

ij |u i ⟩ ⊗ |v j ⟩ =

∑N I

i=1

|u i ⟩ ⊗ |ϕ n i ⟩ (III. 79)

ψ n

ij |v j ⟩ ∈ H II . (III. 80)

These |ϕi n ⟩ are, in general, not orthogonal. Substituting (III. 79) in (III. 77) we have

W =

N∑

n=1

p n

∑N I

i=1 k=1

|u i ⟩ ⟨u k | ⊗ |ϕ n i ⟩ ⟨ϕ n k |. (III. 81)

Subtitution of (III. 81) in (III. 69) yields

Tr II W =

∑N II

⟨β l | W | β l ⟩ =

l=1

N∑

∑N I

n=1 i=1 k=1

∑N II

p n |u i ⟩ ⟨u k | ⟨β l | ϕi n ⟩ ⟨ϕk n | β l ⟩

l=1

=

N∑

∑N I

n=1 i=1 k=1

p n ⟨ϕ n k | ϕ n i ⟩ |u i ⟩ ⟨u k |. (III. 82)

With {|ψ i ⟩} a basis, the coefficients in the expansion of an operator of the form ∑ ij c ij|ψ i ⟩⟨ψ j |

are unique, and comparison of (III. 82) with (III. 78) gives

therefore,

N∑

p n ⟨ϕk n | ϕi n ⟩ = u i δ ik , (III. 83)

n=1

Tr II W =

∑N I

i=1 k=1

u i δ ik |u i ⟩ ⟨u k | =

∑N I

i=1

u i |u i ⟩ ⟨u i |. (III. 84)

◃ Remark

In (III. 83) it follows for i = k, due to the positivity of the p n , that if u i = 0 for certain i,

then |ϕ n i ⟩ = 0 for all n and we see that in (III. 79) only the terms appear for which u i ≠ 0.

Consequently, the same terms occur in (III. 79) as in the expansion (III. 78) of Tr II W . ▹

60 CHAPTER III. THE POSTULATES

If Tr II W is pure, there is only one term

Tr II W = |u 1 ⟩ ⟨u 1 |, (III. 85)

and substitution in (III. 79) yields

|ψ n ⟩ = |u 1 ⟩ ⊗ |ϕ 1 n ⟩. (III. 86)

Therefore,

W =

N∑

p n |u 1 ⟩ ⟨u 1 | ⊗ |ϕ n 1 ⟩ ⟨ϕ n 1 | = |u 1 ⟩ ⟨u 1 | ⊗ p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 87)

n=1

Analogous to (III. 82) we find for

Tr I W =

N∑

∑N I

n=1 i=1 k=1

p n ⟨u k | u i ⟩ |ϕ n i ⟩ ⟨ϕ n k |. (III. 88)

With i = k = 1 and ⟨u 1 | u 1 ⟩ = 1 we have

Tr I W =

N∑

p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 89)

n=1

Substituting (III. 89) in (III. 87) we see that W = Tr II W ⊗ Tr I W . Indeed, if one of the partial

traces is pure, W is factorizable, and therefore completely determined, by its partial traces.

To show the ‘only if’ - part of the theorem, that Tr II W and Tr I W uniquely define the state W

of the composite system only if at least one of the partial traces is pure, since only in that

case W is factorizable, we decompose them into orthogonal 1 - dimensional eigenprojectors,

where both u i , v j ∈ [0, 1] sum up to 1 as required in (III. 35) for the projectors to be state

operators,

Tr II W =

Tr I W =

It then holds that

∑N I

i=1

∑N II

j=1

Tr II W ⊗ Tr I W =

u i |u i ⟩ ⟨u i | :=

v j |v j ⟩ ⟨v j | :=

∑N I

∑N II

i=1 j=1

∑N I

i=1

∑N II

j=1

u i U i , (III. 90)

v j V j . (III. 91)

u i v j U i ⊗ V j . (III. 92)

Now consider an operator W of the form

III. 4. COMPOSITE SYSTEMS 61

W =

∑N I

∑N II

i=1 j=1

which is, in general, not factorizable.

z ij U i ⊗ V j , (III. 93)

EXERCISE 20. Prove that U i ⊗ V j is a 1 - dimensional projector in H.

The operator W , (III. 93), is a state operator if

z ij ∈ [0, 1] and

∑N I

∑N II

i=1 j=1

z ij = 1, (III. 94)

furthermore, with (III. 69) and (III. 72) we have

and

Tr II W =

Tr I W =

∑N I

∑N II

i=1 j=1

∑N I

∑N II

i=1 j=1

z ij U i (III. 95)

z ij V j . (III. 96)

This system has an infinite number of solutions for the unknown z ij , unless one of the partial

traces is pure, e.g. Tr II W = U 1 . In that case, according to (III. 95) it has to hold for i = 1

that ∑ j z 1j = 1. But then (III. 35) requires that ∑ j z ij = 0 if i ≠ 1, which means that, because

of the non - negativity of the z ij , it has to hold that z ij = 0 if i ≠ 1. Substituting i = 1 in (III. 93)

yields

W =

∑N II

j=1

z 1j U 1 ⊗ V j

= U 1 ⊗

∑N II

j=1

z 1j V j = Tr II W ⊗ Tr I W, (III. 97)

where the last step is in accordance with (III. 96).

We conclude that only if, at least, one of the partial traces is pure, W is factorizable. □

In the foregoing we saw that only if the state operator W of a composite system is factorizable, it

can be uniquely defined. Contrary to classical physics, in quantum mechanics maximal knowledge of

the state of the subsystems is in general not equivalent to maximal knowledge of the state of the entire

62 CHAPTER III. THE POSTULATES

system. Consequently, the state of the entire system can, generally, not be derived from measurements

on the separate subsystems. 4

If the partial traces of W = W 1

⊗ W 2 are both pure, W is also pure, as we saw in the exercise

on p. 56, and since the pure partial traces each have only one term W is of the form |u⟩ ⟨u| ⊗ |v⟩ ⟨v|.

On the other hand, a pure state in H is, generally, not factorizable, which we will show in an example.

EXAMPLE

If |u i ⟩ and |v j ⟩ span a basis in H I and H II , respectively, an arbitrary vector |ψ⟩ in H = H I ⊗ H II

is of the form

|ψ⟩ =

∑N I

∑N II

i=1 j=1

c ij |u i ⟩ ⊗ |v j ⟩. (III. 98)

An arbitrary pure state in H is therefore of the form

|ψ⟩ ⟨ψ| =

∑N I

∑N II

∑N I

∑N II

i=1 j=1 k=1 l=1

Consider the following pure entangled state in H,

c ∗ kl c ij

(

|ui ⟩ ⊗ |v j ⟩ )( ⟨u k | ⊗ ⟨v l | ) . (III. 99)

|Φ⟩ = 1 2

√

2

(

|u1 ⟩ ⊗ |v 1 ⟩ + |u 2 ⟩ ⊗ |v 2 ⟩ ) . (III. 100)

The corresponding W is the 1 - dimensional projector

(

W = |Φ⟩ ⟨Φ| = 1 2 |u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 2 |

+ |u 2 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) . (III. 101)

This pure state W is not factorizable, and cannot be written in the form (III. 93). But although W

is pure, its partial traces are not pure,

Tr II W =

Tr I W =

∑N II

(

⟨v j | Φ⟩ ⟨Φ | v j ⟩ = 1 2 |u1 ⟩ ⟨u 1 | + |u 2 ⟩ ⟨u 2 | ) , (III. 102)

j=1

∑N I

i=1

⟨u i | Φ⟩ ⟨Φ | u i ⟩ = 1 2

(

|v1 ⟩ ⟨v 1 | + |v 2 ⟩ ⟨v 2 | ) , (III. 103)

and indeed,

W I ⊗ W II = 1 4

(

|u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 2 | +

|u 2 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) ≠ W. (III. 104)

4 This aspect of the quantum mechanical state description is, however, analogous to a classical state description with a

probability distribution. The two - particle distribution function ρ(q 1 , p 1 ; q 2 , p 2 ) is not uniquely defined by the marginal

distribution functions

∫

ρ 1 (q 1 , p 1 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 2 dp 2 and ρ 2 (q 2 , p 2 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 1 dp 1 ,

the marginals are, after all, analogous to the partial traces.

III. 5. PROPER AND IMPROPER MIXTURES 63

III. 4. 1

SUMMARY

1. The state operator W ∈ S (H) of a composite system, whether pure or not, is not factorizable

in general.

2. If W is factorizable, the factors are equal to the partial traces of W ,

W = W 1 ⊗ W 2 implies W 1 = Tr II W and W 2 = Tr I W. (III. 105)

3. The partial traces uniquely define W iff, at least, one of the partial traces is pure, in which

case W is directly factorizable, W = W 1 ⊗ W 2 .

4. The partial traces of W are pure iff W is pure and of the form W = ( |u⟩ ⊗ |v⟩ )( ⟨u| ⊗ ⟨v| ) ,

with |u⟩ ∈ H I and |v⟩ ∈ H II .

III. 5

PROPER AND IMPROPER MIXTURES

The states of composite systems shed new insight on the interpretation of mixtures. Suppose that

W I and W II are the partial traces of an arbitrary state operator W , and, with u i , v j ∈ [0, 1], it holds

that

W I =

N I ∑

i=1

u i |u i ⟩ ⟨u i | and W II =

N II ∑

j=1

v j |v j ⟩ ⟨v j |. (III. 106)

W I and W II contain all quantum mechanical information about results of measurements on the subsystems

in H I and H II . The question is whether we can interpret this by assuming that the individual

subsystems are in the pure states |u i ⟩ and |v j ⟩, with probabilities u i and v j , respectively. If this were

the case, the composite system could be divided in subensembles of systems in the states |u i ⟩ ⊗ |v j ⟩

with probabilities depending on possible correlations between the values of i and j. The state would

be of the form

W ′ =

=

∑N I ∑N II

i=1

j=1

∑N I ∑N II

i=1

j=1

p ij

(

|ui ⟩ ⊗ |v j ⟩ )( ⟨u i | ⊗ ⟨v j | )

p ij |u i ⟩ ⟨u i | ⊗ |v j ⟩ ⟨v j |. (III. 107)

The coefficients p ij have to satisfy

p ij ∈ [0, 1],

N II ∑

j=1

p ij = u i ,

N I ∑

i=1

p ij = v j

and

∑N I ∑N II

i=1

j=1

p ij = 1, (III. 108)

64 CHAPTER III. THE POSTULATES

but otherwise they are free to choose. As far as being in one of the states |u i ⟩ or |v j ⟩ can be interpreted

as a property the subsystems possess, all correlations between these properties in the total state can

be expressed by the p ij . If there are no correlations, p ij = u i v j .

But we see that W ′ is of the special form (III. 93), and therefore in general not equal to the arbitrary

state operator W we started with, it cannot be said that the individual subsystems are in the pure

states |u i ⟩ and |v j ⟩. Although W I and W II are state operators, they cannot be interpreted as mixtures of

pure states. The mixed states W I and W II are called improper mixtures by B. d’Espagnat (1989, p.61).

Proper mixed states can in principle be taken as an ensemble of systems which are in pure states,

where improper states cannot.

The foregoing shows that the concept of mixed states is forced upon us by the theory of composite

systems as a natural extension of the concept of pure states. Even if the composite system is in a pure

state, the subsystems are generally not pure, it is not correct to understand mixed states in general as

simple mixtures of pure states, in the way the mixture of pieces in the box of a game of chess consists

of black and white pieces.

Finally, we make an observation about similar, or identical, particles. A system of similar particles

is described in quantum mechanics by symmetrized states. Consider the following symmetrized

two - particle state

|Ψ(1, 2)⟩ = 1 2

√

2

(

|u⟩ ⊗ |v⟩ ± |v⟩ ⊗ |u⟩

)

, (III. 109)

where the first factor in each direct product is related to particle 1, and the second to particle 2. In

this case the two subspaces are identical and |u⟩ and |v⟩ can represent states in both one and the other

subspace. The corresponding state operator is

W = |Ψ(1, 2)⟩ ⟨Ψ(1, 2)| = 1 (

2 |u⟩ ⟨u| ⊗ |v⟩ ⟨v| ± |u⟩ ⟨v| ⊗ |v⟩ ⟨u|

the partial traces are

and

W I = Tr II W = 1 2

W II = Tr I W = 1 2

± |v⟩ ⟨u| ⊗ |u⟩ ⟨v| + |v⟩ ⟨v| ⊗ |u⟩ ⟨u| ) , (III. 110)

(

|u⟩ ⟨u| + |v⟩ ⟨v|

)

, (III. 111)

(

|v⟩ ⟨v| + |u⟩ ⟨u|

)

, (III. 112)

and we see that the partial traces are identical. We have to say that both particles are in the same state,

we certainly can not say that one particle is in the state |u⟩ and the other in |v⟩. We cannot assign a

pure state to the separate particles, although the state of the composite system is pure.

III. 6

SPIN 1/2 PARTICLES

The time - dependent Schrödinger equation for the wave function Ψ(q, t) is given by

i ∂Ψ

∂t

= − 2

2m ∇2 Ψ + V Ψ. (III. 113)

III. 6. SPIN 1/2 PARTICLES 65

In this equation

⃗p = − i ⃗ ∇ (III. 114)

is the canonical momentum operator, yielding for the components of the angular momentum ⃗ L = ⃗q×⃗p

for a system in 3 - dimensional space

L i = − i ϵ ijk q j ∂ k . (III. 115)

These components do not commute,

[L i , L j ] = i ϵ ijk L k , (III. 116)

but the operator ⃗ L 2 = L 2 x + L 2 y + L 2 z does commute with ⃗ L, or with any one of its components,

where usually L z is taken.

The simultaneous eigenstates of ⃗ L 2 and L z are written as |l, m⟩, and their eigenvalues are discrete,

⃗L 2 |l, m⟩ = 2 l (l + 1) |l, m⟩, with l = 0, 1 2 , 1, 3 2

, . . . , (III. 117)

L z |l, m⟩ = m |l, m⟩ with m = − l, − l + 1, . . . , l − 1, l. (III. 118)

Although the algebraic derivation using the commutation relations allows for half integer values,

for angular momentum ⃗ L the values of l can only be integers to make sense physically. But the half

integer values are included in the description of spin.

Spin ⃗ S is an internal degree of freedom of elementary particles, which cannot easily be described

in classical terms, but is similar to ⃗ L. A main difference is that where the value of the angular momentum

of a particle can vary, the value s of spin of a particle is constant. The similarity is that spin has,

like ⃗ L, a direction ⃗n in 3 - dimensional space, and satisfies the commutation relations of (III. 116).

Writing the simultaneous eigenstates of ⃗ S 2 and S z as |s, m⟩, we can use (III. 117) and (III. 118)

again, where L 2 and L z are replaced by S 2 and S z , respectively, and l by s. The eigenvalues of ⃗ S 2

and S z are

s = 0 : ⃗ S 2 = 0, S z = 0, (III. 119)

s = 1 2 : S ⃗ 2 = 3 4 2 , S z = − 1 2 , 1 2

, (III. 120)

and so on for s = 0, 1 2 , 1, 3 2

, . . . . In this section we restrict ourselves to the most simple non - trivial

case, spin 1/2.

For spin 1/2 particles there are only two orthonormal eigenstates, | 1 2 , 1 2 ⟩ and | 1 2 , − 1 2

⟩, called

‘spin up’ and ‘spin down’, usually written as |↑⟩ and |↓⟩, respectively. Together, these eigenstates

form a basis for a spin space, the 2 - dimensional Hilbert space H = C 2 .

According to the observables postulate, p. 41, the observable spin corresponds uniquely to a self -

adjoint, or Hermitian, operator A in H. Every Hermitian operator in C 2 can be represented in the

aforementioned basis as a 2 × 2 - matrix,

A =

( )

a11 a 12

a 21 a 22

=

( )

a0 + a z a x − ia y

a x + ia y a 0 − a z

= a 0 11 + a x σ x + a y σ y + a z σ z = a 0 11 + ⃗a · ⃗σ, (III. 121)

66 CHAPTER III. THE POSTULATES

with real coefficients a 0 and ⃗a, and ⃗σ defined by the Pauli matrices,

σ x =

( ) 0 1

, σ

1 0 y =

( ) 0 −i

, σ

i 0 z =

( ) 1 0

. (III. 122)

0 −1

EXERCISE 21. Prove the aforementioned statement.

( (

Writing the eigenvectors of σ z , 1

and 0

, as |z ↑⟩ and |z ↓⟩, we have

0)

1)

σ z |z ↑⟩ = |z ↑⟩ and σ z |z ↓⟩ = − |z ↓⟩. (III. 123)

Analogously, let |x ↑⟩, |x ↓⟩ and |y ↑⟩, |y ↓⟩ denote eigenstates for the eigenvalues ±1 of σ x and σ y .

The Pauli matrices have the following properties:

σ 2 x = σ 2 y = σ 2 z = 11, (III. 124)

σ i σ j = i ϵ ijk σ k , (III. 125)

Tr ⃗σ = 0. (III. 126)

Using the anticommutation relations for the Pauli matrices, [σ i , σ j ] +

from (III. 125), we find a useful relation,

= 0, which follow directly

(⃗a · ⃗σ) ( ⃗ b · ⃗σ) = (⃗a · ⃗b) 11 + i ⃗σ · (⃗a × ⃗ b) (III. 127)

from which it follows that

(⃗a · ⃗σ) 2 = 11 if ∥⃗a∥ = 1. (III. 128)

A 2 × 2 - matrix A has eigenvalues ±1 iff A 2 = 11, and therefore, with ⃗n a unit vector, we see

that the only operators of the form (III. 121) having eigenvalues ±1 are precisely of the form ⃗n · ⃗σ.

This allows us to let spin in the direction ⃗n correspond to the operator

⃗S = 1 2

⃗n · ⃗σ. (III. 129)

We will found this choice shortly, but first we determine the eigenvectors of the spin operator ⃗n · ⃗σ.

Writing ⃗n in spherical coordinates

⃗n =

⎛ ⎞

sin θ cos ϕ

⎝sin θ sin ϕ⎠ , (III. 130)

cos θ

III. 6. SPIN 1/2 PARTICLES 67

we have

⃗n · ⃗σ =

( cos θ e

− i ϕ )

sin θ

e i ϕ , (III. 131)

sin θ − cos θ

with eigenvectors

|⃗n, +⟩ =

(

)

e − i 2 ϕ cos 1 2 θ

e i 2 ϕ sin 1 2 θ

and |⃗n, −⟩ =

(

)

− e − i 2 ϕ sin 1 2 θ

e i 2 ϕ cos 1 2 θ

(III. 132)

for eigenvalues ±1.

EXERCISE 22. Verify (III. 132)

III. 6. 1

SPIN 1/2 AND ROTATIONS IN SPIN SPACE

A rotation over an angle α ∈ [0, π) around an axis in the direction of the unit vector ⃗m,

with ⃗m ∈ R 3 , can be written as a unitary matrix

U (⃗m, α) = e − i α ( ⃗m · ⃗J) , (III. 133)

where the total angular momentum J ⃗ = L ⃗ + S ⃗ is the infinitesimal generator of rotations. With L ⃗ = 0

and writing S i = 1 2 σ i, which is, using (III. 124), in accordance to (III. 120) and the still unfounded

(III. 129), the Pauli matrices are the generators of rotations in C 2 , leading to

U (⃗m, α) = e − i 2 α ( ⃗m · ⃗σ) , (III. 134)

where ∥⃗m∥ is again 1. Using Taylor expansions, with (III. 128) we find for (III. 134)

∞∑ (− i) k (⃗m · ⃗σ) k (

U(⃗m, α) =

1

k!

2 α) k

=

k=0

∞∑

k=0

k=even

(− 1) 1 2 k ( 1

k!

2 α) ∑

k ∞ 11 + i (⃗m · ⃗σ)

k=1

k=odd

(− 1) 1 2 (k+1) ( 1

k!

2 α) k

= cos 1 2 α 11 − i (⃗m · ⃗σ) sin 1 2α. (III. 135)

It can be verified that, under a rotation around an axis ⃗m over an angle α, with ⃗n R the unit vector

in the rotated direction, the eigenstates of ⃗n · ⃗σ, (III. 132), transform into the eigenstates of ⃗n R · ⃗σ,

obeying the rotational transformation rules

U (⃗m, α) |⃗n, ±⟩ = |⃗n R , ±⟩. (III. 136)

68 CHAPTER III. THE POSTULATES

We illustrate (III. 136) using a rotation of ⃗n in the x z - plane, ϕ = 0, over an angle α around

the y - axis as in diagram III. 2.

⃗n

z

θ

α

⃗n R

x

y

Figure III. 2: A rotated unit vector in the xz - plane

For ⃗n and ⃗n R we have

⎛ ⎞ ⎛ ⎞

sin θ

sin(θ + α)

⃗n = ⎝ 0 ⎠ , ⃗n R = ⎝ 0 ⎠ . (III. 137)

cos θ

cos(θ + α)

The eigenstates of ⃗n · ⃗σ, using (III. 132), are

( cos

1

|⃗n, +⟩ = 2 θ )

sin 1 2 θ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 138)

and

|⃗n, −⟩ =

( − sin

1

2 θ )

cos 1 2 θ

= − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩. (III. 139)

Rotating around the y - axis and therefore

(

U (⃗e y , α) = (cos 1 2 α 11 − i ⃗e y · ⃗σ sin 1 cos

1

2 α) = 2 α − sin 1 2 α )

sin 1 2 α cos 1 2 α , (III. 140)

we have

U (⃗e y , α) |⃗n, +⟩ =

( )

cos

1

2

(θ + α)

sin 1 2 (θ + α)

and

U (⃗e y , α) |⃗n, −⟩ =

= cos 1 2 (θ + α) |z ↑⟩ + sin 1 2

(θ + α) |z ↓⟩ (III. 141)

( )

− sin

1

2

(θ + α)

cos 1 2 (θ + α)

= − sin 1 2 (θ + α) |z ↑⟩ + cos 1 2

(θ + α) |z ↓⟩, (III. 142)

III. 6. SPIN 1/2 PARTICLES 69

and we see that (III. 141) and (III. 142) are indeed the eigenstates |⃗n R , +⟩ and |⃗n R , −⟩ of ⃗n R · ⃗σ.

Comparison of these eigenstates with the eigenstates of ⃗n · ⃗σ, (III. 138) and (III. 139), shows

that (III. 136) is satisfied. As can easily be verified, this holds in general, and we conclude that spin

is represented by the spin operator ⃗n · ⃗σ, founding our choice (III. 129).

Under a rotation around the y - axis over an angle θ the eigenvectors of σ z transform into

and, likewise,

U (⃗e y , θ) |z ↑⟩ = (cos 1 2 θ 11 − i σ y sin 1 2 θ) |z ↑⟩ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 143)

U (⃗e y , θ) |z ↓⟩ = − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩ (III. 144)

Especially, it holds that the eigenvectors of σ x correspond to a rotation of the eigenvectors of σ z

around the y - axis over θ = 1 2 π,

and

U (⃗e y , 1 2 π) |z ↑⟩ = 1 2

√

2

(

|z ↑⟩ + |z ↓⟩

)

= |x ↑⟩, (III. 145)

U (⃗e y , 1 2 π) |z ↓⟩ = 1 2

√

2

(

|z ↓⟩ − |z ↑⟩

)

= |x ↓⟩. (III. 146)

EXERCISE 23. Construct, analogously, the states |y ↑⟩ and |y ↓⟩ from |z ↑⟩ and |z ↓⟩ using a

rotation around the x - axis.

Successively rotating over 1 2

into |z ↑⟩, and consequently, we have to rotate |z ↑⟩ over 4π to come back to |z ↑⟩ again. Generally,

a rotation over 2π transforms a state |ϕ⟩ into −|ϕ⟩. This means we cannot simply visualize particles

with spin as tiny spinning tops!

Finally a useful relation holds. Choosing again ⃗e y for ⃗m, we have U (⃗e y , α) as in (III. 140) which

yields for arbitrary ⃗n, (III. 130),

⟨⃗n, +| U (⃗e y , α) |⃗n, +⟩ = cos 1 2 α + (e − i 2 ϕ − e i 2 ϕ ) cos 1 2 θ sin 1 2 θ sin 1 2 α

= cos 1 2 α − i sin ϕ sin θ sin 1 2α, (III. 147)

from which we see that, if ⃗n and ⃗n R are in the xz - plane, ϕ = 0 or ϕ = π,

⟨⃗n, + | ⃗n R , +⟩ = cos 1 2 α ⃗n ⃗n R

, (III. 148)

where α ⃗n ⃗nR is the angle between ⃗n and ⃗n R . Because ⃗n and α can be chosen arbitrarily, this relation

holds for any two vectors ⃗n and ⃗n ′ in the xz - plane, and, by freedom of choice of the coordinate

system, it holds whenever ⃗n and ⃗n ′ are in the same plane.

70 CHAPTER III. THE POSTULATES

EXERCISE 24. Show that the operator 1 2

(11 + ⃗n · ⃗σ) is the projector on |⃗n, +⟩,

1

2

(11 + ⃗n · ⃗σ) = |⃗n, +⟩ ⟨⃗n, +|. (III. 149)

◃ Remark

This holds in any matrix representation. ▹

III. 6. 2

MIXED SPIN 1/2 STATES

Every Hermitian 2 × 2 - matrix can, as stated before, be written as (III. 121), A = a 0 11 + ⃗a · ⃗σ,

with real coefficients a 0 and ⃗a. According to (III. 29), for the corresponding operator A to be a state

operator the trace of A has to be 1, which means that a 0 = 1 2

. Furthermore, A has to be positive.

A positive matrix can be written as the square of a Hermitian matrix B,

B = b 0 11 + ⃗ b · ⃗σ and B 2 = (b 2 0 + ⃗ b 2 ) 11 + 2 b 0

⃗ b · ⃗σ, (III. 150)

Therefore,

a 0 = 1 2 = b 2 0 + ⃗ b 2 and ⃗a = 2 b 0

⃗ b. (III. 151)

The possible values of b 0 are limited by (III. 151), b 2 0

fixed, ⃗ b = 1 ⃗a

2 b 0

, yielding

1 2 , while as soon as b 0 is chosen ⃗ b is

⃗a 2 = 4 b 0

2⃗ b 2 = 4 b 0

2 ( 1

2 − b 0 2) . (III. 152)

Obviously, ⃗a 2 only depends on b 2 0 and its values in the interval [0, 1 2 ] are between 0 and 1 4

, where ⃗a

2

has a maximum for b 2 0 = 1 4 . In other words, A is a state operator iff a 0 = 1 2 and ⃗a 2 1 4

, in which

case some b 0 and ⃗ b exist, satisfying the requirements (III. 151).

Now an arbitrary state operator is

W = 1 2 (11 + ⃗w · ⃗σ), ⃗w 2 1. (III. 153)

This state operator is characterized by the vector ⃗w, called the polarization vector, which has its

endpoints within or on the surface of the unit sphere, the so - called Bloch sphere. For ∥ ⃗w∥ = 1 the

system is called completely polarized, for ⃗w = 0 it is called unpolarized, and if 0 < ∥ ⃗w∥ < 1 it is

called partially polarized.

The state operators with ⃗w 2 = 1 are the pure states, the 1 - dimensional projectors,

W 2 = 1 4 (11 + 2 ⃗w · ⃗σ + ⃗w 2 11) = 1 2

(11 + ⃗w · ⃗σ) = W, (III. 154)

the state operators with ⃗w 2 < 1 are mixed states. The set of state operators is a convex set as we

can now easily see. If ⃗w 1 and ⃗w 2 are within or on the surface of the unit sphere, then α ⃗w 1 + β ⃗w 2 ,

with 0 < α, β < 1 and α +β = 1, is the chord linking ⃗w 1 and ⃗w 2 , and this chord is within the sphere.

III. 6. SPIN 1/2 PARTICLES 71

EXERCISE 25. Prove the following statements.

(a) ⟨⃗σ⟩ W = ⃗w,

(b) det W = 1 4 (1 − ⃗w 2 ),

EXAMPLES

In the following two examples, consider vectors ⃗w with ∥ ⃗w∥ = 1, thus corresponding to pure

states.

(a) Since in this case ⃗w equals the unit vector ⃗n, for ⃗w = (0, 0, 1) ∈ R 3 we have

( )

W = 1 (11 1 0

2 + σ z) = , (III. 155)

0 0

which is a 1 - dimensional projector, it is the matrix representation of W = |z ↑⟩ ⟨z ↑|.

Likewise we have

⃗w = (1, 0, 0) =⇒ W = 1 2 (11 + σ x) = |x ↑⟩ ⟨x ↑|, (III. 156)

⃗w = (0, 1, 0) =⇒ W = 1 2 (11 + σ y) = |y ↑⟩ ⟨y ↑|,

and we see that generally W = 1 2

(11 + ⃗n · ⃗σ) corresponds to the pure state |⃗n, +⟩, as was

already shown in (III. 149).

In the same way, for |⃗n, −⟩ we have

etc.

⃗w = (0, 0, − 1) =⇒ W = 1 2 (11 − σ z) = |z ↓⟩ ⟨z ↓|, (III. 157)

(b) For the probability to find spin up in the direction ⃗n ′ in the state |⃗n, +⟩, with (III. 45)

and (III. 127) we find

µ W ⃗n

(W ⃗n ′) = Tr W ⃗n ′ W ⃗n = Tr ( 1

2 + ⃗n ′ · ⃗σ) · 1

2

(11 + ⃗n · ⃗σ))

= 1 4 Tr ( 11 + ⃗n ′ · ⃗σ + ⃗n · ⃗σ + (⃗n ′ · ⃗n)11 + i⃗σ · (⃗n ′ × ⃗n) )

= 1 2 (1 + ⃗n ′ · ⃗n) = 1 2 (1 + cos θ) = cos2 1 2θ, (III. 158)

with θ the angle between ⃗n and ⃗n ′ . This is in accordance with (III. 148).

The following examples concern mixed state operators W , for which ⃗w has its endpoint somewhere

inside the sphere, ⃗w 2 < 1.

(0, 1, 0) yields

( 1

W = 1 (11 2 + 1 2 σ 2

y) =

− 1 4 i

This can, for instance, be factorized as

1

4 i 1

2

)

. (III. 159)

W = 1 4 |z ↑⟩ ⟨z ↑| + 1 4 |z ↓⟩ ⟨z ↓| + 1 2

|y ↑⟩ ⟨y ↑|, (III. 160)

which clearly is a mixture.

72 CHAPTER III. THE POSTULATES

The next two examples concern the center of the Bloch sphere, ⃗w = 0 .

(d) With ⃗w = 0 , we have

( ) 1 0

W = 1 2

. (III. 161)

0 1

The eigenvalues of this mixed state W are degenerate, and various factorizations are possible,

for example

W = 1 2 |x ↑⟩ ⟨x ↑| + 1 2

|x ↓⟩ ⟨x ↓|

= 1 2 |y ↑⟩ ⟨y ↑| + 1 2

|y ↓⟩ ⟨y ↓|

= 1 2 |z ↑⟩ ⟨z ↑| + 1 2

|z ↓⟩ ⟨z ↓|. (III. 162)

(e) Under a rotation R, ⃗w behaves like a vector in R 3 ,

U (R) ( ⃗w · ⃗σ) U − 1 (R) = ⃗w R · ⃗σ (III. 163)

where U (R) is given by (III. 135). Therefore, the only rotation invariant state for a 1 - particle

system is ⃗w = 0 .

The similarity between the set of density matrices W and the 3 - dimensional unit sphere of polarization

vectors is specific for spin 1/2 particles, in which case every pure state is also the eigenstate

for the spin operator in a certain spin direction. For spin 1 bosons and higher spin particles this no

longer applies.

III. 6. 3

TWO SPIN 1/2 PARTICLES

III. 6. 3. 1

SINGLET AND TRIPLET STATES

Consider a composite system of two spin 1/2 fermions. In the direct product space C 2 ⊗ C 2 = C 4

a basis is

|z ↑⟩ ⊗ |z ↑⟩, |z ↑⟩ ⊗ |z ↓⟩, |z ↓⟩ ⊗ |z ↑⟩, |z ↓⟩ ⊗ |z ↓⟩. (III. 164)

From these basis states the simultaneous eigenstates |s, m⟩ of the operators ⃗ S 2 = ( ⃗ S 1 + ⃗ S 2 ) 2

and S z = S 1z + S 2z can be formed, where s can be 0 or 1. The eigenvalues of ⃗ S 2 are 2 s(s + 1), the

eigenvalues of S z are m, as introduced on p. 65.

The singlet state or singlet for short, with s = 0 and therefore m = 0, is the entangled state

|Ψ 0 ⟩ = |0, 0⟩ = 1 2

√

2

(

|z ↑⟩ ⊗ |z ↓⟩ − |z ↓⟩ ⊗ |z ↑⟩

)

, (III. 165)

which looks the same in terms of the eigenstates of S x and S y , having spherical symmetry. The singlet

is a simultaneous eigenstate of S x , S y and S z with eigenvalue 0. Hence the singlet is an eigenstate

of ⃗n · ⃗S with eigenvalue 0, which means that a rotation (III. 133) carries (III. 165) back into itself.

III. 6. SPIN 1/2 PARTICLES 73

The triplet states, with s = 1 and m = 1, 0, −1 are

|1, 1⟩ = |z ↑⟩ ⊗ |z ↑⟩

√

|1, 0⟩ = 1 ( )

2 2 |z ↑⟩ ⊗ |z ↓⟩ + |z ↓⟩ ⊗ |z ↑⟩

|1, − 1⟩ = |z ↓⟩ ⊗ |z ↓⟩. (III. 166)

III. 6. 3. 2

CORRELATIONS

In chapter VII we will use the spin correlation function of the singlet,

E QM (⃗a, ⃗ b) := ⟨0, 0|⃗a · ⃗σ 1 ⊗ ⃗ b · ⃗σ 2 |0, 0⟩, (III. 167)

where ⃗a, ⃗ b ∈ R 3 are unit vectors. E QM (⃗a, ⃗ b) is the expectation value to find both for particle 1 spin

up along ⃗a and for particle 2 spin up along ⃗ b. To find E QM (⃗a, ⃗ b), first choose the z - axis along ⃗a

as in diagram III. 3, next choose the x - axis in such a way that ⃗ b is in the xz - plane. The spherical

symmetry of the singlet state allows such a choice.

z

⃗a

θ ⃗a, ⃗ b

⃗ b

Figure III. 3: Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b

x

With ⃗a = ⃗e z , ⃗ b similar to ⃗n in (III. 137), and θ ⃗a, ⃗ b

the angle between ⃗a and ⃗ b, we have

E QM (⃗a, ⃗ b) = ⟨0, 0| σ 1z ⊗ (sin θ ⃗a, ⃗ b

σ 2x + cos θ ⃗a, ⃗ b

σ 2z ) |0, 0⟩. (III. 168)

Now σ z |z ↑⟩ = |z ↑⟩, σ x |z ↑⟩ = |z ↓⟩ etc., so that we have, using (II. 100), (III. 165) and (III. 166),

√

(σ 1z ⊗ σ 2x ) |0, 0⟩ = 1 ( )

2 2 |1, 1⟩ + |1, −1⟩ (III. 169)

which is perpendicular to |0, 0⟩, and

(σ 1z ⊗ σ 2z ) |0, 0⟩ = − |0, 0⟩, (III. 170)

from which we see that

E QM (⃗a, ⃗ b) = − cos θ ⃗a, ⃗ b

. (III. 171)

74 CHAPTER III. THE POSTULATES

III. 6. 3. 3

CONDITIONAL PROBABILITIES

In chapter VII we will also need to know, again in case the particles are in the singlet state, the

probability for the spin of particle 2 to be found in the direction ⃗ b, given that the spin of particle 1 was

found in the direction ⃗a. This conditional probability is, by definition,

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )

Prob ( ) . (III. 172)

⃗a · ⃗σ 1 = 1

Here the joint probability is

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = | ( ⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| ) |0, 0⟩| 2 , (III. 173)

with |⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ the direct product of the eigenstates of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 having eigenvalues +1.

Again choosing ⃗a and ⃗ b as in diagram III. 3, |⃗a ↑⟩ = |z ↑⟩ and | ⃗ b ↑⟩ equal to |⃗n, +⟩, (III. 138), we find

for the direct product

Therefore, with (III. 165),

( ) √

⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| |0, 0⟩ =

1

2 2 sin

1

2 θ ⃗a, ⃗ , (III. 175)

b

and we see that the joint probability is

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 sin2 1 2 θ ⃗a, ⃗ . (III. 176)

b

Likewise, again using (III. 173) with ⟨ ⃗ b ↓| equal to |⃗n, −⟩, (III. 139), we have

Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (III. 177)

b

This yields for the marginal probability

Prob ( ⃗a · ⃗σ 1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )

and we see that the conditional probability (III. 172) is

+ Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 )

= 1 2 sin2 1 2 θ ⃗a, ⃗ b + 1 2 cos2 1 2 θ ⃗a, ⃗ b = 1 2

, (III. 178)

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = sin 2 1 2 θ ⃗a, ⃗ . (III. 179)

b

◃ Remark

By definition there is no correlation between the two results of measurements of spin if

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ) , (III. 180)

which is the case if ⃗a and ⃗ b are perpendicular. ▹

III. 6. SPIN 1/2 PARTICLES 75

We are now able to calculate the correlation (III. 167) directly, using a well - known formula from

probability theory,

E QM (⃗a, ⃗ b) =

∑+1

a=−1 b=−1

a b Prob (a, b), (III. 181)

where a, b ∈ { −1, 1} are the results of measurements of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 , respectively, and

Prob (a, b) is the joint probability to find a and b at measurements of the respective spin quantities.

Using (III. 176) and (III. 177) and calculating the probabilities with eigenvalues −1 for ⃗a · ⃗σ 1 we

find

E QM (⃗a, ⃗ b) = Prob (1, 1) + Prob (− 1, − 1) − Prob (1, − 1) − Prob (− 1, 1)

= 2 · 1

2 sin2 1 2 θ ⃗a, ⃗ b − 2 · 1

2 cos2 1 2 θ ⃗a, ⃗ b

= − cos θ ⃗a, ⃗ b

. (III. 182)

This is indeed equal to the earlier result (III. 171).

III. 6. 3. 4

EXAMPLE OF A MIXED STATE OF TWO SPIN 1/2 PARTICLES

Consider, analogous to (III. 100), the pure entangled state

|Φ⟩ = 1 2

√

2

(

|z ↑⟩ ⊗ |z ↑⟩ + |z ↓⟩ ⊗ |z ↓⟩

)

, (III. 183)

and the corresponding state W = |Φ⟩ ⟨Φ|, acting in H I ⊗ H II ,

W = 1 (

2 |z ↑⟩ ⟨z ↑| ⊗ |z ↑⟩ ⟨z ↑| + |z ↑⟩ ⟨z ↓| ⊗ |z ↑⟩ ⟨z ↓| +

|z ↓⟩ ⟨z ↑| ⊗ |z ↓⟩ ⟨z ↑| + |z ↓⟩ ⟨z ↓| ⊗ |z ↓⟩ ⟨z ↓| ) , (III. 184)

where the first factor in the direct product acts in H I , and the second factor in H II .

The representation of W in the corresponding basis (III. 164) of H = H I ⊗ H II is, using the

Kronecker product of matrices, (II. 103),

⎛ ⎞

1 0 0 1

W = 1 ⎜0 0 0 0

⎟

2 ⎝0 0 0 0⎠ . (III. 185)

1 0 0 1

This is indeed a pure state, since W is idempotent, a necessary and sufficient condition for bounded,

self - adjoint operators to be a projector.

The partial traces are

W I = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H I), (III. 186)

W II = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H II), (III. 187)

76 CHAPTER III. THE POSTULATES

and their matrix representation in the basis of σ z is

W I = 1 2

( ) 1 0

0 1

and W II = 1 2

( ) 1 0

. (III. 188)

0 1

Although W is a pure state, the direct product of the partial traces W I and W II is not pure,

⎛ ⎞

1 0 0 0

W I ⊗ W II = 1 ⎜0 1 0 0

⎟

4 ⎝0 0 1 0⎠ ≠ W. (III. 189)

0 0 0 1

This conclusion is, of course, in accordance with the conclusion (III. 104) concerning the pure state

operator (III. 100).

◃ Remark

Notice that all matrices in this example are indeed Hermitian, positive and have trace 1, the requirements

of Gleason’s theorem, p. 47, for operators W to be state operators. ▹

EXERCISE 26.

(a) In (III. 184), fill in the matrix representations of the projectors in H I and H II , and check

that forming Kronecker products indeed yields (III. 185).

(b) Is the state (III. 184) spherically symmetric?

IV

THE COPENHAGEN INTERPRETATION

It is wrong to think that the task of physics is to find out how nature is. Physics concerns

what we can say about nature.

— Niels Bohr

The Heisenberg-Bohr tranquilizing philosophy - or religion? - is so delicately contrived

that, for the time being, it provides a gentle pillow for the true believer from which he

cannot very easily be aroused. So let him lie there.

— Albert Einstein

I know it is not the fault of N. B. that he did not study philosophy. But I deeply regret

that by his authority the brains of two or three generations will be upset and hindered to

think about the problems ‘He’ pretends to have solved.

— Erwin Schrödinger

Bohr’s famous institute being located in Copenhagen, the standard interpretation of quantum

mechanics as explained in most of the textbooks is generally indicated as the Copenhagen Interpretation.

It is however worth mentioning that the conceptions of the many supporters of the

Copenhagen Interpretation, Niels Bohr, Werner Heisenberg, Wolfgang Pauli, Rudolf Peierls,

Léon Rosenfeld and John Wheeler, to name some of them, mutually differ on numerous points,

and that some of them, including Bohr himself, modified their conceptions in the course of time,

so that the name ‘Copenhagen Interpretation’ is more a collective noun than the name of one

clearly outlined vision. Moreover, important contributions to the standard interpretation of the

theory have been made by Born and Von Neumann, working independently of the Copenhagen

school. In this chapter we will evaluate the conceptions of Heisenberg and Bohr as the main

representatives of the Copenhagen Interpretation, and consider more closely the debate between

Einstein and Bohr. Finally, we will discuss the exact expression of the uncertainty principle.

IV. 1

HEISENBERG AND THE UNCERTAINTY PRINCIPLE

The history of modern quantum mechanics starts in 1925, when Heisenberg publishes his famous

transitional article ‘Über quantentheoretische Umdeutung kinematischer und mechanischer

Beziehungen’ (‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’). His

summary reads

The present paper seeks to establish a basis for theoretical quantum mechanics founded

exclusively upon relationships between quantities which in principle are observable.

78 CHAPTER IV. THE COPENHAGEN INTERPRETATION

Obviously the theory was only allowed to speak about observable quantities; every attempt to

visualize the inside of an atom had to be avoided. In particular, one could not speak of the orbit

of an electron. Only the transitions between stationary states were ‘observable’ and therefore the

transition quantities could be characterized by two discrete indices. These ideas were developed

by Heisenberg, Born and Jordan into matrix mechanics. They represented all physical quantities

by infinite complex Hermitian matrices. The ‘quantum condition’, the fundamental equation of this

theory, is the commutation relation

P Q − Q P = − i 11 (IV. 1)

between the matrices P and Q, which were meant to be the ‘quantum counterparts’ of the canonical

dynamical quantities, momentum and position, of classical mechanics à la Hamilton.

In 1926 matrix mechanics received unexpected competition by wave mechanics, established by

Erwin Schrödinger. He interpreted the electron as a vibrating charge cloud, continuously moving

in space. In his conception the stationary states could be understood as resonances, comparable to

the vibrations of the string of a violin. According to Schrödinger, wave mechanics was to be preferred

over matrix mechanics because wave mechanics offers a graphic picture of what takes place in

microphysical reality. This interpretation foundered on three insoluble problems:

(i) waves of physical systems consisting of more than one particle were defined in the configuration

space R 3N instead of in the three - dimensional space R 3 surrounding us,

(ii) wave packets of free particles eventually fall apart and therefore, the electron cannot remain a

localized entity,

(iii) the wave function can carry complex values.

Nevertheless, eventually the empirical strength of wave mechanics turned out to be just as strong

as that of matrix mechanics.

The fact that an approach with such radically different starting points turned out to be possible

also, impelled Heisenberg to further clarify his starting points. The result of this effort is his ‘uncertainty

principle’, formulated for the first time in his 1927 article ‘Über den anschaulichen Inhalt der

quantentheoretischen Kinematik und Dynamik’, which was translated as ‘The physical content of

quantum kinematics and mechanics’.

In this article Heisenberg wonders how the ‘orbit’ of an electron must be understood in quantum

mechanics. On the one hand, the basic equation (IV. 1) prevents granting numerical values to position

and momentum simultaneously, on the other hand, the path of a particle in, for example, a Wilson

chamber, seems to be directly perceptible. To find a way out of this dilemma, he was inspired by a

statement of Einstein (H.J. Folse 1985, p. 91),

[. . . ] it is the theory finally which decides what can be observed and what can not [. . . ]

Could it be, that if a path cannot be defined in quantum mechanics, it can in fact not be observed also?

This idea led him to analyze what the theory has to say about observations.

IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 79

He starts (1927, Eng. tr. p. 64) with linking measuring and defining operationally,

When one wants to be clear about what is to be understood by the words “position of the

object”, for example of the electron, relative to a given frame of reference, then one must

specify definite experiments with whose help one plans to measure the “position of the

electron”, otherwise this word has no meaning.

We will call this the measuring = defining principle.

One could, for example, determine the position of an electron by examining it under a microscope.

According to classical optics a microscope has a limited resolution. The Abbe criterion gives the

smallest distinguishable details as

δq ∼

λ , (IV. 2)

sin ε

where λ is the wavelength of light and ε is the aperture, the opening angle of the lens. For a precise

measurement we must therefore use a very short wavelength, i.e. gamma radiation. But in that case

the Compton effect cannot be neglected. The radiation behaves as a flow of particles, with momentum

p 0 = h λ

, which collides with the electron and causes it to recoil.

Figure IV. 1: Heisenberg’s γ - microscope

To allow for an observation at least one photon has to collide with the electron, which will bring

about a change of momentum. But as we do not know anything more about the direction of the

photon after the collision than that it has gone through the lens, we cannot indicate the size of the

recoil exactly. As can be seen in figure IV. 1, the transfer of momentum remains unknown to an

amount

δp ∼ p 0 sin ε = h λ

sin ε (IV. 3)

and therefore

δq δp ∼ h. (IV. 4)

80 CHAPTER IV. THE COPENHAGEN INTERPRETATION

The more closely the position is determined, δq is small, the more inaccurately the momentum afterwards

is known, δp is large.

Quoting Heisenberg again (loc. cit.)

At the instant when position is determined - therefore, at the moment when the photon is

scattered by the electron - the electron undergoes a discontinuous change in momentum.

This change is the greater the smaller the wavelength of the light employed - that is, the

more exact the determination of the position. At the instant at which the position of the

electron is known, its momentum therefore can be known up to magnitudes which correspond

to that discontinuous change. Thus, the more precisely the position is determined,

the less precisely the momentum is known, and conversely.

This conclusion is the first formulation of the uncertainty principle. According to Heisenberg’s

own measuring = defining principle this conclusion can, however, not yet be drawn because it also

has to be specified what, in this context, must be understood by the term ‘momentum of the electron’.

In a later discussion (Heisenberg 1930), Heisenberg specifies the reasoning by also discussing the

definition of the momentum of the electron.

This reasoning goes as follows. Suppose that the momentum of the electron has been measured

in advance with an inaccuracy δ p 1 . Next, the position is measured with an inaccuracy δ q, then the

momentum is measured again, with inaccuracy δp 2 . We can assume that δp 1 ≪ p 1 and δp 2 ≪ p 2 ,

so that the momentum is very accurately known before and after the position measurement. Now it

makes sense to speak of the momentum p 1 of the electron shortly before the position measurement.

If now the position is measured very precisely, the position and momentum of the electron in the past

are arbitrarily well defined. Heisenberg (1930, p. 20):

[. . . ] if the velocity of the electron is at first known and the position then exactly measured,

the position for times previous to the measurement may be calculated. Then for

these past times δp δq is smaller than the usual limiting value [. . . ]

Apparently, the uncertainty relation does not apply to the past. In the example the uncertainty concerns

the unpredictability of the value of p 2 after the position measurement, not the inaccuracy δp 2

with which p 2 can be measured. This unpredictability can be determined by accurately measuring

the momentum before and after the determination of position, and the unpredictability is larger if

the determination of position was more precise. Although it is true that one can speak in a logically

consistent manner of the position and momentum of the electron in the past (loc. cit.),

[. . . ] but this knowledge of the past is of a purely speculative character, since it can never

(because of the unknown change in momentum caused by the position measurement) be

used as an initial condition in any calculation of the future progress of the electron and

thus cannot be subjected to experimental verification. It is a matter of personal belief

whether such a calculation concerning the past history of the electron can be ascribed

any physical reality or not.

For Heisenberg, such a calculation does not describe reality. But then, what is reality to him?

Heisenberg says, (1927, Eng. tr. p. 73),

The “orbit” comes into being only when we observe it.

IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 81

Apparently, the measurement creates reality, instead of revealing it. This is what we call the measuring

= creating principle.

This leads to the following representation. First, we measure the momentum of the electron

precisely. Not only is the term “the momentum of the electron” hereby defined, now we also can

say, according to the measuring = creating principle, that the value of the momentum, which was

determined in this measurement, is physically real. Next, we measure the position precisely. At

this measurement the electron obtains an exact position. After this measurement the momentum of

the electron has however changed in an unpredictable manner. This can be verified with a second

precise momentum measurement. This unpredictability turns out to be all the larger as the position

measurement is more precise.

Now the question is, if the electron had this changed momentum already before the second momentum

measurement, i.e., if this value is also physically real before this measurement. According

to Heisenberg this is not the case, because we can only predict the momentum to the order of the

size of the change. Before the second momentum measurement the electron has only a blurred, fuzzy

momentum. Only when the measurement of momentum has been carried out the electron regains a

sharply defined momentum. ‘Fuzzy’ is meant in the ontological sense, as the sharpness of a property

the electron possesses. As one quantity is measured more precisely, the conjugate quantity becomes

more fuzzy.

◃ Remark

Directly after the measurement of momentum it is meaningful to say that the electron has this momentum,

because in that case the outcome of a next measurement of momentum can, within the accuracy

of measurement, be predicted with certainty. ▹

In later work Heisenberg uses the Aristotelian term potential. A related term by K.R. Popper

is propensity. The electron has a propensity to produce, at measurement, a certain outcome. This

propensity can be understood as a real property of the electron, even if we are not performing a

measurement. The potential and propensity interpretations are therefore ‘realistic’ interpretations, or

at least not in conflict with scientific realism which is, roughly speaking, the thesis that a scientific

theory tells us how (a part of) reality is made up.

IV. 1. 1

REMARKS

(a) Heisenberg derives the uncertainty relation (IV. 4) for the electron from a quantum mechanical

treatment of the photon. What he in fact hereby proves is the consistency of the uncertainty

principle.

(b) Although it is frequently written that the uncertainty relation restricts simultaneous measurements,

simultaneous measurements of position and momentum do not appear in this discussion.

projection postulate, p. 42, be described as follows. Upon measurement of p the state transforms

into the proper eigenstate of p. In that state q is unpredictable. If next q is measured, the

state transforms into the proper eigenstate of q and p becomes unpredictable. The uncertainty

principle says that that unpredictability is larger if the preceding measurement of q was more

precise.

82 CHAPTER IV. THE COPENHAGEN INTERPRETATION

(d) Heisenberg (1930) describes the path of an electron in a Wilson chamber as follows. Suppose

that the incoming electron can be described by a wave packet with fairly sharply defined position

and momentum. Upon free development this packet spreads out in the course of time so

that the position becomes less sharp. When the electron ionizes a molecule in the Wilson chamber

a macroscopic droplet is formed, which can be understood as a position measurement. As

a result the wave packet reduces to a packet which is rather sharply located, with a dimension

in the order of a molecule, which again spreads out until a next ionization takes place.

It can be shown that the successive spreading and contraction in position and momentum is,

according to the uncertainty principle, in agreement with the observation of a macroscopic

path. We cannot speak however of the path of an electron in an atom, not even approximately.

An observation of the position of the electron with an accuracy larger than the dimension of the

atom requires such a large recoil that the electron is generally pushed out of the atom entirely.

Therefore, of such an ‘orbit’ no more than one point is observable. Notice that observation plays

a vital role; the path in the Wilson chamber only comes into existence because we observe it.

(e) As a result of Heisenbergs discussion of the uncertainty principle the term measurement disturbance

was introduced in quantum mechanics. Initially the inclination existed to consider this

as a more or less classical physical process; the momentum of the electron is disturbed by the

collision with a photon. This is also indicated by Heisenberg’s use of the word ‘error’ for δq.

From the beginning, Bohr resisted this explanation of Heisenberg, and he put the emphasis on

the necessity to combine mutually excluding terms from a wave and particle picture in one description.

Especially because of EPR it later became clear that the ‘measurement disturbance’

cannot be an ordinary error.

IV. 2

BOHR AND COMPLEMENTARITY

The core of the Copenhagen interpretation lays, of course, in Bohr’s work. His articles are characterized

by an entirely own style. Remarkably, Bohr hardly uses the formalism of the theory, he

generally gives a qualitative argument instead. His difficult, and sometimes obscurely formulated,

long sentences are notorious, full of subordinate clauses and conditional definitions which do not

always clarify his intentions. A careful reconstruction and interpretation of Bohr’s point of view,

and its development in the course of time, has been given by E. Scheibe (1973, chapter 1), another

interpretation is the monograph of H.J. Folse (1985).

Centrally in Bohr’s consideration is the language we use to do physics. Bohr emphasizes that,

regardless of how abstract and refined the terms of modern physics may be, in essence they are only

an extension of everyday language, and they are nothing but means of communication we use to

communicate observational results to other people. Such an observational result, the outcome of a

measurement on a physical system in certain experimental circumstances, is therefore the basic element

of consideration. For this, Bohr uses the term phenomenon. Every phenomenon is the resultant

of a physical system S, a preparation apparatus P , a measuring apparatus M and their mutual interaction

in a concrete experimental situation.

The description of a phenomenon must always be made in unambiguous terms because of the

requirement of communicability. A statement like, for example, “the object is in a superposition

IV. 2. BOHR AND COMPLEMENTARITY 83

of two different states” is therefore not suitable. In classical physics a sufficient arsenal of terms is

developed for these aims.

According to Bohr, characteristic of classical physics is in the first place that the interaction between

object and measuring apparatus can be assumed to be negligible small. This implies that upon

describing a phenomenon the measuring apparatus can be left out of consideration. Instead of the

statement: “Thermal interaction between a thermometer and a glass of water has, in certain circumstances,

yielded as a result that the mercury column has been found to have a certain length”, we

can also say: “The temperature of water has a certain value”. In this case we can, without objection,

transfer the description of the phenomenon onto the object itself, and speak in terms of its properties.

The essential difference between classical physics and quantum physics is, according to Bohr, that

in quantum physics the interaction is quantized. The interaction between an object and a measuring

apparatus can only exist of the exchange of one or more quanta, and cannot be made arbitrarily small.

Bohr calls this starting point the quantum postulate (Bohr 1928, p. 580).

QUANTUM POSTULATE:

[The] essence [of the quantum theory] may be expressed in the so - called quantum postulate,

which attributes to any atomic process an essential discontinuity, or rather individuality,

completely foreign to the classical theories and symbolized by Planck’s quantum

of action.

In a phenomenon the object, the measuring apparatuses, and their interaction form an indivisible

whole, and the interaction always amounts to at least one quantum h. This postulate unsettles the

procedure to convert the description of a phenomenon into a description of the object itself.

There is however a second element in Bohr’s point of view, which tempers this pessimistic conclusion.

Scheibe called it the buffer postulate (1973, p. 24) because “the function of the postulate is

to use classical physics as a buffer against the quantum - mechanical treatment of a phenomenon”,

BUFFER POSTULATE:

The description of the apparatus and of the results of observation, which forms part of

the description of a quantum phenomenon, must be expressed in the concepts of classical

physics (including those of “everyday life”), eliminating consistently the Planck quantum

of action.

The context of this requirement is again to be able to communicate our experimental findings to other

people. The reasoning is as follows (Bohr 1947, p. 59),

[. . . ] by an experiment we simply understand an event about which we are able in an

unambiguous way to state the conditions necessary for the reproduction of the phenomena.

In the account of these conditions, there can, therefore, be no question of departing

from the Newtonian way of description and, in particular, it may be stressed that by the

[. . . measuring apparatus . . . ], we simply understand some piece of machinery as regards

the working of which classical mechanics can be entirely relied upon and where,

consequently, all quantum effects have to be disregarded.

84 CHAPTER IV. THE COPENHAGEN INTERPRETATION

Bohr assumes that only the language and terms of classical physics are suitable for the description

of observational results. He writes (Bohr 1931, p. 692)

[. . . ] the unambiguous interpretation of any measurement must be essentially framed in

terms of the classical physical theories, and we may say that in this sense the language

of Newton and Maxwell will remain the language of physicists for all time.

This is a particularly radical point of view, and we will return to its motivation later.

The combination of both postulates now leads to the following reasoning. In all phenomena an

interaction exists between the system and the measuring apparatus which has a minimal order of magnitude

h > 0, after all, the most minute measurements always rely on a quantum phenomenon. But

in our description of the phenomenon we are forced to use classical concepts and this interaction, h,

cannot occur. The consequence is that in our description the interaction is not analyzable.

At the same time the classical character of the description makes it possible to speak again in

terms of properties of the object itself. Therefore, instead of the statement “the interaction between a

particle and a photographic plate resulted in a little black dot in a certain area of the plate”, we can

also say “the particle has been found at a position in that area”, where no longer is referred to the

measuring apparatus.

But the large difference with the classical situation is that we, by disregarding the interaction,

in a certain way make a mistake which remains without consequences within this phenomenon, but

prevents the description to be combinable with the information obtained under different experimental

conditions. If the object is coupled to another measuring apparatus there will be another interaction,

which will again not be analyzable. Descriptions of the object that have been obtained under different

measurement arrangements cannot be combined to one picture which covers it all. We will illustrate

this in a more concrete case.

IV. 2. 1

COMPLEMENTARY PHENOMENA

The most important examples of phenomena which give additional, but mutually excluding information

on an object are measurements of position and momentum. Bohr (1939, p. 22) writes

[. . . ] any phenomenon in which we are concerned with tracing a displacement of some

atomic object in space and time necessitates the establishment of several coincidences

between the object and the rigidly connected bodies and movable devices which, in serving

as scales and clocks respectively, define the space - time frame of reference to which

the phenomenon in question is referred.

In this case, therefore, the object has an interaction with an apparatus which is firmly bolted down

or anchored, so that its position remains secured. But the consequence is that a possible exchange

of momentum between object and apparatus cannot be analyzed. Such a transfer of momentum

will be absorbed by the fixed parts of the apparatus without leaving behind any trails. Within this

experimental setup we are therefore prohibited to say anything about the momentum of the object.

IV. 2. BOHR AND COMPLEMENTARITY 85

The opposite applies to the measurement of momentum (Bohr in Schilpp 1949, p. 219);

In the study of phenomena in the account of which we are dealing with detailed momentum

balance, certain parts of the whole device must naturally be given the freedom to

move independently of others.

Bohr assumes that a measurement of momentum is made by registering the recoil after a collision,

for example, with a test particle. In this way we can, using the conservation laws, retrieve the

momentum of the object. However, the condition that the test particle can move freely means that we

cannot guarantee that it preserves a definite position. It is therefore excluded from being used as part

of a spatial coordinate system, and now we cannot say anything about the position of the object.

In order to perform a position measurement we must therefore put the object in contact with a

part of the measuring apparatus which has been bolted down firmly, while performing a momentum

measurement we must observe the recoil of a freely movable part of the measuring apparatus, and

apply the momentum conservation law. Position and momentum measurements therefore exclude

each other, because a measuring apparatus cannot at the same time be bolted down and freely movable.

In the description of the object we must choose between granting a position or momentum. As worded

by Philipp Frank (1949, p. 163)

Quantum mechanics speaks neither of particles the positions and velocities of which

exist but cannot be accurately observed, nor of particles with indefinite positions and

velocities. Rather, it speaks of experimental arrangements in the description of which the

expressions ”position of a particle” and ”velocity of a particle” can never be employed

simultaneously.

Bohr calls this characteristic property of quantum mechanics, where two quantities exclude each

other whereas both are necessary to describe all phenomena in which the object can participate, complementarity.

Position and momentum are examples of complementary quantities. Similar considerations

apply to time and energy, such that a general complementarity exists between on the one hand

a space - time description of phenomena, and on the other hand a dynamical description, frequently

indicated by Bohr as ‘causally’, in which the conservation laws for energy momentum are applicable.

◃ Remark

The complementarity between quantities like position and momentum or descriptions using space -

time coordination or dynamic laws differs from, and replaces, the contrast which Bohr placed central

in his earlier work, namely between ‘wave’ and ‘particle’, because a classical particle has both position

and momentum, a classical wave has neither. ▹

The role of the uncertainty relations in Bohr’s views can now be described as considering them

in the first place as symbolic expressions of the impossibility to define position and momentum at

the same time when describing an object. In a phenomenon in which the position is determined

sharply, δ q = 0, the momentum must be undetermined, δ p = ∞, and vice versa. But the relation

δq δp ∼ h is, of course, more general. Bohr (1934, pp. 60,61) interprets this as follows:

At the same time, however, the general character of this relation makes it possible to

a certain extent to reconcile the conservation laws with the space - time co - ordination

of observations, the idea of a coincidence of well - defined events in a space - time point

being replaced by that of unsharply defined individuals within finite space - time regions.

86 CHAPTER IV. THE COPENHAGEN INTERPRETATION

The meaning Bohr attaches to the uncertainty relations can be summarized this way: the sharper

we can, in a phenomenon, define the position of the object, the fuzzier the momentum must be defined,

and vice versa. The quantities δq and δp in the relation δqδp ∼ h therefore represent the fuzziness in

the definition. Bohr emphasizes an epistemological role of these quantities stronger than an ontological

role.

IV. 2. 2

REMARKS AND PROBLEMS

Bohr’s supposition that classical language is a definite means of expression for physical observations

which cannot be improved upon, is radical and at first sight even fairly unacceptable. Language

develops and history teaches us that from time to time new concepts are necessary. Aristotle had,

for example, no momentum concept, Newton knew nothing of energy, Coulomb had no theory of

fields, etc. Doesn’t it speak for itself that quantum mechanics also asks for new concepts? Bohr,

however, (ibid., p. 16), says

[. . . ] it would be a misconception to believe that the difficulties of the atomic theory may

be evaded by eventually replacing the concepts of classical physics by new conceptual

forms.

Bohr emphasizes that with this point of view he does not reject the introduction of new entities,

e.g. quarks, superstrings or black holes. The aspects of classical language which are the reason

that it cannot be improved upon are, according to him, descriptions in terms of space and time and

descriptions in terms of cause and effect. These are the only categories with which we can describe

observational results.

Another problem with the idea that the classical concepts cannot be improved upon is Bohr’s

immediate conclusion that the quantum of action cannot occur in the description of a phenomenon,

because a statement such as ‘h = 6.6 · 10 −34 Js’ is also an unambiguous summary of experimental

evidence, although not of one phenomenon. The idea that h cannot appear in the language of observations

is a weak, and in fact untenable point in his argumentation. The prohibition of the use of h

in the language of observations also brought Bohr to the conclusion that the spin of an electron, 1 2 ,

would be fundamentally unobservable. This conclusion has been proven to be incorrect.

In some articles Bohr gives a more abstract explanation of the quantum postulate and emphasizes

the ‘symbolic’ role of h. It does not so much represent the inevitable interaction, or measurement

disturbance, between object and measuring apparatus, as the fundamental impossibility to make a

sharp distinction between object and observation apparatus. It is, in any case, clear that Bohr does not

regard the formalism of quantum mechanics, with its wave functions and operators, as an extension

or improvement of classical language. He emphasizes that this formalism is purely symbolic and

cannot be taken as a description, as the quantum state of a system is given without reference to the

experimental setup.

It should be noted that Bohr, at emphasizing the applicability of concepts, has more in mind than

the ‘logical’ question of ’definiteness’. For Bohr a term like ‘position of a particle’ is applicable if we

can in fact control and secure this position, using firmly bolted apparatuses. Bohr’s use of the term

‘determination’ refers both to a measurement as to a state preparation.

IV. 2. BOHR AND COMPLEMENTARITY 87

Speaking of ‘partially defined positions and momenta’, Bohr considers the uncertainty relation

between position and momentum as the possibility to come to a compromise with the complementarity

between position and momentum. Here we can think of a context of measurement in which the

object interacts with a part of the apparatus which is linked with the rest of the apparatus by means

of a spring with a finite spring constant, an intermediate form between ‘freely movable’ and ‘firmly

bolted’. He has, however, not developed this compromise. This point of view does in fact not fit the

usual mathematical derivation of the uncertainty relations for position and momentum. They make,

for two given (sharp) quantities p and q, a statement about spreading in quantum states, not about the

well-definedness of the quantities. It has been attempted to prove this compromise mathematically,

by the introduction of ‘blurred quantities’, e.g. Busch, Grabowski and Lahti (1995).

Of fundamental importance in Bohr’s point of view is that in a phenomenon an object and experimental

setup are involved. The setup determines which frame of concepts applies to the object. In

many cases the contrast between object and measuring apparatus coincides with that of the microscopic

and macroscopic system, respectively. But that is not necessarily so. A macroscopic system

can also be considered as an object while a microscopic system can serve as a measuring apparatus.

We can consider, for example, a macroscopic measuring apparatus to be the object of another measurement.

As soon as we do this the macroscopic system can, according to Bohr, no longer execute

its role as a measuring device. It becomes an object itself, to which the quantum formalism must be

applied. This functional contrast between object and measuring apparatus is therefore more essential

than that between microscopic and macroscopic systems.

For a good understanding of Bohr’s position, and Heisenberg’s for that matter, it is important to

notice that measurements do not require the presence of consciousness. Decisive for applicability of

classical concepts is the presence of a measurement context. Therefore, subjectivity does not play

a role in any form, for applicability of a concept as ‘momentum’ it does not matter if a conscious

observer, a computer or another measuring apparatus carries out the momentum measurement.

Also, from Bohr’s refusal to assign a realistic meaning to the quantum mechanical description, the

conclusion cannot be drawn that he supports an anti - realistic or ‘instrumentalist’ view on physics,

where instrumentalism is roughly the thesis that a scientific theory is only an instrument to carry out

calculations of which we compare the outcomes with the indications of measuring apparatuses, in

particular, that a theory is no ‘knowledge of the world’, that it does not provide a faithful picture of

what reality is. An object such as an electron has, besides its quantum mechanical state, more than

enough permanent properties, such as the super - selected quantities mass and charge which are not

subject to complementarity, to conceive it as a real, existing object.

IV. 2. 3

AGREEMENT AND DIFFERENCE BETWEEN HEISENBERG AND BOHR

Both Heisenberg and Bohr emphasize that quantum mechanics is a complete theory which cannot

be extended into a more detailed description with hidden variables. Bohr says (Schilpp 1949, p. 235)

[. . . ] in quantum mechanics, we are not dealing with an arbitrary renunciation of a more

detailed analysis of atomic phenomena, but with a recognition that such an analysis is in

principle excluded.

88 CHAPTER IV. THE COPENHAGEN INTERPRETATION

Heisenberg (1927, p. 83) also expresses himself in this sense. He defines the uncertainty relations

as

Even in principle, we cannot know the present in all detail.

He rejects the conception that behind the statistic description of quantum mechanics there still is a

‘real world’ as a “fruitless and senseless speculation” (loc. cit.).

According to both Bohr and Heisenberg, the quantum mechanical description cannot be applied

to the whole world, because a classically described context of measurement is always necessary. The

border between the classical and quantum mechanical description can be moved at will, but cannot

be removed. Therefore, quantum mechanics is not a universal theory in the sense that there exists

something like a ‘wave function of the universe’.

Further agreement between Heisenberg and Bohr is found in the significance they attach to measurement.

The difference is that according to Heisenberg something changes in the object during

measurement; some properties are created, others disappear or become fuzzy. According to Bohr

nothing has to happen in the object. The experimental setup only enables some description of the system

which would not be allowed at another experimental setup. According to Bohr, the uncertainty

relation is a symbolic, contrary to a descriptive, expression of the impossibility to define position and

the momentum in one phenomenon.

Another difference is that Heisenberg tends, more than Bohr, to a realistic interpretation of the

mathematical quantum formalism. In an interview at the end of his life, Heisenberg admitted that he

never really understood the idea of complementarity.

IV. 3

DEBATE BETWEEN EINSTEIN EN BOHR

IV. 3. 1

INTRODUCTION

Einstein, who contributed to the development of the quantum theory until 1922, never wanted

to accept the Copenhagen interpretation. In his memoirs, Heisenberg mentions how he, at a visit to

Berlin, explained his starting - point that the theory may speak exclusively about observable quantities,

and, to his surprise, Einstein wanted to know nothing about it, “the theory decides what can be

observed”. The main source of the course of the debate between Einstein and Bohr which we will

review here, is Bohr’s own report ‘Discussion with Einstein on Epistemological Problems in Atomic

Physics’ (Bohr 1949).

The very first time Einstein gave publicity to his objections was at the 5 th Solvay conference in

Brussels in 1927 where he suggested there were two conceivable conceptions concerning the quantum

mechanical wave function.

(i) The state ψ gives a description of the individual system which is as complete as possible.

(ii) The state ψ does not characterize an individual system but an ensemble of identically prepared

systems. Therefore, as a description of the individual system ψ is incomplete, ψ is a ‘statistical

quantity’.

IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 89

Conception (i) was defended by Heisenberg and Bohr. Einstein posed the next objection to this

conception: when a particle travels through a narrow slit, the wave function will, by deflection, extend

itself over a large part of space. If this is a complete description of the particle, we have to conclude

that it is potentially present everywhere in this area. But after detection of the particle on a photographic

plate it is out of the question that it can still be found elsewhere. Therefore, the wave function

must disappear suddenly there, which would imply a peculiar ‘action at a distance’. This objection

does not apply to conception (ii), because there the detection simply corresponds to the choice of an

element from the ensemble.

In his answer, Bohr emphasized that the deflection of the wave function by a slit in a firmly bolted

screen finds its origin in the possibility of the particle to exchange momentum with the screen. But

this exchange of momentum is not analyzable within this setup, i.e., without detaching the screen.

The question whether a more detailed description of the individual case is possible found its

temporary culmination in the analysis of the thought experiment with the double slit, which is depicted

in figure IV. 2. When a monochromatic wave travels through a screen with two narrow slits, an interference

pattern is visible on a photographic plate. This is typical for wave behavior, where the waves

from both slits cooperate. An individual particle, however, can only travel through one slit, and the

wave function does not tell us through which slit it travels.

Figure IV. 2: The double slit interference experiment (Bohr 1949 )

Einstein now suggested that it was nevertheless possible to obtain information about through

which slit the particle travels, for example by measuring the transfer of momentum to the first screen.

If this screen received a thrust downwards, the particle has chosen the upper slit, and vice versa.

Bohr answered that if we want to measure the momentum transfer to the screen with an exactitude

which is enough to distinguish the recoils belonging to the paths through the two slits, the momentum

of the screen itself must be very exactly known. If d represents the distance between the slits, and l

represents the distance between the screens, the angle between the two paths is of the order

α ≃ sin α = d . (IV. 5)

l

The recoil is of the order

p 0 sin α ≃ d , (IV. 6)

λ l

90 CHAPTER IV. THE COPENHAGEN INTERPRETATION

and therefore we have to know the momentum of the screen with an uncertainty

δp d . (IV. 7)

λ l

Gaining such an exactitude is, however, only possible if the screen is movable. But in that case it is

no longer possible to fulfil its function as a screen which determines an exact position for the slit. It

is therefore no longer part of the original measuring context, as can be seen in figure IV. 3.

Figure IV. 3: Contexts of measurement in which the interference of the particles is visible, and those

in which the recoil of the screen is visible, exclude each other. (Bohr 1949 )

Actually, because now we will perform a measurement on the screen, the screen itself has to be

considered an object. This means that quantum mechanics applies to it, and the screen is, therefore,

also subject to an uncertainty relation

δq λ l . (IV. 8)

d

But this is an indefiniteness of the same order of magnitude as the distance between the interference

bands. Bohr concludes that under these circumstances interference can no longer be seen.

With this reasoning he was able to transform Einstein’s objection to an affirmation of his idea

of complementarity; as soon as we try to carry out a closer analysis of the phenomenon, we have to

modify the experimental setup in such a way that the phenomenon changes unrecognizably. Nowadays

an alternative of this thought experiment can actually be carried out in a laboratory, as we will

discuss in section IV. 4.

IV. 3. 2

THE PHOTON BOX

At the 6 th Solvay Conference in 1930 in Brussels, Einstein gave another example, which is known

under the name ‘the photon box’. It concerns an isolated box filled with radiation and equipped with

a clock mechanism which opens a shutter during a very short interval. It is assumed that in advance

the box is weighed meticulously.

IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 91

Upon closure of the shutter we have, according to Einstein, a choice: either we weigh the box

again and determine how much mass has vanished so that we can, using the relation E = m c 2 ,

retrieve the energy of the escaped photon, or we open the box and read off the clock mechanism to

determine when the shutter has been opened, which enables us to predict the time of exit of the photon

and therefore its time of arrival at a remote detector. We can choose between both options long after

the photon has left.

Bohr’s answer is not entirely clear. It may be assumed that he did not understand Einstein’s

intentions correctly. 1 He explains Einstein’s objection as an attempt to refute the uncertainty relation

between energy and time; he shows that both determinations cannot possibly be made at the same

time.

Bohr reasons as follows. Assume that the box hangs in equilibrium from a spring in a gravitational

field. When in a time interval T a mass δm escapes, it receives an upward impulse F ∆t of magnitude

g δm T. (IV. 9)

We can keep T finite by, at some moment, hanging a small weight to the box to compensate for the

loss of mass. Suppose we want to determine the mass of the photon by measuring this momentum

transfer then, again, the momentum of the box at the start of the experiment must be exactly known,

δp g δm T. (IV. 10)

But now the same argument applies as used in the double slit experiment. This precise determination

of momentum is only possible if the fixation of the position of the box is given up. The box itself

must be considered a quantum mechanical object, and therefore the uncertainty relation δ pδ q h

applies to it. The position of the box is unknown with an uncertainty of magnitude

δq

g δm T

(IV. 11)

from which it follows that the gravitational potential ϕ g to which the clock is exposed is also uncertain,

δϕ g ≃ g δq

. (IV. 12)

δm T

But according to the red shift formula from the general theory of relativity (!) the pace of a clock is

influenced by the gravitational potential,

∆T

T

= δϕ g

, (IV. 13)

c2 therefore, the pace of the clock is also uncertain, and consequently the time of opening of the clock is

unknown. Under the circumstances in which we can determine the energy of the photon, we cannot

retrieve its exit time exactly.

Although Bohr seems to rebuke Einstein with his own theory, Bohr’s answer evokes, among other

things, the question whether it is appropriate that the correctness of quantum mechanics relies on

the correctness of the general theory of relativity, which is a classical theory, and is, strictly spoken,

contradictory to quantum mechanics.

1 That Einstein indeed had the intention to point out the freedom of choice is apparent in a letter to Bohr from Paul

Ehrenfest, who heard the argument from Einstein earlier.

92 CHAPTER IV. THE COPENHAGEN INTERPRETATION

EXERCISE 27. Try, using the uncertainty relation for time and energy, δ tδ E h, to refute

Einstein’s argumentation without appealing to other physical theories.

IV. 3. 3

EINSTEIN, PODOLSKY AND ROSEN

The thought experiment of Einstein, Podolsky and Rosen, which we discussed in section I. 2,

forms the highlight of the debate. Here Einstein’s objections emerge in their most pure form.

Given two systems which interacted with each other at some time, but are separated now, consider

two non - commuting quantities A and A ′ of one of the particles, and B and B ′ of the other

particle. Measurement of A allows us to do a certain prediction concerning B of the other particle,

measurement of A ′ allows us, analogously, to make a certain prediction concerning B ′ of the other

particle.

Einstein admits that these two measurements cannot be carried out simultaneously. But we can

choose which measurement we perform while the other particle is very far away. It is not reasonable,

EPR argue, that this other particle will be influenced by this choice. This means that although

only one of both predictions concerning the other particle can be done with certainty, both predictions

are, at the same time, true, corresponding to properties of the other particle, i.e., to ‘elements of

physical reality’.

IV. 3. 4

HEISENBERG, BOHR AND EINSTEIN, PODOLSKY AND ROSEN

According to Heisenberg, measurement has an essential influence. Some properties of the particle

become sharp, others fuzzy. If this consequence of measurement would be understood to be a physical

interaction, this would evoke the next ‘natural’ requirement of locality (M.L.G. Redhead 1987, p. 77)

An unsharp value for an observable cannot be changed into a sharp value by measurements

performed at a distance.

But the analysis of EPR shows that, the particles being far removed from each other, this requirement

has not been met, making Heisenberg’s interpretation much less physically pictorial than it seemed to

be initially. The natural requirement of locality in Bohr’s interpretation reads (loc. cit.)

A previously undefined value for an observable cannot be defined by measurements performed

‘at a distance’.

This requirement has also not been fulfilled.

Bohr’s answer to EPR, and his rejection of the incompleteness claim, amounts to the notion that

the aforementioned requirement of locality can be violated without implying the existence of superluminal

physical effects. The ‘defining’ functioning of measuring apparatuses is not a process that

propagates in space and time and by means of some interaction disturbs particles that are not measured,

or creates values for properties in those particles. It concerns an epistemological role of the

measuring apparatuses. The measuring apparatuses measuring one of a pair of correlated particles

define which classical terms apply to both particles.

IV. 4. NEUTRON INTERFEROMETRY 93

If the position is measured of one of the particles, we have to do with a phenomenon in which

the term position is applicable. Thus, on the basis of the correlation between these particles the term

‘position’ is also applicable to the other particle. If the position of one of the particles is measured,

a ‘position perspective’ is opened, so to speak, to the world. Likewise, measurement of momentum

on one of the particles makes the other particle accessible to a description with the term ‘momentum’.

Even though there is no physical intervention on this particle, it is still not permitted to speak about

the particles having these properties outside the context of a phenomenon. Therefore, Bohr rejects

Einstein’s reasoning that the other particle, not being disturbed by the measurement, consequently

also possesses the properties ‘position’ and ‘momentum’ independent of measurement.

In fact, this same reasoning can be applied to the the double slit experiment, as Bohr showed

in his answer to Einstein. In this experiment we also have a choice to do either a measurement of

momentum on the screen and this way determine which path the particle has taken, thereby losing the

interference pattern, or to measure its position, thereby retrieving the interference pattern again. But

Bohr writes

As repeatedly stressed, the principal point is here that such measurements demand mutually

exclusive experimental arrangements.

IV. 4

NEUTRON INTERFEROMETRY

Nowadays, a variant version of the thought experiment with the double slit can be carried out in

the laboratory using a neutron interferometer. A neutron interferometer consists of a massive perfect

silicon crystal, usually with dimensions of approximately 10 × 10 × 50 cm 3 . After cutting large

notches in the crystal, a basis with upstanding teeth remains, see figure IV. 4.

Figure IV. 4: Several perfect crystal neutron interferometers (Rauch and Werner 2000 )

94 CHAPTER IV. THE COPENHAGEN INTERPRETATION

Using an interferometer with three upstanding teeth, a monochromatic beam of neutrons with

a de Broglie wavelength of approximately 1 Å now hits the first tooth of this crystal. The crystal

lattice acts like a grid and lets the beam pass in very sharply defined directions. Under suitable

conditions there are exactly two emanating beams, one transmitted (T) and one reflected (R), as

shown in figure IV. 5 a.

At the second tooth this process is repeated, and both beams are again split up. Two of them are

now outside the interferometer where they are screened, no longer participating. The remaining two

beams are bent towards each other and meet at the third tooth. Here, both beams are split up again,

and now the straightforward going beam of one path is superimposed on the reflected beam of the

other path. Neutron detectors are placed in both emanating beams.

T

2

R

A

R

1

R

B

T

a) A sketch of the setup b) The experimental results

(Rauch and Werner 2000 )

Figure IV. 5: The interference pattern in the neutron interferometer is acquired by measuring the

intensity in the detectors at a variable optical path length difference.

If the incoming beam comes from below, and the beams are not manipulated, all neutrons turn out

to end up in the upper beam at detector A, undergoing constructive interference, while the neutrons

in the lower beam extinguish each other. For this phenomenon it is essential that the interferometer

consists of only one crystal, for in that case the waves remain coherent even though, along the way,

the beams have been separated by ‘macroscopic distances’, approximately 5 cm or ≃ 10 9 λ. When a

neutron has arrived in a detector it can have traveled along one of both paths.

Upon introducing a phase difference between the two paths by sliding a small piece of aluminium

of variable thickness in one of the paths, the intensity shifts from the upper to the lower detector. This

intensity is a periodic function of the thickness of the piece of aluminium, see figure IV. 5 b. This is

the interference pattern.

Now the question is if we can, in some way, uncover along which path the particle has traveled.

Following Bohr’s line of thought this should be possible by sawing off one of the teeth and measuring

the recoil it receives of the neutron. Such an experiment can, however, not be carried out with the

required experimental exactitude.

Another option is to make use of the fact that the neutron is a spin 1/2 particle and therefore has

an internal degree of freedom. We can carry out such an experiment with a polarized beam, where

IV. 4. NEUTRON INTERFEROMETRY 95

all neutrons have, at entry in the interferometer, spin up in the z - direction. We place the complete

setup in a homogeneous magnetic field which ensures that spin up and spin down have a different

energy ω 0 . In one of the paths we place a ‘spin flipper’, a small coil through which an alternating

current runs having exactly the resonance frequency ω 0 . At a suitable choice of the length of the

coil the spin of every neutron which travels through it will be flipped over. Subsequently, we place

spin analyzers in front of the detectors, so that we can not only observe in which emanating beam the

neutron is located but also its spin in the z - direction.

In this setup we can therefore uncover exactly along which path the particle has traveled; spin up

means the path without the spin flipper has been chosen, spin down means the neutron traveled along

the path with the spin flipper. But in this setup no more interference is seen! The intensity is equal in

both detectors and independent of the phase difference.

We can describe this as follows. The wavepath function |ϕ 0 ⟩ ∈ L 2 (R 2 ) of an emanating neutron

exists of four terms,

|ϕ 0 ⟩ = 1 2

(

|ϕ1A ⟩ + |ϕ 1B ⟩ + e i χ |ϕ 2A ⟩ + e i χ |ϕ 2B ⟩ ) . (IV. 14)

Here ϕ iA and ϕ iB represent the wave functions ending up in the detectors A and B, respectively, 1

and 2 refer to the two possible paths through the interferometer, as can be seen in figure IV. 5 a. The

factor e iχ corresponds to the phase shift by the aluminium. If χ = 0, there is maximum constructive

interference in A and total destructive interference in B, from which it follows that

|ϕ 1A ⟩ = |ϕ 2A ⟩ and |ϕ 1B ⟩ = − |ϕ 2B ⟩. (IV. 15)

The intensity in detector A is given by the expectation value of a projection P A , where

P A |ϕ iA ⟩ = |ϕ iA ⟩ and P A |ϕ iB ⟩ = 0, analogously for P B . Therefore, we find for the intensity I A

of the neutron beam that encounters detector A, quantum mechanically expressed as the probability

to find a neutron in detector A,

I A = ⟨ϕ 0 | P A |ϕ 0 ⟩ = 1 (

4 ⟨ϕ1A | + ⟨ϕ 2A | e − i χ) ( |ϕ 1A ⟩ + e i χ |ϕ 2A ⟩ )

and likewise for I B ,

= 1 2

I B = ⟨ϕ 0 | P B |ϕ 0 ⟩ = 1 4

= 1 2

(1 + cos χ), (IV. 16)

(

⟨ϕ1B | + ⟨ϕ 2B | e − i χ) ( |ϕ 1B ⟩ + e i χ |ϕ 2B ⟩ )

(1 − cos χ). (IV. 17)

In this experiment the neutrons are polarized, therefore we can add the spin state to the wavepath

function and thus get a Pauli spinor,

( 1

|ϕ i, tot ⟩ = |ϕ 0 ⟩ ⊗ |z ↑⟩ = ϕ(⃗q) =

0)

( ) ϕ(⃗q)

0

∈ L 2 (R 3 ) ⊗ C 2 . (IV. 18)

The functioning of the spin flipper, which we assume to be completely ideal, can now be described as

follows. The component of the state traveling along path 1 does not meet a spin flipper, which means

96 CHAPTER IV. THE COPENHAGEN INTERPRETATION

that it remains unaltered, and we have, leaving out the cartwheels ⊗,

|ϕ 1A ⟩ |z ↑⟩ → |ϕ 1A ⟩ |z ↑⟩ and |ϕ 1B ⟩ |z ↑⟩ → |ϕ 1B ⟩ |z ↑⟩, (IV. 19)

whereas for the components traveling along path 2 the spin direction reverses,

|ϕ 2A ⟩ |z ↑⟩ → |ϕ 2A ⟩ |z ↓⟩ and |ϕ 2B ⟩ |z ↑⟩ → |ϕ 2B ⟩ |z ↓⟩. (IV. 20)

Therefore, the total final state is

|ϕ f, tot ⟩ = 1 2

which means that for the intensity we have

(

|ϕ1A ⟩ |z ↑⟩ + |ϕ 1B ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ + e i χ |ϕ 2B ⟩ |z ↓⟩ ) , (IV. 21)

I A = ⟨ϕ f, tot | P A ⊗ 11 |ϕ f, tot ⟩ = 1 4 ⟨ϕ f, tot| ( |ϕ 1A ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ ) = 1 2

, (IV. 22)

and likewise for I B . We see that, because of the orthogonality of the spin states |z ↑⟩ and |z ↓⟩, the

interference term disappears.

With the neutron interferometer we can also illustrate the fact that there is always freedom of

choice because we can, instead of analyzers for spin in the z - direction, place analyzers for spin in

the x - direction.

The eigenvectors for spin in the x - direction are superpositions of those in the z - direction, see

section III. 6, equations (III. 145) and (III. 146),

|x ↑⟩ = 1 2

√

2

(

|z ↑⟩ + |z ↓⟩

)

and |x ↓⟩ = 1 2

√

2

(

|z ↓⟩ − |z ↑⟩

)

. (IV. 23)

We can calculate the probability to find, e.g., a neutron with spin in the negative x - direction in detector

A, as the expectation value of the projector P A |x ↓⟩⟨x ↓| in the state |ϕ f, tot ⟩, (IV. 21),

⟨ϕ f, tot | ( P A ⊗ |x ↓⟩ ⟨x ↓| ) |ϕ f, tot ⟩

= 1 (

4 ⟨ϕ1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + e i χ ⟨ϕ 1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩

+ e − i χ ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩ )

= 1 4

(1 − cos χ), (IV. 24)

and we see interference again.

EXERCISE 28. Verify the calculations (IV. 22) and (IV. 24).

In this case we also can choose whether we measure spin in the x - direction or in the z - direction

long after the neutron has left the interferometer, which means that the neutron seems to make the

choice whether to take one of the paths through the interferometer, or to show interference between

IV. 5. THE UNCERTAINTY RELATIONS 97

both paths, after it has left the interferometer. J.A. Wheeler (1978) called such experiments delayed -

choice experiments. Outcomes of measurements in the future seem to determine what has happened

in the past!

Actual confirmation of this freedom of choice was not obtained until 2007, when a group in

Cachan, France, succeeded to carry out such an experiment using linearly polarized single photons,

a 48 m interferometer and two beamsplitters. In their article (Jaques 2007) they conclude that

Our realization of Wheeler’s delayed - choice gedanken experiment demonstrates that

the behavior of the photon in the interferometer depends on the choice of the observable

that is measured, even when that choice is made at a position and a time such that it is

separated from the entrance of the photon into the interferometer by a space - like interval.

EXERCISE 29. Give, concisely, Bohr’s view on such experiments.

IV. 5

THE UNCERTAINTY RELATIONS

IV. 5. 1

INTRODUCTION

Heisenberg’s original reasonings concerning the uncertainty principle resulted in ‘approximate

inequalities’ for position q and momentum p, and for energy E and time t, of the form

δq δp ∼ h and δE δt ∼ h. (IV. 25)

In this section we will focus on the mathematical meaning of δ q, δ p, δ E and δ t and their interpretation.

In his first article, Heisenberg (1927) gives the Gaussian wave packet as the only quantitative

example. Its Fourier transform is also Gaussian and the widths of these packets are inversely proportional

to each other, a general result of Fourier analysis. A suitable definition of these widths

yields q 1 p 1 = h, where q 1 and p 1 represent the widths in question. Still in the same year E.H. Kennard

derived the next general inequality,

∆ ψ Q ∆ ψ P 1 2

, (IV. 26)

where ∆ ψ Q and ∆ ψ P are standard deviations of Q and P in ψ ∈ L 2 (R). In his Chicago lectures,

Heisenberg (1930) considers the Kennard inequality (IV. 26) as the mathematical expression of the

uncertainty principle. We will criticize this still widespread conception shortly, and give a derivation

of the ‘standard uncertainty inequalities’, which are a generalization of the Kennard inequality.

◃ Remark

In his discussions of the uncertainty principle, Bohr exclusively makes use of relations of the

type (IV. 25). ▹

98 CHAPTER IV. THE COPENHAGEN INTERPRETATION

IV. 5. 2

THE STANDARD UNCERTAINTY RELATIONS

If ψ ∈ L 2 (R) is the normalized wave function of a physical system in the q - language,

with ∥ψ∥ = 1, the wave function ˜ψ(p) in the p - language is its Fourier transform

˜ψ(p) =

∫

1

√

2 π

R

e − i p q

ψ(q) dq, (IV. 27)

and its inverse Fourier transform is

∫

1

ψ(q) = √ e i p q

˜ψ(p) dp. (IV. 28)

2 π

R

The norm is invariant under Fourier transformations, therefore ∥ ˜ψ∥ = 1.

The standard deviation of position in a state |ψ⟩, ∆ ψ Q, is defined as

∫

( ∫ ) 2.

(∆ ψ Q) 2 = ⟨Q 2 ⟩ ψ − ⟨Q⟩ ψ 2 = q 2 |ψ(q)| 2 dq − q |ψ(q)| 2 dq (IV. 29)

R

Likewise, for momentum, ∆ ψ P , we have

(∆ ψ P ) 2 = ⟨P 2 ⟩ ψ − ⟨P ⟩ ψ

2

∫

= − 2 ψ ∗ (q) d2 ψ(q)

( ∫

R dq 2 dq − − i ψ ∗ (q) dψ(q) ) 2

dq

R dq

∫

= p 2 | ˜ψ(p)|

( ∫ 2. 2 dp − p | ˜ψ(p)| dp) 2 (IV. 30)

R

Without loss of generality we can assume ⟨P ⟩ and ⟨Q⟩ to equal 0, so that

of 1 2

(∆ ψ P ) 2 = − 2 ∫

R

ψ ∗ (q) d2 ψ(q)

dq 2 dq =

∫

R

p 2 | ˜ψ(p)| 2 dp. (IV. 31)

If the wave function ψ (q) is a Gaussian wave packet, the product takes on the minimum value

. An example is the ground state of the one - dimensional harmonic oscillator having mass m,

ϕ 0 (q) =

( m ω0

π

) 1

4 e − m ω q2

2 , (IV. 32)

with energy E 0 = 1 2 ω 0.

Before interpreting the Kennard inequality (IV. 26), we give a still more general inequality, derived

by Schrödinger (1930). Consider two arbitrary self - adjoint operators A and B acting on a Hilbert

space H. Define, for a pure state |ψ⟩ ∈ H, the following operators:

A ψ := A − ⟨A⟩ ψ 11 and B ψ := B − ⟨B⟩ ψ 11. (IV. 33)

The expectation values of these operators are, in the state |ψ⟩, equal to 0,

⟨A ψ ⟩ ψ = ⟨B ψ ⟩ ψ = 0. (IV. 34)

IV. 5. THE UNCERTAINTY RELATIONS 99

The Cauchy - Schwarz inequality (II. 12), p. 19, for the vectors A ψ |ψ⟩ and B ψ |ψ⟩ reads

⟨A ψ ψ | A ψ ψ⟩ ⟨B ψ ψ | B ψ ψ⟩ ∣ ∣ ⟨Aψ ψ | B ψ ψ⟩ ∣ ∣ 2 . (IV. 35)

Because A ψ and B ψ are self - adjoint, we can also write this inequality as follows,

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ ∣ ∣⟨A ψ B ψ ⟩ ψ

∣ ∣

2 . (IV. 36)

Using both the commutator [· , ·] − and the anti - commutator [· , ·] + , we find for the right - hand side

of (IV. 36)

∣ ⟨Aψ B ψ ⟩ ψ

∣ ∣

2

where the cross - term disappears because of

Furthermore,

= ∣ 1

2 ⟨[A ψ, B ψ ] − ⟩ ψ + 1 2 ⟨[A ∣

ψ, B ψ ] + ⟩ ψ 2

= 1 ∣

∣

4 ⟨[Aψ , B ψ ] − ⟩ ψ 2 +

1

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 , (IV. 37)

⟨[A ψ , B ψ ] − ⟩ ∗ ψ = − ⟨[A ψ, B ψ ] − ⟩ ψ

⟨[A ψ , B ψ ] + ⟩ ∗ ψ = + ⟨[A ψ, B ψ ] + ⟩ ψ . (IV. 38)

[A ψ , B ψ ] − = [A, B] − , (IV. 39)

and we obtain the inequality

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4

∣ ∣

∣⟨[A, B] − ⟩ ψ 2 +

1

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 . (IV. 40)

In view of the inequalities (IV. 26) and (IV. 40), we make a few remarks.

(i) Leaving out the last term on the right - hand side of inequality (IV. 40) gives the better known

but weaker inequality, derived by H.P. Robertson (1929),

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4

∣

∣⟨[A, B] − ⟩ ψ

∣ ∣

2 . (IV. 41)

(ii) Notice that ⟨A 2 ψ ⟩ ψ is equal to the square of the standard deviation of the quantity A in the

state |ψ⟩,

⟨A 2 ψ ⟩ ψ = ⟨(A − ⟨A⟩ ψ ) 2 ⟩ = (∆ ψ A) 2 . (IV. 42)

(iii) For the special case A = Q and B = P , the Robertson inequality (IV. 41) transforms into the

Kennard inequality (IV. 26), and the expressions (IV. 29) and (IV. 31) correspond to ⟨Q 2 ψ ⟩ ψ in

the q - language and ⟨P 2

ψ ⟩ ψ in the p - language.

(iv) Notice that in deriving these uncertainty relations the interpretation of the uncertainties plays

no role.

100 CHAPTER IV. THE COPENHAGEN INTERPRETATION

(v) An objection to the Robertson inequality (IV. 41) and the Schrödinger inequality (IV. 40) is that

the right - hand side depends on the state, therefore, it is no absolute lower limit for all states.

If |ψ⟩ is an eigenstate of A, the right - hand side of the Robertson inequality (IV. 41) is 0 and

does not provide any restriction on ∆B. Therefore, even if A and B are not both at the same

time sharp in any state, i.e., they do not have simultaneous eigenstates, this does not follow

from the inequality (IV. 41).

Only if the right - hand side of inequality (IV. 41) is unequal to zero for all states, the Robertson

inequality represents the uncertainty principle. This is the case if the commutator is a multiple

of unity, as in the case of P and Q, where [P, Q] = −i11, see p. 78, (IV. 1). It can, however, be

proved that this canonical commutation relation [P, Q] can only apply to unbounded operators

having no eigenstates in the, inevitably infinite dimensional, Hilbert space in which they act.

(vi) Already in 1929 E.U. Condon pointed out the following facts (Jammer 1974, p. 71). In certain

states, non - commuting operators can both be sharp. Take, for example, the ground state of the

H - atom, or any stationary state with total angular momentum l = 0. This is also an eigenstate

of L x , L y and L z with eigenvalue 0. Therefore, ∆L x ∆L y = 0, and likewise for L x and L z ,

and for L y and L z , although these operators do not mutually commute. Therefore, the fact that

operators do not commute does not guarantee an uncertainty relation. Furthermore, sometimes

an inequality holds for commuting operators. Take again a stationary state of the H - atom,

with l = 1 and m = 0. In that state ⟨[L x , L y ]⟩ = 0, whereas ∆L x ≠ 0 and ∆L y ≠ 0.

In conclusion, there are fundamental objections against accepting the Schrödinger inequality, and

by implication against the weaker inequalities which follow from it, to be the mathematical expression

of Heisenberg’s uncertainty principle.

And this is not everything yet.

IV. 5. 3

SINGLE SLIT EXPERIMENT

Relations (IV. 26) and (IV. 41) are considered to be the mathematical expression of the uncertainty

principle in the major part of textbooks on quantum mechanics. Next to the previous criticism, we

will show that this also is, remarkably enough, inconsistent with the experiments used as illustrations

of this principle (Uffink and Hilgevoord 1985, 1988 and Hilgevoord and Uffink 1988, 1990).

Consider the deflection of light, or of electrons, by a single slit in an absorbing screen, an example

Heisenberg also gives. Take for the wave function representing the particles passing through the

screen with the slit a simple square wave function, see figure IV. 6,

ψ ss (q) =

{

1 √

2 a

if |q| a

0 elsewhere

, (IV. 43)

where 2 a ∈ R + is the width of the slit, and q the Cartesian coordinate parallel to the screen and

perpendicular to the slit.

IV. 5. THE UNCERTAINTY RELATIONS 101

2 a

|ψ ss (q)| 2

Figure IV. 6: The probability distribution in position for a slit of width 2 a

The Fourier transform of ψ ss is

˜ψ ss (p) =

√ a

π

sin(ap/)

. (IV. 44)

a p /

The square of this wave function, | ˜ψ ss (p)| 2 , has the same form as the diffraction pattern for the slit

which is formed on a photographic plate placed far away, see figure IV. 7.

2π/a

| ˜ψ ss (p)| 2

Figure IV. 7: The diffraction pattern for a small slit of width 2 a

For the standard deviation of position and momentum in the state ψ ss we find

(∆ ψss Q) 2 =

∫

R

q 2 |ψ ss (q)| 2 dq = 1

2 a

∫ +a

−a

q 2 dq = 1 3 a2 (IV. 45)

102 CHAPTER IV. THE COPENHAGEN INTERPRETATION

and

yielding

(∆ ψss P ) 2 =

∫

R

p 2 | ˜ψ ss (p)| 2 dp = 1

π a

∫

R

| sin(ap)| 2 dp = ∞, (IV. 46)

∆ ψss Q ∆ ψss P = 1 3√

3 a ∞. (IV. 47)

This indeed satisfies the Kennard inequality (IV. 26), but in a little interesting manner.

Although ∆ ψss P = ∞, the function | ˜ψ ss | 2 has in fact a very pronounced central peak, of a width

of the order a −1 , in which 95% of the total probability is located. It is the inverse proportionality

of the width of this central peak to the width of the slit, which, according to Heisenberg, illustrates

the uncertainty principle; it is impossible to make the probability densities |ψ ss (q)| 2 and | ˜ψ ss (p)| 2

arbitrarily small at the same time.

But this conclusion can not be inferred from the Kennard inequality (IV. 26). If a goes to infinity,

| ˜ψ ss (p)| 2 goes to the delta function δ (p). The standard deviation ∆ ψss P , however, remains

divergent. In other words, 95% of a probability distribution can be concentrated on an arbitrarily

small interval, whereas the standard deviation of the distribution remains arbitrarily large. 2 If nothing

is given concerning the distributions |ψ ss (q)| 2 and | ˜ψ ss (p)| 2 but the Kennard inequality (IV. 26),

these distributions could both be very narrow, and, consequently, Heisenberg’s conclusion can not be

derived from the Kennard inequality, in contrast to what is usually claimed.

Nevertheless, Heisenberg’s conclusion is correct for the given example of the single slit. This

raises the question if his statement is valid in general. What we are in fact interested in is a measure

for the width of a probability distribution representing the width of the unweighted distribution.

The most natural definition of such a measure is the smallest interval a fraction α ∈ [0, 1] of

the total probability can be in, where, roughly, α = 0.95 is taken. If ρ is a probability density, the

definition is

{

∫ b

}

W α (ρ) := min [a, b] ⊂ R ∣ ρ(x) dx = α . (IV. 48)

a

For position and momentum in quantum mechanics we define

{

∫ b

}

W α (Q, ψ) := min [a, b] ⊂ R ∣ |ψ(q)| 2 dq = α , (IV. 49)

{

W α (P, ψ) := min [a, b] ⊂ R

∣

a

∫ b

a

| ˜ψ(p)|

}

2 dp = α . (IV. 50)

The product of these measures also satisfy an uncertainty relation, as was shown for the first time by

H.J. Landau and H.O. Pollak (1961), nota bene in a journal for industrial engineers of the American

Bell Telephone Company,

W α (P, ψ) W α (Q, ψ) c α , (IV. 51)

where α ∈ ( 1

2 , 1] , and c α > 0 is a constant which only depends on α, not on ψ.

2 Responsible for this phenomenon is the mathematical fact that the standard deviation assigns a quadratically increasing

weight to the tails of a distribution. In a Gaussian distribution, e.g. the Gaussian wave packet (IV. 32), these tails go to zero

rapidly enough because an exponential power goes to zero more rapidly than any polynomial goes to infinity, but for many

wave functions occurring in physics the standard deviation diverges.

IV. 5. THE UNCERTAINTY RELATIONS 103

From this inequality it follows that the probability densities of position and momentum cannot

simultaneously be made arbitrarily small, in the sense that a fraction α is concentrated on a arbitrarily

small interval. Finally, 34 years after the birth of the uncertainty principle that of which everyone

thought follows from the standard uncertainty relations was proven.

For the square wave function ψ ss (IV. 43) and its Fourier transform (IV. 44) we find

W α (Q, ψ ss ) ≃ a and W α (P, ψ ss ) ≃ , (IV. 52)

a

so that the product is in the order of magnitude of .

IV. 5. 4

TIME AND ENERGY

In the same article in which Heisenberg (1927) introduces the uncertainty relation for position

and momentum, he also discusses the uncertainty relation between time and energy, starting from the

‘well - known’ equation Et − tE = ih. This equation has caused many problems.

If t is taken to be the universal time parameter, the spectrum of the operator t must be the real axis.

But then the commutation relation can only be satisfied by an energy operator of which the spectrum

is the real axis also. On the other hand, we know that the energy spectrum of quantum mechanical

systems is generally bounded from below and can even be totally or partially discrete. Hence, the

conclusion was soon drawn that there is no time operator in quantum mechanics (Von Neumann 1932,

Pauli 1933). In the light of the existence of a position operator and with the theory of relativity in

mind it was felt that in quantum mechanics something strange was going on with ‘time’. This is

expressed in almost all textbooks and articles concerning this subject. Nevertheless, it has to do with

a conceptual confusion which has not been noticed for a remarkably long time.

As it happens, the comparison between q and t is faulty if t is understood to be a universal time

parameter. After all, q is a dynamic variable of a specific physical system, for example of a particle,

and therefore there are a lot of q’s in a multiple particle system. There is, however, only one time

parameter. This does not belong to a certain physical system but must be put on a par with the

universal position coordinates x, y, z, with which it is linked in the theory of relativity. No more

than these position coordinates, the time coordinate t is an operator in quantum mechanics. Only the

dynamic variables of physical systems can be operators, and the problem outlined above is therefore

a pseudo - problem.

Nevertheless, one can wonder if dynamic variables exist which are just as ‘timelike’, literally

speaking, as q is ‘positionlike’. The answer is affirmative. Such variables exist in systems we call

‘clocks’, think, for example, of the position or the orientation of the hand of a clock. But also very

simple, microscopic systems can have such variables. In quantum mechanics these dynamic time

variables become operators. They occur in specific systems and therefore they are not universal.

And, similar to other dynamic variables, generally the spectrum of such time operators in quantum

mechanics is not the entire real axis (see further J. Hilgevoord 2002).

104 CHAPTER IV. THE COPENHAGEN INTERPRETATION

IV. 5. 5

DOUBLE SLIT EXPERIMENT

Even more interesting is the famous interference experiment with the double slit. The wave function

corresponding to particles passing through the screen with the slits is, in analogy with (IV. 43),

ψ ds (q) =

{

1 √

2 a

if q ∈ [− A − a, − A + a] ∪ [A − a, A + a]

0 elsewhere

, (IV. 53)

where 2a is the width of each slit, 2A is the distance between the slits, and A ≫ a, see figure IV. 8.

2 A

2 a

|ψ ds (q)| 2

Figure IV. 8: The probability distribution in position for a double slit, 2 a is the width of each slit and

2 A the distance between the slits

The Fourier transform of this double square wave function ψ ds is

˜ψ ds (p) =

√

2 a

( Ap

) sin(ap/)

π cos . (IV. 54)

a p /

The function | ˜ψ ds | 2 again has the same form as the interference pattern for the slits on a photographic

plate placed far away, as can be seen in figure IV. 9.

2 π / a

2 π / A

| ˜ψ ds (p)| 2

Figure IV. 9: The interference pattern for the double slit

IV. 5. THE UNCERTAINTY RELATIONS 105

Now there are, however, two parameters playing a role. The distance of the slits A is a measure for

the total width of |ψ ds (q)| 2 , the ‘enveloping’ cosine factor in (IV. 54), while the width of the slits a is a

measure for the ‘fine structure’ of this probability density. For | ˜ψ ds (p)| 2 the roles have reversed, A −1

is a measure for the width of the interference lines, while a −1 is a measure for the total width of the

interference pattern. This shows the well - known fact that the width of the interference lines and the

distance between the slits are inversely proportional. In a moment we will see that Bohr’s discussion

of the double slit experiment exactly rests on this fact.

◃ Remark

Consider the measures

∆ ψds Q ≃ A and ∆ ψds P = ∞, (IV. 55)

W α (Q, ψ ds ) ≃ A and W α (P, ψ ds ) ≃ . (IV. 56)

a

None of these measures gives the fine structure. Therefore, Bohr’s Copenhagen reasoning, treated

in the next subsection, cannot be based on the Kennard inequality (IV. 26) nor on the inequality of

Landau and Pollak (IV. 51). ▹

EXERCISE 30. Verify the calculations (IV. 55) and (IV. 56).

IV. 5. 6

A NEW UNCERTAINTY MEASURE

Bohr’s reasoning concerning the double slit experiment goes as follows. A way to determine

through which slit the particle has gone is measuring the recoil in the q - direction that the screen

experiences at the passage of this particle. To this end the screen must be able to move in the q - direction.

Instead of a fixed screen we take therefore a screen that is suspended from a spring, as can be

seen in figure IV. 10. The incoming momentum p is perpendicular to the screen.

We assume conservation of kinetic energy, i.e. a heavy screen, which means that only the direction

of the momentum changes. Consequently, a particle arriving at position q of the photographic

plate, gives a recoil to the screen of, assuming r ≫ A and therefore sin θ ≈ tan θ,

( q ± A

r

)

p, (IV. 57)

depending on which slit it has gone through. To be able to measure the difference in recoil, it must

hold for the inaccuracy δP with which the momentum of screen was known in advance, that

δP < 2 A p . (IV. 58)

106 CHAPTER IV. THE COPENHAGEN INTERPRETATION

q

q = r tan θ 1 + A

= r tan θ 2 − A

2 A

1

2

a

θ 2

θ 1

r

Figure IV. 10: Moving screen

Because of the inequality

δP δQ , (IV. 59)

to the inaccuracy with which the position Q of the screen was known then applies

δQ >

r . (IV. 60)

2 A p

But the width of the interference lines on the photographic plate is

λ r

2 A = r , (IV. 61)

2 A p

where λ = p

is the de Broglie wavelength of the electron. Bohr therefore concludes that the uncertainty

in the position of the screen will result in the erasure of the interference pattern.

◃ Remarks

First, we see that Bohr applies the uncertainty principle to the screen which means that he treats this

macroscopic body quantum mechanically. Second, he uses the uncertainty principle in a qualitative

manner, in particular, he does not give a definition of the uncertainties δP and δQ. Third, the relevant

uncertainty in Q is of the order of magnitude of the width A −1 of the interference lines. Bohr

therefore has no use of the Kennard inequality (IV. 26) or the inequality of Landau and Pollak (IV. 51),

which do not contain this width. Finally, Bohr does not show how erasure of the interference pattern

exactly takes place, obviously, he considers it to be intuitively evident. ▹

From the previous it should be clear that something is still lacking in the mathematical formulation

of the uncertainty principle. One would hope that there may exist some direct relation between the

IV. 5. THE UNCERTAINTY RELATIONS 107

total width of a distribution in the p - language (q - language), and the fine structure of this distribution

in the q - language (p - language) as exhibited by the wave function for the double slit, assuming that

this relation has general validity. Indeed, such a relation has been found (Uffink and Hilgevoord 1985),

w α (Q, ψ) W α (P, ψ) C α and w α (P, ψ) W α (Q, ψ) C α , (IV. 62)

where w α ( · , ψ) ∈ R + is a measure for the width of the fine structure of ψ, W α ( · , ψ) ∈ R + is

the measure for the total width of ψ as introduced earlier, and C α > 0 is a constant depending

on α ∈ (0, 1], but not on the state ψ.

Illustratively, if W is taken as a measure of the size of the objective of a microscope and w as a

measure of the fine structure of the image, the inequalities express the fact that the resolving power

must decrease if the aperture is reduced. Likewise, the direction of incoming radiation can better

determined by using a long array of radio telescopes than by using a short one, etc. These inequalities

thus express, among other things, the well-known fact in optics that the resolving power of an

apparatus improves as the apparatus is larger.

The inequalities (IV. 62) seem to solve the problem for Bohr. A closer consideration however

tells us that W α (P, ψ) is not the suitable measure to express whether the difference in recoil can or

cannot be observed. More precise, W α (P, ψ) > 2Ap

r

does not guarantee that this difference cannot be

observed. W α (P, ψ) can be large in this experiment, which makes the inequality (IV. 62) ineffective.

Actually, it is the question if Bohr’s argument can in fact be based on an uncertainty relation.

Nevertheless, his conclusion is correct! The fact is that a direct calculation of the double slit

experiment by D. Hauschildt, unpublished, shows that the intensity of the interference, in case the

screen is movable, is proportional to the factor

∣ ⟨χ| e

i 2 A p Q r sc

|χ⟩ ∣ . (IV. 63)

Here |χ⟩ is the state of the screen and Q sc is the position operator of the screen. The state

|χ⟩ ′

:= e i 2 A p

r Q sc

|χ⟩ (IV. 64)

is the state of which the momentum spectrum is shifted by 2Ap

r

with respect to the momentum spectrum

of the state |χ⟩,

⟨p | χ ′ ⟩ = ⟨ p − 2 A p

r

∣ χ

⟩

. (IV. 65)

The factor (IV. 63) is, therefore, exactly the quantum mechanical expression describing to what extent

the state of the screen after the recoil can be distinguished from the state of the screen before the

recoil.

If the momentum spectrum of |χ⟩ is broad with respect to 2Ap

r

, the overlap (IV. 63) will be large,

namely almost 1. In that case |χ⟩ and |χ⟩ ′ are difficult to distinguish and interference is large. If the

momentum spectrum of |χ⟩ only contains peaks which are narrow with respect to 2Ap

r

, then (IV. 63)

is small. The states |χ⟩ and |χ⟩ ′ are well distinguishable then and interference is small. The essence

of Bohr’s reasoning is therefore correct; to the extent in which the screen can serve as a measuring apparatus

to determine the slit a particle goes through, interference disappears. Whether this reasoning

can be based on an uncertainty relation, is unknown to this very day.

108 CHAPTER IV. THE COPENHAGEN INTERPRETATION

IV. 5. 7

INTERPRETATION

The statistical interpretation of the uncertainty W α (A, ψ) in (IV. 62) is that it is a measure for

the predictability of an outcome of measurement given a probability distribution, it is nothing but

the usual statistical interpretation of the standard deviation. How we must physically understand

this uncertainty depends directly on how we must physically understand quantum mechanical probabilities.

We will discuss this elaborately further on.

The number w α (A, ψ) is a measure for the distinguishability between the state ψ (probability

distribution) and some other state (other probability distribution) when measuring quantity A corresponding

to operator A. This is also nothing but the usual statistical interpretation of this measure.

V

HIDDEN VARIABLES

While we have thus shown that the wave function does not provide a complete description

of the physical reality, we left open the question of whether or not such a description

exists. We believe, however, that such a theory is possible.

— Einstein, Podolsky and Rosen

You may have already suspected that I still believe in the hidden variables hypothesis.

[. . . ] Anyway, for me, the hidden variable hypothesis is still the best way to ease my

conscience about quantum mechanics.

— Gerard ’t Hooft

In this chapter we get acquainted with so-called ‘hidden variable theories’ and the motivation

to consider such theories. We examine if it is possible to shove such a ‘hidden variable theory’

under quantum mechanics, the way classical mechanics can be shoven under classical statistical

mechanics. We also treat the notorious impossibility theorems of Von Neumann and of Kochen

and Specker.

V. 1 HIDDEN REALITY

Quantum mechanics is, roughly speaking, a theory about outcomes of measurements; about which

values can be found upon measurement and about the probability of finding a specific value in such

a measurement. Moreover, according to the Copenhagen perspective, this description is complete:

there is nothing more to say about a physical system. As a consequence, quantum mechanics is

exclusively concerned with the observable behaviour of measuring apparatuses.

In the eyes of many authors, this is bizarre. In the entire history of physics we see that the aim

of a theory has been to tell us something about how reality is organized, how to explain what we

observe around us. Measuring is the eminent scientific manner to examine whether a given theory or

hypothesis meets this aim, or to gather data to help us select theories. Measurment is not an aim, but

a tool. The subject of physical theories, physical reality, does not occur in the quantum mechanical

tale, in contrast to nearly all theories in classical physics.

From this point of view, we could hope that quantum mechanics is some sort of cloak, which must

be sustained by an underlying theory concerning physical reality. Because that underlying theory is

hidden under the quantum mechanical cloak, we will speak of a hidden variable theory.

So, let us examine the matter not from the viewpoint of quantum mechanics, but from ‘physical

reality’, taking as a working hypothesis that something like a ‘physical reality’ exists. The behavior of

110 CHAPTER V. HIDDEN VARIABLES

radioactive atomic nuclei, as discussed in the Introduction, p. 7, suggests that individual nuclei differ

from each other, they show various life spans and emit α - particles with distinct momentum. The

natural idea is that this difference in behavior has a cause, which can be found in mutually differing

properties of the physical states of the individual nuclei. Quantum mechanics does not give us these

differences, but perhaps a description of state exists, exceeding that what quantum mechanics tells us.

We would like such an additional description to show us how the phenomena observed at an

individual nucleus follow decisively from the state of that nucleus. Such a description requires extra

variables in comparison with the quantum mechanical description. It is conceivable that not all of

these variables are accessible to our present, and possibly future, possibilities of observation. They

are ‘hidden’ from us, but they must exist to explain the observed differences. If they exist, then

quantum mechanical states correspond to probability distributions over the states described by these

variables. These probability distributions would only express our ignorance concerning the exact

physical states. In this respect, the situation would be entirely analogous to that in classical statistical

mechanics. EPR believed that it must in principle be possible to construct such a theory.

Such an attempt, interpreting quantum mechanics as a statistical theory about an underlying physical

reality, is what is called a hidden variable theory, HVT for short, the support under the quantum

mechanical cloak. Assuming that quantum mechanics is empirically adequate, we will examine if it

is possible in principle to found this description on a HVT.

An important distinction between several types of HVT’s concerns the question whether the hidden

variables describing the physical state of the system can depend on which quantity of the system is

measured. Theories in which this has been permitted are called contextual, they will be discussed in

section V. 4. For the moment, we will first concentrate on the simpler case where this is not permitted,

the non - contextual theories, to be discussed in section V. 2.

Another important division has to do with determinism. Although it is the objective of a HVT to

supplement or complete the quantum mechanical description of a physical system, this does not imply

that with this supplement the precise future behavior of this system can be entirely predicted, it is

conceivable that the HVTtoo merely determines probabilities of possible events. In that case we speak

of an indeterministic, or stochastic, HVT. In this chapter we will discuss only deterministic HVT’s,

but we will come back to stochastic HVT’s in chapter VII.

V. 2 NON - CONTEXTUAL HIDDEN VARIABLES

Let us try to reconstruct quantum mechanics in analogy with classical statistical mechanics. We

assume a space Λ analogous to the phase space Γ known from statistical physics, which we have

already met in section III. 2. An arbitrary ‘point’ in that space Λ is indicated with λ. We do not in

advance impose any restriction to the mathematical form of λ. The variable λ can represent anything,

for example a single real variable, an infinite - dimensional vector field, complex functionals, etc. The

possibilities are endless, the only restriction will be that a probability measure can be defined on Λ. It

is possible to also incorporate the quantum mechanical state as a component in the specification of λ.

Speaking about a ‘classical’ statistical model here does not mean that the HVT must look like

classical mechanics, let alone that λ specifies the position and momentum of the particles, although

we do not exclude that as a possibility.

V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 111

In the HVT, a pure physical state corresponds to a single ‘point’ λ ∈ Λ. We assume that the

system is always in one of these states λ ∈ Λ, even though we do not know in which one. A general,

mixed state is a probability distribution over Λ. For any given λ every physical quantity A has an

exact value, denoted by A[λ], which is revealed upon measurement of A, and therefore a physical

quantity A can be represented as a real function on the space A : Λ → R.

Furthermore, every quantity represented by quantum mechanics has to have a counterpart in the

HVT. If such a quantity, corresponds to the function A : Λ → R the values A [λ] can take are

the eigenvalues of the self - adjoint operator A : H → H which, according to quantum mechanics,

corresponds to quantity A.

It is also required that every quantum mechanical state can be represented in the HVT; for every

state operator W there must be a corresponding probability distribution ρ W over Λ. It is, however,

not necessary that pure quantum states correspond to pure hidden variable states, the idea being that

the HVT allows for a more detailed, complete description of the system. Neither is it necessary that

every probability distribution on Λ corresponds to a state operator, the HVT could easily be a theory

richer than quantum mechanics.

The requirement that the HVT has to reproduce the empirical statements of quantum mechanics

is now expressed in the requirement that the expectation values of quantity A belonging to a physical

system in a physical state, corresponding in the HVT to ρ W , and in quantum mechanics to W , coincide,

∫

⟨A⟩ ρW := A[λ] ρ W (λ) dλ = Tr A W, (V. 1)

Λ

where ρ W : Λ → [0, ∞) is a probability density,

∫

ρ W (λ) dλ = 1. (V. 2)

Λ

For a pure state |ψ⟩, (V. 1) reduces to

∫

A[λ] ρ ψ (λ) dλ = ⟨ψ | A | ψ⟩. (V. 3)

Λ

In the discrete case the integrals are replaced by summations.

Summary

An non - contextual HVT is any theory meeting the following requirements.

(i) Every physical state of a physical system corresponds to a probability distribution ρ over Λ.

This is the state postulate.

(ii) Every physical quantity A corresponds to a function A : Λ → R, λ ↦→ A[λ]. This is the

observables postulate.

112 CHAPTER V. HIDDEN VARIABLES

(iii) The range of A : Λ → R coincides with the spectrum of the self - adjoint operator A which,

according to quantum mechanics, corresponds to quantity A.

The expectation value of A when the physical system is in the state ρ W which, according

to quantum mechanics, corresponds to the state operator W , equals the quantum mechanical

expression for the expectation value

⟨A⟩ ρW :=

∫

Λ

A[λ] ρ W (λ) dλ = Tr AW.

We will call this last requirement (iii) the reproduction criterion.

Since all probabilities in quantum mechanics can be written as Tr PW , with P ∈ P(H), it follows

that all probability distributions in quantum mechanics coincide with the corresponding probability

distributions in the HVT.

We can now ask whether it is possible to construct a HVT satisfying the above requirements. The

answer is that it is indeed possible, even in a quite trivial way, by choosing Λ large enough. We

illustrate this by means of a simple example.

Suppose there are only three quantities A, B, C, with possible values {a 1 }, {b 1 , b 2 }, {c 1 , c 2 } and

represented by functions A, B, C : Λ → R. The possible value combinations are

(a 1 , b 1 , c 1 ), (a 1 , b 1 , c 2 ), (a 1 , b 2 , c 1 ), (a 1 , b 2 , c 2 ). (V. 4)

We now construct a space Λ by identifying every value combination with a point of Λ. If we denote

these points by λ 1 , λ 2 , λ 3 and λ 4 , then

A[λ 1 ] = a 1 , B[λ 3 ] = b 2 , C [λ 4 ] = c 2 , etc. (V. 5)

When there are more quantities, we extend Λ correspondingly.

We have to introduce a probability measure

µ : F (Λ) → [0, 1] with

∑

µ(λ j ) = 1 (V. 6)

j

such that (V. 1) is satisfied. In our case Λ is discrete and consists of four points only, as a result of

which the integral (V. 1) becomes a sum. For example, to quantity B it must apply that

Tr B W =

4∑

B[λ j ] µ W (λ j )

j=1

This is satisfied by

= b 1

(

µW (λ 1 ) + µ W (λ 2 ) ) + b 2

(

µW (λ 3 ) + µ W (λ 4 ) ) . (V. 7)

µ W (a i , b j , c k ) = Tr P ai W Tr P bj W Tr P ck W, (V. 8)

V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 113

where P ai is the projector on the subspace corresponding to the eigenvalue a i of A, etc. Indeed,

according to quantum mechanics

and therefore

while, with

we have

B = b 1 P b1 + b 2 P b2 , (V. 9)

Tr BW = b 1 Tr P b1 W + b 2 Tr P b2 W, (V. 10)

P a1 = 11, P b1 + P b2 = 11, P c1 + P c2 = 11, (V. 11)

µ W (λ 1 ) + µ W (λ 2 ) = µ W (a 1 , b 1 , c 1 ) + µ W (a 1 , b 1 , c 2 )

Likewise we find

= Tr P a1 W Tr P b1 W (Tr P c1 W + Tr P c2 W )

= Tr P b1 W. (V. 12)

µ W (λ 3 ) + µ W (λ 4 ) = Tr P b2 W. (V. 13)

Therefore, (V. 7) has been satisfied, and the same applies to the expectation values of A and C.

If we have, in general, the quantities A, B, C, . . . , F , with values a i , b j , c k , . . . , f l , where

i = 1, . . . , n A , j = 1, . . . , n B , etc., the measure

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W Tr P bj W Tr P ck W · · · Tr P fl W, (V. 14)

satisfies requirement (V. 3) for all quantities. For example, the probability of finding for quantity A

the value a i is

Prob µ W

(A : a i ) =

∑

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W, (V. 15)

j, k,..., l

because all others sum up to 1. Here we have the required quantum mechanical result. Kochen

and Specker (1967) showed how to formulate this idea in the case of an infinite number of physical

quantities.

This solution of the completeness problem is, however, not very interesting physically. It can be

seen from the factorizable probabilities in (V. 8) that all quantities are treated here as being statistically

independent which is not in agreement with physical practice. Some quantities are functions of

other quantities, e.g., kinetic energy is a function of momentum, E kin = p2

2m

, while other quantities

link with two or more other quantities, such as kinetic, potential and total energy, E = E kin + E pot .

In the just outlined HVT we have ignored such links.

114 CHAPTER V. HIDDEN VARIABLES

To illustrate this we assume that in our example C = A+B so that c 1 = a 1 +b 1 and c 2 = a 1 +b 2 .

Now the possible value combinations in the HVT are

(a 1 , b 1 , a 1 + b 1 ), (a 1 , b 1 , a 1 + b 2 ), (a 1 , b 2 , a 1 + b 1 ), (a 1 , b 2 , a 1 + b 2 ), (V. 16)

and we see that (A + B) [λ] is not equal to A [λ] + B [λ] for all λ. Nevertheless, the HVT succeeded

in reproducing, by construction, all quantum mechanical expectation values, in other words,

the HVT reproduces the relation

⟨ψ | A + B | ψ⟩ = ⟨ψ | A | ψ⟩ + ⟨ψ | B | ψ⟩, (V. 17)

without requiring

(A + B)[λ] = A[λ] + B[λ]. (V. 18)

If we would require (V. 18), Λ would only consist of the points (a 1 , b 1 , a 1 + b 1 ) and (a 1 , b 2 , a 1 + b 2 )

which is, of course, a strong restriction.

In the very first proof of the impossibility of a HVT, that is, of the insolubility of the completeness

problem, given by Von Neumann (1932), the requirement (V. 18) was indeed imposed on the HVT.

Von Neumann required (V. 18) for every hidden variable state, in particular also for pure hidden

variable states, which means that (V. 18) must apply to all λ ∈ Λ. We don’t need to discuss Von

Neumann’s elaborate proof of this claim in detail, since J.S. Bell (1966) has shown this impossibility

by means of a very simple example.

Since the values of A[λ] etc. have to be the eigenvalues of the corresponding operators, it can be

seen immediately that this requirement cannot be satisfied in general. Consider for example the Pauli

matrices

σ x =

( ) 0 1

, σ

1 0 y =

( ) 0 − i

i 0

and σ x + σ y =

( )

0 1 − i

. (V. 19)

1 + i 0

The eigenvalues σ x and σ y are ±1, but the eigenvalues of σ x + σ y are ± √ 2, and therefore, (V. 18)

cannot be satisfied.

Bell argued that the requirement (V. 18) is physically unreasonable. For instance, measuring

σ x ,σ y and σ x +σ y requires three different measurement apparatuses, for example three Stern - Gerlach

magnets in three different orientations. There is absolutely no reason to assume that an algebraical

link would exist between the individual outcomes of these measurements. The fact that in quantum

mechanics the relation (V. 17) exists for pure states, even in case A and B do not commute, must be

considered as a particular property of quantum mechanics.

Since the requirement (V. 18) is unreasonably strong, one can wonder whether there are other,

reasonable, requirements which can be imposed to a HVT in order to find acceptable solutions of the

completeness problem. This brings us to the next section.

V. 3 KOCHEN AND SPECKER’S THEOREM

V. 3. KOCHEN AND SPECKER’S THEOREM 115

As we already proved in section II. 4, p. 28, in quantum mechanics the next theorem holds: if the

operators A, B, C, . . . commute, there is a maximal operator O of which they are a function,

A = f (O), B = g(O), etc. (V. 20)

A measuring procedure for A, B, C, . . . would be to measure O and apply the function relation to the

result in order to find the values for A, B, C, . . . Kochen and Specker (1967, p. 64) call the quantities

corresponding to A, B, C, . . . commeasurable.

Now it seems reasonable to require, as Von Neumann did, that the HVT also has this structure, i.e.,

for B, C : Λ → R, if B = f (C), it follows that B[λ] = f ( C [λ] ) , or

f (C)[λ] = f ( C [λ] ) . (V. 21)

This function rule, (V. 21), yields the so - called sum rule for commuting operators,

[A, B] = 0 =⇒ (A + B)[λ] = A[λ] + B[λ], (V. 22)

since, with O again the maximal operator of which A and B are a function, A = f (O), B = g(O),

implying

(A + B) = h(O) with h = f + g, (V. 23)

from (V. 21) it then follows in this HVT that

(A + B)[λ] = h(O)[λ] = h ( O[λ] ) = f ( O[λ] ) + g ( O[λ] )

= (f O)[λ] + (g O)[λ] = A[λ] + B[λ]. (V. 24)

EXERCISE 31. Prove, again using (V. 21), the product rule for commuting operators,

[A, B] = 0 =⇒ (A B)[λ] = A[λ] B[λ]. (V. 25)

Now we will see how the requirement, (V. 21), which at first sight is eminently reasonable, nevertheless

renders a HVT of quantum mechanics impossible.

THEOREM :

A HVT satisfying the requirements (i) - (iii), p. 111, and the function rule (V. 21), does

not exist if dim H > 2.

116 CHAPTER V. HIDDEN VARIABLES

Proof

Consider a complete collection of mutually orthogonal projectors P 1 , . . . ,P N on a N - dimensional

Hilbert space. Such projectors mutually commute; [P i , P j ] = 0. An arbitrary sum of such projectors

over some subset ∆ ⊂ {1, . . . , N} is again a projector,

∑

i∈Delta

P i = P ∆ ∈ P (H). (V. 26)

Therefore, according to the sum rule (V. 22) it has to hold that

∑

, P i [λ] = P ∆ [λ]. (V. 27)

i ∈∆

But the values P i [λ] are the eigenvalues of the operators P i , therefore they are 0 or 1, likewise

for P ∆ [λ], these values also follow from (V. 21). In particular, taking ∆ = {1, . . . , N}, we find

N∑

, P i [λ] = 11[λ] = 1.

i=1

But then the value assignment P i [λ] to the projectors satisfies the requirements for a probability

measure on P (H), i.e.

µ λ (P i ) := P i [λ] ∈ {0, 1} (V. 28)

is a normalized, additive mapping on the subspaces of H. According to Gleason’s theorem, p. 47,

this probability measure can always be written as

µ λ (P i ) = Tr P i W λ , (V. 29)

for a certain state operator W λ , provided that dim H > 2. There is, however, a contradiction

between (V. 29) and (V. 28). The measure (V. 29) is continuous; a small change of the direction

of P i induces a small change of µ(P i ). The measure (V. 28) is however necessarily discontinuous

because µ(P i ) can only have the values 0 and 1.

The conclusion has to be that a value assignment to quantities satisfying (V. 21), and therefore

(V. 27), is impossible. As a consequence, a HVT of this type is not possible. □

In this proof we used Gleason’s theorem, which is difficult to prove, and his own proof is not very

transparent. There have also been given direct proofs for the impossibility of this value assignment.

Bell (1966) and Kochen and Specker (1967) were the first to prove this in general, i.e., for dim H > 2

and for all states; see also Belinfante (1973). We will not discuss these proofs in detail but restrict

ourselves to a number of observations. Before we do so, we formulate Kochen en Specker’s theorem.

KOCHEN AND SPECKER’S THEOREM :

It is not possible to assign values to all physical quantities of an arbitrary physical system,

with a Hilbert space of dim > 2, in accordance with function rule (V. 21).

V. 3. KOCHEN AND SPECKER’S THEOREM 117

Sketch of the direct proof

We can formulate the problem as follows. Consider as a particular case of (V. 26) a resolution of

identity into 1 - dimensional projectors,

P 1 + P 2 + · · · + P n = 11. (V. 30)

According to (V. 21), thence (V. 22), the following must hold

P 1 [λ] + P 2 [λ] + · · · + P n [λ] = 11[λ] = 1 (V. 31)

for every resolution of identity. Consider the 1 - dimensional projectors H as lines in all possible

directions through the origin of H. Now assign to all lines the value 0 or 1, such that the sum

of the values of each complete set of orthogonal lines is 1. Alternatively, consider the points of

intersection of these lines with the surface of the unit sphere in H. To each point of the sphere the

value 0 or 1 is assigned, antipodal points are assigned the same value, and the sum of the values

of the points of intersection of an orthogonal basis with the surface of the sphere is 1.

If this problem is soluble in a complex H, it is also soluble in a real H with the same dimension.

To see this, choose a basis in H and generate, by application of real orthogonal transformations,

a structure which is isomorphic to a real H. Therefore, we can restrict ourselves to proving the

impossibility of the requested value assignment in a real H.

Furthermore, the impossibility in H N implies the impossibility in H N+1 . This can be shown

by considering the N - dimensional subspace which is orthogonal to a line having value 0. Each

orthogonal (N + 1) - tuple of which this line is a part then turns into an N - tuple with a correct

value assignment. In other words, if it is possible in an (N +1) - dimensional H, it is also possible

in an N - dimensional H and, therefore, we only have to consider a real H with a dimension as

low as possible.

Notice that the problem for a 2 - dimensional Hilbert space H 2 does have a solution, see for

example the diagram V. 1.

1 0

0

1

Figure V. 1: A solution for dim H = 2

All proofs therefore aim at the case of a real, 3 - dimensional Hilbert space H 3 . Now it immediately

seems plausible that the requested value assignment in H 3 is not possible, to each point of

the unit sphere R 3 with value 1 infinitely many points belong having value 0, namely, the equator

of which that point is a pole. On the other hand, of each orthogonal triad of points only two points

have the value 0. But this is, of course, not a proof.

Bell (1966, pp. 450, 451) showed that points with different values cannot be arbitrarily close.

This is an independent proof of the continuity of the measure, and therefore contrary to the necessary

discontinuity of (V. 28).

118 CHAPTER V. HIDDEN VARIABLES

Kochen and Specker (1967, p. 69) explicitly constructed a set of 117 spin quantities for which no

consistent value assignment exists. This construction is depicted on the cover of Redhead (1987)

and can be seen in figure V. 2 a. It shows that every value assignment in accordance with function

rule (V. 21) leads to contradictions.

Kochen and Conway only needed 31 quantities in the so - called Peres cube of 33 points (Peres 1993).

This construction is depicted in figure V. 2 b. □

Figure V. 2: a) Kochen - Specker diagram b) Conway - Kochen diagram

(Redhead 1987 ) (Tkadlec 2000 )

V. 3. KOCHEN AND SPECKER’S THEOREM 119

Figure V. 3: M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of the

left pillar. Each cube has 4 lines from the mutual center to its vertices, 6 lines to the centers of

its edges, and 3 lines to the centers of its faces. Three of the lines are shared by all three cubes,

giving 3 · (4 + 6 + 3 ) − 6 = 33 lines. These are Peres’ vectors. (Text Meyer 2003 )

It is interesting to see what the measure (V. 29), according to Von Neumann the probability measure

of quantum mechanics, looks like in this case. For a pure state W = |ψ⟩ ⟨ψ|, with P i = |χ⟩ ⟨χ|

the measure (V. 29) is

µ(P i ) = Tr P i W = ⟨ψ | P i | ψ⟩ = |⟨χ | ψ⟩| 2 (V. 32)

so that in a real space we have

µ(P i ) = |⟨χ | ψ⟩| 2 = cos 2 θ, (V. 33)

120 CHAPTER V. HIDDEN VARIABLES

with θ the angle between |ψ⟩ and |χ⟩, see figure V. 4.

ψ

1

θ

χ

cos 2 θ

0

Figure V. 4: µ(P i ) = cos 2 θ

In the appendix of these lecture notes, p. 183, ff., we will prove that, if we assign to each point

of the upper half of a unit sphere a non - negative real number such that 1 is assigned to the ’north

pole’, 0 is assigned to the ’equator’ and the sum of the values of each orthogonal triad in this half

sphere is 1, there is only one possible value assignment and that is the quantum mechanical one,

i.e., in accordance with cos 2 θ.

◃ Remarks

First, illustrations of Kochen and Specker’s theorem are easy to find for Hilbert spaces of dimension

larger than 3, for example 8, in which case a handful of quantities suffices, see Mermin (1993). We

will come back to that in section VII. 6. Second, when restricted to rational angles between spin

vectors, no contradiction with quantum mechanics can be obtained, as D.A. Meyer (1999) proved. ▹

V. 3. 1 SUMMARY

According to Kochen and Specker’s theorem, a HVT satisfying the state postulate and the observables

postulate, p. 111 (i) and (ii), together with the function rule (V. 21), is contradictory to the state

postulate and the observables postulate of quantum mechanics if dim H > 2, although for Hilbert

spaces with dim H 2 it is possible. This conclusion shows how stringent the vector space structure

of quantum mechanics is, and in particular, the fact that there are many different decompositions of

unity forms a heavy barrier for a HVT.

V. 4 CONTEXTUAL HIDDEN VARIABLES

Essential for Kochen and Specker’s proof is the fact that a 1 - dimensional projector can be part

of several decompositions of unity. This is possible as long as the projectors are not maximal, i.e.,

if dim H > 2. The existence of degenerated projectors, apart from unity, is essential for the proof of

Kochen and Specker, and for this reason it does not hold in a 2 - dimensional H where all projectors,

except 11, are maximal. By means of degenerated projectors also non - commuting operators become

connected to each other. By the requirement (V. 21) this is transferred to the quantities of the HVT, so

V. 4. CONTEXTUAL HIDDEN VARIABLES 121

that via a detour we still impose a requirement for non - commeasurable quantities on the HVT. We

will consider this in detail now.

Suppose that operator A commutes with the maximal operators C 1 and C 2 , while [C 1 , C 2 ] ≠ 0.

Then we have

which implies

A = f (C 1 ) and A = g(C 2 ), (V. 34)

f (C 1 ) = g(C 2 ), (V. 35)

and we see that A is degenerate. Function rule (V. 21) leads to the same relation between the quantities

of the HVT,

yielding

A[λ] = f ( C 1 [λ] ) and A[λ] = g ( C 2 [λ] ) , (V. 36)

f ( C 1 [λ] ) = g ( C 2 [λ] ) . (V. 37)

Again, this is a relation between the value assignments to quantities which do not commute in quantum

mechanics, but the relation is not one - to - one, the functions f and g are not bijective.

It can be supposed that such a requirement is unreasonable is because such quantities are not

commeasurable. In other words, the structure of quantum mechanics, and particularly the proposition

that an operator can be a function of two non - commuting maximal operators, leads to relations

between quantities which cannot be measured in one single experiment.

The following is what occurs at the different decompositions of unity. Consider two bases, {|α j ⟩}

and {|β j ⟩}, in a Hilbert space H of dimension N > 2 and suppose that |α 1 ⟩ = |β 1 ⟩, while all other

basis vectors are different. Then we have

N∑

P |αj ⟩ = 11 =

j=1

N∑

P |βj ⟩ and P |α1 ⟩ = P |β1 ⟩. (V. 38)

j=1

Define, as follows, two maximal operators with all coefficients c j and d j distinct,

C :=

N∑

c j P |αj ⟩ and D :=

j=1

N∑

d j P |βj ⟩, (V. 39)

j=1

then it follows that

P |α1 ⟩ = f (C) = g(D). (V. 40)

This leads to a connection between the non - commuting operators C and D, and using (V. 21)

this leads to a connection between the corresponding representations C[λ] and D[λ] in the HVT. It is

this type of relations which the HVT cannot satisfy.

122 CHAPTER V. HIDDEN VARIABLES

◃ Remark

Notice that the occurrence of non - maximal operators P |αi ⟩ is indeed essential, if P |αi ⟩ would be

maximal, C and D would commute, as we saw in section II. 4 on p. 30. M.J. Maczynski (1971) has

proved that if we exclusively consider maximal quantities, and therefore we would apply (V. 21) to

maximal quantities only, Kochen and Specker’s theorem is no longer valid, and in that case a HVT is

possible. ▹

An obvious expedient is to strictly constrain requirement (V. 21) to quantities which are measurable

within one context. In our example the projector P |α1 ⟩ is commeasurable with both C and D,

while mutually C and D are not commeasurable. Therefore, we have to distinguish between a value

assignment P |αi ⟩[λ] within the context of a measurement of C, and one within the context of a measurement

of D. We can think, for example, of a measurement of C and application of the function relation

P |α1 ⟩ = f(C), or of a measurement of D and application of the function relation P |α1 ⟩ = g(D).

More generally, suppose

A = f (C) = g(D) where [C, D] ≠ 0. (V. 41)

Then we distinguish the hidden variable quantities A C [λ] and A D [λ], where the index indicates the

context of measurement. If C and D do not commute there is, according to a contextual HVT, no

reason to assume that for all λ ∈ Λ it holds that

A C [λ] = A D [λ], (V. 42)

as is the case in every HVT we have considered so far.

Kochen and Specker do assume (V. 42), however, and find a contradiction with quantum mechanics.

The remedy is therefore to ‘split up’ all degenerate quantities by addition of the context in which

they are measured, as was firstly proposed by B.C. van Fraassen (1973). For the sake of convenience

we here assume that a measurement of a degenerated quantity always develops by means of the measurement

of a maximal quantity, which does not have to be split up. By definition we then have

A C [λ] = f ( C [λ] ) and A D [λ] = g ( D[λ] ) . (V. 43)

This yields a weaker form of (V. 21). Suppose A = f (C), B = g(C) and A = h(B) = h(g(C)),

then using (V. 43) we have

A C [λ] = h ( B C [λ] ) . (V. 44)

This consideration leads to a new postulate for a HVT, which, in case the HVT accommodates this

postulate, we call contextual.

CONTEXTUAL OBSERVABLES POSTULATE:

If A is a physical quantity which can be taken as a function of at least two other physical

quantities, for example A = f (C) and A = g (D), then, in the HVT, to A corresponds

a function A C : Λ → R iff quantity C is measured, and a function A D : Λ → R iff

quantity D is measured. If A, f(C) and g(D) are the corresponding quantum mechanical

operators, the following applies,

∀ λ ∈ Λ : A C [λ] = A D [λ] ⇐⇒ [C, D] = 0. (V. 45)

V. 4. CONTEXTUAL HIDDEN VARIABLES 123

Although splitting up quantities is a natural consequence of the idea of commeasurability, it means

giving up a one - to - one relation between the quantities of quantum mechanics and those of the HVT in

a very drastic manner; since the operator P |α1 ⟩ is part of infinitely many decompositions of unity, there

are infinitely many contexts in which P |α1 ⟩ can be measured.

The idea that the context of the measurement must be taken into the consideration can already be

found in Bell (1966). In this article, which was actually written earlier than his famous article with the

Bell inequality, Bell makes some observations concerning the requirements which could be imposed

to a contextual HVT. They have to have a spatial meaning and enable us to interpolate a space - time

picture, preferably causally, between the preparation and the measurement of states.

He then considers Bohm’ s theory of the quantum potential, see chapter VI, and shows that this

theory is not local. He wonders if every HVT of quantum mechanics must have this non - local character

(Bell 1966, p. 452),

However, it must be stressed that, to the present writer’s knowledge, there is no proof that

any hidden variable account of quantum mechanics must have this extraordinary character.

It would therefore be interesting, perhaps, to pursue some further “impossibility

proofs,” replacing the arbitrary axioms objected to above by some condition of locality,

or of separability of distant systems.

Meanwhile, still before the delayed publication of his article, Bell (1964) himself had found such a

proof.

Now we will show how the idea of locality can be brought to expression in a contextual HVT with

‘split’ quantities. Consider a composite system with Hilbert space H = H I ⊗ H II and an operator of

the form A ⊗ 11 where A is maximal in H I . Then the operator A ⊗ 11 is not maximal in H, and

A ⊗ 11 = f (X), (V. 46)

where X is some maximal operator on H. Especially consider an X of the form

X = X I ⊗ X II . (V. 47)

Suppose there is no interaction, or not anymore, between the systems I and II. Then we can raise the

question if X II must be taken to belong to the context of A ⊗ 11.

Consider a second maximal operator

Y = X I ⊗ Y II (V. 48)

which only differs from X in the last factor. We then have

A ⊗ 11 = f (X) = g(Y ). (V. 49)

A requirement of locality is now that

(A ⊗ 11) XI ⊗ X II

[λ] = (A ⊗ 11) XI ⊗ Y II

[λ], (V. 50)

in other words, a change in that what is measured of system II, does not result in a splitting of

quantities of system I. A contextual HVT satisfying (V. 50) is called local.

124 CHAPTER V. HIDDEN VARIABLES

The key question is if a local contextual HVT is compatible with quantum mechanics. As an

example we consider Bohm’s version of the thought experiment of EPR (Cooke and Hilgevoord 1979);

two spin 1/2 particles being in a singlet state. Measurements of the spin of each of the particles

correspond to operators of the form σ i ⊗ τ j , where σ i is the operator of the component of the spin of

the first particle in the direction i and τ j is, likewise, the operator for the second particle. In contrast

to the previously considered operators of the form X I ⊗ X II , the operators σ i ⊗ τ j are not maximal.

Let us consider three directions, i, j ∈ {1, 2, 3}, which means there are nine such measurements.

The result of a measurement of spin is either up or down, and consequently every measurement has

four possible outcomes. If we introduce a quantity in the HVT for each of the nine quantities, we can,

as we saw, reproduce the quantum mechanical predictions. Between the operators the relation

σ i ⊗ τ j = (σ i ⊗ 11) (11 ⊗ τ j ), with i, j ∈ {1, 2, 3} (V. 51)

holds. Now we also have to introduce quantities in the HVT for the six operators σ i ⊗ 11 and 11 ⊗ τ j .

In an autonomous HVT the quantities must also satisfy (V. 51), because the factors on the right -

hand side of (V. 51) commute. This means that there are only six independent quantities in the

HVT and it can be shown that with this the experimental predictions of quantum mechanics can not

be reproduced, see Wigner’s derivation in VII. 3.

In a contextual HVT however, we consider the quantities σ i ⊗ 11 and 11 ⊗ τ j to be dependent of

the context of the operators of which they are functions. Let χ(τ j ) be a function which assigns the

value 1 to the outcome of every spin measurement τ j ,

We then have

χ(τ j ) = 11, with j ∈ {1, 2, 3}. (V. 52)

(σ i ⊗ 11) σi ⊗ τ j

[λ] = (σ i ⊗ χ(τ j ))[λ]. (V. 53)

This quantity represents the spin of particle 1 within the context of a measurement of σ i ⊗ τ j ,

which is a measurement of both spins followed by multiplication of the results. Since j ∈ {1, 2, 3},

this gives a 3 - fold splitting of the quantity σ i ⊗ 11. The product rule now only applies to quantities

in the same context, and the validity is trivial in this case. There are enough independent quantities in

the HVT again to be able to reproduce quantum mechanics. The splitting worked out.

But at the same time we see the price we have to pay; the splitting does not satisfy the weak

requirement of locality (V. 50), because for j ≠ j ′ we make a distinction between the quantities

(σ i ⊗ 11) σi ⊗ τ j

[λ] and (σ i ⊗ 11) σi ⊗ τ j ′ [λ]. (V. 54)

This means that properties, quantities having values, of the one particle can no longer be specified

independent of those of the other particle, even if there is no interaction between these particles and

they are located in different galaxies. Redhead (1987, p. 135) speaks of an ontological contextuality.

The conclusion is that a contextual HVT has to be non - local to be compatible with quantum

mechanics.

◃ Remark

Notice that we did not speak of a measurement of the quantity σ i ⊗ 11. We have invariably seen

V. 4. CONTEXTUAL HIDDEN VARIABLES 125

this as being derived from the measurement of an operator of which it is a function. In this way the

maximal operators eventually acquire a special status, they are not being split up and they are the

only operators which can be measured directly. This can be assumed theoretically, but the relation

with the experimental practice in the laboratory, where almost exclusively degenerated quantities are

measured, is less clear. ▹

VI

BOHMIAN MECHANICS

My suggestion is that at each state the proper order of operation of the mind requires

an overall grasp of what is generally known, not only in formal, logical, mathematical

terms, but also intuitively, in images, feelings, poetic usage of language, etc.

— David Bohm

But why then had Born not told me of this “pilot wave?” If only to point out what was

wrong with it? [. . . ] Why is the pilot wave picture ignored in text books? Should it not be

taught, not as the only way, but as an antidote to the prevailing complacency? To show

that vagueness, subjectivity, and indeterminism, are not forced on us by experimental

facts, but by deliberate theoretical choice?

— John Bell

We briefly describe Bohm’s hidden variables theory, which we will call Bohmian mechanics.

Bohmian mechanics seems to have the same empirical strength as quantum mechanics, but succeeds

to provide an image in space and time of what exactly takes place in micro - physical reality.

VI. 1

INTRODUCTION

The debate between Bohr and Einstein concerning the interpretation of quantum mechanics

reached its peak in the 1935 EPR - article. Although both authors frequently returned to the problems,

neither of them has afterwards introduced new elements in his point of view. For most of

the physicists in the nineteen thirties and later it was not difficult to declare a winner to the debate,

Bohr’s view was accepted nearly unanimously. The question whether a physical reality hides behind

quantum mechanics, which exists of objects having properties and of which we can form ourselves a

picture in space and time, was put aside. It was also thought that Von Neumann’s proof, as discussed

in V. 2, p. 114, made a hidden variables reconstruction of quantum mechanics untenable.

It is the merit of Bohm to have made a breach in the Copenhagen interpretation for the first time,

by doing exactly that what was impossible or meaningless according to the Copenhageners. In 1952

he published two articles in which he presented a HVT of quantum mechanics. In the second article

he describes the breach as follows (Bohm 1952 part II, p. 188)

The usual interpretation of the quantum theory implies that we must renounce the possibility

of describing an individual system in terms of a single precisely defined conceptual

model. We have, however, proposed an alternative interpretation which does not imply

128 CHAPTER VI. BOHMIAN MECHANICS

such a renunciation, but which instead leads us to regard a quantum - mechanical system

as a synthesis of a precisely definable particle and a precisely definable ψ - field which

exerts a force on this particle.

Bohm’s theory is strongly related to ideas which Louis de Broglie already put forward at the

Solvay Conference in 1927. However, criticism from the Copenhageners at the conference, especially

expressed by Pauli, made de Broglie abandon his theory, which was indeed not quite completely and

consistently developed. Bohm devised, independently of de Broglie, an entirely elaborated version,

which brought about a reconversion of de Broglie.

We will study Bohm’s theory because it is an example of a concrete HVT, in contrast to the abstract

characterization of such theories which we discussed in the previous chapter. We will see that Bohm’s

theory shows remarkable aspects which differ thoroughly from classical physics.

VI. 2

THE QUANTUM POTENTIAL

Bohm’s theory, which we will call Bohmian mechanics, starts from wave mechanics, i.e. quantum

mechanics with L 2 (R n ) as its Hilbert space, but without the projection postulate. 1 This means that

Bohm assumes that there is a wave function ψ(⃗q, t) which always satisfies the Schrödinger equation.

First we consider the 1 - particle case, if there are more particles, ψ has more arguments.

The idea is to interpret this wave function as a statistical description of a particle which always has

a certain position and momentum. We will see that this particle must then be subjected to dynamics

which differs from classical dynamics, by assuming that the forces acting on the particle are not

exclusively the forces known from classical physics.

The basic assumption is the Schrödinger equation for a particle with mass m in a time independent

potential V (⃗q),

i

∂ψ(⃗q, t)

∂t

= − 2

2 m ∇2 ψ(⃗q, t) + V (⃗q) ψ(⃗q, t), (VI. 1)

but we will interpret the wave function differently from its usual interpretation in quantum mechanics.

To this end, we rewrite ψ, with the help of two real functions R, S : R 4 → R, as

ψ(⃗q, t) = R(⃗q, t) e i S(⃗q, t) . (VI. 2)

It is always possible to find such functions R and S. Requiring R(⃗q, t) 0, R and S are, at given ψ,

uniquely defined, except where ψ = 0. Substitution of (VI. 2) in (VI. 1), and separating the real and

imaginary parts of the resulting equation, leads to two equations,

∂R(⃗q, t)

∂t

∂S(⃗q, t)

∂t

= − 1 (

R(⃗q, t) ∇ 2 S(⃗q, t) + 2 ∇ R(⃗q, t) · ∇ S(⃗q, t) ) ,

2 m

(VI. 3)

( ) 2 ∇ S(⃗q, t)

= −

− V (⃗q) +

2 ∇ 2 R(⃗q, t)

.

2 m

2 m R(⃗q, t)

(VI. 4)

1 In the literature, under Bohmian mechanics a ’streamlined’ version of Bohm’s original theory is understood, without a

quantum potential.

VI. 2. THE QUANTUM POTENTIAL 129

First we consider equation (VI. 3). Using the abbreviation ρ = R 2 this equation becomes

∂ρ(⃗q, t)

∂t

+ ∇ ·

(

ρ(⃗q, t)

)

∇ S(⃗q, t)

m

= 0, (VI. 5)

where ρ = R 2 is equal to |ψ| 2 , the quantum mechanical probability density for finding a particle

at a certain position, which leads to the interpretation of ρ(⃗q, t) to be the probability density to find

the particle at time t at position ⃗q ∈ R 3 . If we now interpret ∇S (⃗q, t) as the momentum of the

particle, ∇S = ⃗p = m⃗v, (VI. 5) acquires a clear meaning; it is the continuity equation for a probability

density ρ, which expresses that the total probability, given by the integral of ρ(⃗q, t) over R, is

constant in time.

Now consider equation (VI. 4). The last term in this equation is the only term of both (VI. 3)

and (VI. 4) in which Planck’s constant appears explicitly. For this term we define the so - called

quantum potential,

U (⃗q, t) : = − 2

2 m

∇ 2 R(⃗q, t)

. (VI. 6)

R(⃗q, t)

In case the quantum potential U would be equal to 0, equation (VI. 4) reads

∂S(⃗q, t)

∂t

= −

(

∇ S(⃗q, t)

) 2

2 m

− V (⃗q), (VI. 7)

which is exactly the classical Hamilton - Jacobi equation for one particle. In (VI. 7), S is called the

action, and ∇S is, as mentioned above, the momentum of the particle. In other words, if U = 0,

we can interpret equations (VI. 3) and (VI. 4), and therefore also the equivalent Schrödinger equation

(VI. 1), as the statistical description of a particle moving in a potential V in accordance with the

laws of classical mechanics. We will discuss (VI. 7) more elaborately in section VI. 5, thereby also

motivating the interpretation of ∇S.

In case the quantum potential U would not be equal to 0, the just discussed interpretation can

still be given if we assume that, next to the classical potential V , the quantum potential U is added

as a correction to the equation of motion. The momentum is still given by ⃗p = ∇S, and (VI. 5)

remains to be a continuity equation. However, (VI. 7) is replaced by (VI. 4), the Hamilton - Jacobi

equation for a particle in the potential field V + U. We see that we have now adopted, besides the

well - known −∇V , an extra force which acts on the particle,

⃗F (⃗q, t) =

d⃗p(⃗q, t)

dt

= − ∇ ( V (⃗q) + U (⃗q, t) ) . (VI. 8)

If the limit → 0 is taken in the Schrödinger equation, (VI. 1), the result is nonsense, but if → 0 is

taken in the definition (VI. 6) of the quantum potential U, we have U (⃗q, t) = 0, and (VI. 8) reduces

to Newton’s law of motion.

We will now discuss a simple example to illustrate the difference between Bohmian mechanics

and quantum mechanics.

130 CHAPTER VI. BOHMIAN MECHANICS

EXAMPLE

A particle sits in a 1 - dimensional ‘box’ of length L, having walls which are formed by infinitely

high potential barriers. Quantum mechanics gives as stationary solutions

ψ n (q, t) = ψ n (q) e − i En t , (VI. 9)

with

ψ n (q) =

√

2

( nπq

)

L sin , q ∈ [0, L], (VI. 10)

L

and energy values

E n =

2

2 m

( n π

) 2

. (VI. 11)

L

Therefore, in Bohmian mechanics for a stationary state we have

R n (q, t) = ψ n (q) and S n (q, t) = − E n t. (VI. 12)

Now it is surprising that in this example it holds that

p = ∂S n

∂q

= ∂(− E n t)

∂q

= 0, (VI. 13)

i.e., according to Bohmian mechanics the particle is motionless. This also applies to other cases of

stationary states, for example to the ground state of the hydrogen atom. It is in straight contradiction

to the statements of quantum mechanics. After all, in the case of the box quantum mechanics

assigns, if the particle is in the state ψ n , a large probability to finding the momentum p having values

around ±nπ

L

, in which case the particle moves with p m

> 0, although the quantum mechanical

expectation value of p is zero for the particle in the box.

This example shows that the statements of quantum mechanics and Bohmian mechanics do not

coincide for all quantities. They only correspond concerning probability distributions for position

measurements. Bohmian mechanics is, therefore, not a HVT in the sense of chapter V, where it was

assumed that the statements of such a theory are similar to the statements of quantum mechanics for

all quantities. Von Neumann’s impossibility proof is therefore not applicable to Bohmian mechanics.

The explanation of the discrepancy between Bohmian mechanics and quantum mechanics lies, of

course, in the use of the quantum potential. According to Bohm, the energy of the particle in the box

has been entirely stored in the form of potential energy as a result of the quantum potential, hence,

the particle has no kinetic energy.

This changes however as soon as we open the box by removing one or both barriers. The quantum

potential energy is again released, and the particle will start to move. The wave packet ψ(⃗q, t) then

spreads out in space, in exactly the same way as prescribed by the Schrödinger equation, and there

is no difference anymore between the statements of both theories concerning the movement of the

particle.

VI. 2. THE QUANTUM POTENTIAL 131

The discrepancy between Bohmian mechanics and quantum mechanics has no perceptible consequences

if we argue that all measurements are ultimately made by means of observation of position.

Every physical quantity is eventually determined by a ‘pointer’ with a certain position, and a momentum

measurement must eventually be registered by means of the displacement of some object.

◃ Remark

Notice that Bohm’s point of view deviates from that of Bohr, which says that position and momentum

measurements exclude each other in principle but are both necessary to be able to give an exhaustive

description of the system. ▹

Figure VI. 1: The quantum potential for the two slit system as viewed from the screen, under assumption

of a Gaussian distribution at the slits (Bohm 1989 )

Finally we consider a special case. Suppose that A, B ⊂ R 3 are disjoint areas in space,

i.e. A ∩ B = ∅, ψ A and ψ B are wave functions which are 0 outside these areas, and the wave

function has the following form,

ψ(⃗q) = a ψ A (⃗q) + b ψ B (⃗q), (VI. 14)

with a, b ∈ R. Since ψ A and ψ B have no overlap, for all ⃗q ∈ R 3 it holds that

ψ A (⃗q) ψ B (⃗q) = 0. (VI. 15)

Therefore, the probability density belonging to (VI. 14) is

ρ(⃗q) = |a ψ A (⃗q)| 2 + |b ψ B (⃗q)| 2 , (VI. 16)

without a cross - term, and we see that the ensemble of particles described by the density |ψ (⃗q)| 2

behaves like a mixture.

132 CHAPTER VI. BOHMIAN MECHANICS

With

S(⃗q) =

⎧

⎪⎨

⎪⎩

S A (⃗q) for ⃗q ∈ A,

S B (⃗q) for ⃗q ∈ B,

0 elsewhere,

(VI. 17)

and ψ A (⃗q) = R A (⃗q)e i S A(⃗q) , etc., (VI. 14) reads

ψ(⃗q) = ( a R A (⃗q) + b R B (⃗q) ) e i S(⃗q) , (VI. 18)

which means that also the quantum potential, as depicted in figure VI. 1, can now be taken as a sum

of terms belonging to separate areas. The particles in area A do not perceive the wave function in

area B at all.

Figure VI. 2: A simulation of the double slit experiment in Bohmian mechanics. Each particle follows

a certain path between the slits and the photographic plate. All particles coming from the upper slit

arrive at the upper half of the photographic plate, likewise for the lower slit and lower half of the

plate. The twists in the paths are caused by the quantum potential U. (Vigier et al. 1987 )

VI. 3

COMPOSITE SYSTEMS

The technique used to rewrite the Schrödinger equation into equations describing particles with

definite position and momentum in a non - classical potential field, can easily be generalized. For

VI. 3. COMPOSITE SYSTEMS 133

example, for a system of two particles, represented by the wave function ψ (⃗q 1 , ⃗q 2 , t), we interpret

|ψ(⃗q 1 , ⃗q 2 , t)| 2 as the probability density that, simultaneously, particle 1 is located at position ⃗q 1

and particle 2 at position ⃗q 2 .

We write

ψ(⃗q 1 , ⃗q 2 , t) = R(⃗q 1 , ⃗q 2 , t) e i S(⃗q 1, ⃗q 2 , t) , (VI. 19)

and the quantum potential is now given by

2 ( 2 ∇1 R(⃗q 1 , ⃗q 2 , t)

U (⃗q 1 , ⃗q 2 , t) = −

+ ∇ 2 2 )

R(⃗q 1 , ⃗q 2 , t)

, (VI. 20)

R(⃗q 1 , ⃗q 2 , t) 2 m 1 2 m 2

where ∇ i := ∂ /∂⃗q i is the gradient to the coordinates of particle i. In this expression the coordinates

of both particles occur. Therefore, the force on particle 1, ⃗ F 1 = −∇(V + U), also depends,

by means of the quantum potential, on the position of particle 2, and vice versa. This can be compared

to the situation in Newton’s gravitation theory, where such a dependence appears in the classical

potential V ; there is an instantaneous interaction (Latin: actio in distans) between particles, a choice

of another initial position of one particle immediately influences the dynamics of the other.

Notice, however, that in Bohmian mechanics this influence does not have to decrease with the

distance between the particles. Even if R (⃗q 1 , ⃗q 2 , t) would go to 0 for ∥⃗q 1 − ⃗q 2 ∥ → ∞, the quantum

potential U(⃗q 1 , ⃗q 2 ) does not need to do so, it depends on the second derivative, which means that

it depends on the strength of the oscillation of R, not on the amplitude.

Also notice that the mutual dependence between the particles does not only appear by means of

the quantum potential. The momentum of particle 1, given by ∇ 1 S(⃗q 1 , ⃗q 2 , t), cannot be chosen independently

of the position of particle 2, and vice versa. This does not even happen in a classical theory

with an actio in distans, and it gives Bohmian mechanics a deeply ‘holistic’ character.

Only when the total wave function is a product this mutual dependence disappears, because then

yielding

ψ(⃗q 1 , ⃗q 2 , t) = ψ 1 (⃗q 1 , t) ψ 2 (⃗q 2 , t), (VI. 21)

R(⃗q 1 , ⃗q 2 , t) = R 1 (⃗q 1 , t) R 2 (⃗q 2 , t),

S(⃗q 1 , ⃗q 2 , t) = S 1 (⃗q 1 , t) + S 2 (⃗q 2 , t) (VI. 22)

and, consequently, (VI. 20) becomes

U (⃗q 1 , ⃗q 2 , t) = U 1 (⃗q 1 , t) + U 2 (⃗q 2 , t). (VI. 23)

Each particle only feels its own potential field, and its momentum does not depend on the position

of the other particle. If now the classical potential V is also a sum of 1 - particle potentials, this

factorizability is preserved in time.

We know, however, that the wave function ψ (⃗q 1 , ⃗q 2 , t) does in general not have to be a product

state, and even if it is a product state at some moment, it will generally not remain to be one. We

must therefore conclude that the quantum potential U represents a non - local connection between the

particles.

134 CHAPTER VI. BOHMIAN MECHANICS

◃ Remark

For Bell, this observation was a reason to examine if quantum mechanical HVT’s can, in fact, be local

at all. We will come back to this in chapter VII. ▹

An intermediate form occurs if A, B, C, D ⊂ R 3 are certain areas in space, such that A ∩ C = ∅

or B ∩ D = ∅, ψ A , ψ C , ϕ B , ϕ D are wave functions which are 0 outside these areas, and the wave

function is, analogously to (VI. 14), of the form

ψ(⃗q 1 , ⃗q 2 ) = a ψ A (⃗q 1 ) ϕ B (⃗q 2 ) + b ψ C (⃗q 1 ) ϕ D (⃗q 2 ), (VI. 24)

with a, b ∈ R. Since the pair ψ A and ψ C , or the pair ϕ B and ϕ D , or both, have no overlap, for

all ⃗q 1 , ⃗q 2 ∈ R 3 we have

ψ A (⃗q 1 ) ψ C (⃗q 1 ) = 0 or ϕ B (⃗q 2 ) ϕ D (⃗q 2 ) = 0. (VI. 25)

Therefore, the probability density belonging to (VI. 24) is

ρ(⃗q 1 , ⃗q 2 ) = R 2 (⃗q 1 , ⃗q 2 ) = |a ψ A (⃗q 1 ) ϕ B (⃗q 2 )| 2 + |b ψ C (⃗q 1 ) ϕ D (⃗q 2 )| 2 , (VI. 26)

without a cross - term, and we see that the ensemble, again analogously to (VI. 14), behaves like a

mixture. In this case we call the wave function ψ(⃗q 1 , ⃗q 2 ) effectively factorizable.

With

⎧

S ⎪⎨ A (⃗q 1 ) + S B (⃗q 2 ) for ⃗q 1 ∈ A, ⃗q 2 ∈ B

S tot (⃗q 1 , ⃗q 2 ) = S C (⃗q 1 ) + S D (⃗q 2 ) for ⃗q 1 ∈ C, ⃗q 2 ∈ D

(VI. 27)

⎪⎩

0 elsewhere,

and ψ A (⃗q 1 ) = R A (⃗q 1 )e i S A(⃗q 1 ) , etc., because of (VI. 25) it holds that

ψ(⃗q 1 , ⃗q 2 ) = a R A (⃗q 1 ) R B (⃗q 2 ) e i (S A(⃗q 1 ) + S B (⃗q 2 ))

+ b R C (⃗q 1 ) R D (⃗q 2 ) e i (S C (⃗q 1 ) + S D (⃗q 2 ))

(VI. 28)

= ( a R A (⃗q 1 ) R B (⃗q 2 ) + b R C (⃗q 1 ) R D (⃗q 2 ) ) e i Stot(⃗q 1, ⃗q 2 ) .

Therefore, also in case of composite systems, the quantum potential can be taken as a sum of terms

belonging to the separate particles, and the momentum of a particle does not depend on the other

particle.

Consequently, we can interpret the system as being composed of a pair of particles of which one

particle is in area A and the other in B, or, likewise, in area C and D. The pair of particles is not

influenced by the wave functions or the quantum potential in the other area. For this reason, these

pilot waves are also called empty waves. They have no dynamic influence on the particles, but they

do contain energy. If, at some time, the wave functions will have overlap again, they will of course

also regain influence.

VI. 4. REMARKS AND PROBLEMS 135

VI. 4

REMARKS AND PROBLEMS

In Bohmian mechanics the wave function a plays a double role. On the one hand, we see

that ρ(⃗q, t 0 ) = R 2 = |ψ (⃗q, t 0 )| 2 is equal to the probability density to find a particle at time t 0 at

a certain position, and we use this to characterize the ensemble at t 0 . On the other hand, ψ determines

the value of R, and thereby, by means of formula (VI. 6) or (VI. 20), also the quantum potential which

has the same status as the classical potential V . This means that ψ is also connected with the dynamic

evolution of particles.

This is strange if seen from a classical perspective. In classical statistical mechanics it is always

possible to specify the form of the probability density at t 0 independently of the dynamics. Inversely,

the force acting on a particle in a classical theory does not depend on the probabilities that the particle

would be at another position then it actually is. But we saw that in Bohmian mechanics the force does

depend on the probabilities. In Bohm’s interpretation we must therefore assume that if at an initial

time t 0 the quantum mechanical probability density is |ψ (⃗q, t 0 )| 2 , the particles subsequently move

under the influence of forces which are also determined by ψ(⃗q, t 0 ).

Nonetheless, it can be proved that if this pre - established harmony is valid at one moment in

time, it remains valid at all other times. In later work, Bohm speculated that this harmony between

the quantum potential and the probability density could possibly be understood as a requirement for

equilibrium of an underlying ‘sub - quantum aether’. From this idea the expectation arises that if this

equilibrium can be disrupted, it can only after some time become restored again, so that deviations

from the quantum mechanical predictions can appear at very swift measurements. Until now such

deviations have not been found.

Bohmian mechanics gives, on the basis of the thesis that, eventually, all measurements are position

measurements, the same empirically verifiable predictions as standard quantum mechanics does.

Moreover, it provides a picture in which particles have position and momentum and it can be visualized

how the particles move through space, even if there is no measurement. Also, Bohmian

mechanics is deterministic; the evolution is determined by classical mechanics, extended with the

quantum potential. Although these properties seem to be large advantages, Bohm’s proposal evoked

no enthusiasm in the nineteen fifties.

Of course, from the side of the Copenhageners little support was to be expected. The proposal was

dismissed as ‘metaphysical speculation’, a return to the lost paradise of classical physics. Bohm parried

this argument by calling the Copenhageners’ ‘completeness’ claim untestable and metaphysical.

But Einstein also found the idea ‘too cheap’ because it leaned too much on the quantum mechanical

formalism in combination with the classical idea of particles. Einstein himself thought that a

completely new theory with a totally different perspective was necessary, such as his unified field

theory. Probably, Einstein also had objections because of the far - reaching non - locality of Bohmian

mechanics.

Others stumbled at the fact that Bohmian mechanics only relies on a rewriting of the Schrödinger

equation, and contains nothing new. Bohm had foreseen this criticism and tried to argue that his theory

presents new ideas for experiments and that on distance and energy scales which are within range of

Heisenberg’s indeterminacy principle, Bohmian mechanics will prove to be necessary. But above all,

Bohm wanted to show the possibility of a HVT and to challenge the necessity of the Copenhagen

interpretation.

136 CHAPTER VI. BOHMIAN MECHANICS

Bohmian mechanics has not lead to new verifiable statements, although ‘tunneling times’ are debated,

about which quantum mechanics does not say anything, but Bohmian mechanics does. Furthermore,

by the fresh look supplied by Bohmian mechanics, new extensions of the theory are suggested,

such as the suggestion of an underlying sub - quantum aether, as a result of the unexpected double

role of the wave function.

In the nineteen nineties, a growing group of physicists considered Bohmian mechanics to be a

serious alternative for the Copenhagen interpretation, see for example Holland (1993) and Cushing

(1994), who suggests a sociological explanation for the fact that the physicists’ community did not

replace quantum mechanics by the, according to Cushing, superior Bohmian mechanics.

VI. 5

THE HAMILTON - JACOBI EQUATION

In classical mechanics we assume that for a system of n point particles, with canonical positions

⃗q = (q 1 , . . . , q n ) ∈ R 3n and speeds ˙⃗q = ( ˙q 1 , . . . , ˙q n ) ∈ R 3n , a Lagrangian L(⃗q, ˙⃗q, t) can be

found, the Lagrangian L = T − V being the difference between kinetic and potential energy. Define

the following functional, called the action

∫

S γ (⃗q, t; ⃗q 0 , t 0 ) := L(⃗q, ˙⃗q, t) dt, (VI. 29)

γ

where the integral, for n particles in 3 dimensions, is taken over a continuous path γ in configuration

space R 3n between an initial configuration ⃗q 0 at time t 0 and the configuration ⃗q at time t. In case the

Lagrangian does not explicitly depend on t, we can also write S γ (⃗q, ⃗q 0 , t − t 0 ).

The equations of motion are found by application of Hamilton’s principle of least action; for the

path γ 0 which is actually followed, the action reaches an extremum in comparison to all possible

continuous paths. This requirement,

δS γ = 0, (VI. 30)

provides n equations of motion of Euler and Lagrange,

d

dt

∂L

∂ ˙q j

− ∂L

∂q j

= 0. (VI. 31)

The Hamiltonian, H = T + V , is defined as the Legendre transform of the Lagrangian,

H (⃗q, ⃗p, t) :=

3n∑

j=1

p j ˙q j − L(⃗q, ˙⃗q, t) (VI. 32)

where

p j := ∂L

∂ ˙q j

(VI. 33)

is the canonical momentum.

VI. 5. THE HAMILTON - JACOBI EQUATION 137

Substitution of (VI. 32) in (VI. 29) yields

S γ =

∫

γ

( 3n∑

j=1

)

p j ˙q j − H (⃗q, ⃗p, t) dt =

3n∑

j=1

∫

γ

p j dq j −

∫

γ

H (⃗q, ⃗p, t) dt, (VI. 34)

and variation of S γ in this form yields the 2n Hamiltonian equations of motion,

˙q j = ∂H

∂p i

,

ṗ j = − ∂H

∂q i

. (VI. 35)

Now consider the action S γ along a real path γ 0 , i.e., a path satisfying the equations of motion,

and form its differential,

dS(⃗q, ⃗q 0 , t − t 0 ) =

3n∑

j=1

(p j dq j − p 0j dq 0j ) − H (⃗q, ⃗p, t) dt. (VI. 36)

Comparison with

dS(⃗q, ⃗q 0 , t − t 0 ) =

3n∑

j=1

( ∂S

∂q j

dq j +

∂S )

dq 0j + ∂S dt (VI. 37)

∂q 0j ∂t

and using requirement (VI. 30) shows that

H (⃗q, ⃗p, t) = − ∂S

∂t ,

p j = ∂S

∂q j

,

p 0j = − ∂S

∂q 0j

, (VI. 38)

and therefore

∂S

(

∂t + H ⃗q, ∂S )

∂⃗q , t

= 0. (VI. 39)

This is (VI. 7), the Hamilton - Jacobi equation, as discussed on p. 129. The technique to solve the

mechanical equations of motion by means of this equation is especially due to Jacobi. Without discussing

this technique in detail, we mention the following.

For definite q 0 and t 0 it is possible to consider the action S as a function on configuration space. It

can be shown that the paths satisfying the equations of motion are always perpendicular to the hyperplanes

of constant S, hence the frequently quoted analogy with optics; paths are comparable to rays

of light, and planes of constant S to wave fronts. If, for one moment in time, the values S are given

over the complete configuration space, the Hamilton - Jacobi equation determines how they evolve in

the course of time. The problem to find the paths of the particles is thus reduced to constructing the

curves which are normal to the planes of constant S.

◃ Remark

Schrödinger originally based his derivation of wave mechanics on the idea that wave mechanics is to

classical mechanics as wave optics is to ray optics, and with the just mentioned wave fronts and the

Hamilton - Jacobi equation he came to his wave mechanics. ▹

VII

BELL’S INEQUALITIES

There is hardly a paper - nor was there any during the past two and a half decades -

which deals with the foundations of quantum mechanics and does not refer to the work

of John Stewart Bell.

Bell’s theorem is the most profound discovery of science.

— Max Jammer

— Henry Stapp

[. . . ] Bell is generally credited with having brought down a purely philosophical issue

from the lofty realms of abstract speculation to the tangible reach of empirical investigation

and of having thereby established what has been called ‘experimental metaphysics’.

— Max Jammer

The ‘Bell inequalities’ is a generic term for inequalities in terms of measurable physical quantities

which are satisfied by hidden variables theories, but are violated by quantum mechanics. We will

derive several Bell inequalities, belonging to different types of hidden variables theories. This

also includes indeterministic, stochastic HVT’s, which fell outside the scope of chapter V.

VII. 1

LOCAL DETERMINISTIC HIDDEN VARIABLES

VII. 1. 1

DERIVATION OF THE FIRST BELL INEQUALITY

Returning to the hidden variables theories, HVT’s, we focus our attention at a specific experiment.

In the article ‘On the Einstein Podolsky Rosen paradox’ (1964), J.S. Bell examines the EPR experiment,

discussed in section I. 2, in a version which was given by Bohm and Aharonov (Bohm 1957),

also called the EPRB experiment. Bohm and Aharonov proposed an experiment in which two spin

1/2 particles are prepared in the singlet state and, next, move apart in opposite directions. After they

are separated, the spin of each of the particles is measured in an arbitrary direction, where the spin of

particle 1 is measured in direction ⃗a and the remote particle 2 in direction ⃗ b, as in figure III. 3, p. 73.

In this experiment, one can follow the same argument as EPR. Using the notation of section III. 6,

if measurement of ⃗σ 1 · ⃗a yields the value +1 then, for the singlet state, measurement of ⃗σ 2 · ⃗a must

yield the value −1 and vice versa.

Since the result of a measurement of a spin component of the one particle can be predicted with

certainty by measuring the same component of the other particle, whereas the particles are far away

140 CHAPTER VII. BELL’S INEQUALITIES

from each other and do not interact, it follows, according to EPR, that the result of a measurement

of any spin component is determined in advance, i.e., that it is an element of physical reality. This

suggests that there there should be a more complete description of the state of the particles, including

hidden variables.

Specify this description of the pair of particles with variables λ ∈ Λ as we did in chapter V. We

write the quantities corresponding to (⃗σ 1·⃗a)⊗(⃗σ 2·⃗b) as the pair (A, B), having values a,b = ±1. In a

contextual HVT, these values are dependent on the hidden variable λ and the total measuring context,

which can be specified here by means of the measurement directions ⃗a and ⃗ b, leading to

A = A(⃗a, ⃗ b, λ) and B = B(⃗a, ⃗ b, λ). (VII. 1)

Now the essential assumption is the requirement of locality that the quantity A does not depend

on the reading ⃗ b of a remote spin meter, and vice versa for B and ⃗a. These quantities therefore only

depend upon the local context,

A(⃗a, ⃗ b, λ) = A(⃗a, λ), a = ±1,

B(⃗a, ⃗ b, λ) = B( ⃗ b, λ), b = ±1. (VII. 2)

b = +1

b = −1

B( ⃗ b, λ)

A(⃗a, λ)

a = +1

a = −1

b ′ = +1

b ′ = −1

B( ⃗ b ′ , λ)

A(⃗a ′ , λ)

a ′ = +1

a ′ = −1

Spin meter B

ρ(λ)

Spin meter A

Source

Figure VII. 1: Thought experiment of Einstein, Podolsky and Rosen on the singlet

The source emitting the particle pairs probably does not prepare the pairs in the same state λ each

time. We assume that the source can be characterized by a probability density ρ,

∫

ρ(λ) dλ = 1, (VII. 3)

Λ

where we also assume that this probability density does not depend on the measuring directions ⃗a

and ⃗ b, which, after all, can be established long after the particles have left the source. The expectation

value of the product of A and B in this HVT is therefore

∫

E(⃗a, ⃗ b) = A(⃗a, λ) B( ⃗ b, λ) ρ(λ) dλ. (VII. 4)

VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 141

Quantum mechanics gives as the expectation value, with the particle pair in the singlet state, see

equation (III. 171), p. 73,

E QM (⃗a, ⃗ b) = ⟨ ⃗σ 1 · ⃗a ⊗ ⃗σ 2 · ⃗b ⟩ = −⃗a · ⃗b = − cos θ ⃗a, ⃗ b

. (VII. 5)

But the expressions (VII. 4) and (VII. 5) cannot coincide for all directions ⃗a and ⃗ b. According

to (VII. 2), the expectation value E(⃗a, ⃗ b) of the product of A and B cannot be less than −1. Therefore,

to reach −1 at ⃗a = ⃗ b, also requiring equality between (VII. 4) and (VII. 5), it must hold for all unit

vectors ⃗n that

A(⃗n, λ) = − B(⃗n, λ), (VII. 6)

which leads to

∫

E(⃗a, ⃗ b) = −

Λ

A(⃗a, λ) A( ⃗ b, λ) ρ(λ) dλ. (VII. 7)

Now it follows, because of ( A(⃗n, λ) ) 2 = 1, that

∫

E(⃗a, ⃗ b) − E(⃗a, ⃗ (

b ′ ) = − A(⃗a, λ) A( ⃗ b, λ) − A(⃗a, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ

=

Λ

∫

Λ

A(⃗a, λ) A( ⃗ b, λ) ( A( ⃗ b, λ) A( ⃗ b ′ , λ) − 1 ) ρ(λ) dλ, (VII. 8)

where ⃗ b ′ is another setting of the remote spin meter, and A( ⃗ b ′ , λ) also has values ±1. Taking the

absolute value on both sides, keeping in mind that |A(⃗a, λ)A( ⃗ b, λ)| = 1, it follows that

∫

|E(⃗a, ⃗ b) − E(⃗a, ⃗ (

b ′ )| 1 − A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ, (VII. 9)

or,

Λ

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| 1 + E( ⃗ b, ⃗ b ′ ). (VII. 10)

This is the original Bell inequality.

VII. 1. 2

THE BELL INEQUALITY OF CLAUSER, HORNE, SHIMONY AND HOLT

Next, we will derive a second inequality. In (VII. 8), we replace ⃗a by ⃗a ′ and the − sign by

the + sign,

∫

E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ (

b ′ ) = − A(⃗a ′ , λ) A( ⃗ b, λ) + A(⃗a ′ , λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ

∫

= −

Λ

A(⃗a ′ , λ) A( ⃗ b, λ) ( 1 + A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ. (VII. 11)

142 CHAPTER VII. BELL’S INEQUALITIES

Now, in the same way as we derived (VII. 10), we obtain

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 1 − E( ⃗ b, ⃗ b ′ ). (VII. 12)

Combination of (VII. 10) and (VII. 12) leads to

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2. (VII. 13)

This version of the Bell inequality has been first derived although under weaker assumptions than

used here, by Clauser, Horne, Shimony and Holt (Clauser 1969), for which reason it is also called the

CHSH inequality. We will return to these assumptions in section VII. 2,

VII. 1. 3

VIOLATION OF THE BELL INEQUALITIES BY QUANTUM MECHANICS

We will now prove the following theorem.

BELL’S FIRST THEOREM:

A local deterministic HVT is empirically contradictory to quantum mechanics.

Proof

With the expression empirically contradictory we mean that the two theories make contradictory

statements in terms of measurable physical quantities. We will show that, quantum mechanically,

there are spin quantities which violate the Bell inequalities.

Consider the configuration below, where all vectors lie in the same plane.

a

a ′ , b

b ′

ϕ

Figure VII. 2: A configuration in which the spin quantities violate the Bell inequality

Using (VII. 5) for this configuration, and substituting the quantum mechanical expression into (VII. 13),

F (ϕ) := | − cos ϕ + cos 2ϕ | + | − cos ϕ − 1| 2, . (VII. 14)

This function is plotted in figure VII. 3.

VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 143

2

F (ϕ)

0

π/2

ϕ →

π

Figure VII. 3: The Bell inequality violated for every acute angle ϕ

We see that (VII. 14) is violated for every ϕ ∈ (0, 1 2 π). The maximum violation is F (60◦ ) = 5 2 ,

as can be seen in the figure.

Even larger violations are by the next configuration:possible in other configurations. The largest

violation is obtained in the configuration of figure VII. 4(with all vectors in a single plane),

leading to

E QM (⃗a, ⃗ b) = − cos 45 ◦ = − 1 2

√

2,

E QM (⃗a, ⃗ b ′ ) = − cos 135 ◦ = 1 2

√

2,

E QM (⃗a ′ , ⃗ b) = − cos 135 ◦ = 1 2

√

2,

E QM (⃗a ′ , ⃗ b ′ ) = − cos 135 ◦ = 1 2

√

2,

|E QM (⃗a, ⃗ b) − E QM (⃗a, ⃗ b ′ )| + |E QM (⃗a ′ , ⃗ b) + E QM (⃗a ′ , ⃗ b ′ )| = 2 √ 2. (VII. 15)

This is a violation of 41%. □

a

b

a ′ 45 ◦ b ′

Figure VII. 4: the configuration giving the largest violation of the Bell inequality (all vectors in the

same plane)

144 CHAPTER VII. BELL’S INEQUALITIES

VII. 1. 4

THE BELL INEQUALITY IN A NON-CONTEXTUAL, LOCAL DETERMINISTIC HVT

To show that the Bell inequality, derived for a local deterministic contextual HVT, also holds for

a local deterministic autonomous HVT, we consider a local deterministic autonomous model for the

singlet.

Assume that both particles are characterized by a ‘classical’ spin vector, ⃗ J and − ⃗ J, about a

common axis. This is the hidden variable. In this HVT, we further assume that the outcome of a

measurement of spin in the direction ⃗n is determined by the sign of the component of the spin vector

in the direction ⃗n. Now let the particles fly away from each other. If the spin of the first particle in the

direction ⃗a is measured we find the outcome

⃗J · ⃗a

∥ ⃗ J · ⃗a∥

∈ {− 1, 1}, (VII. 16)

for the spin of the second particle in direction ⃗ b we find

− ⃗ J · ⃗b

∥ ⃗ J · ⃗b∥

∈ {− 1, 1}. (VII. 17)

The result of the measurement of the first particle is independent of the direction ⃗ b and vice versa,

therefore, the model is local.

Now consider an ensemble of such two particle systems where ⃗ J is distributed isotropically. If a n

is the sign of ⃗ J · ⃗a in the n th pair, and likewise, b n the sign of − ⃗ J · ⃗b, then if ⃗ J pierces through the

shaded area of the unit sphere on the right side in figure VII. 5, a n b n = +1. Otherwise, a n b n = −1.

⃗a

+

⃗a

⃗ b

−

⃗J

θ

+

−

⃗ b

−

+

− ⃗ J

Figure VII. 5: Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n is

positive, in the unshaded areas a n b n is negative.

VII. 2. LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES 145

The surface of the shaded area is 4θ ⃗a, ⃗ b

, that of the remaining part is 4(π − θ ⃗a, ⃗ b

). For an isotropic

distribution, averaging over the surface of the unit sphere, we therefore find

⟨a n b n ⟩ = 1 (

4 θ⃗a, ⃗

4 π b

− 4 (π − θ ⃗a, ⃗ b

) ) = − 1 + 2 π θ ⃗a, ⃗ , (VII. 18)

b

which is an increasing line through (0, −1) having slope π 2 . This runs from perfect anti - correlation

for θ = 0 to perfect correlation for θ = π.

1

− cos θ ⃗a, ⃗ b

⟨a n b n ⟩

0

θ →

π

− 1

Figure VII. 6: Comparison of the quantum mechanical expectation values and those for the local

deterministic HVT

In this HVT, equation (VII. 18) must satisfy the Bell inequality (VII. 13) for E (⃗a, ⃗ b) = ⟨a n b n ⟩.

Choosing the angles as in the example on p. 142, figure VII. 2, if (VII. 18) is substituted in (VII. 13)

it yields exactly 2 for any θ π, where the quantum mechanical expectation values violated the

inequality for every θ ∈ (0, 1 2 π).

In the configuration giving the largest violation of the inequality (VII. 13), see figure VII. 4, we

have

θ ⃗a, ⃗ b

= 1 4 π and θ ⃗a, ⃗ = θ

b ′ ⃗a ′ , ⃗ b = θ ⃗a ′ , ⃗ = 3 b ′ 4

π, (VII. 19)

and therefore, (VII. 18) substituted in (VII. 13) yields

| ( − 1 + 2) 1 ( ) ( ) (

− − 1 +

3

2 | + | − 1 +

3

2 + − 1 +

3

2)

| = 1 + 1 = 2, (VII. 20)

where quantum mechanically, on p. 143 we found 2 √ 2.

We see that where quantum mechanics violated the inequality (VII. 13), this local deterministic

autonomous HVT satisfies it, thereby confirming Bell’s first theorem.

VII. 2

LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES

We have seen that a considerable difference exists between the empirically verifiable statements

of quantum mechanics and those of a local deterministic, autonomous HVT for a singlet state and

146 CHAPTER VII. BELL’S INEQUALITIES

suitably chosen spin directions. This enables an experimental test of these statements, and therefore

of the correctness of the philosophical bases of both theories. A. Shimony (1989) spoke, concerning

the experimental testing of the Bell inequalities, of ‘experimental metaphysics’.

However, the question of experimental testing puts the derivation of the Bell inequalities in another

perspective. We no longer want to compare a HVT with quantum mechanics, but with experimental

results. In this respect (VII. 6), implying perfect anti - correlation when ⃗a = ⃗ b, is overly

idealized. In a real experiment the particle detectors are not perfectly efficient, in the sense that not

all particles are registered. Imagine a detector which, even if A(⃗a, λ) = 1, sometimes gives 0, i.e. not

measured, or even −1, i.e. wrongly measured. Moreover, in a contextual HVT the outcomes could also

be dependent of the measuring context, i.e. of (possibly hidden) variables of the detectors. But also in

this generalized situation it is possible to derive the inequality (VII. 13) from a locality assumption.

We will show this by proving the next theorem.

BELL’S SECOND THEOREM:

A local deterministic contextual HVT is empirically inconsistent with quantum mechanics.

Proof

Assume that the quantities A and B are functions of three arguments,

A = A(⃗a, λ, µ), B = B( ⃗ b, λ, ν) where A, B ∈ {− 1, 1}. (VII. 21)

Here the local deterministic character of the HVT is expressed; the outcome of the measurement

at the measuring apparatus measuring ⃗a · ⃗σ is determined by λ ∈ Λ, describing the source, by the

local hidden variables of that measuring device, expressed symbolically by µ ∈ Λ a , and by the

position ⃗a of the meter pointer. Therefore, the requirement of locality is that A does not depend

on ⃗ b and ν, and B does not depend on ⃗a and µ. We also assume that the hidden variables of the

apparatuses are independent of each other and of λ,

Defining

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ) ρ 2 (ν). (VII. 22)

and

⟨A(⃗a, λ)⟩ :=

⟨B( ⃗ b, λ)⟩ :=

∫

A(⃗a, λ, µ) ρ 1 (µ) dµ (VII. 23)

Λ a

∫

B( ⃗ b, λ, ν) ρ 2 (ν) dν, (VII. 24)

Λ b

we have, instead of assumption (VII. 2), the much weaker requirements

|⟨A(⃗a, λ)⟩| 1 and |⟨B( ⃗ b, λ)⟩| 1, (VII. 25)

and we will show now that from this it is again possible to derive the Bell inequality (VII. 13).

dµ A(⃗a, λ, µ) dν B( ⃗ b, λ, ν) ρ(λ, µ, ν)

VII. 3. WIGNER’S DERIVATION 147

The expectation value in this HVT is

∫ ∫

∫

E(⃗a, ⃗ b) = dλ

Λ Λ a Λ b

∫

= ⟨A(⃗a, λ)⟩ ⟨B( ⃗ b, λ)⟩ ρ(λ) dλ, (VII. 26)

Λ

which is an ‘averaged’ version of (VII. 4). With (VII. 25) we see that

∫

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| = |⟨A(⃗a, λ)⟩ ( ⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩ ) | ρ(λ) dλ

Λ

∫

|⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ. (VII. 27)

Λ

Likewise we have

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )|

∫

Λ

|⟨B( ⃗ b, λ)⟩ + ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ, (VII. 28)

and therefore

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2, (VII. 29)

since |x + y| + |x − y| 2 if |x| 1 and |y| 1. We see that (VII. 29) is, indeed, the Bell

inequality (VII. 13).

For ⃗a ′ = ⃗ b ′ and the assumption of perfect anti - correlation E ( ⃗ b ′ , ⃗ b ′ ) = −1, from inequality

(VII. 13) follows the original Bell inequality (VII. 10). But, as we showed, (VII. 13) remains

valid under the weaker conditions (VII. 25). □

◃ Remark

It is not necessary to assume mutual independence for µ and λ or for ν and λ as in (VII. 22), the

result (VII. 25) also follows when we make the weaker assumption that the conditional probability

distributions of the apparatuses factorize the conjoint probability distribution ρ,

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ | λ) ρ 2 (ν | λ). ▹ (VII. 30)

VII. 3

WIGNER’S DERIVATION

E.P. Wigner (1970) was the first to give an elegant derivation of a Bell inequality in terms

of probabilities. We again consider the EPRB experiment from section VII. 1. Using three directions,

⃗n 1 , ⃗n 2 , ⃗n 3 ∈ R 3 , define

σ i := ⃗n i · ⃗σ and τ i := ⃗n i · ⃗τ with i ∈ {1, 2, 3}. (VII. 31)

148 CHAPTER VII. BELL’S INEQUALITIES

Here ⃗σ and ⃗τ are the spin operators of particle 1 and particle 2, respectively. We assume the quantities

of particle 1 to be independent of those of particle 2 and therefore

(σ i ⊗ 11) σi ⊗ τ j

[λ] = (σ i ⊗ 11) σi ⊗ τ j ′ [λ], (VII. 32)

(11 ⊗ τ j ) σi ⊗ τ j

[λ] = (11 ⊗ τ j ) σi ′ ⊗ τ j

[λ]. (VII. 33)

for i ′ ≠ i and j ′ ≠ j. This is the requirement of locality. Without this requirement we would have

nine quantities in the HVT, namely the pairs (σ i ,τ j ), that is, as much quantities as measuring contexts.

Now we have only six: σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 .

The outcome of measurement of every spin quantity is ±1 in units of 1 2

. A HVT must grant a

probability to every combination of outcomes,

0 p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) 1, (VII. 34)

with the usual marginal distributions, for instance

p (σ 1 , τ 1 ) =

∑+1

σ 2 =−1 σ 3 =−1 τ 2 =−1 τ 3 =−1

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ), (VII. 35)

and so on.

◃ Remark

Quantum mechanics does not have such joint probability distributions because these six quantities

do not all in pairs commute with each other. The spin quantities are not jointly measurable but in

the HVT their values are all fixed. ▹

Calling the angles between ⃗n 1 , ⃗n 2 , ⃗n 3 : θ 12 , θ 23 , θ 31 , then in the singlet state we have, see chapter

III, (III. 176) and (III. 177),

Prob (σ i = 1 ∧ τ j = 1) = 1 2 sin2 1 2 θ ij, (VII. 36)

Prob (σ i = 1 ∧ τ j = − 1) = 1 2 cos2 1 2 θ ij. (VII. 37)

These are the quantum mechanical probabilities and we will see that the HVT, satisfying requirement

(VII. 34), cannot reproduce this. From (VII. 36) and (VII. 37) follows the requirement

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) = 0 unless σ 1 = − τ 1 , σ 2 = − τ 2 , σ 3 = − τ 3 , (VII. 38)

because the hidden variables cannot assume values giving a positive spin of both particles in the same

direction.

The probability for σ 1 and τ 3 to be both +1 is, using (VII. 36),

∑ ∑

p (+, σ 2 , σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 13 (VII. 39)

τ 1 ,τ 2

σ 2 ,σ 3

= p (+, +, −, −, −, +) + p (+, −, −, −, +, +).

Likewise we calculate the following probabilities

∑ ∑

and

σ 1 ,σ 3

∑ ∑

σ 2 ,σ 3

VII. 3. WIGNER’S DERIVATION 149

τ 1 ,τ 2

p (σ 1 , +, σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 23 (VII. 40)

= p (+, +, −, −, −, +) + p (−, +, −, +, −, +)

τ , τ 3

p (+, σ 2 , σ 3 , τ 1 , +, τ 3 ) = 1 2 sin2 1 2 θ 12 (VII. 41)

From (VII. 40) and (VII. 41) it follows that

= p (+, −, +, −, +, −) + p (+, −, −, −, +, +).

p (+, +, −, −, −, +) 1 2 sin2 1 2 θ 23 and (VII. 42)

p (+, −, −, −, +, +) 1 2 sin2 1 2 θ 12, (VII. 43)

respectively. Consequently, we have for (VII. 39), the probability for σ 1 and τ 3 to be both +1,

1

2 sin2 1 2 θ 23 + 1 2 sin2 1 2 θ 12 1 2 sin2 1 2 θ 13, (VII. 44)

which, using sin 2 1 2 θ = 1 2

(1 − cos θ), is equal to

(1 − cos θ 23 ) + (1 − cos θ 12 ) (1 − cos θ 13 ). (VII. 45)

This is, in essence, the same as inequality (VII. 10); rewriting (VII. 45), realizing that 1 − cos θ 0,

and comparing E(⃗a, ⃗ b) to − cos θ 12 etc. yields

1 − cos θ 23 | − cos θ 12 + cos θ 13 |. (VII. 46)

n 2

n 1

ϕ ϕ

n 3

Figure VII. 7: Violation of the Bell inequality again

With θ 23 = θ 12 = 1 2 θ 13 = ϕ as in diagram VII. 7, (VII. 45) becomes

1 − 2 cos ϕ + cos 2ϕ 0, (VII. 47)

and using cos 2ϕ = 2 cos 2 ϕ − 1 we see that

cos ϕ (1 − cos ϕ) 0. (VII. 48)

Since 1 − cos ϕ 0 for every ϕ, this inequality is violated for every acute angle.

150 CHAPTER VII. BELL’S INEQUALITIES

EXERCISE 32. What type of HVT is excluded by Wigner’s reasoning?

◃ Remark

Wigner (1970) makes the observation that the HVT would have been possible if the terms in (VII. 44)

had been sin 1 2 θ instead of sin2 1 2θ. Apparently, our world depends on such ‘minimal’ mathematical

differences. ▹

VII. 4

THE DERIVATION OF EBERHARD AND STAPP

In the previous derivations of the Bell inequalities hidden variables were assumed, which represent

properties of the pair of particles and determine the outcomes of measurements of all physical

quantities. As a consequence, in this HVT a joint probability is defined for the values of non -

commuting quantities also, as we saw in Wigner’s derivation. This follows from the fact that at

given λ both A(⃗a, λ) and A(⃗a ′ , λ) are fixed, for example

p ( A(⃗a) = 1 ∧ A(⃗a ′ ) = 1 ) ∫

= ρ(λ) dλ, (VII. 49)

∆

where ∆ ⊂ Λ is the area in which both A(⃗a, λ) = 1 and A(⃗a ′ , λ) = 1. Since quantum mechanics

does not acknowledge such ‘simultaneous probabilities’ for non - commuting quantities, the quantities

not being simultaneously measurable, it could be suspected that this property of the HVT is the main

reason for the deviation from quantum mechanics, instead of locality or determinism.

In the next derivation of the Bell inequality, given by P. Eberhard and H. Stapp (1977), the existence

of hidden variables is not assumed. They claim that the Bell inequality follows from an assumption

of locality only. However, what will be shown to be necessary in this derivation, is the

assumption that we can speak reasonably about the outcomes of measurements which have not actually

been carried out.

THE EBERHARD - STAPP THEOREM:

Quantum mechanics is a non - local theory.

Proof

Consider again the EPRB experiment. Let ⃗a and ⃗a ′ be two readings of the spin meter at A, and ⃗ b

and ⃗ b ′ likewise at B. We can carry out four experiments:

I : ⃗a, ⃗ b II : ⃗a, ⃗ b ′ III : ⃗a ′ , ⃗ b IV : ⃗a ′ , ⃗ b ′ . (VII. 50)

Define, for the n th pair of particles, a n (I) as the outcome of a spin measurement in the direction ⃗a

of the particle traveling to A while the meter at A points in the direction ⃗a, while at the other

particle, which travels to B, spin in the direction ⃗ b is measured; this gives a n (I) = ±1 for

experiment I and likewise for a n (II), a n ′ (III), a n ′ (IV), b n (I), b n ′ (II), b n (III) and b n ′ (IV).

These values represent outcomes of measurements of actual or possible measurements, not actual

properties of the particles which also exist if they are not measured.

VII. 4. THE DERIVATION OF EBERHARD AND STAPP 151

The assumption of locality is that an outcome of measurement of spin of particle 1, in direction ⃗a,

does not depend on which spin direction, ⃗ b or ⃗ b ′ , is measured of the other, remote particle 2. This

is the supposition of locality from the Eberhard - Stapp theorem, leading to what we will call the

matching condition,

a n (I) = a n (II),

a n ′ (III) = a n ′ (IV),

b n (I) = b n (III), b n ′ (II) = b n ′ (IV), (VII. 51)

for all N particle pairs in the singlet state |Ψ 0 ⟩.

Now we can define the following mathematical expression

γ n := a n (I) b n (I) + a n (II) b n ′ (II) + a n ′ (III) b n (III) − a n ′ (IV) b n ′ (IV), (VII. 52)

where the first term corresponds to experiment I, the second to experiment II, etc. Because of the

value assignment ±1, γ is an even integer, and the fourth term being the product of the first three

terms, subtraction of the fourth term means that γ has only two values, as we will see. Moreover,

subtraction allows for an inequality similar to Bell’s inequality (VII. 13).

In (VII. 52) we can omit writing out the labels referring to the numbers of the experiments because

of the matching condition (VII. 51); a n := a n (I) = a n (II), etc. Rewriting (VII. 52),

γ n = a n (b n + b n ′ ) + a n ′ (b n − b n ′ ), (VII. 53)

because of the value assignment ±1 we immediately see that either the first or the second term

equals 0, yielding for all n

γ n = ± 2. (VII. 54)

Averaging over N recurrences of the experiment we have

∣ 1 N

N∑ ∣ ∣∣

γ n =

n=1

1

∣

N

N∑

a n b n +

n=1

Defining the correlation coefficients

N∑

a n b ′ n +

n=1

N∑

a ′ n b n −

n=1

N∑

a ′ ′

n b n ∣ 2. (VII. 55)

n=1

we conclude

c N (⃗a, ⃗ b) :=

1 N

N∑

a n b n etc., (VII. 56)

n=1

|c N (⃗a, ⃗ b) + c N (⃗a, ⃗ b ′ ) + c N (⃗a ′ , ⃗ b) − c N (⃗a ′ , ⃗ b ′ )| 2. (VII. 57)

This is indeed a Bell inequality again, equivalent to inequality (VII. 13) in the limit N → ∞.

The expectation value of c(⃗a, ⃗ b) = ⟨a n b n ⟩ in quantum mechanics is given by (VII. 5) and the

contradiction with (VII. 57) follows as in section VII. 2. □

152 CHAPTER VII. BELL’S INEQUALITIES

◃ Remark

The derivation of (VII. 57) directly comes from expression (VII. 52) and as a result, the existence of

hidden variables does not have to be presumed, only locality was required. Sensationally, we seem to

have proved that quantum mechanics is empirically inconsistent with the requirement of locality. ▹

The experimental violation of the Bell inequalities thus leads us to the conclusion that physical

reality is not local. What we, however, have presupposed in the matching condition (VII. 51) is that

we can simultaneously assign values to a n and a n′ , although they cannot be simultaneously measured

because the spin measuring device cannot be at the same time in both positions ⃗a and ⃗a ′ ≠ ⃗a. In fact,

of the set of four terms in (VII. 52), at the most one of them is experimentally realizable. Still, we

spoke of outcomes of measurements that have not actually been carried out. Of course, the derivation

of the Bell inequality (VII. 57) from the matching condition (VII. 51) is mathematically flawless. The

question is whether the matching condition (VII. 51) follows from the requirement of locality. We

will now explore this question further.

VII. 4. 1

COUNTERFACTUAL CONDITIONAL STATEMENTS AND INDETERMINISM

Let a n be the outcome of experiment I. With their matching condition, Eberhard and Stapp claim

that this value of a n would be unaltered if we had carried out experiment II instead of experiment I

because these experiments only differ in the settings of the B - meter, which is far away. Therefore,

a n is the outcome which the spin meter A would have given for the n th pair of particles for both

experiment I and experiment II. Redhead (1987, p. 92) formulates this requirement as follows

PRINCIPLE OF LOCAL COUNTERFACTUAL DEFINITENESS (PLCD):

The result of an experiment which could be performed on a microscopic system has a

definite value which does not depend on the setting of a remote piece of apparatus.

This means that if this setting would have been different, the outcome of the experiment would

not have been different. Using the same mathematics as before it follows that

PLCD → Bell inequality. (VII. 58)

Since PLCD is an assumption of locality concerning outcomes of measurements, (VII. 58) seems

to be independent of the existence of hidden variables. But appearances are deceptive. In fact, PLCD

is only reasonable in a deterministic context, and not in the case of indeterminism.

Consider the following example given by Redhead (ibid.). Suppose that, at t 1 , just before the

clock strikes twelve, I raise my hand. Now I ask the question if the clock would also have struck if

I had not raised my hand at t 1 . Intuitively, the right answer is ‘Yes’, in agreement with PLCD. Now

replace the clock by a radioactive atom which decays at t 2 . Suppose I raised my hand at t 1 < t 2 ,

would the atom also have decayed if I had not done this? Now the answer is far from clear. If the

decay is purely indeterministic, a recurrence of the experiment, even if it is just a thought experiment,

does not have to have the same outcome. The supposition that the atom would not have decayed if I

had not raised my hand, is not contradictory to locality.

The assumptions that outcomes of measurements remain to have the same values even if they are

not measured, or that measurements which are not carried out have certain outcomes in advance, are

VII. 5. STOCHASTIC HIDDEN VARIABLES 153

only reasonable in a deterministic context. But in a deterministic context these assumptions do not

differ from each other, and a outcome of measurement is decisively linked to the value the quantity

had just beforehand, therefore, to a hidden variable.

The conclusion is that the assumption of Eberhard and Stapp, PLCD, is no more general than the

assumption that the value a n is a property of the particles which is determined in advance, and which

is independent of the settings of the meter at B. This means that the derivation is no more general

than the derivation for a local deterministic HVT.

VII. 5

STOCHASTIC HIDDEN VARIABLES

In this section we will no longer require determinism in the HVT; the λ only determine the probability

that a quantity has a certain value, which is revealed by the measuring apparatus in the way a

balance reveals our weight. A stochastic HVT is linked more closely to quantum mechanics, enabling

a more well - defined comparison between the assumptions leading to the Bell inequalities on the one

hand, and quantum mechanics on the other.

In our stochastic HVT we assume the existence of a probability distribution at given directions

⃗a, ⃗ b ∈ R 3 of the spin meters in the EPRB experiment

p ⃗a, ⃗ b

(a, b, λ), (VII. 59)

which is the probability to find for the quantities A = ⃗σ 1 · ⃗a and B = ⃗σ 2 · ⃗b the values a and b,

respectively, where it holds that a,b = ±1. Again, λ ∈ Λ is the hidden variable describing the source.

Such a probability distribution can always be written in terms of conditional probabilities,

p ⃗a, ⃗ b

(a, b, λ) = p ⃗a, ⃗ b

(a | b ∧ λ) p ⃗a, ⃗ b

(b | λ) ρ ⃗a, ⃗ b

(λ). (VII. 60)

To be able to derive the Bell inequalities we make the following three suppositions.

154 CHAPTER VII. BELL’S INEQUALITIES

1. Outcome independence

The probability to find a value a for ⃗a · ⃗σ is ‘completely’ determined by the settings of the spin

meters and by λ, particularly, it is not necessary to also give outcome b, likewise for finding a

value b,

p ⃗a, ⃗ b

(a | b ∧ λ) = p ⃗a, ⃗ b

(a | λ) and p ⃗a, ⃗ b

(b | a ∧ λ) = p ⃗a, ⃗ b

(b | λ). (VII. 61)

2. Parameter independence

The probability to find the outcome of measurement a or b is independent of the settings of the

remote spin meter,

p ⃗a, ⃗ b

(a | λ) = p ⃗a (a | λ) and p ⃗a, ⃗ b

(b | λ) = p ⃗b (b | λ). (VII. 62)

3. Source independence

The distribution of λ in the source does not depend on the settings of the spin meters,

ρ ⃗a, ⃗ b

(λ) = ρ(λ). (VII. 63)

In principle we can adjust the spin meters ‘at the last moment’, long after the particles have left

the source. It is reasonable to assume that the source is not influenced by what happens to the

measuring devices in the future.

Now we will prove the next theorem.

BELL’S THIRD THEOREM:

A stochastic HVT which is in agreement with outcome, parameter and source independence

is empirically inconsistent with quantum mechanics.

Proof

As a consequence of the aforementioned properties, in every local stochastic HVT, (VII. 60) becomes

p ⃗a, ⃗ b

(a, b, λ) = p ⃗a (a | λ) p ⃗b (b | λ) ρ(λ), (VII. 64)

or

p ⃗a, ⃗ b

(a, b | λ) = p ⃗a (a | λ) p ⃗b (b | λ), (VII. 65)

which means that the quantities A and B are statistically independent of each other for given λ.

This statement is often called factorizability or conditional independence.

VII. 5. STOCHASTIC HIDDEN VARIABLES 155

Using (VII. 64), another Bell inequality can be derived for E(⃗a, ⃗ b) by means of the relation

∫

E(⃗a, ⃗ (

b) = p⃗a, ⃗ b

(1, 1, λ) − p ⃗a, ⃗ b

(1, −1, λ) (VII. 66)

Defining

Λ

− p ⃗a, ⃗ b

(−1, 1, λ) + p ⃗a, ⃗ b

(−1, −1, λ) dλ )

∫

(

= p⃗a (1 | λ) − p ⃗a (−1 | λ) ) ( p ⃗b (1 | λ) − p ⃗b (−1 | λ) ) ρ(λ) dλ.

Λ

f (⃗a, λ) := p ⃗a (1 | λ) − p ⃗a (−1 | λ) (VII. 67)

and

g( ⃗ b, λ) := p ⃗b (1 | λ) − p ⃗b (−1 | λ), (VII. 68)

we see that

|f (⃗a, λ)| 1 and |g( ⃗ b, λ)| 1, (VII. 69)

which brings us back to (VII. 25) and the subsequent equations so that again we obtain the Bell

inequality (VII. 13). Violation of this Bell inequality means that (VII. 64) can not apply and

therefore no HVT can guarantee both outcome independence (VII. 61) and parameter independence

(VII. 62). □

VII. 5. 1

OUTCOME, PARAMETER AND SOURCE INDEPENDENCE

The importance of the distinction between outcome and parameter independence was first brought

to attention by J. Jarrett (1984).

1. Outcome independence, (VII. 61), means that the probability of outcome b, for given λ, does

not depend on the outcome a. This is motivated by the idea that λ gives a complete description of

the state of the pair of particles; the variable λ contains an exhaustive specification of all factors

which are relevant for the outcomes of measurement. Therefore, specifying the extra information that

outcome a has occurred can, if λ is already known, not lead to new information on b.

The purpose of the requirement can be illustrated by giving the next example, in which it is not

satisfied. Suppose that two people, without looking, each draw a little ball out of a box containing two

little balls, one black and one white. Hereafter they separate, one travels to New York, the other to

Tokyo. Now consider a ‘stochastic hidden variable’ with probability 1 2

for the little balls to be black

or white. On arrival at Tokyo the traveler opens his hand and sees that his little ball is black, which

instantaneously enables him to predict the color of the little ball in New York, it has to be white. Here

the outcome of measurement of the one little ball does provide relevant information on the outcome

of a measurement of the other little ball.

156 CHAPTER VII. BELL’S INEQUALITIES

The idea behind the requirement of outcome independence is that such a situation could only

occur because the HVT was incomplete; in a complete specification of the state of the pair of particles

which existed at the beginning of the trip also the color of the little balls should have been included,

even though the travelers did not know the color of their little ball. Then it automatically follows,

at given λ, that the little ball in New York is white and the observation in Tokyo provides no new

information.

2. Parameter independence, (VII. 62), means that the probability distribution of the outcomes

at A is independent of external changes at B, e.g. pointing the spin meter. The argumentation leading

to the assumption of parameter independence is generally associated with the possibility of signaling.

Suppose that, for example, adjustments ⃗ b and ⃗ b ′ existed such that

p ⃗a, ⃗ b

(a | λ) ≠ p ⃗a, ⃗ b ′ (a | λ), (VII. 70)

then, in principle, it is possible to instantaneously exchange signals between experimenters located

at A and B. Since the experimenter located at B can choose if he points his spin meter in the

direction ⃗ b or ⃗ b ′ , an experimenter located at A is able, if the source emits particle pairs in a pure

hidden - variables state λ, to register the relative frequency of outcomes of A and thereby retrieve

which adjustment has been chosen by the experimenter at B. Violation of parameter independence

therefore means that the HVT enables the instantaneous exchange of signals over arbitrarily large

distances.

3. Source independence, (VII. 63), means that the probability distribution over the hidden variable

describing the particle pair cannot depend on the measuring directions chosen by the experimenters.

The argumentation leading to the assumption of source independence is often described

in terms of the ‘free will’ of the experimenters. The experimenters are considered to be completely

‘free’ in their decision how to point their spin meters, and even to make their choice just at the last

moment, when the particles have long left the source. Therefore, the probability distribution ρ(λ),

which characterizes the source of the particle pairs, cannot depend on that.

Of course, here too it applies that violation of the requirement is logically conceivable. It is

possible that this freedom does not exist, and that at emitting the particles, the directions in which

the experimenters will measure have already been determined. It is also conceivable that by some

other cause a correlation exists between λ and the directions ⃗a and ⃗ b, influencing both. The first case,

in which all relevant factors of the EPR experiment are determined in advance and the experimenters

have no free will, is called super - determinism. Therefore, in a super - deterministic HVT the Bell

inequalities can be violated also.

VII. 5. 2

QUANTUM MECHANICS AS A STOCHASTIC HVT

Exclusively giving probability statements concerning outcomes of measurements, a stochastic

HVT conceptually differs less from quantum mechanics than other HVT’s. In fact we can, without

objection, take quantum mechanics itself as an example of a stochastic HVT by identifying λ with the

quantum mechanical state and Λ with the relevant Hilbert space. Since quantum mechanics does not

satisfy the Bell inequalities, it is interesting to examine which of the aforementioned requirements is

violated inevitably by quantum mechanics.

VII. 5. STOCHASTIC HIDDEN VARIABLES 157

3. Source independence. We already discussed the possibility of violation of the Bell inequalities

by a super-deterministic theory without source independence. It is a philosophical question whether

we can somehow establish if we have free will or not, therefore, it is a possibility, but not an inevitability,

leaving outcome and parameter independence.

2. Parameter independence. Describing the pairs of particles in the singlet state |Ψ 0 ⟩, (III. 165),

by a pure hidden - variables state, the probability distribution is a delta - distribution,

ρ Ψ0 (λ) = δ λ0 (λ) := δ(λ − λ 0 ), (VII. 71)

which leads to

∫

p ⃗a, ⃗ b,λ0

(a, b, λ) ρ Ψ0 (λ) dλ = p ⃗a, ⃗ b,λ0

(a, b). (VII. 72)

Λ

The probabilities for the outcomes of measurement are given by (III. 176),

p ⃗a, ⃗ b,λ0

(a = 1 ∧ b = 1) = 1 2 sin2 1 2 θ ⃗a, ⃗ b ,

p ⃗a, ⃗ b,λ0

(a = 1 ∧ b = −1) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (VII. 73)

b

EXERCISE 33. Also calculate the other two joint probabilities, that is, for a = 1 ∧ b = 1

and a = −1 ∧ b = 1.

The marginal probabilities are, using (VII. 73),

p ⃗a, ⃗ b

(a | λ 0 ) = p ⃗a, ⃗ b,λ0

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0

(a = 1 ∧ b = −1) = 1 2 ,

p ⃗a, ⃗ b

(b | λ 0 ) = p ⃗a, ⃗ b,λ0

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0

(a = −1 ∧ b = 1) = 1 2

, (VII. 74)

which means that, both being equal to 1 2

, they are not dependent of the settings of a remote measuring

device. Consequently, even the quantum mechanical correlations in the singlet cannot be used for

signaling, there is no actio in distans, leading to the following theorem.

NO - SIGNALING THEOREM:

Quantum mechanics satisfies parameter independence, i.e., if subsystems of a composite

physical system no longer interact, the probability of finding certain outcomes of measurement

for an arbitrary quantity of subsystem 1 is independent of which quantity of

subsystem 2 is measured, and vice versa.

EXERCISE 34. Prove that the EPRB experiment is an example of the no - signaling theorem.

Optional: prove, in general, the no - signaling theorem using state operators. Whoever cannot

solve this problem, is advised to consult Ghirardi, Rimini and Weber (1980).

158 CHAPTER VII. BELL’S INEQUALITIES

1. Outcome independence. In quantum mechanics it is indeed the requirement of outcome independence

that is not satisfied. The conditional probabilities, i.e., the probabilities for the spin of

particle 1 to be found in the direction ⃗a, given that the spin of particle 2 was found in the direction ⃗ b

and vice versa, which were defined in (III. 172), p. 74, are clearly not independent,

p ⃗a, ⃗ b

(a = 1 | λ 0 ∧ b = 1) = sin 2 1 2 θ ⃗a, ⃗ , (VII. 75)

b

p ⃗a, ⃗ b

(a = −1 | λ 0 ∧ b = 1) = cos 2 1 2 θ ⃗a, ⃗ . (VII. 76)

b

According to quantum mechanics, physical systems are inseparable. However, this interdependence

of outcomes cannot be used to exchange signals since we do not control the outcomes of spin measurements

and therefore we are unable to actively influence the probability distribution over the outcomes

of measurements from a distance. Shimony (1984, p. 227) called it passion at a distance. The experimenter

at B can, on the basis of his observation, indeed do a better prediction concerning an outcome

at A than that which is possible on just the knowledge of the singlet state, but he cannot warn the

observer at A, he only can watch passively.

The singlet |Ψ 0 ⟩ ∈ C 4 does violate the Bell inequalities for suitably chosen spin quantities.

The singlet is not factorizable, i.e., it cannot be written as a direct product of two states in C 2 , it is

entangled. We can raise the question if types of quantum mechanical states exist which do not violate

a Bell inequality for any choice of four spin quantities.

Capasso, Fortunato and Selleri (1973) proved that the CHSH inequality, (VII. 13), is upheld for

every choice of four spin quantities by all factorizable states and by all mixtures thereof. Violations

are therefore only possible for entangled states. Vice versa, Home and Selleri (1991, pp. 22 - 26)

proved that for every entangled pure state, that is, a state which cannot be written as a direct product,

it is always possible to choose spin quantities in such a way that the CHSH inequality is violated.

These results can be summarized in the statement that entanglement and violation of Bell inequalities

are equivalent. It confirms Schrödinger’s insight from 1935 (Schrödinger 1935a) that the

existence of entangled states marks the cardinal difference between classical and quantum mechanics.

VII. 6

AN ALGEBRAIC PROOF WITHOUT INEQUALITIES

The contradiction between a local deterministic or a local stochastic HVT, both either autonomous

or contextual, on the one hand, and quantum mechanics on the other hand, is statistical in nature, because

it concerns inequalities in terms of expectation values or probabilities, like all Bell’s theorems.

But Kochen and Specker’s theorem, which we discussed in V. 3, does not contain any inequalities. In

this case it is customary to speak of algebraic proof.

This raises the question whether an algebraic proof of Bell’s theorems is also possible, that is,

without appealing to the measurement postulate. The answer is affirmative. Using a spin state of a

composite system of four particles, D.M. Greenburger, M.A. Horn en A. Zeilinger (1989) showed

that it is mathematically impossible to locally and separably assign values to all spin quantities. Here

we will show a simplified version given by N.D. Mermin (1993), where, in using |GHZ⟩, we refer to

the aforementioned authors.

VII. 6. AN ALGEBRAIC PROOF WITHOUT INEQUALITIES 159

Consider a composite system of three spin 1/2 fermions with pure states in the direct product

Hilbert space C 2 ⊗ C 2 ⊗ C 2 = C 8 . We look at 10 physical quantities which correspond to the

spin operators represented in the Mermin pentagon, figure VII. 8. In this diagram σy

1 is shorthand

for σ y (1) ⊗ 11 (2) ⊗ 11 (3), and σy 1 σy 2 σx 3 is likewise for σ y (1) ⊗ σ y (2) ⊗ σ x (3), etc. On every

straight line through the Mermin pentagon we find four commuting operators. These operators are

products of commuting operators with eigenvalues ±1 and therefore have eigenvalues ±1 also.

σ 1 y

σ 1 x σ 2 x σ 3 x σ 1 y σ 2 y σ 3 x σ 1 y σ 2 x σ 3 y σ 1 x σ 2 y σ 3 y

σ 3 x

σ 3 y

σ 1 x

σ 2 y

σ 2 x

Figure VII. 8: The Mermin pentagon

Using the properties of the Pauli matrices (III. 122), p. 66, it can be shown that

(

σx (1) ⊗ σ y (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ x (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ y (2) ⊗ σ x (3) )

= − σ x (1) ⊗ σ x (2) ⊗ σ x (3), (VII. 77)

where we note that the four operators acting in C 8 commute. Consequently, they have a simultaneous

eigenstate in C 8 , having eigenvalue +1 for the three operators on the left - hand side of the equation,

and eigenvalue −1 for the operator on the right - hand side. The entangled state in C 8 ,

|GHZ⟩ := 1 2

√

2

(

|z ↑⟩ ⊗ |z ↑⟩ ⊗ |z ↑⟩ − |z ↓⟩ ⊗ |z ↓⟩ ⊗ |z ↓⟩

)

, (VII. 78)

is such a state.

We assume that the three particles are already far away from each other and are moving still further

apart, and the composite system is, as far as spin is concerned, in the state |GHZ⟩. A measurement of

two particles, of which we assume that it does not influence the third particle in any way, determines

the value of the third particle because, according to quantum mechanics, the product of the outcomes

of measurement is determined.

160 CHAPTER VII. BELL’S INEQUALITIES

According to a HVT, at the moment a measurement is made the values of the spin quantities

are revealed. If we call these values w x (1) for the spin in x - direction of particle 1, etc, then, because

|GHZ⟩, (VII. 78), is a simultaneous eigenstate for the four quantities in C 8 , (VII. 77), it must

hold that

and

w x (1) w y (2) w y (3) = w y (1) w x (2) w y (3) = w y (1) w y (2) w x (3) = + 1, (VII. 79)

w x (1) w x (2) w x (3) = − 1. (VII. 80)

The product of these four factors is

(

wx (1) w y (2) w y (3) ) ( w y (1) w x (2) w y (3) ) ( w y (1) w y (2) w x (3) ) ( w x (1) w x (2) w x (3) )

= (+ 1) (+ 1) (+ 1) (− 1) = − 1. (VII. 81)

But if we consider the product as as a product of the 12 values of these spin quantities we find

w x (1) w y (2) w y (3) w y (1) w x (2) w y (3) w y (1) w y (2) w x (3) w x (1) w x (2) w x (3)

= w 2 x (1) w 2 y (1) w 2 x (2) w 2 y (2) w 2 x (3) w 2 y (3) = 1 6 = + 1, (VII. 82)

This leads to +1 = −1, which is, of course, an algebraical absurdity. And indeed, this is an algebraic

proof since it contains no probabilities or inequalities.

EXERCISE 35. What kind of HVT is excluded by the foregoing reasoning? Which postulates of

quantum mechanics are necessary to obtain the contradiction?

VII. 7

MISCELLANEA

Literature concerning the Bell inequalities has reached an extraordinarily large extent since the

seventies of the 20 th century, however, its growth has decreased in recent years. In conclusion of this

chapter we will briefly discuss some of the main topics.

VII. 7. 1

LOCALITY AND RELATIVITY

Although in these lecture notes we have restricted ourselves to non - relativistic quantum mechanics,

the speed of light did not play a role in our considerations, it is, of course, especially the

special theory of relativity which provides the inspiration to study the (im -) possibility of signaling.

VII. 7. MISCELLANEA 161

Therefore, it is interesting to consider the EPRB experiment schematically in a Minkowski diagram,

figure VII. 9.

ct

A

B

λ

Figure VII. 9: Minkowski diagram of the EPRB experiment, where λ is in the past light cones of both

A and B

A natural requirement of locality for a relativistic stochastic HVT is that the probability of an

outcome A depends exclusively on the variables which specify the state in the past light cone of

the measuring event at A, and likewise for B. Bell has called it local causality. We have seen that

quantum mechanics is not a local causal theory. Indeed, the probability of an outcome at A cannot be

influenced by the choice of the direction of measurement ⃗ b at B, but with the outcome at B, which

can be registered there, a prediction can be done by an observer at B concerning the particle at A

which an observer at A can not do, even if he has complete knowledge of the state in the past light

cone of A.

x

VII. 7. 2

LOCALITY VERSUS CONDITIONAL INDEPENDENCE

A problem that is brought up in some publications, e.g. Fine (1982), De Muynck (1986, 1996),

is, to what extent locality is necessary to derive the Bell inequalities. The authors argue that in

‘requirements of locality’ only a special form of statistic independence is expressed. The distance

between the measuring apparatuses is in absolutely no way manifest in the requirement. Although

‘locality’ is a term which seems to presuppose a space - time, such space - times are conspicuous by

their absence in relevant locality assumptions, they all are probability statements without reference to

space or time.

Indeed, strictly speaking one cannot say that these assumptions express a requirement of locality.

It could be possible to expect an analogous independence for a hypothetical pair of particles, for

example a photon and a gluon, which absolutely cannot interact with each other, but are located very

close to each other. The essence is that in a local theory the large distance between the particles can

be taken to be a sufficient, but not necessary condition for the absence of interactions.

The requirement of outcome independence in the HVT is not a representation of the requirement

of locality, it has only been motivated by it. The conclusion that is sometimes drawn from this, that

apparently locality itself is irrelevant for the Bell inequality, is, however, incorrect. Factual violation

of the Bell inequality means that every stochastic HVT satisfying the factorizability as formulated in

section VII. 5 is excluded, and therefore, also the local versions are excluded.

162 CHAPTER VII. BELL’S INEQUALITIES

VII. 7. 3

DETERMINISM

Another widespread view is that the derivation of the Bell inequalities always relies on a supposition

of determinism in the HVT, so that giving up determinism would be a possible expedient from

the Bell inequalities.

Bell himself has emphasized the inadequacy of this view. Determinism, which is the possibility

to make predictions concerning a remote object with certainty before making measurements, indeed

plays an important role in the original version. But this is a consequence of the perfect correlation

in the quantum mechanical expression (VII. 5), i.e., this determinism follows from the singlet state

itself, and is not a specific supposition of the HVT, see for instance Suppes and Zanotti (1976), and

Dieks (1983).

We saw that in a stochastic, or indeterministic, HVT the Bell inequalities are also derivable, so

that giving up determinism does not help. Moreover, the opposite is true; especially super - determinism,

the supposition that also the choice of the direction of measurement by the experimenter is

determined in advance, offers a way out of the Bell inequalities.

VIII

THE MEASUREMENT PROBLEM

[. . . ] if one has to stick to these darn quantum jumps then I regret that I ever have taken

part in the whole thing.

— Erwin Schrödinger

In this final chapter we will elaborate on the most important interpretation problem, the measurement

problem, which has the subject of an ever-continuing series of publications. We will give

an introduction to Von Neumann’s quantum mechanical measurement theory and formulate the

measurement problem, we will go through a number of attempts to solve it, and finally we will

discuss some criticism of the theory.

VIII. 1

INTRODUCTION

The term ‘measurement’ plays a very special role in quantum mechanics, and we suggest a short

rereading of the first paragraphs of chapter V. It is remarkable that the term arises in the Von Neumann

postulates as described in chapter III, p. 41, ff. Both in the measurement postulate, specifying the

possible outcomes of measurement and giving a physical meaning to the probability measure which

is determined by the state vector, or the state operator, in terms of outcomes of measurement, and

in the projection postulate, establishing the evolution in time of the state at measurement, the term

‘measurement’ comes forward.

That special role also becomes apparent in the debates concerning the interpretation of the theory,

where it is frequently remarked that measurement ‘creates’ the value for a quantity, or that it causes a

sudden state change, as expressed by Dirac (1958, p. 36),

In this way we see that a measurement always causes the system to jump into an eigenstate

of the dynamical variable that is being measured, the eigenvalue this eigenstate

belongs to being equal to the result of the measurement.

From the perspective of classical physics, this is extremely unusual. In Newton’s theory of gravitation,

or the electrodynamics of Faraday and Maxwell, measurements are sometimes mentioned, as

suppliers of experimental facts, but never as specific types of operation on physical systems, needing

a separate treatment in the theory.

The point here is not only that measurements in classical physics, as is frequently stated, always

bring about a negligible or compensable disturbance of the system and therefore can remain outside

consideration, much more important is, that in in classical physics there is no distinction in principle

164 CHAPTER VIII. THE MEASUREMENT PROBLEM

between processes which serve as measurements and processes that do not. Every physical process

or every mutual influence of physical systems can, under suitable circumstances, be considered as

a measurement. Since it is the physical theory that indicates which physical processes in nature are

possible, the theory itself also provides the criterion for the kinds of measurements which are possible.

According to Von Neumann’s postulates, in quantum mechanics this is exactly the other way

around. First we must, according to the aforementioned postulates, have a criterion to know when a

process is a measurement, before we can indicate what the theory has to say concerning the process,

before we can apply the postulates. That the term measurement in this way gets a more fundamental

status than the physical theory, is also expressed by the words of Pauli as quoted in chapter I, p. 9,

that a measurement creating values is “outside the laws of nature”.

Intuition tells us that measurements are just an ‘ordinary kind’ of physical interactions, and this

intuition cannot easily be wept out, from which we will give an illustration. Consider a photon which

has gone through a slit and is on its way to a photographic plate. If we presume the interaction with

this photographic plate to be a measurement, the wave function of the photon must, according to the

projection postulate, collapse on arrival at the plate. But we also know that the photographic plate has

a microscopic structure. It contains silver atoms in an emulsion which can be excited by the photon

and start a chemical process in such a way that we can see something when the plate is developed.

Would it not be plausible that quantum mechanics could describe such a process using a Schrödinger

equation?

In every way this event looks like a physical interaction which falls completely within the well -

known laws of nature, instead of without. And if this is denied, how shall we decide at all when

a microscopic interaction between a photon and an atom can and when it cannot be labeled as a

measurement? Asking an experimental physicist how her measurement setup works, one will be

given an answer in which physical interactions, generally of electromagnetic nature, are of uppermost

importance. It seems absurd to deny that events take place in the laboratory that are “outside the laws

of nature”.

The clash between the conception that measurements do not differ from other physical interactions

on the one hand, and the fact that measurements in quantum mechanics acquired a special status

because they are not classified to be physical interactions on the other hand, is called the quantum

mechanical measurement problem in the broad sense.

VIII. 2

MEASUREMENT ACCORDING TO CLASSICAL PHYSICS

Although usually no special attention is given to measurements in classical physics, it is no problem

to give a general, schematic description of how a measurement is treated classically.

A measurement brings about a correlation between a quantity A of a physical system S which

is, within the context of a measurement, frequently called an object system, and a quantity R, where

the R comes from reading, which is characteristic for the measuring apparatus M, the apparatus

being a physical system also. In classical physics we assume that A has a certain value a ∈ R,

where a is an element from a set of possible values, for instance a 1 , . . . , a n ⊂ R, and that after the

measurement process R has a value r j = m(a j ), where m is a bijection of the possible values of A

before the measurement, to the possible values of R after the measurement.

VIII. 2. MEASUREMENT ACCORDING TO CLASSICAL PHYSICS 165

Take, for example, S to be yourself and M to be a balance, A is your weight and R is the reading

of the pointer of the balance. Now you have an unknown weight value, a, which is revealed by

the balance indicating r = m(a) = 63 kg. The role of a measurement is pragmatic; the value of

a physical quantity of the object system which is not directly or not easily observable, for example

mass, is correlated to a quantity that is directly observable, in this case the position of an pointer. For a

correlation to occur between A and R there must be an interaction between S and M. This interaction

can, potentially, influence the value of A in such a way that the value before the measurement can

change to another value after measurement. Measurement is a process looking towards the past and

its aim is to reveal the value of A before the interaction with M.

If it is possible to predict, from the value a and the interaction between S and M, the value a ′

which A has after measurement, then the measurement also looks at the future and acts like an apparatus

which prepares a state of S in which A has the value a ′ . Think, for example, of an ammeter

in an electric circuit with an energy source of V volt; if the current through a resistor R is I = V R

without the ammeter, then, after the ammeter has been connected in series with the resistor, the current

I ′ V

equals

R+R s

, where R s is the internal resistance of the ammeter. In case a ′ equals a, the

measurement is called non - disturbing or ideal. The measurement process thus has two aspects; what

happens to the measuring apparatus M, and what happens to the physical system S, i.e. measurement

and state preparation.

In classical physics the measurement interaction can be taken to be arbitrarily small, in which

case the value of A is not disturbed. Therefore, the transition in such an ideal measurement process

is

(a j , r 0 ) (a j , r j ) = ( a j , m(a j ) ) . (VIII. 1)

Notice that the characteristics of the measurement are left out of the consideration. The method

of measuring does not have anything to do with the phenomenon one wants to get information about.

The motion of the planets in the gravitational field of the sun is studied by looking at them, i.e., by

using the fact that the planets reflect sunlight. The optical instruments that are used have nothing to

do with the gravitational motion under examination.

Also notice that in this consideration the question how to measure A is only transformed into the

question how to find the value of R. If we also would have to measure the value of R, this could

lead to an infinite chain of measuring apparatuses. This is avoided by assuming that the quantity R is

directly observable, hence the term pointer reading for R, where we have to take the term ‘pointer’

very generally, for instance, screens showing results of measurements or results printed on paper are

included in the term.

We appeal in our description to a distinction between two different types of quantities; the directly

observable quantities, that is, observable to the naked eye, versus the not directly observable or unobservable

quantities. But this is not a distinction which corresponds to a fundamental distinction of

these quantities, in classical physics all quantities are treated as properties of objects. The fact that we

stop at a directly observable quantity R is a decision based on purely contingent factors, particularly

human physiology and the physics of the human senses.

166 CHAPTER VIII. THE MEASUREMENT PROBLEM

VIII. 3

MEASUREMENT ACCORDING TO QUANTUM MECHANICS

The following schematic representation of the measurement process in quantum mechanics is

given by Von Neumann (1932).

Suppose that A is a physical quantity of the object system S, represented quantum mechanically

by the maximal operator A on Hilbert space H S , having a discrete spectrum a 1 , . . . , a N . Now

let S interact with a measuring apparatus M, where M is described quantum mechanically also.

For the measuring apparatus M to be able to function as a measuring apparatus, it has to have an

pointer quantity R, represented by the operator R on Hilbert space H M , having orthonormal eigenstates

|r 0 ⟩, . . . , |r N ⟩. These eigenstates have to be orthonormal since they correspond to pointer

readings which can be distinguished by the human eye. Let |r 0 ⟩ be the eigenstate in which the pointer

shows no deflection. The Hilbert space of this composite system S M is H = H S ⊗ H M with

dim H M = dim H S + 1, the basis of R including |r 0 ⟩, where that of A does not include |a 0 ⟩.

Prior to the measurement, the measuring apparatus M is in the eigenstate |r 0 ⟩. We want this state

to change, as a result of the measurement interaction, into the eigenstate |r j ⟩ which is indicative of the

value a j of A, thus, let S initially be in the eigenstate |a j ⟩ of A. Moreover, we want the measurement

to be ideal, so that the state |a j ⟩ of S does not change.

Von Neumann showed that this transition can indeed be brought about by a unitary transformation,

which means we have to find for the composite system SM a unitary evolution operator U, inducing

the transition

U ( |a j ⟩ ⊗ |r 0 ⟩ ) = |a j ⟩ ⊗ |r j ⟩, (VIII. 2)

where U describes the measurement interaction lasting some unspecified time interval.

EXERCISE 36. Show that the operator

U =

N∑ N∑

|a l ⟩ ⊗ |r [l+m] ⟩ ⟨a l | ⊗ ⟨r m | (VIII. 3)

l=1 m=0

(a) is unitary, and (b) induces the desired transition (VIII. 2). Here, [l + m] means l + m modulo

N + 1, i.e.: [N + 1] = 0, [N + 2] = 1, etc.

The formula (VIII. 2) strongly resembles the transition (VIII. 1). Apparently, everything we desired

concerning the ideal measurement process in quantum mechanics, including the requirement

that the value of A must not be disturbed, can be achieved using a unitary operator. At first sight,

there does not seem to be any problem with a completely quantum mechanical treatment of the measurement

interaction, taken as an ordinary physical process obeying Schrödinger’s equation. As in the

classical case, the method of measuring is not discussed. We also did not appeal to the measurement

or the projection postulate.

VIII. 3. MEASUREMENT ACCORDING TO QUANTUM MECHANICS 167

However, at a second look, the transition (VIII. 2) turns out to have peculiar consequences. The

formula (VIII. 2) assumed that the object system S was, before the measurement, in an eigenstate of

A. But what if S is in an arbitrary state |ψ⟩ ∈ H S ?

We can decompose this arbitrary state |ψ⟩ into the orthonormal eigenstates |a j ⟩ of A with coefficients

c j = ⟨a j | ψ⟩. Therefore, using |ψ⟩ = ∑ c j |a j ⟩ and the linearity of the evolution operator it

follows that

U ( |ψ⟩ ⊗ |r 0 ⟩ ) = U

N S ∑

j=1

c j |a j ⟩ ⊗ |r 0 ⟩ =

N S ∑

j=1

c j U ( |a j ⟩ ⊗ |r 0 ⟩ )

=

N S ∑

j=1

c j |a j ⟩ ⊗ |r j ⟩ =: |Φ⟩. (VIII. 4)

We see that the state |Φ⟩ of the composite system of object S and measuring apparatus M after the

measurement is no longer a product state, rather it is entangled. This implies that we cannot describe

S, nor M, with a pure state; the partial traces S and M yield mixed states, see section III. 4.

This aspect has no classical analogue. We will come back to this, but first we consider the question

whether this quantum mechanical description of the measurement process is compatible with the

measurement postulate. Or, more precisely, whether application of the measurement postulate to A

leads to the same result as its direct application to S. And we ask whether the desired correlation

between the values of A and R is achieved. We will show now that this is indeed the case.

The quantity R of the measuring apparatus M is represented on the Hilbert space H S ⊗ H M of

the composite system SM as 11⊗R. The probability to find for this quantity the value r k is, according

to the measurement postulate,

Prob |Φ⟩ (R : r k ) = ⟨Φ| ( 11 ⊗ |r k ⟩ ⟨r k | ) |Φ⟩. (VIII. 5)

With (VIII. 4) this yields

Prob |Φ⟩ (R : r k ) = |c k | 2 , (VIII. 6)

where we have used the orthonormality of the |r k ⟩ ∈ H M . This is the same result as yielded by

direct application of the measurement postulate to the arbitrary |ϕ⟩ from (VIII. 4). Apparently, the

probability to find an outcome r k when measuring R of M is always equal to the probability to find

the outcome a k of A on S. This former measurement can therefore be regarded as a substitute for the

latter.

The validity of (VIII. 6) itself does not show that a correlation between the value of A and R has

been established. To show that such a correlation exists, we have to know the probability of a certain

pair of outcomes (a i , r k ) for A ⊗ R, in the state |Φ⟩ of (VIII. 4). The joint probability to find this pair

of outcomes is

Prob |Φ⟩ (A : a i ∧ R : r k ) = ⟨Φ| ( |a i ⟩ ⟨a i | ⊗ |r k ⟩ ⟨r k | ) |Φ⟩

= ∣ ∣ ( ⟨a i | ⊗ ⟨r k | ) |Φ⟩ ∣ ∣ 2 = |c i | 2 δ ik . (VIII. 7)

168 CHAPTER VIII. THE MEASUREMENT PROBLEM

The conditional probability to find for A the value a i , given that for R the value r k has been found,

is therefore

Prob |Φ⟩ (A : a i | R : r k ) = Prob (A : a i ∧ R : r k )

Prob (R : r k )

= |c i| 2 δ ik

|c k | 2 = δ ik . (VIII. 8)

In other words, in the state |Φ⟩ a strict correlation exists between the quantities A and R, represented

quantum mechanically by the operators A and R.

The schematic representation of the ideal measurement process is, as we have seen, consistent

with the measurement postulate in the sense that a measurement on M can be a substitute for a measurement

on S. Notice that, to answer this question, we did appeal to the measurement postulate.

This is unavoidable, since the final state after the measurement process, (VIII. 4), is an entangled

quantum state. We can only specify its empirical consequences by appealing to the meaning quantum

mechanics attributes to such quantum states, and in Von Neumann’s postulates that meaning is established

by means of the measurement postulate. Unfortunately, this postulate forces us to consider of a

measurement again, namely, a measurement on the measuring apparatus M itself, by reading off the

position of the pointer. Now we have to ask if this second measurement can also be represented as a

normal interaction.

Suppose that we introduce a second measuring apparatus M ′ which we use to read off the result

of M using a new pointer quantity R ′ , represented by the operator R ′ in H M ′. As an example, we

can think of a quantum mechanical description of our eye. Schematically, we then have the process

|r j ⟩ ⊗ |r ′ 0⟩ −→ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 9)

where the |r ′ j⟩ are the eigenstates of R ′ of M ′ . Let U ′ be the unitary operator describing the measurement

by M ′ on M, lasting again some unspecified amount of time. Now we have, for the composite

system SM M ′ in the Hilbert space H = H S ⊗ H M ⊗ H M ′,

|a j ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩

U

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ 0⟩

U ′

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 10)

and therefore, if we start from a general initial state |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩, the final state will be

|Φ ′ ⟩ = U ′ U ( |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩ ) =

N S ∑

j=1

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩. (VIII. 11)

Again, one can argue that all this is consistent with the measurement postulate. That is, upon measurement

of R ′ , the probability of finding the value r ′ k, is equal to |c k | 2 , etc.

We can extend this type of reasoning ad nauseam, by incorporating more and more systems in

the chain of measurement apparatuses, even including a photon scattered by the pointer and entering

the eye of the observer, his retina, the nerve fibres of his brain, etc. All this is consistent with the

measurement postulate, and you can, if you want to, be satisfied with this.

However, the argument does not show that we can take measurements to be on an entirely equal

footing with other physical interactions. No matter how far we extend the chain of apparatuses, the

final state will always be a superposition of the form (VIII. 4) or (VIII. 11)); the meaning of which

can only be specifies by saying what we will find at yet another measurement. The transition to the

VIII. 3. MEASUREMENT ACCORDING TO QUANTUM MECHANICS 169

conclusion that a certain state has been actually found, sometimes called the ‘Heisenberg cut’ (e.g.

Primas 1993), cannot be made within the formalism. Rudolf Haag has expressed this situation as

follows (Haag 1990, p. 246),

Indeed the problem faced in the development in quantum theory has [. . . ] been [. . . ] the

inability of devising any coherent realistic picture conforming with the observed phenomena.

We can shift the place where we want to make the Heisenberg cut at will, by incorporating more

and more systems in the quantum mechanical description. But the transition itself, exchanging the

quantum mechanical description for a description in terms of observed facts, must come from outside

quantum mechanics.

One can of course, in analogy to the classical measurement scheme, simply postulate that this

quantum mechanical description of the measurement process ends as soon as we can couple the system

S, perhaps by means of many intermediate steps, to some measuring apparatus M whose pointer

quantity R is directly observable. But here we are dealing with the fundamental issue in the theory

and therefore we cannot be satisfied with a pragmatical point of view. Also, we would be faced

with the question which quantities deserve to have the special status of being “directly observable”.

Furthermore, there is the problem that the final states of (VIII. 4) or (VIII. 11) are entangled superpositions

of states with different pointer positions. As we mentioned above, this has no classical analogue.

Without further analysis it is hard to imagine what a direct observation on such states would look like.

An example in which these issues emerge sharply is Schrödinger’s famous cat paradox (1935b),

which we discussed already in the introduction. Schrödinger imagined that a living cat is locked up in

a hermetically closed box, together with a radioactive substance of which perhaps one atom decays

in the course of one hour. The box is provided with a Geiger counter which can register the decay of

the atom, and activates upon decay an installation which lets escape a deadly gas.

Assume that initially the quantum mechanical state of this total system is a product state with a

very large number of factors. The state of the radioactive atom evolves in the course of the hour we

agreed upon to wait into a superposition of the atom before and after decay. The evolution of the total

state then takes on the same form as the state |Φ⟩ in (VIII. 11), i.e., the state evolves into something

like

c 1 (t) |A : 1⟩ ⊗ |ν : 0⟩ ⊗ · · · ⊗ |cat : ⌣⟩

+ c 2 (t) |A : 0⟩ ⊗ |ν : 1⟩ ⊗ · · · ⊗ |cat : †⟩, (VIII. 12)

where t is the time we wait and the system evolves, |A : 1⟩ and |A : 0⟩ are the states of the radioactive

atom before and after decay, |ν : 1⟩ and |ν : 0⟩ are the states of the electromagnetic field with

and without a photon, etc.

The composite system is therefore in a gigantic superposition of states in which the cat is living

and in which it is dead. If we want to hold on to the orthodox interpretation of quantum mechanics

to the bitter end, we have to say that in this state the cat is neither living nor dead, and that only at

measuring, which is perhaps lifting the lid of the box after one hour, there is a certain probability,

namely |c 2 | 2 versus |c 1 | 2 , to find the cat dead or alive. It is the observer, the opener of the box, who

determines the fate of the cat.

170 CHAPTER VIII. THE MEASUREMENT PROBLEM

Figure VIII. 1: Schrödinger’s cat paradox (DeWitt 1970 )

VIII. 4

THE MEASUREMENT PROBLEM IN THE NARROW SENSE

In the previous section we have seen how the measurement process, (VIII. 4), brings the composite

system in a superposition of macroscopic different states, e.g. pointer positions. The development of

such superpositions is a consequence of the linearity of the evolution operator. An example is given in

the discussion between Einstein and Pauli, described in the introduction, p. 9, concerning the center of

mass of a macroscopic body. The strangeness of a superposition comes from our tacit presupposition

that the macroscopic pointer positions not only act as possible outcomes of a measurement, but can

also be taken as properties of the pointer. We think that pointers of a measuring apparatus indicate

something, even if we are not in the act of reading them off.

Assuming, for the sake of convenience, that observing something is sufficient to decide that there

is an element of physical reality which is responsible for the observation, we expect that if the quantum

state presents a complete description of the system, i.e., if every element of physical reality has a

counterpart in quantum mechanics, then those macroscopic properties should be represented by it.

That is, however, not the case in the state (VIII. 4).

The idea playing a background role is the next postulate, often called the ‘eigenstate-eigenvalue

link’. It was explicitly supported both by Dirac (1958, p. 46) and Von Neumann (1955, p. 253).

EIGENSTATE-EIGENVALUE LINK, PURE CASE:

A physical system S has the property that quantity A has a definite value iff its state is an

eigenstate of the operator A which, according to the observables postulate, corresponds

to A.

It is also conceivable that a system possesses a definite but unknown value for a quantity. If we use

the ‘ignorance interpretation of mixtures’, as discussed in chapter III, p. 52, we obtain the variation

EIGENSTAT-EIGENVALUE LINK, MIXED CASES:

A physical system S has the property that quantity A has a definite but unknown value

VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 171

iff its state is in a mixture of eigenstates of the operator A which, according to the observables

postulate, corresponds to A.

These postulates speak about the existence of properties, about physical quantities having values,

independent of a measurement or a measuring context.

EXERCISE 37. Discuss the link between the property postulates and the sufficient condition of

reality EPR(EPR) of Einstein, Podolsky and Rosen, section I. 2, p. 12, ff.

From this point of view it would be good to have a quantum mechanical description of the measurement

process in which, in any case, the measuring apparatus has a certain property after completion

of the measurement. This means that, instead of the superposition (VIII. 4), we require, as a final

state, the mixture

W ′ =

N∑

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |. (VIII. 13)

j=1

Some authors, e.g. Landau and Lifshitz (1958, pp. 21 - 24), go still further and require as a final

state an eigenstate |r k ⟩ of the pointer quantity R, corresponding to the pointer position found after

measurement. According to them the measuring interaction finishes with an indeterministic jump,

with probability |c j | 2 , to one of the states |a j ⟩ ⊗ |r j ⟩.

Summarizing, we have the following options for the description of the measurement process. For

the initial state there is no comtroversy,

|ψ⟩ ⊗ |r 0 ⟩ =

N S ∑

j=1

For the final state there are three possibilities,

c j |a j ⟩ ⊗ |r 0 ⟩. (VIII. 14)

1.

N S ∑

j=1

c j |a j ⟩ ⊗ |r j ⟩, (VIII. 15)

2. W ′ = ∑ j

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |, (VIII. 16)

3. |a j ⟩ ⊗ |r j ⟩ with probability |c j | 2 . (VIII. 17)

According to the foregoing line of reasoning we require that, at the end of a measuring interaction,

the pointer of the measuring apparatus, which is of course macroscopic, designates something.

The state (VIII. 15) does not satisfy this requirement, on the contrary, the quantum mechanical superposition

|ψ⟩ of eigenstates |a j ⟩ of the quantity that is measured and which prohibited us to ascribe,

172 CHAPTER VIII. THE MEASUREMENT PROBLEM

preliminary to the measurement, a certain value A to the object system S, proves to be contagious;

after the interaction also the pointer quantity of the measuring apparatus has no definite value anymore,

and if the composite system SM is coupled to another measuring apparatus M ′ , this also

becomes infected with ‘property loss’. This is why (VIII. 16) and (VIII. 17) are preferred as final

states over (VIII. 15).

The problem of giving a treatment of the measurement process which produces one of these two

final states, and which therefore ‘creates’ the definite values by means of the measuring interaction,

is the measurement problem in the narrow sense. Notice that (VIII. 16) and (VIII. 17) cannot be

obtained from the initial state by means of a unitary transformation. Therefore, we have to adjust or

extend the first five Von Neumann postulates. We will discuss some proposals for a solution.

VIII. 4. 1

THE PROJECTION POSTULATE AND CONSCIOUSNESS

By adding the projection postulate to the first five postulates, p. 41, Von Neumann gave the standard

solution to the measurement problem in the narrow sense. He distinguished two ways in which

a state can change in time,

Process 1. The discontinuous, non - unitary, indeterministic projection occurring at a

measurement; the projection postulate.

Process 2. The continuous, unitary, deterministic evolution which is consistent with the

Schrödinger equation or its generalization to mixed states, as long as no measurement is

made on the system; the Schrödinger postulate.

At measurement the state undergoes a transition into the eigenstate belonging to the outcome of

measurement. Therefore, this brings about the final state (VIII. 17) and gives, in accordance with the

eigenstate-eigenvalue link, p. 170, definite properties to both the object system and the pointer of the

measuring apparatus.

Although the measurement problem in the narrow sense is solved with these two types of evolution,

the measurement problem in the broad sense, p. 164, comes into prominence more than ever.

We would now like to have an explanation for the particular nature of a measurement, or at least a

criterion with which it can be distinguished of other processes.

Such a criterion is provided, by Von Neumann and for instance Wigner, W. Heitler (1970 p. 42),

and F. London and E. Bauer (1939), in terms of the consciousness of an observer. London and Bauer

reason as follows.

Consider an object system S, a measuring apparatus M and a conscious observer B. The state of

the composite system after measurement is, according to (VIII. 11),

|Φ⟩ = ∑ j

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |b j ⟩. (VIII. 18)

According to London and Bauer, this is the description of the state for us. But for the conscious

observer B it is not the same, because B has the characteristic capacity of introspection. By introspection

he knows in which eigenstate he is, he perceives one certain pointer position. This breaks the

VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 173

quantum mechanical chain. If he knows that he is in the state |b k ⟩ and sees the meter indicating something

which corresponds to the pointer state |r k ⟩, then from that moment on the state has immediately

become |a k ⟩ ⊗ |r k ⟩ ⊗ |b k ⟩. Conscious introspection of the observer therefore causes the collapse

of the wave packet. This strange situation is expressed in the thought experiment called ‘Wigner’s

friend’, in which the measuring device is replaced by a friend who communicates the outcome of

measurement to Wigner.

The aforementioned authors emphasize the role of consciousness in the interpretation of quantum

mechanics. It need hardly be emphasized that for the majority of physicists something like this is

unacceptable. They are of the opinion that a measurement is finished as soon as the result is registered

somewhere in the equipment. It is not necessary that it subsequently comes to attention of a conscious

being. But of course, then the question remains again which criterion can be given for a permanent

registration.

VIII. 4. 2

BOHMIAN MECHANICS

An important advantage of the theory of chapter VI is its avoidance of the projection postulate.

This has consequences for the treatment of measurements. ‘Measuring’ is not a primitive concept in

Bohmian mechanics, measurements are treated on an equal footing with all other physical interactions.

The measuring apparatus is treated in the same manner as the measured object system, namely

with the Bohmian equations, which are derived from the Schrödinger equations. As a consequence,

the interaction between an object system and a measuring apparatus can be given according to the

measurement scheme (VIII. 4).

If, for the sake of simplicity, we limit ourselves to two terms, the interaction is of the

form (VI. 24), p. 134, where ϕ B and ϕ D are the eigen - wave functions of the pointer quantity, corresponding

to the various pointer positions. It is plausible to assume that ϕ B and ϕ D have no overlap.

Consequently, the wave function of the object system and the measuring apparatus is effectively factorizable

and we can regard the superposition as a mixture. There is no measurement problem in

Bohmian mechanics.

◃ Remark

The requirement that ϕ B and ϕ D in (VI. 24) have no overlap is stronger than what is required in

Von Neumann’s model. There it suffices that the wave functions are orthogonal, i.e., ⟨ϕ B | ϕ D ⟩ = 0

instead of ϕ B (⃗q)ϕ D (⃗q) = 0 for all ⃗q ∈ R 3 . ▹

VIII. 4. 3

SPONTANEOUS COLLAPSE

The next option has been developed by G.C. Ghirardi, A. Rimini, and T. Weber (1986), a related

proposal comes from F.A. Bopp (1947). In this view the evolution from the Schrödinger postulate has

to be replaced by an indeterministic evolution. A stochastic term is added, making the Schrödinger

equation non - linear. This has as a consequence that every physical system from time to time spontaneously

makes a small jump, so that the wave function collapses to, almost, a position eigenstate.

The new constant of nature characterizing the relevant time scale is such that the probability of a

spontaneous collapse of the wave function for a single elementary particle is extremely small, in the

174 CHAPTER VIII. THE MEASUREMENT PROBLEM

order of once every 10 10 years, leaving the continuous Schrödinger equation an excellent approach

for such a physical system.

In this theory it can be shown that in case of composite systems a collapse of the state of a partial

system brings about a collapse of the state of the entire composite system. This has as a consequence

that the average frequency of these spontaneous jumps per unit of time increases with the number of

degrees of freedom, and for a macroscopic system with approximately 10 25 particles the average time

between two jumps, and therefore two collapses, will only be 10 −5 milliseconds. Hence, in good

approximation, macroscopic systems always have a definite position where microscopic systems do

not.

The difference between this approach and that of Von Neumann is that in the first place there

is no fundamental difference between measurements and other interactions, consciousness plays no

role. Moreover, by adapting the evolution equation, this theory leads to predictions which differ from

quantum mechanics, making it verifiable. By means of experiments it is possible to obtain upper

and lower limits for the collapse frequency. Ghirardi, Rimini and Weber are of the opinion that the

experimental data we have at present are still compatible with a finite interval for their new constant

of nature.

VIII. 4. 4

MANY WORLDS

Another option is the many - worlds interpretation of H. Everett (1957), J.A. Wheeler (1957) and,

especially, B.S. DeWitt (1970, 1971). In this view it is posed that the quantum mechanics of the

first five postulates gives a universally valid description of reality. Therefore, in principle the wave

function of the universe can be written down. There is no part of the world, including the context of

measurement, which is described classically. Moreover, there is no projection postulate. The wave

function develops according to a unitary evolution, which means that it remains a pure state for all

time.

Everett models a measurement process by assuming that a certain system has a complete set

of orthonormal eigenstates, which are interpreted to signify that certain outcomes of measurement

have occurred and are permanently registered in a memory. They are analogous to the previously

mentioned pointer positions |r j ⟩. The state |Ψ⟩ of the composite system of object system S and

measuring apparatus M remains in the superposition form (VIII. 15) for all time. To every state |ϕ i ⟩

of the object system corresponds a relative state of the measuring apparatus,

|ψ⟩ rel

Ψ, ϕ i

:= N i

∑

j

c ij |r j ⟩ with c ij = ( ⟨ϕ i | ⊗ ⟨r j | ) |Ψ⟩, (VIII. 19)

where N i is a normalization constant and {|ϕ i ⟩} and {|r j ⟩} are arbitrary orthonormal bases of the

Hilbert spaces H S and H M of the object system and measuring apparatus, respectively. It can simply

be shown that this definition is independent of the choice of this basis, so that the relative state is

uniquely defined by |Ψ⟩ and |ϕ i ⟩.

In case of an ideal measurement we have

|ψ⟩ rel

Ψ, ϕ i

= |r i ⟩. (VIII. 20)

VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 175

This relative state yields the usual conditional probability distribution for the possible outcomes of

measurement of a quantity in case the object system is found in the state |ϕ i ⟩. This is substantiated by

Everett by showing that, if we set the right conditions for the state |ϕ i ⟩, all predictions for quantities

which only refer to the object system S can be determined using the relative state. Therefore, we can

act as if a projection to that state has taken place. In reality, however, the superposition (VIII. 15)

remains.

Now the question is, of course, how this superposition must be interpreted. Especially DeWitt

has propagated a radical view; all terms in this superposition represent real, existing worlds. The

transition during the measurement process is a division of the world in uncountably many copies,

where a different result is registered in each of them. All these worlds exist and develop further next

to one another, without being able to have mutual contact. The problem how to choose one really

realized term from the superposition, as we do using the projection postulate, is avoided because all

terms are realized.

Postulating the existence of such an multiplicity of worlds, with which, moreover, we absolutely

cannot make contact, is acceptable only for a small number of people. But probably worse is the

idea that any decay process in a star in a remote part of the universe can split up our local world into

millions of copies of itself.

Moreover, a difficult point in this theory is how the ‘splitting’ must be understood exactly. It

seems that DeWitt intends a special kind of physical process which emerges at registration. This

would look like adopting a second type of process besides the Schrödinger evolution, in contrast to

the objective of the interpretation; the measurement problem in the broad sense would not be solved.

There is also the problem which process we have to suggest for the reversed evolution; a ‘melting’ of

worlds? In Everett’s original work the idea of a physical splitting of the universe does not occur. He

only regards this as a ‘bookkeeping’ transition to a relative state.

Finally there is the supposition that to a set of states |r j ⟩ of the measuring apparatus the interpretation

can be given that herewith an outcome of measurement is permanently registered. This

supposition cannot without problems be brought into conformity with quantum mechanics because it

still concerns superpositions.

VIII. 4. 5

SUPERSELECTION RULES

Again another option is to introduce superselection rules. Certain superpositions of microscopic

states do not seem to occur in nature, for example, superpositions of states with unequal charge, e.g.

electric, baryonic, or superpositions of states with integer and half integer spin. Therefore, it could be

assumed that superpositions of macroscopically different states do not occur also, and the dynamics

of quantum mechanics must then be adapted to account for this.

In such a setup of quantum mechanics, e.g., in which the superposition principle is not valid

in general, it is possible to have W ′ , (VIII. 16), as the final state of the measurement process, see

Beltrametti and Cassinelli (1981, p. 57). More precisely, in the presence of superselection rules the

mixture (VIII. 16) and the pure state (VIII. 15) become equivalent; the superselection rules provide

the same expectation values for all physical quantities allowed by the superselection operators.

An example of this approach is the suggestion of R. Penrose (1996) that in a future unified theory

for quantum gravitation a superselection rule would apply to the space - time metric. Because

176 CHAPTER VIII. THE MEASUREMENT PROBLEM

the gravitational field is taken into account in the metric, also the positions of massive bodies such

as pointers of measuring apparatuses are superselected since the field depends on the positions of

massive macroscopic bodies.

VIII. 4. 6

IRREVERSIBILITY OF MEASUREMENT

The next option is to appeal to the special characteristic properties of measuring apparatuses,

and to the theory of irreversible processes, as is done in the work of A. Daneri, A. Loinger and

G.M. Prosperi (1962). According to these authors it is characteristic for measuring apparatuses that

they are in a metastable state. An interaction with a microscopic system then causes, by means of a

chain reaction, an irreversible response of the measuring apparatus.

The description of such an irreversible process within quantum mechanics is not straightforward,

because the unitary evolution is always reversible. It is necessary to make special assumptions concerning

the structure of the macroscopic measuring apparatus and its observable quantities; all matrices

corresponding to these quantities have to be almost diagonal in the energy representation. Then

it can be shown that, as regards the empirical statements for this observable quantities, the final

state (VIII. 15) can be replaced by that of (VIII. 16).

The elegance of this approach is that the details and construction of the measuring apparatus are

discussed. The presence of a metastable state indeed seems to be an essential aspect, like for example

the Geiger counter, or the bubble chamber using superheated liquids. But the introduction of irreversible

processes asks for a modification of the unitary evolution and therefore of the Schrödinger

postulate. Just as in the quantum theory of Ghirardi, Rimini and Weber, this is a fundamental modification

of quantum mechanics.

VIII. 4. 7

MODAL INTERPRETATION

This option to solve the measurement problem is provided by the so - called modal interpretation,

introduced by B.C. van Fraassen (1979) and developed by S. Kochen (1985), D. Dieks, (1989) and

R. Healey (1989). Overviews are given by Vermaas (1999), and Dieks and Vermaas (1998).

In the modal interpretation the projection postulate is removed together with a part of the property

postulate, while the measurement postulate is replaced by a postulate saying that every vector of the

form

|ψ⟩ = ∑ j

c j |a j ⟩ ⊗ |r j ⟩ (VIII. 21)

describes the situation in which system 1 has, as a property, the value a j for the quantity A corresponding

to the operator which is determined by the basis {|a j (t)⟩} and in which, similarly, system 2

has the value r j . Each of these states has a probability |c j | 2 to be realized. This is not different from

the usual ‘ignorance interpretation’ of probabilities. Finally, the Schrödinger postulate is declared to

be valid universally, it is, therefore, also effective during the measurement process.

An important theorem by E. Schmidt, the so - called (biorthogonal -) decomposition theorem, says

that for every composite system the evolution of a state |ψ⟩ in the form (VIII. 21) is unique as long

VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 177

as |c j | ̸= |c k | for j ≠ k. Therefore it is possible for every state |ψ⟩ for which this holds to exactly

indicate the potential corresponding properties. A generalization to mixed states can be achieved by

taking the spectral decomposition of W of the composite system as the preferred decomposition, the

Schmidt decomposition (VIII. 21) is then found for the special case of pure states.

The idea that the meaning of the state vector can be exclusively formulated in terms of measurements

is rejected, the state vector describes factual properties. The description by the wave function

is, however, incomplete, |ψ⟩ determines the possibilities and the probabilities of the possibilities, but

the real physical situation is not determined. Quantum mechanics is fundamentally indeterministic

because sometimes one possibility, at other times another one occurs.

Moreover, in this interpretation the ‘only if’ part of the property postulate is rejected, if a system

is in an eigenstate it has indeed the corresponding eigenvalue, but not ‘only if’; a system which is

in a superposition of eigenstates, (VIII. 21), nevertheless has one of the properties. In the first case

a composite physical system necessarily has the property, in the second case contingently. In logic

the italicized words are called ‘modalities’, hence the name modal interpretation. The projection

postulate is now superfluous.

If, however, the singlet state, being a state of a composite system also, is considered in the modal

interpretation, this interpretation tells us less than quantum mechanics with the property postulate

does.

◃ Remarks

In this interpretation, the metastability or possibly permanent nature of the quantities of system 2 plays

no role in attributing properties. Another point in this interpretation is that, besides the Schrödinger

dynamics for the state, there seems to be a need for a dynamics describing how properties change in

time. Several attempts have been made to that end. ▹

EXERCISE 38. What does quantum mechanics with the property postulate say about the EPRB

experiment, p. 139, that the modal interpretation does not say, and why? Does it help to couple a

measuring apparatus to the composite system of the two spin particles?

VIII. 4. 8

DECOHERENCE

Finally we will discuss the option which is possibly supported by the majority of physicists, see

H.J. Groenewold (1946), K. Gottfried (1989), N.G. van Kampen (1988), W.H. Zurek (1981 and 1982).

Bell (1990) named this option the For All Practical Purposes solution, briefly FAPP. The idea is to

show that the difference between the pure state (VIII. 15) and the mixed state (VIII. 16) is hardly

perceptible in practice.

A measuring apparatus is a macroscopic system which is in continuous interaction with its surroundings.

A more realistic representation of the measurement process will therefore be of the

178 CHAPTER VIII. THE MEASUREMENT PROBLEM

form (VIII. 11), but with a very large number of terms and factors, e.g.

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |s 0 ⟩ ⊗ · · · ⊗ |t 0 ⟩ ∑ j

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩. (VIII. 22)

In practice, the coherence between the various terms of the superposition will rapidly be lost because

this coherence can only be revealed if the expectation values of the quantities contain cross

terms. To see this, consider a quantity which is a product of quantities of the various partial systems

S, M, M ′ , . . . , M ′′ , for instance, of the form Ã ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T , or a summation thereof,

which contains non - zero off - diagonal matrix elements. This means that we assume that

(

⟨ai ′| ⊗ ⟨r j ′| ⊗ · · · ⊗ ⟨t k ′| ) Ã ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T ( |a i ⟩ ⊗ |r j ⟩ ⊗ · · · ⊗ |t k ⟩ )

= ⟨a i ′ | A | a i ⟩ ⟨r j ′ | R | r j ⟩ · · · ⟨t k ′ | T | t k ⟩ (VIII. 23)

does not exclusively contains diagonal terms. However, in practice such quantities cannot be measured,

as soon as we do not measure one of the partial systems the coherence is already broken.

For example, because of the orthogonality of the states |s j ⟩, the expectation value of the quantity

˜Q ⊗ ˜R ⊗ 11 ⊗ · · · ⊗ ˜T in the state (VIII. 22) is equal to that in the mixed state

W ′′ = ∑ |c j | 2 ( |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩ )

j

(

⟨aj | ⊗ ⟨r j | ⊗ ⟨s j | ⊗ · · · ⊗ ⟨t j | ) . (VIII. 24)

VIII. 5. INCOMPATIBLE QUANTITIES 179

The step from the pure state (VIII. 22) to the mixture (VIII. 24) is therefore justified by limiting

ourselves to practically realizable states.

At first sight, this reasoning is in every way reasonable. Of course, the reasoning only refers to

a particular class of quantities; a physical quantity for a composite system is certainly not always a

direct product or a summation thereof. But it can be maintained that quantities which are not direct

products are even harder to measure in practice. It is, however, beyond doubt that experimentally

distinguishing the pure state (VIII. 22) from the mixed state (VIII. 24) using macroscopic quantities

will be extremely difficult.

Bell considers this FAPP solution as a pitfall, he speaks of the FAPP - trap. He emphasizes that the

measurement problem is not a practical but a fundamental problem. The core of the problem is if,

after the measurement process, certain properties are present in the measuring apparatus. The FAPP

reasoning shows that, generally, in practice the system behaves as if it had those properties, but it

leaves untouched the fact that ‘in reality’ the system does not have those properties, and that, if our

experimental possibilities would be more ample, this is also experimentally provable.

EXERCISE 39. Show that, using the physical quantity corresponding to the operator |Ψ⟩ ⟨Ψ|, in

which |Ψ⟩ is the right - hand side of (VIII. 22), experimental distinction can be made between the

pure state (VIII. 22) and the mixed state (VIII. 24).

VIII. 5

INCOMPATIBLE QUANTITIES

So far we considered measuring a single physical quantity or two compatible, or commeasurable,

physical quantities of the object system, where compatible quantities are quantities corresponding

to commutating operators. The simple measurement theory (VIII. 2) however, enables us to discuss

also the measurement of incompatible quantities.

Let A and B be two arbitrary, incompatible quantities of the object system S corresponding

to the maximal operators A and B. Measuring apparatus M 1 measures A and apparatus M 2 measures

B. The pointer observables of the apparatuses are R and T , corresponding to the operators

R and T , the eigenstates are |a j ⟩, |b j ⟩, |r j ⟩, |t j ⟩, respectively. The initial state is |ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩

in H = H S ⊗ H 1 ⊗ H 2 , and with dim H S = N S ,

|ψ⟩ =

N S

∑

⟨a j | ψ⟩ |a j ⟩ =

j=1

N S

∑

⟨b k | ψ⟩ |b k ⟩. (VIII. 25)

k=1

180 CHAPTER VIII. THE MEASUREMENT PROBLEM

Now we first measure A and next B. The measurement scheme (VIII. 4) gives

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩

A

N S

∑

⟨a j | ψ⟩ |a j ⟩ ⊗ |r j ⟩ ⊗ |t 0 ⟩

j=1

B ∑

N S

∑

⟨a j | ψ⟩ ⟨b k | a j ⟩ |b j ⟩ ⊗ |r j ⟩ ⊗ |t j ⟩. (VIII. 26)

j=1 k=1

If we first measure B and then A, we have

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩

B

N S

∑

⟨b k | ψ⟩ |b k ⟩ ⊗ |r 0 ⟩ ⊗ |t k ⟩

k=1

A ∑

N S

k=1 j=1

∑

⟨b k | ψ⟩ ⟨b k | a j ⟩ ∗ |a j ⟩ ⊗ |r j ⟩ ⊗ |t k ⟩. (VIII. 27)

We see that the final states (VIII. 26) and (VIII. 27) differ from each other. For the probability to get

for A the outcome a j and for B the outcome b k we find

and

Prob A, B (R : r j ∧ T : t k ) = |⟨a j | ψ⟩| 2 |⟨b k | a j ⟩| 2 (VIII. 28)

Prob B, A (T : t k ∧ B : r j ) = |⟨b k | ψ⟩| 2 |⟨a j | b k ⟩| 2 . (VIII. 29)

The good thing is that the measurement theory enables us to make a statement about measurements

of the incompatible quantities A and B which are done after each other, on the basis of the,

possibly simultaneous, measurements of the compatible quantities R and T .

EXERCISE 40. Why are R and T compatible?

We see that the order in which A and B are measured is important. Here the result of the ‘measurement

disturbance’ develops within the framework of the unitary time evolution of the state.

For the conditional probability to find b k if we have found a j , and vice versa, we find,

with (VIII. 28) and (VIII. 29), |⟨b k | a j ⟩| 2 and |⟨a j | b k ⟩| 2 , respectively, and we see that they are equal.

This can be generalized easily. If we successively measure the discrete quantities A, A ′ , A ′′ , . . . ,

having eigenvalues a i , a ′ j, a ′′ k, . . . , the probability to find, given that measurement of A yielded the

outcome a i , for A ′ the outcome a ′ j and for A ′′ the outcome a ′′ k, etc. is equal to

Prob ( · · · A ′′ : a ′′ k ∧ A ′ : a ′ j | A : a i )

= · · · |⟨a ′′ k | a ′ j⟩| 2 |⟨a ′ j | a i ⟩| 2 = ⟨a i | a ′ j⟩ ⟨a ′ j | a ′′ k⟩ · · · ⟨a ′′ k | a ′ j⟩ ⟨a ′ j | a i ⟩

= ⟨a i | P ′ j P ′′ k · · · P ′′ k P ′ j | a i ⟩ = Tr P i P ′ j P ′′ k · · · P ′′ kP ′ j. (VIII. 30)

This result does not apply to degenerated eigenvalues.

VIII. 6. COMMENTS ON THE THEORY OF MEASUREMENT 181

We can consider (VIII. 30) to be the most general statement of quantum mechanics for maximal,

discrete quantities; a probability statement concerning the occurrence of correlations between the

outcomes of consecutive measurements. Empirically speaking, all of physics is about such statements,

including classical physics. But classical physics permits us to associate with it a picture of physical

systems as scraps and pieces of matter with properties, moving through space, while in quantum

mechanics such a picture is not available.

◃ Remark

If we measure on S, as in (VIII. 2), the same quantity for a number of times, we will always find the

same outcome. Then the projections in (VIII. 30) are orthogonal and

Tr P i P j P k · · · P k P j = δ ij δ jk · · · . ▹ (VIII. 31)

VIII. 6

COMMENTS ON THE THEORY OF MEASUREMENT

Although the measurement scheme (VIII. 2) seems evident, it is not entirely so. To show this, we

start with deriving a desired consequence from it; only physical quantities which correspond to normal

operators are measurable. The pointer states of the measuring apparatus have to be macroscopically

distinguishable, which means that the eigenstates |r j ⟩ of operator R are orthonormal, ⟨r j | r k ⟩ = δ jk ,

since R corresponds to the observable pointer position R of the measuring apparatus. Because the

measurement interaction is unitary, it holds that

(

⟨ai | ⊗ |r 0 ⟩ ) ( ⟨a j | ⊗ |r 0 ⟩ ) = ( ⟨a i | ⊗ ⟨r i | ) ( |a j ⟩ ⊗ |r j ⟩ ) , (VIII. 32)

or

⟨a i | a j ⟩ ⟨r 0 | r 0 ⟩ = ⟨a i | a j ⟩ ⟨r i | r j ⟩ = δ ij , (VIII. 33)

and therefore, ⟨a i | a j ⟩ = 0 if i ≠ j, where the |a j ⟩ are again the eigenvectors of the maximal operator

A, introduced on p. 166. The |a j ⟩ are thus orthonormal and can therefore be a basis. According

to the spectral theorem of p. 26, every basis generates, by means of the projectors projecting on the

elements of the basis, a normal operator. The physical quantity A indeed corresponds to the normal

operator A and, representing a physical quantity, the eigenvalues of A are real. Consequently, A is,

on finite dimensional Hilbert spaces, self - adjoint.

Now we will discuss some points of criticism. The measurement scheme (VIII. 2) is strongly

idealized. It does not say anything about the physical nature of measurements, which are nearly

always of electromagnetic nature. In case of a concrete description, the evolution operator U (t) will

have to represent something, i.e., a Hamiltonian H is needed which generates this evolution by means

of U (t) = e − i H t . In general, A and U will not commute, in which case the |a j ⟩ do not transform

into themselves, unless the duration of the measurement is ‘sufficiently short’. But the question what,

in this connection, is sufficiently short cannot be answered without discussing the characteristics of U

and H.

Likewise, complying with the conservation laws evokes problems as is shown in a theorem by

Wigner (1952) and Araki and Yanase (1960).

182 CHAPTER VIII. THE MEASUREMENT PROBLEM

THE WIGNER - ARAKI - YANASE THEOREM:

The evolution U(τ), which brings about the measurement transition (VIII. 4) when measuring

physical quantity A, is possible iff A commutes with all additive conserved quantities

of the composite system of object system and measuring apparatus. In other words,

conserved physical quantities which are not additive, additive physical quantities which

are not conserved, and physical quantities which are neither conserved nor additive, cannot

be measured exactly.

Proof

Here we will only prove the ‘if’ - part of the theorem. Let B be an additive conserved quantity of

the composite system SM, i.e., B is, by definition, of the form

B = B 1 ⊗ 11 + 11 ⊗ B 2 , (VIII. 34)

which is conserved. This means that B commutes with the Hamiltonian H of the composite

system,

[B, H] = 0. (VIII. 35)

Then B also commutes with every function of H and therefore with U (τ) = e − i H τ ,

[B, U (τ)] = 0 =⇒ B = U † (τ) B U (τ). (VIII. 36)

Consider the matrix element

B jk := ⟨a j | ⊗ ⟨r 0 | B |a k ⟩ ⊗ |r 0 ⟩. (VIII. 37)

On the one hand, because of the additivity of B, (VIII. 34), we have

B jk = ⟨a j | ⊗ ⟨r 0 | (B 1 ⊗ 11 + 11 ⊗ B 2 ) |a k ⟩ ⊗ |r 0 ⟩

= ⟨a j | B 1 | a k ⟩ + δ jk ⟨r 0 | B 2 | r 0 ⟩, (VIII. 38)

while on the other hand, using (VIII. 36), we see that

B jk = ⟨a j | ⊗ ⟨r 0 | U † (τ) B U (τ) |a j ⟩ ⊗ |r 0 ⟩ = ⟨a j | ⊗ ⟨r j | B | a k ⟩ ⊗ |r k ⟩

= δ jk ⟨a j | B 1 | a k ⟩ + δ jk ⟨r j | B 2 | r k ⟩. (VIII. 39)

Comparison of these two results shows that

⟨a j | B 1 | a k ⟩ = 0 for j ≠ k, (VIII. 40)

which means that in the basis {|a j ⟩} of H S , B 1 is in diagonal form and therefore A commutes

with B 1 . □

This theorem shows that the scheme (VIII. 4) can, strictly speaking, apply only to measurement of

quantities which commute with all additive conserved quantities. However, the measurement scheme

remains approximately valid if the value of the conserved quantity is large, which will easily be the

case for macroscopic apparatuses. We therefore see that, whereas the U (t) in (VIII. 2) exists, a more

concrete interpretation can come across problems. The shortcomings of the conventional formalism

of quantum mechanics with regard to giving a faithful description of the measurement process, has

lead to interesting extensions of the formalism, see, for instance, Busch, Lahti and Mittelstaedt (1991).

A

GLEASON’S THEOREM

Proofs really aren’t there to convince you that something is true - they’re there to show

you why it is true.

— Andrew Gleason

Of course mathematics works in physics! It is designed to discuss exactly the situation

that physics confronts; namely, that there seems to be some order out there - let’s find

out what it is.

— Andrew Gleason

In section III. 2 we mentioned that Von Neumann suggested for a quantum mechanical probability

measure the trace formula Tr P W , with P a projector. Gleason’s theorem shows that

this probability measure in fact characterizes all probability measures on P (H), the set of all

projectors on H. Since Gleason’s original proof is very difficult, in this appendix we will give a

simplified version by proving the theorem for pure states only.

A. 1 INTRODUCTION

Let H be a real or complex Hilbert space with dim H > 2, and P (H) the set of all projectors

on H. Let µ be a mapping µ : P (H) → [0, 1]. This µ is called a measure on H if it is additive,

satisfying

P i ⊥ P j =⇒ µ(P i + P j ) = µ(P i ) + µ(P j ) ∀ P i , P j ∈ P (H) (A. 1)

µ(0 ) = 0 and µ(11) = 1. (A. 2)

Combination of (A. 1) and the last requirement of (A. 2) implicates that µ attributes the value 1 to any

orthogonal decomposition of unity.

In section III. 2, p. 46, we saw that pure states are represented by the extreme elements of a convex

set, and by proving the theorem on p. 49 we showed that the extreme elements of the convex set S(H)

of state operators on H are the 1 - dimensional projectors in P (H). Consequently, the measure µ is

called extreme if there exists a 1 - dimensional projector P such that

µ(P ) = 1. (A. 3)

This is also expressed by saying that µ is concentrated on P . We can now formulate Gleason’s

theorem for pure states.

184 APPENDIX A. GLEASON’S THEOREM

GLEASON’S THEOREM FOR PURE STATES:

Under the condition that dim H > 2, a 1 - dimensional projector P 0 ∈ P (H) exists on

which the measure µ : P (H) → [0, 1] is concentrated, such that

µ(P ) = Tr P 0 P (A. 4)

for all P ∈ P (H).

The original proof by A.M. Gleason uses sophisticated mathematical methods and is rather

opaque. Several authors have undertaken attempts a to give a more simple proof, particularly

C. Piron (1976), J. Dorling (unpublished) and R. Cooke, M. Keane and B. Moran (1985), where

the commentaries on the ‘elementary proof’ of Cooke, Keane and Moran by R.I.G. Hughes (1989)

are clarifying.

The following proof is a mixture of all this work. It exists of four steps, which, for that matter, do

not coincide with the sections.

A. 2 CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM

Before taking the first step, we discuss a number of simple observations. First, the probability

measure of Gleason’s theorem has to be continuous in P . In section III. 1, p. 48, we showed that

discontinuous probability measures exist for dim H = 2. Therefore, the requirement dim H > 2

holds without further mentioning throughout this appendix. Second, since the trace of a projector P

is equal to the dimension of the subspace onto which it projects, the trace of a 1 - dimensional projector

is 1, which yields for µ being concentrated on P 0

µ(11) = Tr P 0 11 = 1, (A. 5)

in accordance with (A. 2) and (A. 4). Third, every measure is entirely determined by giving its values

on the 1 - dimensional projectors, and, since every higher - dimensional projector P is the sum

of orthogonal 1 - dimensional projectors P i we can, with (A. 1), determine µ (P ) from the values

of µ(P i ). Fourth, for every Hilbert space, (A. 4) at the same time defines an extreme measure µ on H

which is concentrated on P 0 , and as of now we will indicate this measure by µ 0 ,

µ 0 (P ) := Tr P 0 P. (A. 6)

Using the idempotence of P 0 we have

µ 0 (P 0 ) = Tr P 0 2 = Tr P 0 = 1, (A. 7)

from which we see that (A. 4) holds for µ = µ 0 and P = P 0 . Since this measure, being concentrated

on P 0 , assigns the value 0 to all projectors orthonormal to P 0 , it can also easily be verified that this

measure satisfies the requirements (A. 1) and (A. 2).

The foregoing observations lead to the conclusion that to prove Gleason’s theorem for pure states

we have to prove that µ = µ 0 for all P ∈ P (H). Now we will take the first step.

A. 2. CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM 185

A. 2. 1 STEP 1

THEOREM 1:

If Gleason’s theorem for pure states is true for any 3 - dimensional real Hilbert space, it

is also true for any complex Hilbert space with dim H > 2.

We will prove theorem 1 using a proof by contradiction.

Proof

Let H be a complex Hilbert space with dim H > 3 for which Gleason’s theorem is not true. Since

all higher - dimensional projectors can be decomposed to 1 - dimensional projectors, it suffices to

prove this theorem for 1 - dimensional projectors.

Assume a measure µ on H exists, which is concentrated on P 0 ∈ P (H) such that µ(P 0 ) = 1,

but differs from the measure µ 0 defined by (A. 6) in the sense that there is some 1 - dimensional

projector P 1 for which the theorem does not hold,

µ 0 (P 1 ) := Tr P 0 P 1 ≠ µ(P 1 ). (A. 8)

First we will show that, if these measures differ on a higher - dimensional Hilbert space, they also

differ on a 3 - dimensional Hilbert space.

Using the projectors P 0 and P 1 , we can construct a set of three orthogonal 1 - dimensional projectors

P 0 , ˜P 1 , P 2 in the following way. With P 0 = |e 0 ⟩ ⟨e 0 | and P 1 = |e 1 ⟩ ⟨e 1 |, construct a

unit vector |ẽ 1 ⟩ in the plane spanned by |e 0 ⟩ and |e 1 ⟩ which is perpendicular to |e 0 ⟩, i.e.

|ẽ 1 ⟩ ∝ (11 − P 0 ) |e 1 ⟩, (A. 9)

as can be seen in figure A. 1. Then the projector ˜P 1 := |ẽ 1 ⟩ ⟨ẽ 1 | is perpendicular to P 0 . 1

ẽ 1

e 1

e 2 e 0

Figure A. 1: Construction of a 3 - dimensional subspace E

Let P 2 be a 1 - dimensional projector which is perpendicular to both P 0 and ˜P 1 , it is always possible

to choose such a projector because dim H > 3. With P 2 = |e 2 ⟩ ⟨e 2 |, the three orthonormal

vectors |e 0 ⟩,|ẽ 1 ⟩ and |e 2 ⟩ together span a 3 - dimensional Hilbert space, which is a subspace of H.

We will call this space E, and, by construction, P 0 , P 1 , ˜P 1 , P 2 ∈ P (E).

1 To be exact

˜P 1 = (1 − Tr P 0 P 1 ) −1 (P 1 + P 0 P 1 P 0 − P 1 P 0 − P 0 P 1 ).

186 APPENDIX A. GLEASON’S THEOREM

Now we have the following statements,

(a) P (E) ⊂ P (H),

(b) the restriction of µ 0 to P (E) is a measure on P (E),

(d) the measures µ 0 and µ differ on P (E).

Statement (a) follows immediately from E ⊂ H. The statements (b) and (c) follow from the

fact that both µ 0 and µ, being concentrated on P 0 , assign the value 1 to E, thereby assigning

the value 0 to all subspaces of H perpendicular to E. Statement (d) follows from our assumption

(A. 8).

Next, we have to show that the Hilbert space E can be real. A Hilbert space is real if scalar

multiplication and linear combinations of vectors are only carried out with real coefficients and

the inner products are real. Choosing the vectors |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩, we have the freedom to

absorb an arbitrary phase factor, which means that we can also take them real. Furthermore, we

can exploit that freedom to bring about that the vector |e 1 ⟩, lying in the plane spanned by |e 0 ⟩

and |ẽ 1 ⟩, becomes a linear combination with real coefficients, i.e.,

|e 1 ⟩ = a |e 0 ⟩ + b |ẽ 1 ⟩ with a, b ∈ R. (A. 10)

All inner products of the four vectors |e 0 ⟩, |ẽ 1 ⟩, |e 2 ⟩ and |e 1 ⟩ now have a real value. The required

real Hilbert space is obtained by taking all linear combinations of |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩ with real

coefficients. Because both |e 0 ⟩ and |e 1 ⟩ are elements of this Hilbert space, (a) through (d) remain

valid.

We see that, if Gleason’s theorem for pure states is not true for a complex Hilbert space with

dim > 3, it is also not true for a real 3 - dimensional Hilbert space. Now assume that the theorem

is proven to be true for a real Hilbert space with dim = 3. At the same time supposing that it is

not true for a Hilbert space with dim > 3, so that it would, as we showed, also not be true for a

real H with dim = 3, yields a contradiction. Therefore, theorem 1 is true. □

A. 3 FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE

While by proving theorem 1 we showed that, if Gleason’s theorem for pure states is true for a

real, 3 - dimensional Hilbert space, it is also true for a complex Hilbert space with dim > 2, we did

not prove that µ = µ 0 . In this section we will take the next steps towards proving that indeed µ = µ 0

for all P ∈ P (H) in a real, 3 - dimensional Hilbert space.

Conversion of an arbitrary complex Hilbert space to a 3 - dimensional real Hilbert space is convenient

because this space is isomorphic with the usual 3 - dimensional Euclidean space R 3 . Here,

the 1 - dimensional projectors correspond to lines through the origin, and we can identify them with

points on the surface of a unit sphere, or actually, with half of the unit sphere because |e⟩ and −|e⟩ represent

the same state. Those points will be designated by means of their spherical coordinates (θ, ϕ),

or as points, or directions, on the surface of the unit sphere p, q, r, s, t, . . . , ∈ S 2 , where S 2 is the

standard notation for this surface, and the index 2 refers to the fact that it is 2 - dimensional.

A. 3. FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE 187

Letting lines through the origin represent 1 - dimensional projectors, the mapping µ is represented

by a function µ which is a function of the points on S 2 or of the spherical coordinates of those points,

having the following characteristics.

The point p 0 , corresponding to the projector P 0 for which the measure µ is extreme, therefore

µ(p 0 ) = 1, (A. 11)

is called the north pole by convention, the other 1 - dimensional projectors are represented by points on

the northern hemisphere. The set of all 1 - dimensional projectors perpendicular to a given direction r

is called a great circle with axis r, this can be seen in figure A. 2. The great circle representing the

projectors perpendicular to P 0 is called the equator, for which it holds that for any point s on the

equator, according to (A. 1), (A. 11), and the requirement 0 µ 1,

µ(s) = 0. (A. 12)

The requirement (A. 1) for µ to be a measure is, if µ is taken to be a function of points on the

surface of the unit sphere, that for arbitrary, mutually perpendicular axes (r, s, t) in the northern

hemisphere it holds that

µ(r) + µ(s) + µ(t) = 1, (A. 13)

while for µ taken as a function of the spherical coordinates (θ, ϕ) of the points of intersection of the

arbitrary axes (r, s, t) with the surface of the unit sphere we have

µ(θ r , ϕ r ) + µ(θ s , ϕ s ) + µ(θ t , ϕ t ) = 1 (A. 14)

where for any ϕ it holds that

µ(0, ϕ) = 1 and µ( 1 2π, ϕ) = 0, (A. 15)

assigning the required values to the north pole and the equator.

Since we are working in a real, 3 - dimensional Hilbert space, we can assign values to the special

measure (A. 6) in accordance with Von Neumann’s value assignment (V. 33), p. 119. Using (III. 45),

with P 0 = |e 0 ⟩ ⟨e 0 | and P s = |ψ⟩ ⟨ψ|,

µ 0 (P s ) = Tr P 0 P s = |⟨ψ | e 0 ⟩| 2 , (A. 16)

with θ s the angle between s and the north pole, the special measure can be written as

µ 0 (s) = cos 2 θ s . (A. 17)

We will come back to this value assignment in section A. 4. In the next two steps we will prove

that any measure µ (s) satisfying the requirements (A. 11) to (A. 13), or (A. 14) and (A. 15), is a

nonincreasing function in θ s , and does not depend on ϕ.

188 APPENDIX A. GLEASON’S THEOREM

A. 3. 1 STEP 2

THEOREM 2:

If the function µ (s) or, equivalently, µ (θ s , ϕ s ), satisfies the requirements (A. 11)

to (A. 15), then µ(s) is a nonincreasing function in θ s .

We will prove this theorem using two lemmas.

A. 3. 1. 1 LEMMA 1

A LITTLE LEMMA:

Let {s ∈ S 2 | s ⊥ r} be the great circle with axis r ≠ p 0 . Furthermore, let s 0 represent

the most northern point of this circle. Then for all points s of this great circle it holds

that

µ(s 0 ) µ(s), (A. 18)

i.e., if we let s travel along a great circle, µ(s) will have its maximum value in the most

northern point s 0 .

Proof

Choose a set of three orthogonal directions r, s, t, with s ∈ S 2 an arbitrary point on the great

circle around axis r. From (A. 13) we have

µ(r) + µ(s) + µ(t) = 1. (A. 19)

Now carry out a rotation of the orthogonal pair s and t around the axis r until s arrives at the most

northern point s 0 of the great circle. Under this rotation t arrives at a point t ′ at the equator as

can be seen in figure A. 2.

r

p 0

s 0

θ

t ′

t

∆ϕ

s

equator

Figure A. 2: Rotation of s to s 0 and t to t ′ along a great circle around axis r

A. 3. FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE 189

Since r, s 0 and t ′ are still mutually orthogonal, we have

µ(r) + µ(s 0 ) + µ(t ′ ) = 1, (A. 20)

and combination with (A. 19) gives

µ(s) + µ(t) = µ(s 0 ) + µ(t ′ ). (A. 21)

But t ′ is on the equator, where, according to (A. 12), µ(t ′ ) = 0, and with 0 µ 1 we see that

µ(s) = µ(s 0 ) − µ(t) µ(s 0 ). (A. 22)

Therefore, on the great circle µ(s) has its largest value in the most northern point. □

A. 3. 1. 2 LEMMA 2

PIRON’S GEOMETRIC LEMMA :

If the pair (s, t) are points on the northern hemisphere, and s lies more northwards than t,

a curve of s to t can be found, existing entirely of segments of great circles, always

starting at their most northern point.

The following proof of Piron’s geometric lemma using projective geometry has been given by

Cooke, Keane and Moran (1985).

Proof

The surface of the northern hemisphere of the unit sphere can be projected bijectively from the

origin onto the horizontal plane P tangent to the north pole, as can be seen in figure A. 3. Therefore,

we can also formulate our problem in this plane.

P

p 0 = Im(p 0 )

Im(s ′ )

Im(s 0 )

s ′ s 0

Im(s ′′ )

s ′′

Figure A. 3: Projection of points on a great circle onto a plane P through the north pole

190 APPENDIX A. GLEASON’S THEOREM

All great circles, except the equator, are projected onto this plane as straight lines. The most

northern point of such a great circle is projected onto the point of its corresponding line that is

closest to the north pole. The line connecting the image of the north pole, Im(p 0 ), and the image

of s 0 , Im(s 0 ), therefore intersects this line at a right angle.

The projection plane therefore contains circles around the projected north pole corresponding to

circles of constant northern latitude, where θ is constant, lines through the projected north pole

corresponding to meridians which are lines of constant ϕ, and projected great circles, where one

of those great circles is depicted in figure A. 4 by the thick grey line, while the projection of its

most northern point is connected with the projected north pole by the thin grey line.

P

θ = c

ϕ = c

Figure A. 4: Projection of meridians, circles with constant latitude, and a great circle

A continuous path from s to t, with s more northern than t, therefore θ s < θ t , along a series of

segments of great circles while always starting at their most northern point, is represented in this

way by a spiral consisting of straight line segments as shown in figure A. 5.

t

S N

Figure A. 5: Spiral representing a projected path from s to t along subsequent great circles, each time

starting at their most northern point

By increasing the number of segments between s and t, we can let this spiral approach a circle

with the north pole as its center. This means that on the northern hemisphere we can travel

every desired distance in longitude by changing over to other great circles, while by changing

over frequently enough we can make the decrease in northern latitude arbitrarily small, leaving θ

constant or nearly constant.

It is also possible to travel from a point t to a more southern point v having the same longitude,

ϕ t = ϕ v . Of course, this can be done by traveling along a nearly circular path as described

p 0

S s

A. 3. FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE 191

above while taking ϕ from 0 to 2π and changing over just often enough to descend the required

distance, but we will show it can also be done taking a path along two great circles only, again

starting in their most northern points.

As we saw, on the plane P paths of constant latitude are represented by circles around the north

pole p 0 . Taking t as the starting point, choose it to be the most northern point of a great circle and

travel along a segment, projected onto P as a straight line, to arrive at u, with θ u > θ t . From u,

also choosing it to be the most northern point of a great circle, travel along a segment in opposite

rotational direction, to arrive at v, the projection of which can be seen in figure A. 6.

v

t

u

ϕ(t) = ϕ(v)

S

Figure A. 6: Path from t to v, having the same longitude

By traveling far enough along the great circle through t, u can always be chosen such that v can

be reached from t in two steps. This means that we can always combine paths with constant latitude

and constant longitude to create a path between two points s and t, where s is more northern

than t, existing entirely of segments of great circles, always starting at their most northern point,

thereby satisfying Piron’s lemma. □

p 0

A. 3. 1. 3 RESULT OF LEMMA 1 AND 2

By proving the first lemma, we showed that µ(s 0 ), with s 0 the most northern point of the great

circle through s, is always larger than, or equal to, µ(s), consequently, µ can only remain constant or

decrease along a great circle if traveling along the circle starts from its most northern point.

According to lemma 2, traveling from s to t, where s is more northern that t, is always possible

to follow a path along subsequent great circles, each time starting at their most northern points.

Combination of the two lemmas means that Piron’s lemma implies that we can find a sequence of

points s, ′ , s ′′ , . . . , t, with

and therefore

µ(s) µ(s ′ ) . . . µ(t) for θ s < θ s ′ < . . . < θ t , (A. 23)

µ(s) µ(t) for θ s < θ t , (A. 24)

which proves theorem 2. □

192 APPENDIX A. GLEASON’S THEOREM

A. 3. 2 STEP 3

THEOREM 3:

The function µ is constant at constant latitude and hence does not depend on ϕ,

θ s = θ t ⇒ µ(θ s , ϕ s ) = µ(θ t , ϕ t ). (A. 25)

Proof, first part

Again, we will use a proof by contradiction.

Suppose a latitude exists, i.e., there is a horizontal circle B on the surface of the unit sphere,

B(θ 0 ) = {s ∈ S 2 | θ s = θ 0 }, (A. 26)

for which µ is not constant. Here we assume that B(θ 0 ) is not the north pole or the equator, where

theorem 3 is obvious. Now let

and

M (θ 0 ) := sup{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )} (A. 27)

m(θ 0 ) := inf{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )}, (A. 28)

where M (θ 0 ) is the least upper bound, or supremum, and m(θ 0 ) is the greatest lower bound, or

infimum, of all values of µ over B(θ 0 ). If µ does not remain constant, it applies, for certain ε > 0,

that

M (θ 0 ) − m(θ 0 ) = ε. (A. 29)

Now let C be an arbitrary continuous curve which intersects each circle of constant latitude at

most once, i.e., C is strictly in - or decreasing.

p

B(θ 0 )

C

Figure A. 7: A strictly in - or decreasing curve C

Let p be the point where the curve C intersects the latitude (A. 26),

p = C ∩ B(θ 0 ), (A. 30)

as can be seen in figure A. 7.

A. 3. FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE 193

For every point s 1 on this curve north of p we have θ s1 < θ 0 , which means that according

to (A. 24) it holds that µ(s 1 ) µ(s) for every point s ∈ B(θ 0 ). Consequently, it also holds that

µ(θ s1 ) M (θ 0 ). (A. 31)

Likewise, for all points s 2 of C south of B(θ 0 ) we see that

µ(θ s2 ) m(θ 0 ). (A. 32)

This reasoning holds no matter how close to B(θ 0 ) the points s 1 and s 2 are chosen.

Because of (A. 29) we conclude that the value of µ, when traveling from north to south along

the curve C, makes a discontinuous jump of at least

M (θ 0 ) − m(θ 0 ) = ε (A. 33)

to a lower value when passing B (θ 0 ). This conclusion applies to every continuous, strictly in -

or decreasing curve intersecting B (θ 0 ), which means we can also choose the curve C to be a

meridian,

C = {s ∈ S 2 | ϕ s = ϕ 0 }, (A. 34)

which is a great circle through the north pole having its axis t at the equator, see figure A. 8.

p 0

s ⊥ B(θ 0 )

q

θ q

s

p

C

t

Figure A. 8: Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ )

Let q ∈ C be orthogonal to the point of intersection p of C and B (θ 0 ), such that t, p and q

are mutually orthogonal. Choose an orthogonal pair (s, s ⊥ ) ∈ C to be a rigid coordinate system.

Rotating this system around axis t, we move s from north to south through point p, whereby,

according to (A. 33), the value of µ jumps discontinuously with at least ε while crossing over the

latitude of B(θ 0 ). The pair s and s ⊥ forming a rigid system, we know that

µ(s) + µ(s ⊥ ) + µ(t) = 1, (A. 35)

where µ(t) = 0 because the axis t is on the equator.

194 APPENDIX A. GLEASON’S THEOREM

Therefore, if s moves southwards, passing through p, and simultaneously s ⊥ moves northwards,

passing through q, the value of µ(s ⊥ ) also has to jump discontinuously. If µ(s) jumps with −ε,

then µ(s ⊥ ) jumps with ε.

Now choose another great circle C ′ with axis t ′ , which intersects B(θ 0 ) in p under a slightly tilted

angle, as can be seen in figure A. 9.

p 0

q

q ′ q ′′ B(θ 0 )

t ′′

C ′′

p

t ′

C ′

C

t

Figure A. 9: Great circle C and tilted great circles C ′ and C ′′

For this great circle we can repeat the same argument, and conclude that for s ′ ∈ C ′ , while

passing the latitude of B(θ 0 ), µ(s ′ ) makes a jump of at least ε, and an equally valued but opposite

jump is made by µ (s ′⊥ ) in a point q ′ ∈ C ′ which is again perpendicular to p. Notice that,

because C ′ is tilted with respect to C, θ(q) ≠ θ(q ′ ).

This argument can be repeated endlessly, with great circles C ′′ , C ′′′ , . . . , C n , intersecting B(θ 0 )

in p, always under different angles. We therefore find a series of points q, q ′ , q ′′ , . . . , q n where,

in passing through one of them while traveling along one of the great circles through p, the value

of µ jumps discontinuously. □

Here we briefly pause from the proof of theorem 3 to prove a simple lemma.

ACCESSORY LEMMA:

Let C 1 and C 2 be two continuous curves on S 2 , intersecting in q, where q is not the

most northern point of either curve. For some s ∈ C 1 , with s more northern than q,

suppose that, traveling south, µ(s) makes a discontinuous jump of −ε < 0 in the point

of intersection q.

This means that it holds for all s ∈ C 1 and some constant a,

and

θ s < θ q ⇒ µ(s) a, (A. 36)

θ s > θ q ⇒ µ(s) a − ε, (A. 37)

and consequently, for all t ∈ C 2 , µ(t) also makes a discontinuous jump in q of at least ε.

A. 3. FORMULATION OF THE PROBLEM ON THE SURFACE OF A SPHERE 195

Proof

For every pair of points (s 1 , s 2 ) ∈ C 1 , where θ s1 < θ q and θ s2 > θ q , we can always find a

pair (t 1 , t 2 ) ∈ C 2 , such that θ s1 < θ t1 < θ q and θ s2 > θ t2 > θ q , see figure A. 10.

s 1

t 1

q

s 2 t 2

C 1

C 2

θ q

Figure A. 10: Two continuous curves on S 2 , intersecting in q

Using (A. 24), (A. 36) and (A. 37), we have for t ∈ C 2

θ s < θ t < θ q ⇒ µ(s) µ(t) a, (A. 38)

and

θ s > θ t > θ q ⇒ µ(s) µ(t) a − ε. (A. 39)

This holds no matter how close to q the points s and t are chosen, which proves the lemma. □

Proof, second part

Now we continue the proof of theorem 3. In the first part of the theorem we proved for the pair

(s, s ⊥ ) that if µ jumps with ε in p, it also jumps with ε in q. The same rigidity holding for any

pair (s i , s i⊥ ) ∈ C i , we concluded that µ jumps in every point q, q ′ , q ′′ , . . . , q n with at least ε.

With the accessory lemma, we proved that, if µ makes a jump of at least ε at some point on one

curve C, it does so on any curve C i through that point.

Since we chose the directions q, q ′ , q ′′ , . . . , q n perpendicular to p, see figure A. 9, they all lie

on C p , a great circle with axis p. Starting in its most northern point q, upon descending along this

great circle C p towards the equator, µ(s) remains constant or decreases, as we showed by proving

theorem 2.

But according to the first part of this proof and the accessory lemma, upon descending along

this great circle C p towards the equator, in each of the points q, q ′ , q ′′ , . . . , q n , µ jumps with

at least −ε while passing their various latitudes. Since we can choose n arbitrary large, we can

choose n to be larger than n > ε 1 , making the total jump nε > 1. This leads to µ acquiring values

smaller than 0, which is contradictory to the requirement that 0 µ 1. We have to conclude

that ε = 0, which yields M (θ 0 ) = m(θ 0 ).

We proved that if on the surface of the unit sphere a horizontal circle B exists for which µ is not

constant, then µ /∈ [0, 1], hence µ is constant on constant latitude and does not depend on ϕ,

which proves theorem 3. □

196 APPENDIX A. GLEASON’S THEOREM

A. 4 AN ANALYTIC LEMMA

We have to take one more step to prove that µ = µ 0 , but first we prove a lemma using results

from previous sections.

LEMMA:

The special measure µ 0 can be written as

µ 0 (χ s ) = χ s , (A. 40)

Proof

The special measure (A. 6) can, as we saw in (A. 17), be written as

µ 0 (P ) = Tr P 0 P = |⟨ψ | e 0 ⟩| 2 = cos 2 θ. (A. 41)

As we proved that µ is a nonincreasing function in θ, and does not depend on ϕ, we can take µ

to be a function of a function of θ, and to already make a connection with the analytic lemma

of step 4 which will follow shortly, we choose this function to be the constant, nonincreasing

function χ s : [0, 1 2 π] → [0, 1], χ(θ s) := cos 2 θ s , where θ s is the angle between the direction s

and the north pole. In the next step we will show that this measure satisfies the requirements for µ

to be a measure.

For the special measure µ 0 (θ s ), with s representing an arbitrary P , (A. 41) now reads

µ 0 (χ(θ s )) = cos 2 θ s (A. 42)

which can be written as

µ 0 (χ s ) = χ s . □ (A. 43)

What is left for us to do is to see whether a measure exists, not equal to µ 0 , for which this does

not hold for some P ∈ P (H), as was our assumption (A. 8) in A. 2. 1. This will be the final step,

where, by proving the next theorem, we will see that such a measure does not exist.

A. 4. 1 STEP 4

THEOREM 4:

The only form of µ satisfying (A. 24),

is

θ s < θ t ⇒ µ(s) µ(t), (A. 44)

µ(χ s ) = χ s . (A. 45)

A. 4. AN ANALYTIC LEMMA 197

To prove this theorem, we will use an analytic lemma given by Cooke, Keane and Moran (1985).

But before we will do so, we make some observations.

First, for any triple of mutually perpendicular directions (r, s, t) and some direction q it holds in

general that

cos 2 θ r + cos 2 θ s + cos 2 θ t = 1, (A. 46)

where θ r is the angle between the direction q and axis r, and r corresponds to cos θ r , likewise for s

and t. We can easily see that (A. 46) holds in general if we express the directions in the usual spherical

coordinates,

cos θ r = cos ϕ sin θ, cos θ s = sin ϕ sin θ, and cos θ t = cos θ, (A. 47)

from which we readily know that their squares add up to 1.

With χ(θ r ) = cos 2 θ r etc., we can write (A. 46) as

χ r + χ s + χ t = 1. (A. 48)

Second, for µ as a function of χ(θ s ), µ : [0, 1] → [0, 1], it holds that although µ is nonincreasing

in θ, it is nondecreasing in χ s . The requirements for µ to be a measure, (A. 14) and (A. 15), can now

be rewritten as

µ(χ r ) + µ(χ s ) + µ(χ t ) = 1, (A. 49)

µ(0) = 0 and µ(1) = 1. (A. 50)

With these properties, µ equals the function f in the analytic lemma which now follows.

ANALYTIC LEMMA:

If f : [0, 1] → [0, 1] is a function such that

(1) f (0) = 0,

(2) f is nondecreasing, i.e., if a

(3) if a, b, c ∈ [0, 1] and a + b + c = 1, then f (a) + f (b) + f (c) = 1,

then f is the identity function: f (a) = a for all a ∈ [0, 1].

Proof

Choosing c = 0, from (3) we have b = 1 − a, yielding

f (a) = 1 − f (1 − a) (A. 51)

for all values a ∈ [0, 1]. Next, choose c = 1 − (a + b),

f (a) + f (b) = 1 − f (1 − (a + b) = 1 − (1 − f (a + b)) = f (a + b) (A. 52)

for all a, b, a + b ∈ [0, 1].

198 APPENDIX A. GLEASON’S THEOREM

Iteration of (A. 52) yields, for n ∈ N + ,

nf (a) = f (na) for n a 1. (A. 53)

Taking a = 1 n

we see that

( 1

f =

n)

f (1)

n

and iterating again, we have

or, indeed,

( m

)

f = m n n

= 1 , (A. 54)

n

for m, n ∈ N, m < n, (A. 55)

f (a) = a ∀ a ∈ Q. (A. 56)

From (2) we see that

lim f (a) = sup f (a) = 0, (A. 57)

a→0 a→0

and, using again (A. 52),

lim f (a + b) = f (b) ∀ 0 b 1. (A. 58)

a→0

Therefore, f is continuous, and

f (a) = a ∀ a. □ (A. 59)

A. 5 SUMMARY

In this appendix we proved Gleason’s theorem for pure states, represented by extreme measures µ.

In section A. 2 we proved that if Gleason’s theorem for pure states is true for any 3 - dimensional

real Hilbert space, it is also true for any complex Hilbert space with dim H > 2. In A. 3. 1 we showed

that µ is a nonincreasing function in θ, and in A. 3. 2 we proved that µ does not depend on ϕ.

Finally, by proving the analytic lemma we showed that there can only be one form for the measure

µ which satisfies these requirements for all P ∈ P (H) and that is the quantum mechanical one,

i.e., in accordance with cos 2 θ.

WORKS CONSULTED

Most subjects in these lecture notes are also found in Redhead (1987), Krips (1987), Hughes (1989),

D’Espagnat (1989) and Bub (1997).

Dickson (1998) is an accessible monograph.

Jammer (1974) is a survey of the research in foundations of quantum mechanics in historical perspective

from the beginnings of quantum mechanics until 1974. However, Jammer remains indispensable

for every student seriously studying foundations of quantum mechanics.

Bell (1987) contains his articles on quantum mechanics.

Von Neumann’s Grundlagen (1932) is a masterpiece, which is still fully worth studying.

Prugovečki (2006) is a modernized and more systematic version, but it evades subjects of interpretation

and is mainly a mathematical reference book.

Busch, Lahti and Mittelstaedt (1996) is a monograph on quantum mechanical measurement theory.

Hooker (1975) is a collection of important articles of algebraic and logical signature.

Wheeler and Zurek (1983) is an extensive collection of photocopies of important articles (EPR, Bohr,

Bohm, Everett, etc.).

Fine (1986) is the unequalled monograph on Einstein and quantum mechanics.

Contributions to the research of foundations of quantum mechanics from Utrecht University are the

work of Hilgevoord and Uffink and vice versa, of Dieks and Vermaas about the modal interpretation

of quantum mechanics and Uffink’s thesis (1990) about uncertainty relations.

BIBLIOGRAPHY

Albers, D.J., Alexanderson, G.L., Reid, C. (1990) More Mathematical People : Contemporary Conversations

Boston: Harcourt Brace Jovanovich

Araki, H., Yanase, M.M. (1960) ‘Measurement of Quantum Mechanical Operators’

Physical Review 120 (2) pp. 622-626

Aspect, A., Dalibard, J., Roger, G. (1982) ‘Experimental Test of Bell’s Inequalities Using Time -

Varying Analyzers’

Physical Review Letters 49 (25) pp. 1804-1807

Belinfante, F.J. (1973) A Survey of Hidden - Variables Theories

Oxford: Pergamon Press

Bell, J.S. (1964) ‘On the Einstein Podolsky Rosen Paradox’

Physics 1 (3) pp. 195-200, repr. in Wheeler and Zurek (1983)

Bell, J.S. (1966) ‘On the Problem of Hidden Variables in Quantum Mechanics’

Reviews of modern physics 38 pp. 447-452

Bell, J.S. (1971) ‘Introduction to the hidden - variables question’

In d’Espagnat (1971), repr. in Bell (1987)

Bell, J.S. (1975) ‘The Theory of Local Beables’

Presented at the sixth GIFT Seminar, Jaca, 2 - 7 June 1975, repr. in Bell (1987)

Bell, J.S. (1982) ‘On the impossible pilot wave’

Foundations of Physics 12 (10) pp. 989-999

Bell, J.S. (1987) Speakable and Unspeakable in Quantum Mechanics

Cambridge: Cambridge University Press

Bell, J.S. (1990) ‘Against measurement’

Physics World (August) pp. 33-40

Beltrametti, E.G., Cassinelli, G. (1981) The Logic of Quantum Mechanics

Reading: Addison - Wesley Publishing Company

Birkhoff, G., Von Neumann, J. (1936) ‘The Logic of Quantum Mechanics’

The Annals of Mathematics, Second Series 37 (4) pp. 823-843

Bohm, D.J. (1952) ‘A Suggested Interpretation of the Quantum Theory in Terms of “Hidden” Variables.

I, II’

Physical Review 85 (2) pp. 166-179, pp. 180-193

202 BIBLIOGRAPHY

Bohm, D.J., Aharonov, Y. (1957) ‘Discussion of Experimental Proof for the Paradox of Einstein,

Rosen, and Podolsky’

Physical Review 108 (4) pp. 1070-1076

Bohm, D.J. (1981) Wholeness and the implicate order

London: Routledge & Kegan Paul

Bohm, D.J., Peat, F.D. (1989) Science, order, and creativity

London: Routledge

Bohr, N.H.D. (1928) ‘The Quantum Postulate and the Recent Development of Atomic Theory’

Nature 121 (3050) pp. 580-590

Bohr, N.H.D. (1931) ‘Maxwell and Modern Theoretical Physics’

Nature 128 (3234) pp. 691-692

Bohr, N.H.D. (1934) Atomic Theory and the Description of Nature

New York: The Macmillan Company

Bohr, N.H.D. (1935a) ‘Quantum Mechanics and Physical Reality’

Nature 136 p. 65

Bohr, N.H.D. (1935b) ‘Can Quantum - Mechanical Description of Physical Reality Be Considered

Complete?’

Physical Review 48 (8) pp. 696-702

Bohr, N.H.D. (1939) ‘The causality problem in atomic physics’

In Bohr, N.H.D. (1939) New Theories in Physics

Paris: International Institute of Intellectual Co - operation

Bohr, N.H.D. (1947) ‘Newton’s Principles and Modern Atomic Mechanics’

In The Royal Society of London (1947) Newton Tercentenary Celebrations. 15 - 19 July 1946

Cambridge: Cambridge University Press

Bohr, N.H.D. (1949) ‘Discussion with Einstein on epistemological problems in atomic physics’

In Schilpp (1949), repr. in Wheeler and Zurek (1983)

Bopp, F.A. (1947) ‘Quantenmechanische Statistik und Korrelationsrechnung’

Zeitschrift für Naturforschung A 2 pp. 202-216

Born, M., Jordan, P., (1925) ‘Zur Quantenmechanik’

Zeitschrift fur Physik 34 (1) pp. 858-888

Eng. tr. (abridged): ‘On Quantum mechanics’

In Van der Waerden (1967)

Bródy F., Vámos, T. (eds) (1995) The Neumann Compendium

Singapore: World Scientific Publishing Company

BIBLIOGRAPHY 203

Broglie, L.V.P.R. de (1928) ‘La nouvelle dynamique des quanta’

La Commission Administrative de l’Institut Internale de Physique Solvay (1928) Électrons et

Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24

au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay

Paris: Gauthier - Villars

Eng. tr.: ’The new dynamics of quanta’

In Bacciagaluppi, G., Valentini, A. (2009) Quantum Theory at the Crossroads : Reconsidering

the 1927 Solvay Conference

Cambridge: Cambridge University Press

Bub, J., Clifton, R.K. (1996) ‘A Uniqueness Theorem for ‘No Collapse’ Interpretations of Quantum

Mechanics’

Studies in the History and Philosophy of Modern Physics B 27 (2) pp. 181-219

Bub, J., Clifton, R.K., Goldstein, S. (2000) ‘Revised Proof of the Uniqueness Theorem for ‘No Collapse’

Interpretations of Quantum Mechanics’

Studies in the History and Philosophy of Modern Physics B 31 pp. 95-98

Bub, J. (1997) Interpreting the Quantum World

Cambridge: Cambridge University Press

Busch, P.S., Grabowski, M.P., and Lahti, P.J. (1995) Operational Quantum physics

Berlin: Springer - Verlag

Busch, P., Lahti, P.J., Mittelstaedt, P. (1991) The Quantum Theory of Measurement

Berlin: Springer - Verlag

Capasso, V., Fortunato, D., Selleri, F. (1973)‘Sensitive Observables of Quantum Mechanics’

International Journal of Theoretical Physics 7 (5) pp. 319-326

Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A. (1969) ‘Proposed Experiment to test Local Hidden

- Variable Theories’

Physical Review Letters 23 (15) pp. 880-884

Clifton, R.K., Butterfield, J.N., Redhead, M.L.G. (1990) ‘Nonlocal Influences and Possible Worlds –

A Stapp in the Wrong Direction’

British Journal for the Philosophy of Science 41 (1) pp. 5-58

Condon, E.U. (1929) ‘Remarks on uncertainty principles’

Science 69 pp. 573-574

Cooke, R.M, Hilgevoord, J. (1979) ‘Correspondence, Equivalence and Completeness’

Epistemological Letters (March) pp. 42-54

Cooke, R.M., Keane, M.S., Moran, W. (1985) ‘An elementary proof of Gleason’s theorem’

Mathematical Proceedings of the Cambridge Philosophical Society 98 pp. 117-128

Cushing, J.T. (1994) Quantum Mechanics : Historical Contingency and the Copenhagen Hegemony

Chicago: The University of Chicago Press

204 BIBLIOGRAPHY

Daneri, A., Loinger, A., Prosperi, G.M. (1962) ‘Quantum Theory of Measurement and Ergodicity

Conditions’

Nuclear Physics 33 (1962) pp. 297-319

De Muynck, W.M. (1986) ‘The Bell Inequalities and their Irrelevance to the Problem of Locality in

Quantum Mechanics’

Physics Letters A 114 (2) pp. 65-67

De Muynck, W.M. (1996) ‘Can We Escape from Bell’s Conclusion that Quantum Mechanics Describes

a Non - Local Reality?’

Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of

Modern Physics 27 (3) pp. 315-330

DeWitt, B.S. (1970) ‘Quantum mechanics and reality’

Physics Today 23 (9) pp. 30-40

DeWitt, B.S. (1971) ‘The Many - Universes Interpretation of Quantum Mechanics’

In d’Espagnat (1971), repr. in DeWitt and Graham (1973)

DeWitt, B.S., Graham, R.N. (eds) (1973) The Many - Worlds Interpretation of Quantum Mechanics

Princeton: Princeton University Press

Dickson W.M. (1998) Quantum Chance and Non - locality : Probability and Non - locality in the

Interpretations of Quantum Mechanics

Cambridge: Cambridge University Press

Dieks, D.G.B.J. (1983) ‘Stochastic Locality and Conservation Laws’

Lettere al Nuovo Cimento 38 (13) pp. 443-447

Dieks, D.G.B.J. (1989) ‘Resolution of the Measurement Problem through Decoherence of the Quantum

State’

Physics Letters A 142 (8,9) pp. 439-446

Dieks, D.G.B.J. and Vermaas, P.E. (eds) (1998) The Modal Interpretation of Quantum Mechanics

Dordrecht: Kluwer Academic Publishers

Dirac, P.A.M., (1925) ‘The Fundamental Equations of Quantum Mechanics’

Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical

and Physical Character 109 (752) pp. 642-653

Dirac, P.A.M. (1958) The Principles of Quantum Mechanics

Oxford: at the Clarendon Press

Dirac, P.A.M. (1963) ‘The Evolution of the Physicist’s Picture of Nature’

Scientific American 208 (5) pp. 45-53

Eberhard, P.H. (1977) ‘Bell’s Theorem without Hidden Variables’

Il Nuovo Cimento B 38 (1) pp. 75-80

BIBLIOGRAPHY 205

Einstein, A. (1921) ‘Geometrie und Erfahrung’

Sitzungsberichte der Preussischen Akademie der Wissenschaften pp. 123-130

Eng. tr.: Bargmann, S. (transl) ‘Geometry and Experience’

In Janssen, M., Schulmann, R., Illy, J., Lehner, C., Buchwald, D. (eds) (2002) The Collected

Papers of Albert Einstein, Volume 7 : The Berlin Years: Writings, 1918 - 1921

Princeton: Princeton University Press

Einstein, A. (1934) Mein Weltbild

Amsterdam: Querido Verlag

Eng. tr.: Bargmann, S. (transl), Seelig, C. (ed) (1954) Ideas and opinions

New York: Bonanza Books

Einstein, A., Podolsky, B., Rosen, N. (1935) ‘Can Quantum - Mechanical Description of Physical

Reality Be Considered Complete?’

Physical Review 47 (10) pp. 777-780

Einstein, A., Born, M., Born, H. (1971) The Born - Einstein letters : correspondence between Albert

Einstein and Max and Hedwig Born from 1916 to 1955 with commentaries by Max Born

London: The Macmillan Press

Espagnat, B. d’ (ed) (1971) Foundations of Quantum Mechanics : Proceedings of the International

School of Physics ”Enrico Fermi”, held at Varenna, 29th June-11th July, 1970, Course IL

New York: Academic Press

Espagnat, B. d’ (1989) Conceptual Foundations Of Quantum Mechanics

New York: Perseus Books

Everett, H. III (1957) ‘The Theory of the Universal Wave Function’

In DeWitt and Graham (1973)

Everett, H. III (1957) “‘Relative State” Formulation of Quantum Mechanics’

Reviews of Modern Physics 29 (3) pp. 454-462

Fine, A.I. (1982) ‘Hidden Variables, Joint Probability, and the Bell Inequalities’

Physical Review Letters 48 (5) pp. 291-295

Fine, A. (1986) The shaky game : Einstein, realism and the quantum theory

Chicago: University of Chicago Press

Folse, H.J. (1985) The philosophy of Niels Bohr : the framework of complimentarity

Amsterdam: North - Holland Physics Publishing

Fraassen, B.C. van (1973) ‘Semantic Analysis of Quantum Logic’

In Hooker, C.A. (ed) (1973) Contemporary Research in the Foundations and Philosophy of

Quantum Theory

Dordrecht: D. Reidel Publishing Company

Fraassen, B.C. van (1979) ‘Hidden Variables and the Modal Interpretation of Quantum Theory’

Synthese 42 (1) pp. 155-165

206 BIBLIOGRAPHY

Frank, P.G. (1949) Modern Science and Its Philosophy

Cambridge: Harvard University Press

Freedman, S.J., Clauser, J.F. (1972) ‘Experimental Test of Local Hidden - Variable Theories’

Physical Review Letters 28 (14) pp. 938-941

Ghirardi, G.C., Rimini, A., Weber, T. (1980) ‘A General Argument against Superluminal Transmission

through the Quantum Mechanical Measurement Process’

Lettere al Nuovo Cimento 27 (10) pp. 293-298

Ghirardi, G.C., Rimini, A., Weber, T. (1986) ‘Unified dynamics for microscopic and macroscopic

systems’

Physical Review D 34 (2) pp. 470-491

Gleason A.M., (1957) ‘Measures on the Closed Subspaces of a Hilbert space’

Journal of Mathematics and Mechanics 6 pp. 885-893

Gottfried, K. (1989) ‘Does Quantum Mechanics describe the Collapse of the Wavefunction?’

Unpublished contribution to the 1989 Conference, International School of History of Science,

Erice, Italy, 5 - 14 August

Greenberger, D.M., Horne, M.A., Zeilinger, A. (1989) Going Beyond Bell’s Theorem

In Kafatos, M.C. (ed) (1989) Bell’s Theorem, Quantum Theory and Conceptions of the Universe

Dordrecht: Kluwer Academic Publishers

http://arxiv.org/abs/0712.0921

Groenewold, H.J. (1946) ‘On the Principles of Elementary Quantum Mechanics’

Physica 12 (7) pp. 405-460

Haag, R. (1990) ‘Fundamental Irreversibility and the Concept of Events’

Communications in Mathematical Physics 132 pp. 245-251

Healey, R.A. (1989) The philosophy of quantum mechanics : An interactive interpretation

Cambridge: Cambridge University Press

Heisenberg, W. (1925) ‘Über quantentheoretische Umdeutung kinematischer und mechanischer

Beziehungen’

Zeitschrift für Physik 33 (1) pp. 879-893

Eng. tr.: ‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’

In Van der Waerden, B.L. (1967)

Heisenberg, W.K. (1927) ‘Über den anschaulichen Inhalt der quantentheoretischen Kinematik und

Mechanik’

Zeitschrift für Physik 43 (3/4) pp. 172-198

Eng. tr.: ‘The physical content of quantum kinematics and mechanics’

In Wheeler and Zurek (1983)

BIBLIOGRAPHY 207

Heisenberg, W.K., (1930) Die Physikalischen Prinzipien der Quantentheorie

Leipzig: Verlag von S. Hirzel

Eng. tr.: Eckart, C., Hoyt, F.C. (transl) (1930) The Physical Principles of Quantum Theory

New York: Dover Publications

Heisenberg, W. (1963) Niels Bohr Library and Archives

Interview with Werner Heisenberg by T. S. Kuhn at the Max Planck Institute, Munich, Germany,

February 25. Transcript Session VIII

http://www.aip.org/history/ohilist/4661underscore8.html

Heitler, W.H. (1970) Der Mensch und die naturwissenschaftliche Erkenntniss

Braunschweig: Friedrich Vieweg & Sohn Verlagsgesellschaft

Hey, T., Walters, P. (2003) The New Quantum Universe

Cambridge: Cambridge University Press

Hilgevoord, J., Uffink, J.B.M. (1988) ‘The mathematical expression of the uncertainty principle’

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1988) Microphysical Reality and Quantum

Formalism. Volume I

Dordrecht: Kluwer Academic Publishers

Hilgevoord, J., Uffink, J.B.M. (1990) ‘A new view on the uncertainty principle’

In Miller A.I. (ed) (1990) Sixty - Two years of Uncertainty : Historical, Philosophical and

Physical Inquiries into the Foundations of Quantum Mechanics

New York: Plenum Press

Hilgevoord, J. (2002) ‘Time in quantum mechanics’

American Journal of Physics 70 (3) pp. 301-306

Holevo, A.S. (1982) Probabilistic and Statistical Aspects of Quantum Theory

Amsterdam: North - Holland Publishing Company

Holland, P.R. (1993) The Quantum Theory of Motion : An Account of the de Broglie - Bohm Causal

Interpretation of Quantum Mechanics

Cambridge: Cambridge University Press

Home, D., Selleri, F. (1991) ‘Bell’s Theorem and the EPR Paradox’

La Rivista del Nuovo Cimento 14 (9) pp. 1-95

’t Hooft, G. (1997) In search of the ultimate building blocks

Cambridge: Cambridge University Press

Hooker, C.A. (ed) (1975) The Logico - Algebraic Approach to Quantum Mechanics. Volume I: the

Historical Evolution

Dordrecht: D. Reidel Publishing Company

Hughes, R.I.G. (1989) The Structure and Interpretation of Quantum Mechanics

Cambridge: Harvard University Press

208 BIBLIOGRAPHY

Isham, C.J. (1995) Lectures on Quantum Theory : Mathematical and Structural Foundations

River Edge: Imperial College Press

Jacques, V., Wu, E., Grosshans, F., Treussart, F., Grangier, P., Aspect, A., Roch, J-F. (2007) ‘Experimental

realization of Wheeler’s delayed - choice gedanken experiment’

Science 315 (5814) pp. 966-968

Jammer, M. (1974) The Philosophy of Quantum Mechanics : The Interpretations of Quantum Mechanics

in Historical Perspective

New York: John Wiley & Sons

Jammer, M. (1990) ‘John Stewart Bell and His Work - On the Occasion of His Sixtieth Birthday’

Foundations of Physics 20 (10) pp. 1139-1145

Jammer, M., (1992) ‘John Stewart Bell and the Debate on the Significance of His Contributions to

the Foundations of Quantum Mechanics’

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1992) International Conference on Bell’s

Theorem and the Foundations of Modern Physics

Singapore: World Scientific Publishing

Jarrett, J.P. (1984) ‘On the Physical Significance of the Locality Conditions in the Bell Arguments’

Noûs 18 (4) pp. 569-589

Jauch, J.M. (1968) Foundations of Quantum Mechanics

Reading: Addison - Wesley Educational Publishers

Kalckar, J. (ed) (1996) Niels Bohr - Collected Works : Volume 7 - Foundations of Quantum Physics II

(1933 - 1958)

Amsterdam: Elsevier Science

Kampen, N.G. van (1988) ‘Ten Theorems about Quantum Mechanical Measurements’

Physica A 153 pp. 97-113

Kennard, E.H. (1927) ‘Zur Quantenmechanik einfacher Bewegungstypen’

Zeitschrift für Physik 44 (4/5) pp. 326-352

Kochen, S., Specker, E.P. (1967) ‘The Problem of Hidden Variables in Quantum Mechanics’

Journal of Mathematics and Mechanics 17 (1) pp. 59-87

Kochen, S. (1985) ‘A New Interpretation of Quantum Mechanics’

In Lahti, P.J., Mittelstaedt, P. (eds) (1985) Symposium on the foundations of modern physics

1985 : 50 years of the Einstein - Podolsky - Rosen Gedankenexperiment

Singapore: World Scientific Publishing Company

Krips, H. (1987) The Metaphysics of Quantum Theory

Oxford: Clarendon Press

Landau, L.D., Lifshitz, E.M. (1958) Quantum Mechanics : Non - Relativistic theory

London: Pergamon Press

BIBLIOGRAPHY 209

Landau, H.J., Pollack, H.O. (1961) ‘Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty

- II’

The Bell System Technical Journal 40 pp. 65-84

London, F., Bauer, E. (1939) La Théorie de l’Observation en Mécanique Quantique

Paris: Hermann

Eng. tr.: ‘The Theory of Observation in Quantum Mechanics’

In Wheeler and Zurek (1983)

Lüders, G., (1951) ‘Über die Zustandsänderung durch den Meßprozeß’

Annalen der Physik 443 (5 - 8) pp. 322-328

Eng. tr.: Kirkpatrick, K.A. (transl) (2006) ‘Concerning the state - change due to the measurement

process’

Annalen der Physik 15 (9) pp. 663-670

Maczynski, M.J. (1971) ‘Boolean Properties of Observables in Axiomatic Quantum Mechanics’

Reports on Mathematical Physics 2 (2) pp. 135-150

Mermin, N.D. (1993) ‘Hidden variables and the two theorems of John Bell’

Reviews of Modern Physics 65 (3) pp. 803-815

Meyer, D.A. (1999) ‘Finite precision measurement nullifies the Kochen - Specker theorem’

Physical Review Letters 83 pp. 3751-3754

Meyer, D.A. (2003) ‘Coloring, quantum mechanics, and Euclid’

Pdf file: math.ucsd.edu/ dmeyer/research/talks/cqmE.pdf

Miller, A.I. (1990) (ed) Sixty - two Years of Uncertainty : Historical, Philosophical and Physical

Inquiries into the Foundations of Quantum Mechanics

New York: Plenum Press

Miller, W.A., Wheeler, J.A. (1984) ‘Delayed - Choice Experiments and Bohr’s Elementary Quantum

Phenomenon’

In Nakajima, S., Murayama, Y., Tonomura, A. (eds) (1996) Foundations of Quantum Mechanics

in the Light of New Technology

Singapore: World Scientific Publishing

Muller, F.A. (1997a) ‘The Equivalence Myth of Quantum Mechanics–Part I’

Studies in History and Philosophy of Modern Physics 28 (1) pp. 35-61

(1997b) ‘Part II’

ibid. 28 (2) pp. 219-247

(1999) ‘(Addendum)’

ibid. 30 (4) pp. 543-545

Neumann, J. Von (1932) Mathematische Grundlagen der Quantenmechanik

Berlin: Verlag von Julius Springer

Eng. tr.: Beyer, R.T. (transl) (1955) The Mathematical Foundations of Quantum Mechanics

Princeton: Princeton University Press

210 BIBLIOGRAPHY

Pauli, W.E. (1933) Die allgemeinen Prinzipien der Wellenmechanik

Berlin: Verlag von Julius Springer

Eng. tr.: (1950) The General principles of wave mechanics

Urbana - Champaign: University of Illinois Press

Penrose, R. (1996) ‘On Gravity’s Role in Quantum State Reduction’

General Relativity and Gravitation 28 (5) pp. 581-600

Peres, A. (1993) Quantum Theory: Concepts and Methods

Dordrecht: Kluwer Academic Publishers

Petersen, A. (1963) ‘The Philosophy of Niels Bohr’

Bulletin of the Atomic Scientists 19 (7) pp. 8-14

Petersen, A. (1968) Quantum Physics and the Philosophical Tradition

Cambridge: M.I.T. Press

Piron, C. (1976) Foundations of Quantum Physics

Reading: W.A. Benjamin

Prugovečki, E. (2006) Quantum Mechanics in Hilbert Space

Mineola: Dover Publications

Przibram, K. (ed) (1963) Briefe zur Wellenmechanik : Schrödinger, Planck, Einstein, Lorentz

Wien: Springer - Verlag

Eng. tr.: Przibram, K. (ed) (1963) Letters on wave mechanics : Schrödinger, Planck, Einstein,

Lorentz

New York: Philosophical Library

Rauch, H., Werner, S.A. (2000) Neutron Interferometry : Lessons in Experimental Quantum Mechanics

Oxford: Oxford University Press

Redhead, M.L.G. (1987) Incompleteness, Nonlocality and Realism : A Prolegomenon to the Philosophy

of Quantum Mechanics

Oxford: Clarendon Press

Robertson, H.P. (1929) ‘The Uncertainty Principle’

Physical Review 34 p. 163

Scheibe, E., Sykes, J.B., (transl) (1973) The Logical Analysis of Quantum Mechanics

Oxford: Pergamon Press

Schiff, L.I. (1949) Quantum Mechanics

New York: McGraw - Hill

Schilpp, P.A. (ed) (1949) Albert Einstein : Philosopher - Scientist

Evanston: The Library of Living Philosophers

BIBLIOGRAPHY 211

Schmidt, E. (1907) ‘Zur Theorie der linearen und nichtlinearen Integralgleichungen. I. Teil’

Mathematische Annalen 63 pp. 433-476

(1907) ‘Zweite Abhandlung’

ibid. 64 pp. 161-174

(1907) ‘III. Teil’

ibid. 65 pp. 370-399

Schrödinger, E.R.J.A. (1926) ‘An Undulatory Theory of the Mechanics of Atoms and Molecules’

The Physical Review 28 (6) pp. 1049-1070

Schrödinger, E.R.J.A. (1930) ‘Zum Heisenbergschen Unschärfeprinzip’

Sitzungsberichte der Preußischen Akademie der Wissenschaften. Physikalisch - mathematische

Klasse pp. 296-303

Schrödinger, E.R.J.A. (1935a) ‘Discussion of Probability Relations between Separated Systems’

Mathematical Proceedings of the Cambridge Philosophical Society 31 (4) pp. 555-563

Schrödinger, E.R.J.A. (1935b) ‘Die gegenwärtige Situation in der Quantenmechanik’

Naturwissenschaften 23 (48) pp. 807-812, (49) pp. 823-828, (50) pp. 844-849

Eng. tr.: Trimmer, J.D. (transl) (1980) ‘The Present Situation in Quantum Mechanics: A Translation

of Schrödinger’s “Cat Paradox”’

Proceedings of the American Philosophical Society 124 (5) pp. 323-338

Repr. in Wheeler and Zurek (1983)

Shimony, A. (1984) ‘Controllable and Uncontrollable Non - Locality’

In Kamefuchi, S., et al. (eds) Proceedings of the International Symposium : Foundations of

Quantum Mechanics in the Light of New Technology

Tokyo: Physical Society of Japan

Shimony, A. (1989) ‘Search for a Worldview Which Can Accommodate Our Knowledge of Microphysics’

In Cushing, J.T., McMullin, E. (eds) Philosophical Consequences of Quantum Theory : Reflections

on Bell’s Theorem

Notre Dame: University of Notre Dame Press

Shimony, A. (1995) ‘Degree of entanglement’

In Greenberger, D.M., Zeilinger, A. (eds) Fundamental Problems in Quantum Theory : In

Honor of Professor John A. Wheeler

New York: New York Academy of Sciences

Stapp, H.P. (1975) ‘Bell’s Theorem and World Process’

Il Nuovo Cimento B 29 (2) pp. 270-276

Stapp, H.P. (1977) ‘Are Superluminal Connections Necessary?’

Il Nuovo Cimento B 40 (1) pp. 191-205

Stone, M.H. (1932) ‘On One - Parameter Unitary Groups in Hilbert Space’

The Annals of Mathematics, Second Series 33 (3) pp. 643-648

212 BIBLIOGRAPHY

Suppes, P., Zanotti, M. (1976) ‘On the Determinism of Hidden Variable Theories with Strict Correlation

and Conditional Statistical Independence of Observables’

In Suppes, P. (ed) Logic and Probability in Quantum Mechanics

Dordrecht: D. Reidel Publishing Company

Svetlichny, G., Redhead, M.L.G., Brown, H.R., Butterfield, J. (1988) ‘Do the Bell Inequalities Require

the Existence of Joint Probability Distributions?’

Philosophy of Science 55 (3) pp. 387-401

Tkadlec, J. (2000) ‘Diagrams of Kochen - Specker Type Constructions’

International Journal of Theoretical Physics 39 (3) pp. 921-926

Uffink, J.B.M., Hilgevoord, J. (1985) ‘Uncertainty Principle and Uncertainty Relations’

Foundations of Physics 15 (9) pp. 925-944

Uffink, J.B.M., Hilgevoord, J. (1988) ‘Interference and Distinguishability in Quantum Mechanics’

Physica B 151 pp. 309-313

Uffink, J.B.M. (1990) Measures of Uncertainty and the Uncertainty Principle

Utrecht: Rijksuniversiteit te Utrecht, Dissertation

Vermaas, P.E., Dieks, D.G.B.J. (1995) ‘The Modal Interpretation of Quantum Mechanics and its

Generalization to Density Operators’

Foundations of Physics 25 (1) pp. 145-158

Vermaas, P.E. (1999) A Philosopher’s Understanding of Quantum Mechanics : Possibilities and Impossibilities

of a Modal Interpretation

Cambridge: Cambridge University Press

Vigier, J.-P., Dewdney, C., Holland, P.R., Kyprianidis, A. (1987) ‘Causal particle trajectories and the

interpretation of quantum mechanics’

In Hiley, B.J., Peat, F.D. (eds) (1987) Quantum implications : essays in honour of David Bohm

London: Routledge & Kegan Paul

Waerden, B.L. Van der (ed) (1967) Sources of Quantum mechanics

Amsterdam: North - Holland Publishing Company

Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., Zeilinger, A. (1998) ‘Violation of Bell’s Inequality

under Strict Einstein Locality Conditions’

Physical Review Letters 81 (23) pp. 5039-5043

Wheatley, M.J. (2001) Leadership and the New Science : Discovering Order in a Chaotic World

San Francisco: Berrett - Koehler Publishers

Wheeler, J.A. (1957) ‘Assessment of Everett’s “Relative State” Formulation of Quantum Theory’

Reviews of Modern Physics 29 (3) pp. 463-465

Wheeler, J.A., Zurek, W.H. (eds) (1983) Quantum Theory and Measurement

Princeton: Princeton University Press

BIBLIOGRAPHY 213

Wick, G.C., Wightman, A.S., Wigner E.P. (1952)‘The Intrinsic Parity of Elementary Particles’

Physical Review 88 (1) pp. 101-105

Wigner, E.P. (1952) ‘Die Messung quantenmechanischer Operatoren’

Zeitschrift für Physik 133 pp. 101-108

Wigner, E.P. ‘Remarks on the mind - body question’

In Good, I.J. (1962) The scientist speculates : an anthology of partly - baked ideas

London: Heinemann

Repr in Wheeler and Zurek (1983)

Wigner, E.P. (1963) ‘The problem of measurement’

American Journal of Physics 31 (6) pp. 6-15

Wigner, E.P. (1970) ‘On Hidden Variables and Quantum Mechanical Probabilities’

Americal Journal of Physics 38 (8) pp. 1005-1009

Wigner, E.P. (1983) ‘Interpretation of Quantum Mechanics’

In Wheeler and Zurek (1983)

Zukav, G. (1984) The Dancing Wu Li Masters : An Overview of the New Physics

New York: Bantam Books

Zurek, W.H. (1981) ‘Pointer basis of quantum apparatus: Into what mixture does the wave packet

collapse?’

Physical Review D 24 (6) pp. 1516-1525

Zurek, W.H. (1982) ‘Environment - induced superselection rules’

Physical Review D 26 (8) pp. 1862-1880

FOUNDATIONS OF QUANTUM MECHANICS

FOUNDATIONS OF QUANTUM MECHANICS ... View more FOUNDATIONS OF QUANTUM MECHANICS

Delete template?

Save as template ?

FOUNDATIONS OF QUANTUM MECHANICS FOUNDATIONS OF QUANTUM MECHANICS