FOUNDATIONS OF QUANTUM MECHANICS

FOUNDATIONS OF QUANTUM MECHANICS FOUNDATIONS OF QUANTUM MECHANICS

projects.science.uu.nl
from projects.science.uu.nl More from this publisher
01.06.2014 Views

FOUNDATIONS OF QUANTUM MECHANICS JOS UFFINK INSTITUTE FOR HISTORY AND FOUNDATIONS OF SCIENCE UTRECHT UNIVERSITY SEPTEMBER 2010

<strong>FOUNDATIONS</strong><br />

<strong>OF</strong><br />

<strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

JOS UFFINK<br />

INSTITUTE FOR HISTORY AND <strong>FOUNDATIONS</strong><br />

<strong>OF</strong><br />

SCIENCE<br />

UTRECHT UNIVERSITY<br />

SEPTEMBER 2010


PREFACE<br />

These lecture notes serve as a support for the course on Foundations of Quantum Mechanics, provided<br />

by the Institute for History and Foundations of Science of the University of Utrecht. Although<br />

the text has been revised repeatedly, efforts to improve can sometimes bring along new imperfections,<br />

making revision a never-ending process. The current version, the 11 th , is slightly modified<br />

with respect to the previous one. Many thanks are due to Anne van Weerden for help in the English<br />

translation.<br />

Remarks and comments remain very welcome.<br />

Jos Uffink<br />

Utrecht, August 2010


CONTENTS<br />

I CONCEPTUAL PROBLEMS 7<br />

I. 1 Introduction . . . . . . . . . . . . . . . 7<br />

I. 2 Incompleteness and locality . . . . . . . . . . . 11<br />

II THE FORMALISM 17<br />

II. 1 Finite - dimensional Hilbert spaces . . . . . . . . . . 17<br />

II. 2 Operators . . . . . . . . . . . . . . . . 20<br />

II. 3 Eigenvalue problem and spectral theorem . . . . . . . . 24<br />

II. 3. 1 Appendix . . . . . . . . . . . . . . . 26<br />

II. 4 Functions of normal operators . . . . . . . . . . . 27<br />

II. 5 Direct sum and direct product . . . . . . . . . . . 30<br />

II. 5. 1 Direct sum . . . . . . . . . . . . . . 30<br />

II. 5. 2 Direct product . . . . . . . . . . . . . 31<br />

II. 6 Addendum: Infinite - dimensional Hilbert spaces . . . . . . 34<br />

II. 6. 1 The structure of vector spaces . . . . . . . . . . 34<br />

II. 6. 2 Operators . . . . . . . . . . . . . . . 36<br />

II. 6. 2. 1 Unbounded operators . . . . . . . . . . 37<br />

II. 6. 2. 2 Continuous spectra . . . . . . . . . . . 38<br />

II. 6. 2. 3 Spectral theorem . . . . . . . . . . . . 39<br />

II. 6. 3 Dirac . . . . . . . . . . . . . . . . 40<br />

II. 6. 4 Summary . . . . . . . . . . . . . . . 40<br />

III THE POSTULATES 41<br />

III. 1 Von Neumann’s postulates . . . . . . . . . . . . 41<br />

III. 2 Pure and mixed states . . . . . . . . . . . . . 45<br />

III. 3 The interpretation of mixed states . . . . . . . . . . 51<br />

III. 4 Composite systems . . . . . . . . . . . . . 55<br />

III. 4. 1 Summary . . . . . . . . . . . . . . . 63<br />

III. 5 Proper and improper mixtures . . . . . . . . . . . 63<br />

III. 6 Spin 1/2 particles . . . . . . . . . . . . . . 64


III. 6. 1 Spin 1/2 and rotations in spin space . . . . . . . . 67<br />

III. 6. 2 Mixed spin 1/2 states . . . . . . . . . . . . 70<br />

III. 6. 3 Two spin 1/2 particles . . . . . . . . . . . . 72<br />

III. 6. 3. 1 Singlet and triplet states . . . . . . . . . . 72<br />

III. 6. 3. 2 Correlations . . . . . . . . . . . . . 73<br />

III. 6. 3. 3 Conditional probabilities . . . . . . . . . . 74<br />

III. 6. 3. 4 Example of a mixed state of two spin 1/2 particles . . . 75<br />

IV THE COPENHAGEN INTERPRETATION 77<br />

IV. 1 Heisenberg and the uncertainty principle . . . . . . . . 77<br />

IV. 1. 1 Remarks . . . . . . . . . . . . . . . 81<br />

IV. 2 Bohr and complementarity . . . . . . . . . . . . 82<br />

IV. 2. 1 Complementary phenomena . . . . . . . . . . 84<br />

IV. 2. 2 Remarks and problems . . . . . . . . . . . 86<br />

IV. 2. 3 Agreement and difference between Heisenberg and Bohr . . . 87<br />

IV. 3 Debate between Einstein en Bohr . . . . . . . . . . 88<br />

IV. 3. 1 Introduction . . . . . . . . . . . . . . 88<br />

IV. 3. 2 The photon box . . . . . . . . . . . . . 90<br />

IV. 3. 3 Einstein, Podolsky and Rosen . . . . . . . . . . 92<br />

IV. 3. 4 Heisenberg, Bohr and Einstein, Podolsky and Rosen . . . . 92<br />

IV. 4 Neutron interferometry . . . . . . . . . . . . 93<br />

IV. 5 The uncertainty relations . . . . . . . . . . . . 97<br />

IV. 5. 1 Introduction . . . . . . . . . . . . . . 97<br />

IV. 5. 2 The standard uncertainty relations . . . . . . . . 98<br />

IV. 5. 3 Single slit experiment . . . . . . . . . . . . 100<br />

IV. 5. 4 Time and energy . . . . . . . . . . . . . 103<br />

IV. 5. 5 Double slit experiment . . . . . . . . . . . 104<br />

IV. 5. 6 A new uncertainty measure . . . . . . . . . . 105<br />

IV. 5. 7 Interpretation . . . . . . . . . . . . . . 108<br />

V HIDDEN VARIABLES 109<br />

V. 1 Hidden reality . . . . . . . . . . . . . . . 109<br />

V. 2 Non - contextual hidden variables . . . . . . . . . . 110<br />

V. 3 Kochen and Specker’s theorem . . . . . . . . . . 115<br />

V. 3. 1 Summary . . . . . . . . . . . . . . . 120<br />

V. 4 Contextual hidden variables . . . . . . . . . . . 120


VI BOHMIAN <strong>MECHANICS</strong> 127<br />

VI. 1 Introduction . . . . . . . . . . . . . . . 127<br />

VI. 2 The quantum potential . . . . . . . . . . . . . 128<br />

VI. 3 Composite systems . . . . . . . . . . . . . 132<br />

VI. 4 Remarks and problems . . . . . . . . . . . . 135<br />

VI. 5 The Hamilton - Jacobi equation . . . . . . . . . . 136<br />

VII BELL’S INEQUALITIES 139<br />

VII. 1 Local deterministic hidden variables . . . . . . . . . 139<br />

VII. 1. 1 Derivation of the first Bell inequality . . . . . . . . 139<br />

VII. 1. 2 The Bell inequality of Clauser, Horne, Shimony and Holt . . . 141<br />

VII. 1. 3 Violation of the Bell inequalities by quantum mechanics . . . 142<br />

VII. 1. 4 The Bell inequality in a non-contextual, local deterministic HVT . 144<br />

VII. 2 Local deterministic contextual hidden variables . . . . . . 145<br />

VII. 3 Wigner’s derivation . . . . . . . . . . . . . 147<br />

VII. 4 The derivation of Eberhard and Stapp . . . . . . . . . 150<br />

VII. 4. 1 Counterfactual conditional statements and indeterminism . . . 152<br />

VII. 5 Stochastic hidden variables . . . . . . . . . . . 153<br />

VII. 5. 1 Outcome, parameter and source independence . . . . . 155<br />

VII. 5. 2 Quantum mechanics as a stochastic HVT . . . . . . . 156<br />

VII. 6 An algebraic proof without inequalities . . . . . . . . 158<br />

VII. 7 Miscellanea . . . . . . . . . . . . . . . 160<br />

VII. 7. 1 Locality and relativity . . . . . . . . . . . 160<br />

VII. 7. 2 Locality versus conditional independence . . . . . . . 161<br />

VII. 7. 3 Determinism . . . . . . . . . . . . . . 161<br />

VIII THE MEASUREMENT PROBLEM 163<br />

VIII. 1 Introduction . . . . . . . . . . . . . . . 163<br />

VIII. 2 Measurement according to classical physics . . . . . . . 164<br />

VIII. 3 Measurement according to quantum mechanics . . . . . . 166<br />

VIII. 4 The measurement problem in the narrow sense . . . . . . 170<br />

VIII. 4. 1 The projection postulate and consciousness . . . . . . 172<br />

VIII. 4. 2 Bohmian mechanics . . . . . . . . . . . . 173<br />

VIII. 4. 3 Spontaneous collapse . . . . . . . . . . . . 173<br />

VIII. 4. 4 Many worlds . . . . . . . . . . . . . . 174<br />

VIII. 4. 5 Superselection rules . . . . . . . . . . . . 175<br />

VIII. 4. 6 Irreversibility of measurement . . . . . . . . . 176


VIII. 4. 7 Modal interpretation . . . . . . . . . . . . 176<br />

VIII. 4. 8 Decoherence . . . . . . . . . . . . . . 177<br />

VIII. 5 Incompatible quantities . . . . . . . . . . . . 179<br />

VIII. 6 Comments on the theory of measurement . . . . . . . . 181<br />

A GLEASON’S THEOREM 183<br />

A. 1 Introduction . . . . . . . . . . . . . . . 183<br />

A. 2 Conversion to a 3 - dimensional real problem . . . . . . . 184<br />

A. 2. 1 Step 1 . . . . . . . . . . . . . . . 185<br />

A. 3 Formulation of the problem on the surface of a sphere . . . . . 186<br />

A. 3. 1 Step 2 . . . . . . . . . . . . . . . 188<br />

A. 3. 1. 1 Lemma 1 . . . . . . . . . . . . . 188<br />

A. 3. 1. 2 Lemma 2 . . . . . . . . . . . . . 189<br />

A. 3. 1. 3 Result of lemma 1 and 2 . . . . . . . . . . 191<br />

A. 3. 2 Step 3 . . . . . . . . . . . . . . . 192<br />

A. 4 An analytic lemma . . . . . . . . . . . . . 196<br />

A. 4. 1 Step 4 . . . . . . . . . . . . . . . 196<br />

A. 5 Summary . . . . . . . . . . . . . . . . 198<br />

WORKS CONSULTED 199<br />

BIBLIOGRAPHY 200


LIST <strong>OF</strong> FIGURES<br />

III. 1 A discontinuous measure for dim H = 2 . . . . . . . . . 48<br />

III. 2 A rotated unit vector in the xz - plane . . . . . . . . . . 68<br />

III. 3 Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b . . . . . . 73<br />

IV. 1 Heisenberg’s γ - microscope . . . . . . . . . . . . 79<br />

IV. 2 The double slit interference experiment (Bohr 1949 ) . . . . . . 89<br />

IV. 3 Contexts of measurement in which the interference of the particles is visible,<br />

and those in which the recoil of the screen is visible, exclude each other. (Bohr<br />

1949 ) . . . . . . . . . . . . . . . . . . 90<br />

IV. 4 Several perfect crystal neutron interferometers (Rauch and Werner 2000 ) . 93<br />

IV. 5 The interference pattern in the neutron interferometer is acquired by measuring<br />

the intensity in the detectors at a variable optical path length difference. . 94<br />

IV. 6 The probability distribution in position for a slit of width 2 a . . . . 101<br />

IV. 7 The diffraction pattern for a small slit of width 2 a . . . . . . . 101<br />

IV. 8 The probability distribution in position for a double slit, 2 a is the width of each<br />

slit and 2 A the distance between the slits . . . . . . . . . 104<br />

IV. 9 The interference pattern for the double slit . . . . . . . . . 104<br />

IV. 10 Moving screen . . . . . . . . . . . . . . . . 106<br />

V. 1 A solution for dim H = 2 . . . . . . . . . . . . . 117<br />

V. 2 a) Kochen - Specker diagram b) Conway - Kochen diagram . . . . 118<br />

V. 3 M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of<br />

the left pillar. Each cube has 4 lines from the mutual center to its vertices, 6<br />

lines to the centers of its edges, and 3 lines to the centers of its faces. Three of<br />

the lines are shared by all three cubes, giving 3 · (4 + 6 + 3 ) − 6 = 33 lines.<br />

These are Peres’ vectors. (Text Meyer 2003 ) . . . . . . . . 119<br />

V. 4 µ(P i ) = cos 2 θ . . . . . . . . . . . . . . . 120<br />

VI. 1<br />

VI. 2<br />

The quantum potential for the two slit system as viewed from the screen, under<br />

assumption of a Gaussian distribution at the slits (Bohm 1989 ) . . . 131<br />

A simulation of the double slit experiment in Bohmian mechanics. Each particle<br />

follows a certain path between the slits and the photographic plate. All<br />

particles coming from the upper slit arrive at the upper half of the photographic<br />

plate, likewise for the lower slit and lower half of the plate. The twists in the<br />

paths are caused by the quantum potential U. (Vigier et al. 1987 ) . . . 132<br />

VII. 1 Thought experiment of Einstein, Podolsky and Rosen on the singlet . . . 140<br />

VII. 2 A configuration in which the spin quantities violate the Bell inequality . . 142<br />

VII. 3 The Bell inequality violated for every acute angle ϕ . . . . . . 143<br />

VII. 4 the configuration giving the largest violation of the Bell inequality (all vectors<br />

in the same plane) . . . . . . . . . . . . . . . 143


VII. 5 Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n<br />

is positive, in the unshaded areas a n b n is negative. . . . . . . . 144<br />

VII. 6 Comparison of the quantum mechanical expectation values and those for the<br />

local deterministic HVT . . . . . . . . . . . . . . 145<br />

VII. 7 Violation of the Bell inequality again . . . . . . . . . . 149<br />

VII. 8 The Mermin pentagon . . . . . . . . . . . . . . 159<br />

VII. 9 Minkowski diagram of the EPRB experiment, where λ is in the past light cones<br />

of both A and B . . . . . . . . . . . . . . . 160<br />

VIII. 1 Schrödinger’s cat paradox (DeWitt 1970 ) . . . . . . . . . 170<br />

A. 1 Construction of a 3 - dimensional subspace E . . . . . . . . 185<br />

A. 2 Rotation of s to s 0 and t to t ′ along a great circle around axis r . . . 188<br />

A. 3 Projection of points on a great circle onto a plane P through the north pole 189<br />

A. 4 Projection of meridians, circles with constant latitude, and a great circle . 190<br />

A. 5 Spiral representing a projected path from s to t along subsequent great circles,<br />

each time starting at their most northern point . . . . . . . . 190<br />

A. 6 Path from t to v, having the same longitude . . . . . . . . 191<br />

A. 7 A strictly in - or decreasing curve C . . . . . . . . . . 192<br />

A. 8 Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ ) . . 193<br />

A. 9 Great circle C and tilted great circles C ′ and C ′′ . . . . . . . 194<br />

A. 10 Two continuous curves on S 2 , intersecting in q . . . . . . . 195


I<br />

CONCEPTUAL PROBLEMS<br />

Anyone who is not shocked by quantum theory has not understood it.<br />

— Niels Bohr<br />

I think it is safe to say that no one understands quantum mechanics.<br />

— Richard Feynman<br />

I. 1 INTRODUCTION<br />

Quantum mechanics emerged at the beginning of the 20 th century from an attempt to understand<br />

the interaction between atoms and radiation. The presence of discrete lines in the emission<br />

and absorption spectra of chemical elements indicates that this interaction takes the form of discrete<br />

quanta. When, in the years 1925 and 1926, a coherent theory was developed by the unified efforts of<br />

Werner Heisenberg, Paul Dirac, Max Born, Pascual Jordan, Wolfgang Pauli and Erwin Schrödinger,<br />

and this theory was axiomatized seven years later by John von Neumann, the question about the<br />

physical interpretation of the mathematical symbols of the theory arose.<br />

The central mathematical concept in quantum mechanics is ψ, in the form of a wave function ψ(q)<br />

in Schrödinger’s wave mechanics, or of a vector |ψ⟩ in Hilbert space, à la Von Neumann. According<br />

to Born, its physical meaning is that ψ determines probabilities for results of measurements, and a<br />

key question is then how such probabilities must be interpreted. By means of four examples we will<br />

give an idea of the conceptual problems raised by quantum mechanics.<br />

(i) Consider as a first example the decay of radioactive nuclei of a certain kind, as discussed by<br />

Einstein (P.A. Schilpp (1949, p.667, ff). We see the unstable nuclei decay at various times, one almost<br />

immediately, another only after a long time; the α - particles are radiated in ever different directions.<br />

Quantum mechanics describes these nuclei by a non-stationary wave function, and using this function<br />

one can calculate the expected lifetime of the nuclei.<br />

A natural reaction is to assume that the nuclei differ from each other, and that this difference is<br />

the cause of the mutually different individual life spans and the different directions the α - particles<br />

are radiated in. In this view, the quantum mechanical expectation value would be comparable to<br />

the average life span in a population. However, this does not fit in a natural way in the quantum<br />

mechanical description. Quantum mechanics describes all nuclei by the same wave function. If this<br />

description is complete, the fact that quantum mechanics gives only expected life spans is not due to<br />

a lack of knowledge. Rather, there simply is nothing more to know concerning the nuclei than their<br />

wave function and the probabilities that follow from it.<br />

On the other hand, we see before our eyes that the nuclei do not behave the same way, they decay<br />

at different times and send the α - particles in ever different directions. This suggests that more can


8 CHAPTER I. CONCEPTUAL PROBLEMS<br />

be known about nuclei than their expected life spans, just like a more thorough investigation of the<br />

individuals of a population enables us to know more than their mere average life span; we would<br />

then be able to make a more detailed statement about their individual life spans. In this view the<br />

quantum mechanical description is not complete, there are extra, until now ‘hidden’, variables which<br />

say something about the individual case.<br />

There is a standard answer to this problem, called the ‘Copenhagen interpretation’, after the view<br />

developed by Bohr and his coworkers. This answer is that the idea that the individual nuclei have a<br />

definite life span, independent of the observation of this life span, is incorrect. We can only speak of<br />

an individual life span within the context of an experiment in which this is measured. An experiment<br />

always entails a disturbance of the system. For this reason no conclusions can be drawn concerning<br />

the undisturbed system. It is incorrect to speak of the life span of a nucleus which is not observed.<br />

The statistical spread in the measured individual life spans is due to the quantum character of the<br />

interaction between object and measuring apparatus. As a matter of principle, what happens in this<br />

interaction cannot be described more precisely. This makes every individual measurement into a<br />

unique event.<br />

Characteristic for the Copenhagen interpretation is, furthermore, that one cannot simply combine<br />

the description of the system, obtained within the context of a certain type of experiment, with a<br />

description of the same system, obtained in a different kind of experiment. The best known example<br />

of such mutually excluding experiments are measurements of position and momentum. According to<br />

Bohr, descriptions of a system with terms like ‘position’ or ‘momentum’ are complementary; they are<br />

supplementary to each other, but they can never be united in one picture.<br />

The main point behind the Copenhagen answer is the idea of measurement disturbance. According<br />

to this line of thought quantum mechanics is distinguished from classical physics by the quantization<br />

of the interaction between system and measuring apparatus. Every observation involves an<br />

interaction with, and therefore a disturbance of, the observed system. This disturbance cannot be<br />

made arbitrarily small; ≠ 0. Therefore, one cannot identify observation results with properties<br />

the system has independently of the observation. One can only talk meaningfully about observation<br />

results which are created by the measurement. In contrast to classical physics, quantum mechanics<br />

does not deal with what exists, but with what is observed.<br />

At first sight this reasoning seems to be plausible, it is, however, not without problems. Can<br />

we use the same reasoning if the observed system is macroscopic? And, as a matter of fact, what<br />

exactly is an observation? Is it essential that some conscious being takes notice of the result of the<br />

observation, or is an apparatus registering the outcome sufficient? These problems will appear in the<br />

third and fourth example.<br />

(ii) The next example is from a letter Einstein wrote to Born in 1948 (Born 1971, pp. 169, 170).<br />

Consider a free particle described by a wave function ψ. According to the quantum mechanical<br />

description, ψ satisfies an uncertainty relation; the statistical deviations of position and momentum<br />

cannot simultaneously be made arbitrarily small. Apparently the outcomes of measurements of position<br />

and momentum of an individual particle cannot both be predicted exactly, and the question arises<br />

how to interpret this situation. Einstein distinguishes two points of view.<br />

(a) The (free) particle really has a definite position and a definite momentum, even if they<br />

cannot both be ascertained by measurement in the same individual case. According to<br />

this point of view, the ψ - function represents an incomplete description of the real state


I. 1. INTRODUCTION 9<br />

of affairs. [. . . ] Its acceptance would lead to an attempt to obtain a complete description<br />

of the real state of affairs as well as the incomplete one, and to discover physical laws<br />

for such a description. The theoretical framework of quantum mechanics would then be<br />

exploded.<br />

(b) In reality the particle has neither a definite momentum nor a definite position; the description<br />

by [the] ψ - function is, in principle, a complete description. The strictly defined<br />

position of the particle, obtained by measuring the position, cannot be interpreted as the<br />

position of the particle prior to the measurement. The sharp localization which appears as<br />

a result of the measurement is brought about only as a result of the unavoidable (but not<br />

unimportant) operation of measurement. The result of the measurement depends not only<br />

on the real particle situation but also on the nature of the measuring mechanism, which<br />

in principle is incompletely known. An analogous situation arises when the momentum<br />

or any other observable quantity relating to the particle is measured.<br />

Interpretation (b) is accepted by the majority of the physicists and Einstein admits<br />

[. . . ] it alone does justice in a natural way to the empirical state of affairs expressed in<br />

Heisenberg’s principle within the framework of quantum mechanics.<br />

Nevertheless, he emphasizes his preference for interpretation (a). His argument is that it is basic<br />

to physics that physical concepts refer to entities, such as particles, fields, etc., that exist independently<br />

of the observer, and are situated in space and time. Interpretation (b) renders this kind of<br />

description impossible. A second argument has to do with composite systems and will be discussed<br />

in section I. 2.<br />

(iii) The next example, also originating from the correspondence between Einstein and Born<br />

(Born 1971, pp. 188, 208 - 209), concerns a freely moving macroscopic object, for instance a star.<br />

A simple Schrödinger equation applies to the center of mass of such a body, namely that of a free<br />

particle. Since all wave functions which are solutions of the Schrödinger equation are admissible,<br />

one may consider as a solution a wave function with two peaks of equal size, located far from each<br />

other.<br />

Upon measurement of the position of the center of mass of such a body, the outcome is found at<br />

one peak in about half of the measurements, in the other half the outcome is found at the other peak.<br />

In this case it is tempting to say that for half of these measurements the center of mass was at that<br />

one position, that the object was at that position, while at the other half the center of mass was at the<br />

other position. But according to the standard interpretation this is incorrect: prior to the measurement<br />

no position can be assigned to the center of mass. Quantum mechanics applies just as well to the<br />

center of mass of a macroscopic body as to an electron. It is, however, difficult to imagine how a<br />

measurement ‘creates’ the position of the center of mass of a star as a result of a disturbance in the<br />

order of the size of one single quantum .<br />

According to Pauli, one of the representatives of the Copenhagen interpretation, this is a creation<br />

outside the laws of nature (ibid., p. 223). The laws of nature only say something about the statistics<br />

of the outcomes. The quantum mechanical probability description does not express our ignorance<br />

concerning the position of the center of mass of the body; the probability description corresponds


10 CHAPTER I. CONCEPTUAL PROBLEMS<br />

to an essential indeterminacy of that position. Pauli states that the question whether the ‘position’<br />

of a body would also exist without observation is fundamentally unanswerable and for this reason<br />

meaningless.<br />

In this example the problem of the transition between the microscopic and the macroscopic levels<br />

arises. Our intuition tells us that somewhere along the way the quantum mechanical probability<br />

description must turn into a classical description of an ensemble, an ensemble of objects that have<br />

properties. But if we accept at the same time that quantum mechanics applies as well to macroscopic<br />

bodies as to microscopic ones, our expectation is refuted. This transition of the one type of ensemble<br />

to the other is a problem which invariably emerges in considerations concerning the ‘measurement<br />

problem’. We will come back to this in chapter VIII.<br />

The previous discussion follows rather closely the formulations of Einstein and Pauli in the<br />

years 1948-1954, as can be found in the correspondence between Born and Einstein (Born 1971).<br />

An interesting aspect is that the discussion actually takes place over Born’s head. Born saw Einstein<br />

as the one who had, in his theory of relativity, abolished the idea of absolute simultaneity by means of<br />

the argument that it is meaningless to want to speak about something you cannot measure in principle.<br />

Einstein reacts (ibid., p. 188)<br />

There is nothing analogous in relativity to what I call incompleteness of description in<br />

the quantum theory. Briefly it is because the ψ - function is incapable of describing certain<br />

qualities of an individual system, whose ‘reality’ we none of us doubt (such as a<br />

macroscopic parameter).<br />

Moreover, Born continues to believe, despite everything Einstein writes, that Einstein objects<br />

to the indeterministic character of quantum mechanics, i.e., the fact that it only provides probability<br />

statements, instead of objecting to the alleged completeness of quantum mechanics, until Pauli<br />

intervenes in the discussion and explains Einstein’s position to Born (ibid., pp. 217-219).<br />

(iv) The last example is Schrödinger’s notorious cat paradox (Schrödinger 1935b).<br />

One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along<br />

with the following diabolical device (which must be secured against direct interference<br />

by the cat): in a Geiger counter there is a tiny bit of radioactive substance, so small<br />

that perhaps in the course of one hour one of the atoms decays, but also, with equal<br />

probability, perhaps none; if it happens, the counter tube discharges and through a relay<br />

releases a hammer which shatters a small flask of hydrocyanic acid. If one has left this<br />

entire system to itself for an hour, one would say that the cat still lives if meanwhile no<br />

atom has decayed. The first atomic decay would have poisoned it. The Ψ - function for<br />

the entire system would express this by having in it the living and the dead cat (pardon<br />

the expression) mixed or smeared out in equal parts.<br />

In this example a number of problems is combined. In the first place there is again the difference<br />

between a classical state and a quantum state. If the standard interpretation is extended consistently,<br />

the cat cannot be considered dead or alive as long as the chamber is not opened and the cat is not<br />

observed. (One may wonder what the cat itself thinks of this.)<br />

The question whether it is permitted to extend the standard interpretation in this way coincides<br />

with the question if and to what extent the quantum mechanical description can be transferred from


I. 2. INCOMPLETENESS AND LOCALITY 11<br />

the microscopic to the macroscopic level. Then there is the question what an observation exactly is.<br />

Are cats observers of their own situation? And if consciousness is essential for an observation, do<br />

cats have the correct type of consciousness?<br />

From the examples above we can isolate the following central concepts:<br />

1. the real state of a system independent of measurement,<br />

2. incompleteness,<br />

3. measurement disturbance,<br />

4. complementarity,<br />

5. the transition from microscopic to macroscopic,<br />

6. consciousness. 1<br />

I. 2 INCOMPLETENESS AND LOCALITY<br />

The previous discussion only served to get the reader in the right mood! In 1935 Albert Einstein,<br />

Boris Podolsky and Nathan Rosen, from now on abbreviated as EPR, came up with an example<br />

which considerably sharpened the discussion (EPR 1935). Using rigorous reasoning they argued that<br />

quantum mechanics is an incomplete theory. As an introduction to their argumentation we will first<br />

examine a more simple argument that Einstein formulated in the same year in a letter to Schrödinger,<br />

as paraphrased by A. Fine (1986, p. 37).<br />

Consider a composite system of two particles which interacted with each other but are so widely<br />

separated in space now that they no longer interact. Suppose they are in a state |ψ⟩ which is an eigenstate<br />

of the total momentum P 1 +P 2 with eigenvalue 0, but is not an eigenstate of P 1 or P 2 separately,<br />

(P 1 + P 2 ) |ψ⟩ = 0 and P 1 |ψ⟩ ̸= a |ψ⟩, P 2 |ψ⟩ ̸= b |ψ⟩, for a, b ∈ R. (I. 1)<br />

Through a measurement of the momentum of particle 1 we can predict with certainty what the result<br />

will be of a measurement of the momentum of particle 2. Moreover, the measurement of particle 1<br />

has absolutely no physical influence on particle 2. But if it is possible to predict the momentum of<br />

particle 2 with certainty without any interaction with that particle, then particle 2 must already have<br />

this momentum before the measurement, and this must even be the case before the measurement of<br />

particle 1, since the measurement absolutely does not disturb particle 2. However, the value of this<br />

property of particle 2 cannot be derived from the quantum mechanical description using the state |ψ⟩.<br />

Therefore, quantum mechanics is incomplete.<br />

We see how Einstein succeeds, thanks to the strict correlation between the particles that quantum<br />

mechanics allows for, and thanks to the spatial separation of the particles, to refute the argument of<br />

1 The role of consciousness is regarded as essential by mathematicians and physicists like Von Neumann, London,<br />

Heitler and Wigner. The fact that they felt forced to take this highly unusual step in physical theory illustrates how serious<br />

the situation is.


12 CHAPTER I. CONCEPTUAL PROBLEMS<br />

the measurement disturbance as a physical process. In the earlier examples we could imagine the<br />

measurement to create the outcome (although this already seemed a hardly convincing escape in Einstein’s<br />

example of macroscopic bodies), and that this outcome did not exist prior to the measurement<br />

because of the disturbance that comes with the measurement. We now see that we cannot imagine<br />

these measurement disturbances as spatially limited, ‘local’ processes. Einstein spoke of “a spooky<br />

action at a distance” and of “telepathy”.<br />

The case against the completeness of the quantum mechanics gained strength with this example.<br />

However, objections can be made. (Later Einstein would be amused about the fact that everyone knew<br />

the argumentation was not correct but that everyone had another reason to think so.) The argument<br />

uses the fact that in quantum mechanics there are eigenstates of P 1 + P 2 in which the momentum<br />

of each individual particle is undetermined. It could be objected that such states are perhaps not<br />

physically realizable, that only eigenstates of P 1 + P 2 which are at the same time also eigenstates of<br />

both P 1 and P 2 would be realizable, and that we should therefore replace the state |ψ⟩ by a mixture<br />

of such eigenstates, in which case the argumentation does not hold any longer.<br />

The EPR article itself gives a more balanced argumentation that does not have this shortcoming.<br />

The article deviates from the above on two points. First, not only the momentum, but also the position<br />

of the two particles is brought into the consideration. Second, EPR formulate a ‘sufficient condition of<br />

reality’ by means of the term ‘element of physical reality’, which we will call EPR(EPR). As worded<br />

by EPR, p. 777,<br />

EPR(EPR): If, without in any way disturbing a system, we can predict with certainty<br />

(i.e., with probability equal to unity) the value of a physical quantity, then there exists an<br />

element of physical reality corresponding to this physical quantity.<br />

How else could we explain that we are able to predict the outcomes of measurements with certainty?<br />

A necessary, and certainly sufficient, condition for a complete physical theory is, that each<br />

element of physical reality must have a counterpart in the theoretical description,<br />

COMP(T): If a physical theory T is complete, then every element of physical reality<br />

must have a counterpart in the theory T .<br />

It is possible to choose for |ψ⟩ a state which is a simultaneous eigenstate of the commuting operators<br />

P 1 + P 2 and Q 1 − Q 2 . In Dirac - notation, and only considering one spatial dimension, such a<br />

state is written in the ‘p - language’ and in the ‘q - language’, as<br />

∫<br />

∫<br />

|ψ⟩ = |p 1 = p⟩ ⊗ |p 2 = −p⟩ e − i l p dp = |q 1 = q⟩ ⊗ |q 2 = q − l⟩ dq, (I. 2)<br />

R<br />

where l is the eigenvalue of the mutual distance Q 1 − Q 2 and can be chosen arbitrarily large, and<br />

the terms with the ‘cartwheels’ are the direct products, see subsection II. 5. 2, p. 31, of which the first<br />

factor refers to particle 1, and the second to particle 2. The ‘p - language’ and the ‘q - language’ can<br />

be ‘translated’ into each other by means of a Fourier - transformation. 2<br />

2 Without Dirac - notation but in terms of Dirac’s δ - ‘functions’ the wave function has, in ‘p - language’ and in ‘q - language’,<br />

the following form,<br />

ψ(p 1 , p 2 ) = e − i lp 1<br />

δ(p 1 + p 2 ) and ˜ψ(q1 , q 2 ) = δ(q 1 − q 2 + l).<br />

R


I. 2. INCOMPLETENESS AND LOCALITY 13<br />

Although this state |ψ⟩ is an eigenstate of the total momentum P 1 + P 2 of the two particles and<br />

their mutual distance Q 1 − Q 2 , with eigenvalues 0 or l, respectively,<br />

(<br />

P 1 + P 2<br />

)<br />

|ψ⟩ = 0 |ψ⟩ and<br />

(<br />

Q1 − Q 2<br />

)<br />

|ψ⟩ = l |ψ⟩, (I. 3)<br />

it is not an eigenstate of any of the 1 - particle operators P 1 , Q 1 , P 2 or Q 2 . However, given the<br />

outcome of a measurement of P 1 , e.g. a, we can predict the result of a measurement of P 2 with<br />

certainty, namely −a. In the same way, from a measurement of Q 1 with outcome x, the result of a<br />

measurement of Q 2 follows with certainty, namely x − l.<br />

Now the argumentation is as follows. If we would measure the momentum P 1 of particle 1,<br />

then we could predict the value of P 2 with certainty, without disturbing particle 2. According to<br />

the aforementioned criterion the momentum P 2 of particle 2 must then correspond to an element of<br />

physical reality. On the other hand, if we would measure the position Q 1 of particle 1, then we could<br />

predict the value of Q 2 with certainty, again without disturbing particle 2. In that case there must<br />

be an element of physical reality which corresponds to Q 2 . Therefore we can, depending on which<br />

measurement we perform on particle 1, assign an element of physical reality to particle 2.<br />

However, because of the absence of physical interaction between the particles there can be no real<br />

change in particle 2 as a result of what is done with particle 1. Consequently, particle 2 must have both<br />

elements of physical reality. But such a simultaneous assignment of exact position and momentum<br />

has no counterpart in the quantum mechanical formalism, there are no wave functions which are<br />

simultaneous eigenfunctions of position and momentum. The conclusion is unavoidable, the answer<br />

to the question in the title of their article ‘Can quantum - mechanical description of physical reality be<br />

considered complete?’ must be negative.<br />

Notice that it is not necessary to perform the measurements on P 1 or Q 1 simultaneously, the<br />

only thing that matters is the possibility to choose whether to predict the position or momentum of<br />

particle 2 with certainty. Because of the absence of interaction between both particles it makes no<br />

difference for particle 2 which choice is made for particle 1. This part of the argumentation relies<br />

on the supposition that the elements of physical reality have a local character. This implicit, but<br />

reasonable locality premise, runs as follows,<br />

LOC(EPR): Performing a measurement on a physical system S 1 does not have an instantaneous<br />

effect on elements of physical reality belonging to any system S 2 which is spatially<br />

separated from S 1 .<br />

We can thus summarize the argument of EPR schematically; quantum mechanics, QM, together<br />

with EPR(EPR) and LOC(EPR), implies that quantum mechanics is an incomplete theory,<br />

not COMP(QM). Or:<br />

QM ∧ EPR(EPR) ∧ LOC(EPR) → ¬ COMP(QM). (I. 4)<br />

In comparison to the foregoing, the strength of this argument is, in the first place, the larger precision<br />

with which the argumentation has been set up: the conclusion follows logically from a number<br />

of explicitly formulated premises and conditions. Moreover, we see that we are able to attribute to<br />

particle 2 both position and momentum without interacting with particle 2. This means that we cannot<br />

avoid the argumentation by assuming that for the correct quantum mechanical description the


14 CHAPTER I. CONCEPTUAL PROBLEMS<br />

given wave function ψ must be replaced by a mixture of eigenstates. Such eigenstates of position and<br />

momentum are simply not available in quantum mechanics. The possibility to assign values to P 2<br />

and Q 2 attacks the complementarity idea in the heart.<br />

EPR anticipated the objection that only that which has been measured is real (EPR 1935, p. 780),<br />

Indeed, one would not arrive at our conclusion if one insisted that two or more physical<br />

quantities can be regarded as simultaneous elements of reality only when they can be<br />

simultaneously measured or predicted. On this point of view, since either one or the<br />

other, but not both simultaneously, of the quantities P and Q can be predicted, they are<br />

not simultaneously real. This makes the reality of P and Q depend upon the process of<br />

measurement carried out on the first system, which does not disturb the second system in<br />

any way. No reasonable definition of reality could be expected to permit this.<br />

They conclude their article with the next paragraph,<br />

While we have thus shown that the wave function does not provide a complete description<br />

of physical reality, we left open the question of whether or not such a description exists.<br />

We believe, however, that such a theory is possible.<br />

The problem whether a complete theory is possible or not, is called the hidden variable problem. The<br />

so - called ‘hidden variable theories’ are attempts to solve this problem. We will come back to this in<br />

chapter V.<br />

Bohr’s (1935a) response to the argument of EPR aims at the question to what extent the condition<br />

for an element of ‘physical reality’, as worded by EPR, is fulfilled in their example. The next quotation<br />

is from Bohr (1935b, p. 700),<br />

From our point of view we now see that the wording of the aforementioned criterion of<br />

physical reality proposed by Einstein, Podolsky and Rosen contains an ambiguity as regards<br />

the meaning of the expression “without in any way disturbing a system.” Of course<br />

there is in a case like that just considered no question of a mechanical disturbance of<br />

the system under investigation during the last critical stage of the measuring procedure.<br />

But even at this stage there is essentially the question of an influence on the very conditions<br />

which define the possible types of predictions regarding the future behavior of<br />

the system. Since these conditions constitute an inherent element of the description of<br />

any phenomenon to which the term ‘physical reality’ can be properly attached, we see<br />

that the argumentation of the mentioned authors does not justify their conclusion that<br />

quantum mechanical description is essentially incomplete. (Emphasis added.)<br />

It is not easy to completely comprehend what Bohr says here. Evidently, he abandons the original<br />

idea that the measurement disturbance creates the measurement results, or, at least, that such a creation<br />

can be understood as a physical process. It is replaced by the idea that applicability of physical concepts<br />

depends on the context of measurement. Performing a measurement on one of the particles<br />

is considered as determinative for the applicability of concepts to the other particle. Bohr says that<br />

the measurement disturbance is not a mechanical disturbance; apparently LOC(EPR) continues to<br />

apply for him if we, using the term ‘influence’, refer to a mechanical interaction, but not if we mean


I. 2. INCOMPLETENESS AND LOCALITY 15<br />

by ‘influence’ the ‘defining effect’ of the context of measurement. The experimental circumstances<br />

define what you may call physical reality. Physical reality is not defined by experiments you could do,<br />

as is the case according to EPR, but exclusively by experiments you actually do. Under circumstances<br />

as described in the EPR experiment this ‘defining effect’ of the experimental setup also reaches parts<br />

of the system with which the measuring apparatus has no physical interaction.<br />

A distinct difference between Einstein and Bohr is that Einstein wants to visualize reality independent<br />

of observation, whereas Bohr is satisfied with complementary pictures of which the applicability<br />

always remains dependent on the chosen measurement setup. In 1955 Einstein says (Fine 1986, p.95)<br />

It is basic for physics that one assumes a real world existing independently from any act<br />

of perception. But this we do not know. We take it only as a programme in our scientific<br />

endeavors. This programme is, of course, prescientific and our ordinary language is<br />

already based on it.<br />

And concerning the EPR situation he says (Schilpp 1949, p. 85)<br />

But on one supposition we should, in my opinion, absolutely hold fast: the real factual<br />

situation of the system S 2 is independent of what is done with the system S 1 , which is<br />

spatially separated from the former.<br />

Bohr’s conceptions concerning physical reality are much more difficult to characterize. According<br />

to him there is no independent reality of which the physical theory would have to give an unambiguous<br />

representation. He writes (Schilpp 1949, p. 211)<br />

Thus, a sentence like “we cannot know both the momentum and the position of an atomic<br />

object” immediately raises questions as to the physical reality of two such attributes of the<br />

object, which can be answered only by referring to the conditions for the unambiguous<br />

use of space - time concepts, on the one hand, and dynamical conservation laws, on the<br />

other hand.<br />

An exhaustive description of reality must always use concepts which themselves remain dependent<br />

on mutually excluding contexts. Bohr says (A. Petersen 1963, p. 11)<br />

The word ‘reality’ is also a word, a word which we must learn to use correctly.<br />

He constantly emphasizes the restricted applicability of our physical concepts, which makes the link<br />

between description and reality very complicated. Petersen mentions (ibid., p. 12)<br />

When asked whether the algorithm of quantum mechanics could be considered as somehow<br />

mirroring an underlying quantum world, Bohr would answer, “There is no quantum<br />

world. There is only an abstract quantum physical description. It is wrong to think that<br />

the task of physics is to find out how nature is. Physics concerns what we can say about<br />

nature.”


16 CHAPTER I. CONCEPTUAL PROBLEMS<br />

Einstein’s conceptions are, in a certain way, easier than those of Bohr and correspond to the<br />

intuition of the majority of physicists. When the preponderance of the Copenhagen school started to<br />

wane, in the 1960s, attention for Einstein’s viewpoint revived.<br />

In 1964, John Bell gave a reconstruction of the EPR experiment (see chapter VII) satisfying Einstein’s<br />

requirement that the real, factual situation of physical system S 2 is independent of what is<br />

done with system S 1 , the two systems being spatially separated. He constructed a very general model<br />

and made the surprising discovery that such a model cannot completely reproduce the quantum mechanical<br />

predictions. Especially remarkable are the broad generality of his derivation and the fact that<br />

the differences with quantum mechanics are large enough to be able to be measured. Sensationally, a<br />

‘philosophical’ issue thus came within the range of experimental physics! Abner Shimony has spoken<br />

in this respect of experimental metaphysics.<br />

Bell’s work is an attempt solve the completeness problem. Hereafter attempts were undertaken<br />

to really carry out the EPR experiment, which was thus far only a thought experiment. The first<br />

experiment was done in 1972 by Freedman and Clauser. Later, several other experiments have been<br />

done, the highlight of which was, in 1982, the experiment of Alain Aspect and his group in Paris. In<br />

turn, this has been superseded by the experiments of Anton Zeilinger and his groups in Vienna and<br />

Innsbruck (e.g. Weihs 1998). The results of these experiments are in good to excellent agreement<br />

with quantum mechanics, and therefore in conflict with all models meeting Einstein’s requirements.<br />

The latter conclusion applies irrespective of the validity of quantum mechanics.<br />

These results brought about a great number of responses and is one of the main causes of the<br />

revived interest for interpretation problems of quantum mechanics. The discussion focusses on the<br />

question what exactly the suppositions are that lead to the result of Bell and whether his model is<br />

indeed the most general model that meets Einstein’s requirements.<br />

The consequences of Bell’s result seem to be considerable. It can be argued that no independent<br />

existence can be granted to objects that at some time interacted, irrespective of how far apart they are,<br />

this even holds completely independent of the distance. This suggests that reality cannot be reduced<br />

to the ‘sum’ of its parts and that a more holistic approach is imperative, making our picture of nature<br />

much more complicated.<br />

Through the discussion of the EPR argument some more basic concepts are added to our list:<br />

7. element of physical reality,<br />

8. separability of physical systems,<br />

9. locality,<br />

10. holism.<br />

These ten concepts play a central role in the research on the foundations of quantum mechanics.


II<br />

THE FORMALISM<br />

As far as the laws of mathematics refer to reality, they are not certain; and as far as they<br />

are certain, they do not refer to reality.<br />

In mathematics you don’t understand things. You just get used to them.<br />

— Albert Einstein<br />

— John von Neumann<br />

The usual mathematical formulation of quantum mechanics has been developed by John von Neumann<br />

in 1932 as an operator calculus on a Hilbert space. We will not need all details of this<br />

calculus, therefore, give only a succinct review. For our purposes we can limit ourselves to a<br />

finite - dimensional Hilbert space, a complex vector space with an inner product. We will give an<br />

overview of the elementary concepts of this Hilbert space, and in an addendum concisely summarize<br />

the infinite - dimensional case. For a more extensive treatment of Hilbert spaces we refer<br />

to the first chapters of E. Prugovečki (2006).<br />

II. 1<br />

FINITE - DIMENSIONAL HILBERT SPACES<br />

We start this chapter by defining a space called a Hilbert space, denoted by H. The elements H are<br />

called vectors. Following Dirac’s ket notation the vectors will be written as |α⟩,|β⟩,|γ⟩,|ϕ⟩,|ψ⟩,|χ⟩, . . . ,<br />

complex numbers will be specified by the first characters of the alphabet a, b, c ∈ C.<br />

Vectors can be added, and multiplied with a complex number, also called a scalar, we then remain<br />

in H, i.e., for all |ϕ⟩, |ψ⟩ ∈ H and a, b ∈ C we have<br />

a|ϕ⟩ + b|ψ⟩ ∈ H. (II. 1)<br />

In other words, H is closed under linear combinations.<br />

The addition is commutative and associative,<br />

|ϕ⟩ + |ψ⟩ = |ψ⟩ + |ϕ⟩, (II. 2)<br />

|ϕ⟩ + ( |ψ⟩ + |χ⟩ ) = ( |ϕ⟩ + |ψ⟩ ) + |χ⟩. (II. 3)<br />

We require the existence of a null vector, 0 ∈ H, which is provable unique and has the property<br />

that for all |ϕ⟩ ∈ H<br />

0 + |ϕ⟩ = |ϕ⟩, (II. 4)


18 CHAPTER II. THE FORMALISM<br />

and that every vector has an additive inverse, i.e., for every |ϕ⟩ ∈ H there is a vector |ϕ ′ ⟩ ∈ H, also<br />

provable unique, such that<br />

|ϕ⟩ + |ϕ ′ ⟩ = 0 . (II. 5)<br />

The scalar multiplication is distributive and associative,<br />

(a + b) ( |ϕ⟩ + |ψ⟩ ) = a |ϕ⟩ + a |ψ⟩ + b |ϕ⟩ + b |ψ⟩, (II. 6)<br />

a ( b |ϕ⟩ ) = (a b) |ϕ⟩, (II. 7)<br />

and we demand that<br />

1 |ψ⟩ = |ψ⟩. (II. 8)<br />

Incidentally we also write<br />

a |ψ⟩ ≡ |a ψ⟩ ≡ |ψ⟩ a. (II. 9)<br />

EXERCISE 1. Prove (a) 0|ϕ⟩ = 0 ,<br />

(b) the additive inverse of |ϕ⟩ equals −1|ϕ⟩.<br />

An inner product on a vector space is a mapping H × H → C, where the image in C<br />

of ( |ϕ⟩, |ψ⟩ ) ∈ H × H is written as ⟨ϕ | ψ⟩. The inner product has the following properties:<br />

(i)<br />

⟨ϕ | a ψ + b χ⟩ = a ⟨ϕ | ψ⟩ + b ⟨ϕ | χ⟩,<br />

(ii) ⟨ϕ | ψ⟩ = ⟨ψ | ϕ⟩ ∗ ,<br />

(iii) ⟨ϕ | ϕ⟩ 0, (II. 10)<br />

(iv) ⟨ϕ | ϕ⟩ = 0 iff |ϕ⟩ = 0 .<br />

The value<br />

∥ψ∥ := √ ⟨ψ | ψ⟩ (II. 11)<br />

is called the norm of |ψ⟩ and meets the usual requirements for a norm; its value is positive, except<br />

for the zero vector which is assigned 0, it is homogeneous, in the sense that ∥aψ∥ = |a|∥ψ∥, and it<br />

satisfies the triangle inequality ∥ψ + ϕ∥ ∥ψ∥ + ∥ϕ∥. A vector is called a unit vector if the norm<br />

equals 1.


II. 1. FINITE - DIMENSIONAL HILBERT SPACES 19<br />

An important inequality is the Cauchy - Schwarz inequality<br />

|⟨ϕ | ψ⟩| 2 ⟨ϕ | ϕ⟩ ⟨ψ | ψ⟩. (II. 12)<br />

EXERCISE 2. Prove (a) the Cauchy - Schwarz inequality (II. 12),<br />

(b) the definition of the norm satisfies the standard requirements for a norm.<br />

The n vectors |α 1 ⟩, . . . , |α n ⟩ are called (linearly) independent if it follows from<br />

n∑<br />

c i |α i ⟩ = 0 (II. 13)<br />

i=1<br />

that all coefficients c i are equal to zero, otherwise the vectors are called dependent.<br />

EXERCISE 3. Prove that mutually orthogonal vectors are linearly independent.<br />

A set of vectors |α 1 ⟩, . . . , |α N ⟩ in H is complete 1 if every vector |ψ⟩ ∈ H can be written as a<br />

linear combination of this set,<br />

|ψ⟩ =<br />

N∑<br />

c i |α i ⟩. (II. 14)<br />

i=1<br />

A complete, independent set of vectors is called a basis. A basis is called orthonormal if<br />

⟨α i | α j ⟩ = δ ij , (II. 15)<br />

where δ ij is the Kronecker delta. It can be proved that every basis of a space H contains the same<br />

number of elements, this number is, by definition, the dimension of H, and is written dim H. The<br />

dimension of a Hilbert space is infinite if every finite set of linearly independent vectors is incomplete.<br />

If |α 1 ⟩, . . . , |α N ⟩ is an orthonormal basis, with N = dim H, then it follows from (II. 15) that the<br />

coefficients in (II. 14) are given by<br />

c i = ⟨α i | ψ⟩, (II. 16)<br />

and the vectors |ψ⟩ can thus be represented in such a basis by columns of N complex numbers.<br />

Therefore, an N - dimensional Hilbert space can also be written as C N .<br />

1 The use of the term ‘complete’ for a system of vectors should not be confused with the same phrase as used within the<br />

context of the foundations of quantum mechanics, that is, as a property of a physical theory.


20 CHAPTER II. THE FORMALISM<br />

With (II. 16), in an orthonormal basis we have<br />

⎛ ⎞<br />

c 1<br />

c 2<br />

|ψ⟩ = ⎜ ⎟<br />

⎝ . ⎠<br />

c n<br />

(II. 17)<br />

and hence ⟨ψ| = (c 1 ∗ , c 2 ∗ , . . . , c ∗ n), therefore<br />

⎛<br />

c 1 c<br />

∗ 1 . . . c 1 c ∗ ⎞<br />

n<br />

⎜<br />

|ψ⟩ ⟨ψ| = ⎝<br />

.<br />

. ..<br />

⎟<br />

⎠ , (II. 18)<br />

c n c<br />

∗ 1 c n cn<br />

∗<br />

from which it is evident that for the vectors of the orthonormal basis {|α i ⟩} it holds that<br />

N∑<br />

|α i ⟩ ⟨α i | = 11, (II. 19)<br />

i=1<br />

with 11 the identity mapping on H,<br />

11 |ψ⟩ = |ψ⟩ ∀ |ψ⟩ ∈ H. (II. 20)<br />

Using (II. 14) and (II. 16), we see that an orthonormal basis is indeed characterized by the relation<br />

|ψ⟩ =<br />

N∑<br />

⟨α i | ψ⟩ |α i ⟩ =<br />

i=1<br />

N∑<br />

|α i ⟩ ⟨α i | ψ⟩. (II. 21)<br />

i=1<br />

The definition of a finite - dimensional Hilbert space is now completed; it is a finite - dimensional<br />

complex Hilbert space with an inner product which is related to the norm by means of (II. 11). A real<br />

finite - dimensional Hilbert space is obtained by replacing C everywhere by R, i.e., the set of scalars is<br />

in R and the inner product is always real. In section II. 6 we will see that for the infinite - dimensional<br />

case the definition must be extended with two requirements, ‘separability’ and ‘completeness’, which<br />

we can prove in the finite - dimensional case.<br />

II. 2<br />

OPERATORS<br />

An operator A on a Hilbert space H is a linear mapping of H onto itself,<br />

A : H → H, |ψ⟩ ↦→ A |ψ⟩ with A ( a |ψ⟩ + b |ϕ⟩ ) = a A |ψ⟩ + b A |ϕ⟩. (II. 22)<br />

From (II. 16) we saw that in a given orthonormal basis |α 1 ⟩, . . . , |α N ⟩ the vectors |ψ⟩ ∈ H are<br />

unambiguously represented by rows of N complex numbers c i = ⟨α i | ψ⟩. This corresponds to the


II. 2. OPERATORS 21<br />

representation of an operator A as an N × N - matrix A in a basis {|α i ⟩}, and the coefficients of the<br />

vector A|ψ⟩ in this basis are, using (II. 19),<br />

with<br />

⟨α i | A | ψ⟩ = ⟨α i | A 11 | ψ⟩ =<br />

N∑<br />

⟨α i | A | α j ⟩ ⟨α j | ψ⟩ =<br />

j=1<br />

N∑<br />

A ij c j , (II. 23)<br />

j=1<br />

A ij := ⟨α i | A | α j ⟩. (II. 24)<br />

Operators A and B can be added and multiplied,<br />

(A + B) |ψ⟩ := A |ψ⟩ + B |ψ⟩ and (A B) |ψ⟩ := A ( B |ψ⟩ ) . (II. 25)<br />

The adjoint A † of an operator A is defined by the following equation<br />

⟨ψ | A † | ϕ⟩ = ⟨ϕ | A | ψ⟩ ∗ ∀ |ϕ⟩, |ψ⟩ ∈ H. (II. 26)<br />

EXERCISE 4.<br />

( ) A<br />

† = A ∗<br />

ij ji .<br />

Show that for the matrix representation in an orthonormal basis it holds that<br />

Every operator on a finite - dimensional vector space has a unique adjoint, and the following holds<br />

(c A) † = c ∗ A † ,<br />

(A + B) † = A † + B † ,<br />

(A B) † = B † A † ,<br />

(<br />

A<br />

† ) † = A. (II. 27)<br />

An operator B is called an inverse of A if<br />

A B = B A = 11. (II. 28)<br />

In this case we write A −1 for B, because the inverse, if it exists, is unique. Not every operator has an<br />

inverse, an example in the Hilbert space C 2 is<br />

( ) 0 1<br />

. (II. 29)<br />

0 0<br />

The trace of an operator A is defined as follows,<br />

Tr A :=<br />

N∑<br />

⟨γ i | A | γ i ⟩, (II. 30)<br />

i=1<br />

where |γ 1 ⟩, . . . , |γ N ⟩ is an arbitrary orthonormal basis and N = dim H.


22 CHAPTER II. THE FORMALISM<br />

EXERCISE 5. Show that Tr A is independent of the choice of the orthonormal basis.<br />

The trace has the following properties:<br />

Tr A † = Tr A ∗ ,<br />

Tr (bA + cB) = b Tr A + c Tr B,<br />

Tr AB = Tr BA. (II. 31)<br />

EXERCISE 6. Prove the three statements in (II. 31).<br />

We will now list the most important types of operators. An operator A is called normal if it<br />

commutes with its adjoint,<br />

[<br />

A, A<br />

† ] := A A † − A † A = 0 , (II. 32)<br />

where 0 is actually the ‘zero operator’, it maps all vectors to the zero vector 0 . An operator is called<br />

self - adjoint or Hermitian if it is equal to its adjoint,<br />

A † = A, (II. 33)<br />

and with the first statement of (II. 31) we see that the trace of a self-adjoint operator is always real.<br />

Self - adjoint operators are normal, but not all normal operators are self - adjoint, e.g., the unitary<br />

operator,<br />

U † = U − 1 . (II. 34)<br />

EXERCISE 7. Prove that a unitary operator preserves the inner product, e.g., for all |ϕ⟩,|ψ⟩ ∈ H<br />

the following holds: if |ϕ ′ ⟩ = U |ϕ⟩ and |ψ ′ ⟩ = U |ψ⟩ then ⟨ψ ′ | ϕ ′ ⟩ = ⟨ψ | ϕ⟩.<br />

An operator A is called positive, i.e. A 0, if<br />

⟨ψ | A | ψ⟩ 0 ∀ |ψ⟩ ∈ H. (II. 35)<br />

An operator P is called a projection operator, or a projector for short, if it is self - adjoint and<br />

idempotent,<br />

P = P † and P 2 = P. (II. 36)


II. 2. OPERATORS 23<br />

An example of a projector, apart from the obvious examples of the zero operator 0 and the identity<br />

operator 11, is the mapping P ϕ = |ϕ⟩ ⟨ϕ| which projects on a given unit vector |ϕ⟩,<br />

P ϕ : |ψ⟩ ↦→ ⟨ϕ | ψ⟩ |ϕ⟩ = |ϕ⟩ ⟨ϕ | ψ⟩. (II. 37)<br />

EXERCISE 8. Show that (a) every projector is positive,<br />

(b) if P is a projector, then 11 − P is one also.<br />

Projectors are the workhorses of Hilbert space. Nearly all of our further considerations concerning<br />

quantum mechanics can be formulated in terms of projectors, and therefore we will now discuss their<br />

properties somewhat more elaborate.<br />

We write the set of all projectors on a Hilbert space H as P (H). Every projector P can be<br />

characterized by means of its range, i.e. the set<br />

H P := { P |ψ⟩ : |ψ⟩ ∈ H } . (II. 38)<br />

This set is closed under linear combinations and thus forms another Hilbert space by itself, called a<br />

subspace of H. Conversely, every subspace of H corresponds unambiguously to a projector. 2<br />

The subspace corresponding to a projector is also called its eigenspace, and if the dimension of its<br />

eigenspace is N, the projector is called N - dimensional.<br />

Two projectors P 1 and P 2 are called mutually orthogonal, written as P 1 ⊥ P 2 , if<br />

P 1 P 2 = 0. (II. 39)<br />

In that case their eigenspaces are also orthogonal,<br />

P 1 ⊥ P 2 iff ∀ |ψ⟩ ∈ H P 1<br />

, ∀ |ϕ⟩ ∈ H P 2<br />

it holds that ⟨ϕ | ψ⟩ = 0. (II. 40)<br />

EXERCISE 9. Verify that P 1 P 2 = 0 =⇒ P 2 P 1 = 0 holds for projectors.<br />

For two orthogonal projectors P 1 ⊥ P 2 , the sum P 1 + P 2 is also a projector since it is, as can be<br />

seen using (II. 27), self - adjoint, and it is idempotent,<br />

(P 1 + P 2 ) 2 = P 1 2 + P 1 P 2 + P 2 P 1 + P 2 2 = P 1 2 + P 2 2 = P 1 + P 2 , (II. 41)<br />

thereby satisfying the requirements (II. 36). The eigenspace of the projector P 1 + P 2 is the linear<br />

space spanned by the vectors in H P1 and H P2 .<br />

2 In infinite - dimensional Hilbert spaces this only holds for closed subspaces.


24 CHAPTER II. THE FORMALISM<br />

A set of projectors P 1 , . . . , P N is called mutually orthogonal if<br />

P i P j = δ ij P i for i, j = 1, . . . , N, (II. 42)<br />

a set of mutually orthogonal projectors is called complete if<br />

N∑<br />

P i = 11. (II. 43)<br />

i=1<br />

In particular, in accordance with (II. 19), for an orthonormal basis |α i ⟩, . . . , |α N ⟩ it holds that the<br />

associated 1 - dimensional projectors form a complete set,<br />

N∑<br />

|α i ⟩ ⟨α i | = 11. (II. 44)<br />

i=1<br />

II. 3<br />

EIGENVALUE PROBLEM AND SPECTRAL THEOREM<br />

If |β 1 ⟩, . . . , |β N ⟩ is an arbitrary orthonormal basis, an operator A is represented in this basis as<br />

an arbitrary N × N - matrix,<br />

A ij = ⟨β i | A | β j ⟩. (II. 45)<br />

A powerful tool for the study of such matrices is obtained if they can be ‘diagonalized’, i.e., if an<br />

orthonormal basis |α 1 ⟩, . . . , |α N ⟩ can be found where the matrix representation of A is of the form<br />

⎛ ⎞<br />

a 1<br />

A = ⎝ . ..<br />

0<br />

⎠ , (II. 46)<br />

0 a N<br />

or, equivalently,<br />

A ij = a j δ ij . (II. 47)<br />

For such a basis it holds that<br />

A |α i ⟩ = a i |α i ⟩. (II. 48)<br />

Equation (II. 48) is called the eigenvalue equation of the operator A, the values a i are called the<br />

eigenvalues of A, the set of eigenvalues of A the spectrum of A, written as Spec A, the vectors |α i ⟩<br />

are called the eigenvectors, and the system |α 1 ⟩, . . . , |α N ⟩ an eigenbasis of A. For a self - adjoint<br />

operator it holds that the eigenvalues are all real, and the eigenvalues are not negative if the operator<br />

is positive. For a unitary operator U all eigenvalues u i ∈ C are on the complex unit circle, |u i | = 1,<br />

for a projector the eigenvalues are 0 or 1.<br />

The eigenvalue equation does, however, not always have a solution. See as an example operator<br />

(II. 29). The conditions under which the equation can be solved are given by the next important<br />

theorem which we mention without proof.


II. 3. EIGENVALUE PROBLEM AND SPECTRAL THEOREM 25<br />

SPECTRAL THEOREM:<br />

Every normal operator A has an orthonormal basis of eigenvectors |α 1 ⟩, . . . , |α N ⟩ and<br />

associated eigenvalues a 1 , . . . , a N , not necessarily distinct, satisfying (II. 48).<br />

The spectral theorem tells us that normal operators can be diagonalized. This can be formulated<br />

more elegantly in Dirac notation, where we must distinguish between the case in which all eigenvalues<br />

differ from each other, and the case in which some eigenvalues are equal. In the first case the operator<br />

is called maximal, in the second case the operator is called degenerate.<br />

Suppose that the operator A is maximal, i.e. all eigenvalues a i differ from each other, a i ≠ a j<br />

if i ≠ j. In this case we often use the eigenvalues as a label for the eigenvectors and write |a i ⟩ instead<br />

of |α i ⟩. This notation is unambiguous, since there is exactly one eigenvalue for every eigenvector.<br />

Now, according to the spectral theorem, there is an orthonormal basis |a 1 ⟩, . . . , |a n ⟩ such that<br />

A =<br />

N∑<br />

a i |a i ⟩ ⟨a i |, (II. 49)<br />

i=1<br />

since, with (II. 44), it holds for all |ψ⟩ ∈ H that<br />

A |ψ⟩ = A 11 |ψ⟩ = A<br />

N∑<br />

|a i ⟩ ⟨a i | ψ⟩ =<br />

i=1<br />

N∑<br />

a i |a i ⟩ ⟨a i | ψ⟩. (II. 50)<br />

i=1<br />

If the operator is degenerate there are only M < N distinct eigenvalues a 1 , . . . , a M . For every<br />

eigenvalue a i , there exists a number n i of mutually orthogonal eigenvectors, for which we have<br />

M∑<br />

n i = N. (II. 51)<br />

i=1<br />

The eigenvalue a i is called n i - fold degenerate. The associated eigenvectors span a n i - dimensional<br />

subspace of eigenvectors for the value a i .<br />

Choose, in this subspace, an orthonormal basis {|α i , j⟩} with j = 1, . . . , n i . Here we can also<br />

use the eigenvalues a i as a label for the basis vectors because the extra label j prevents our notation<br />

from becoming ambiguous. Now the eigenvalue equation (II. 48) becomes<br />

A |a i , j⟩ = a i |a i , j⟩. (II. 52)<br />

Analogous to (II. 49), we find<br />

A =<br />

M∑<br />

i=1<br />

a i<br />

∑n i<br />

j=1<br />

|a i , j⟩ ⟨a i , j|, (II. 53)<br />

which, in terms of the n i - dimensional eigenprojectors<br />

P ai =<br />

∑n i<br />

j=1<br />

|a i , j⟩ ⟨a i , j|, (II. 54)


26 CHAPTER II. THE FORMALISM<br />

can also be written as<br />

A =<br />

M∑<br />

a i P ai . (II. 55)<br />

i=1<br />

EXERCISE 10. (a.) Show that P ai in (II. 54) is independent of the choice of the orthonormal<br />

basis |a i , 1⟩, . . . , |a i , n i ⟩. (b). Show that for P ai as defined in (II. 54) and P ϕ given by II. 37:<br />

TrP ai P ϕ = ⟨ϕ|P ai |ϕ⟩ (II. 56)<br />

We summarize the two preceding cases in the following, equivalent, form of the spectral theorem,<br />

formulated in terms of projectors.<br />

SPECTRAL THEOREM:<br />

For every normal operator A a unique set of mutually distinct eigenvalues a 1 , . . . , a M<br />

exists, with M N, and an associated unique complete set of mutually orthogonal projectors<br />

P a1 , . . . , P aM , such that<br />

A =<br />

11 =<br />

M∑<br />

a i P ai , (II. 57)<br />

i=1<br />

M∑<br />

P ai . (II. 58)<br />

i=1<br />

If the operator is non - degenerate, all of these projectors are 1 - dimensional; if it is degenerate,<br />

dim P ai gives the degeneracy of eigenvalue a i . Equation (II. 57) is called the spectral decomposition<br />

of A, the set of mutually orthogonal projectors P ai is called the spectral family of A, and (II. 58) a<br />

resolution of identity.<br />

II. 3. 1<br />

APPENDIX<br />

A formulation of the spectral theorem which is equivalent to the preceding, but is more suitable<br />

for generalizations, can be obtained if we introduce the correspondence between the eigenvalues and<br />

the associated eigenprojectors as a mapping A of all subsets of Spec A ⊂ C to the set P (H) of<br />

projectors on H.<br />

We construct that mapping by demanding<br />

{a i } ↦→ P ai , (II. 59)


II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 27<br />

and extend this with the condition<br />

{a 1 , a 2 } ↦→ P {a1 , a 2 } := P a1 + P a2 , (II. 60)<br />

or, more generally, if ∆ represents an arbitrary set of eigenvalues, we define<br />

∆ ↦→ P ∆ = ∑<br />

P a . (II. 61)<br />

a ∈ ∆<br />

A mapping A : C → P (H) is called a projection - valued measure if<br />

(i) P ∅ = 0<br />

(ii) P Spec A = 11<br />

(iii) P ∪i ∆ i<br />

= ∑ i<br />

P ∆i , for all ∆ i mutually disjoint. (II. 62)<br />

EXERCISE 11. Verify that: P ∆ c = 11 − P ∆ where ∆ c = Spec A \ ∆ is the complement of ∆.<br />

The spectral theorem can now again be formulated.<br />

SPECTRAL THEOREM:<br />

Every normal operator A corresponds unambiguously to a projection - valued measure A.<br />

II. 4<br />

FUNCTIONS <strong>OF</strong> NORMAL OPERATORS<br />

The spectral theorem makes it possible to treat functions of normal operators in a simple manner.<br />

If f is an arbitrary function, real or complex, and A is an operator with spectral decomposition<br />

A =<br />

M∑<br />

a i P ai , (II. 63)<br />

i=1<br />

then the function f (A) of A is defined as<br />

f (A) :=<br />

M∑<br />

f (a i ) P ai . (II. 64)<br />

i=1<br />

This means that f (A) always has the same eigenvectors and eigenprojections as A, and only differs<br />

from A in the labeling of its eigenvalues, namely by f (a i ) instead of a i . As an example, consider the<br />

characteristic function χ a of a ∈ C,<br />

χ a : C → {0, 1}, x ↦→ χ a (x) :=<br />

{ 1 if x = a<br />

0 otherwise<br />

(II. 65)


28 CHAPTER II. THE FORMALISM<br />

for which, with (II. 64), we have<br />

χ ak (A) : =<br />

M∑<br />

χ ak (a i ) P ai = P ak , (II. 66)<br />

i=1<br />

and we see that the projectors from the spectral decomposition of A, (II. 63), are functions of A.<br />

We use the spectral decompositions in the proof of the following theorem.<br />

THEOREM:<br />

If two self - adjoint operators A and B commute, there is a maximal, self - adjoint operator<br />

C of which both A and B are a function.<br />

To prove this theorem we first prove a useful lemma.<br />

LEMMA:<br />

If [A, B] = 0, a basis {|γ i ⟩} exists in which A and B are simultaneously diagonal.<br />

Proof<br />

Let {|a i , j⟩} be an orthonormal eigenbasis of operator A, where j = 1, . . . , n i is the degeneracy<br />

of eigenvalue a i , and we have<br />

⟨a p , q | a i , j⟩ = δ pi δ qj . (II. 67)<br />

Analogously, let there be an orthonormal eigenbasis {|b k , l⟩} for operator B. From [A, B] = 0<br />

and (II. 63) it follows that<br />

A ( B |a i , j⟩ ) = B A |a i , j⟩ = a i B |a i , j⟩, (II. 68)<br />

and B |a i , j⟩ is, apparently, an eigenvector of A with the eigenvalue a i , i.e., B |a i , j⟩ is in the<br />

eigenspace spanned by |a i , 1⟩, . . . , |a i , n i ⟩. Or, equivalently,<br />

B |a i , j⟩ =<br />

∑n i<br />

k=1<br />

holds for certain numbers Λ [i]<br />

j,k ∈ C.<br />

Λ [i]<br />

j,k |a i, k⟩ (II. 69)<br />

By assmuptionion, B is self - adjoint and therefore the matrix Λ [i] must be Hermitian,<br />

and we see that<br />

⟨a k , l | B | a i , j⟩ = Λ [i]<br />

l,j δ ki = Λ [i]<br />

l,j<br />

, (II. 70)<br />

⟨a k , l | B | a i , j⟩ ∗ = Λ [i] ∗<br />

l,j = ⟨ai , j | B † | a k , l⟩ = Λ [k]<br />

j,l δ ik = Λ [i]<br />

j,l<br />

, (II. 71)<br />

Λ [i]<br />

l,j<br />

∗ [i] = Λ<br />

j,l<br />

. (II. 72)


II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 29<br />

Because Λ [i] is self - adjoint, it can be diagonalized by a unitary matrix S [i] ,<br />

Λ ′ [i]<br />

= S [i]− 1 Λ [i] S [i] . (II. 73)<br />

This corresponds to an orthonormal basis transformation within the n i - dimensional subspace<br />

with eigenvalue a i . Carrying out this transformation in each of the subspaces and writing |a i , m ′ ⟩<br />

for the transformed eigenvectors of A, we have<br />

|a i , m ′ ⟩ =<br />

∑n i<br />

j=1<br />

S [i]<br />

j,m ′ |a i, j⟩. (II. 74)<br />

In the new basis {|a i , m ′ ⟩} the matrix Λ [i] is diagonalized and therefore<br />

B |a i , m ′ ⟩ = Λ ′ [i]<br />

m ′ , m ′ δ m ′ j |a i , j⟩ = Λ ′ [i]<br />

m ′ , m ′ |a i, m ′ ⟩. (II. 75)<br />

The vectors |a i , m ′ ⟩ are not just eigenvectors of A, but also of B and form, by construction, a<br />

basis. □<br />

Notice that it is not in contradiction to this lemma if non - commuting operators have some eigenvectors<br />

in common. ▹<br />

Now we come to the proof of the theorem.<br />

Proof<br />

Define, in the basis {|γ i ⟩} of the lemma<br />

A = ∑ i<br />

a i P |γi⟩ and B = ∑ i<br />

b i P |γi⟩, (II. 76)<br />

where the eigenvalues a i and b i are allowed to be degenerate. Next, define a maximal self - adjoint<br />

operator<br />

C = ∑ i<br />

c i P |γi ⟩, (II. 77)<br />

with all c i ∈ C distinct.<br />

Then, according to (II. 66), with χ ci defined analogously to (II. 65),<br />

P |γi⟩ = χ ci (C). (II. 78)<br />

With f (x) = ∑ i<br />

a i χ ci (x) and g(x) = ∑ i<br />

b i χ ci (x), as defined in (II. 64), we now find<br />

A = ∑ i<br />

a i χ ci (C) = f (C) and B = ∑ i<br />

b i χ ci (C) = g(C). (II. 79)


30 CHAPTER II. THE FORMALISM<br />

Thus, both self - adjoint, and mutually commuting, operators A and B are functions of the maximal,<br />

self - adjoint operator C, which is what we set out to prove. □<br />

Note that the choice of C in the above theorem is not unique. Indeed, suppose that<br />

A = f (C 1 ) = g(C 2 ), (II. 80)<br />

where C 1 and C 2 are both maximal. In general, it is not required for C 1 and C 2 to commute.<br />

But they do commute if A itself is maximal . In that case f can be inverted<br />

C 1 = f − 1 (A) = f − 1 (g(C 2 )) (II. 81)<br />

from which it follows that<br />

[C 1 , C 2 ] = 0. ▹ (II. 82)<br />

II. 5<br />

DIRECT SUM AND DIRECT PRODUCT<br />

There are two ways to construct a new Hilbert space H from two given Hilbert spaces H 1<br />

and H 2 , or vice versa, to divide a given Hilbert space H into smaller spaces.<br />

II. 5. 1<br />

DIRECT SUM<br />

Let H 1 and H 2 be two Hilbert spaces. By definition we call the space H := H 1 ⊕ H 2 the direct<br />

sum space of H 1 and H 2 if the following requirements are satisfied:<br />

(i) The space H 1 ⊕ H 2 contains as its elements all ordered pairs of vectors, written as |ϕ⟩ 1 ⊕ |ψ⟩ 2 ,<br />

with |ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 ∈ H 2 .<br />

(ii) Addition and scalar multiplication are defined on H 1 ⊕ H 2 , and obey<br />

a ( |ϕ⟩ 1 ⊕ |ψ⟩ 2<br />

)<br />

+ b<br />

(<br />

|χ⟩1 ⊕ |ξ⟩ 2<br />

)<br />

=<br />

(<br />

a |ϕ⟩1 + b |χ⟩ 1<br />

)<br />

⊕<br />

(<br />

a |ψ⟩2 + b |ξ⟩ 2<br />

)<br />

. (II. 83)<br />

(iii) The inner product is additive,<br />

(<br />

1⟨ϕ| ⊕ 2 ⟨ϕ| ) ( |ψ⟩ 1 ⊕ |ψ⟩ 2<br />

)<br />

= 1 ⟨ϕ | ψ⟩ 1 + 2 ⟨ϕ | ψ⟩ 2 . (II. 84)<br />

(iv) H 1 ⊕ H 2 is the smallest Hilbert space spanned by the elements of the form |ϕ⟩ 1 ⊕ |ψ⟩ 2 and<br />

their linear combinations.


II. 5. DIRECT SUM AND DIRECT PRODUCT 31<br />

A few remarks about this definition are in order. (a) According to (II. 83), an arbitrary linear<br />

combination of elements in H 1 ⊕ H 2 is, of the form<br />

∑ ( ) ∑<br />

a i |ϕi ⟩ 1 ⊕ |ψ i ⟩ 2 = a i |ϕ i ⟩ 1 ⊕ ∑ a i |ψ i ⟩ 2 . (II. 85)<br />

i<br />

i<br />

i<br />

Consequently, with |ϕ⟩ 1 := ∑ i a i|ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 := ∑ i a i|ψ⟩ 2 ∈ H 2 , all elements in H 1 ⊕ H 2<br />

are of the form |ϕ⟩ 1 ⊕|ψ⟩ 2 . This means that the requirements (i) and (ii) imply that H 1 ⊕H 2 is closed<br />

under linear combinations.<br />

(b) The subspace of H 1 ⊕H 2 , existing of all vectors of the form 0 1 ⊕|ψ⟩ 2 , with 0 1 the null vector<br />

in H 1 , and |ψ⟩ 2 ∈ H 2 arbitrary, is isomorphic to H 2 , likewise for |ϕ⟩ 1 ⊕ 0 2 and H 1 . Moreover, these<br />

two subspaces of H 1 ⊕ H 2 are mutually orthogonal, because<br />

(<br />

1⟨ϕ| ⊕ 0 2<br />

) (<br />

0 1 ⊕ |ψ⟩ 2<br />

)<br />

= 1 ⟨ϕ | 0 ⟩ 1 + 2 ⟨0 | ψ⟩ 2 = 0. (II. 86)<br />

Therefore, every vector |χ⟩ ∈ H 1 ⊕ H 2 can be written uniquely as the direct sum of two orthogonal<br />

terms,<br />

|χ⟩ = |ϕ⟩ 1 ⊕ |ψ⟩ 2 = |ϕ⟩ 1 ⊕ 0 2 + 0 1 ⊕ |ψ⟩ 2 . (II. 87)<br />

Vice versa, suppose that H is an arbitrary Hilbert space, and that H 1 is a subspace of H. Now<br />

let H 2 = H 1 ⊥ be the orthocomplement of H 1 , i.e., H 2 contains all vectors in H which are perpendicular<br />

to all vectors in H 1 . Then H = H 1 ⊕ H 2 holds, with the identification<br />

and<br />

|ϕ⟩ 1 ⊕ 0 2 ↔ |ϕ⟩ ∈ H 1 , (II. 88)<br />

0 1 ⊕ |ψ⟩ 2 ↔ |ψ⟩ ∈ H 2 , (II. 89)<br />

|ϕ⟩ ⊕ |ψ⟩ = |ϕ⟩ + |ψ⟩. (II. 90)<br />

In this case the direct sum ⊕ is nothing but ordinary addition in H, which was given in (II. 1) as<br />

a general property of H. This means that every Hilbert space can be written as a direct sum of<br />

an arbitrary subspace and its orthocomplement. We also see something that holds generally: the<br />

dimension of H 1 ⊕ H 2 is the sum of the dimensions of H 1 and H 2 ,<br />

dim (H 1 ⊕ H 2 ) = dim H 1 + dim H 2 . (II. 91)<br />

II. 5. 2<br />

DIRECT PRODUCT<br />

There is another, actually more important, way to construct a new Hilbert spaces out of two given<br />

spaces. Again, let H 1 and H 2 be two Hilbert spaces. By definition we call the space H :=<br />

H 1 ⊗ H 2 the direct product space if the following requirements have been satisfied.


32 CHAPTER II. THE FORMALISM<br />

(i) The space H 1 ⊗ H 2 has as its elements at least all ordered pairs ( |ϕ⟩ 1 , |ψ⟩ 2<br />

)<br />

, with |ϕ⟩1 ∈ H 1<br />

and |ψ⟩ 2 ∈ H 2 , which we now write as |ϕ⟩ 1 ⊗ |ψ⟩ 2 .<br />

(ii) The addition and scalar multiplication on H 1 ⊗ H 2 satisfy<br />

|ϕ⟩ 1 ⊗ |ψ⟩ 2 + |ϕ⟩ 1 ⊗ |χ⟩ 2 = |ϕ⟩ 1 ⊗ ( |ψ⟩ 2 + |χ⟩ 2<br />

)<br />

. (II. 92)<br />

and<br />

a ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />

)<br />

= a |ϕ⟩1 ⊗ |ψ⟩ 2 = |ϕ⟩ 1 ⊗ a |ψ⟩ 2 (II. 93)<br />

(iii) The inner product is multiplicative,<br />

(<br />

1⟨ϕ| ⊗ 2 ⟨χ| ) ( |ψ⟩ 1 ⊗ |ξ⟩ 2<br />

)<br />

= 1 ⟨ϕ | ψ⟩ 1 2 ⟨χ | ξ⟩ 2 . (II. 94)<br />

(iv) H 1 ⊗ H 2 is the smallest Hilbert space spanned by vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 ∈ H and<br />

their linear combinations.<br />

If |α 1 ⟩ 1 , . . . , |α N1 ⟩ 1 is an orthonormal basis in H 1 , and |β 1 ⟩ 2 , . . . , |β N2 ⟩ 2 is likewise in H 2 ,<br />

with N 1 = dim H 1 , N 2 = dim H 2 , their direct products, i.e. the vectors of the form |α i ⟩ 1 ⊗ |β j ⟩ 2<br />

provide, an orthonormal set of vectors in H 1 ⊗ H 2 . Indeed, using (II. 94),<br />

(<br />

1⟨α j | ⊗ 2 ⟨β k | ) ( |α m ⟩ 1 ⊗ |β n ⟩ 2<br />

)<br />

= 1 ⟨α j | α m ⟩ 1 2 ⟨β k | β n ⟩ 2 = δ jm δ kn . (II. 95)<br />

Because orthonormal vectors are independent, the dimension of H 1 ⊗ H 2 cannot be smaller than<br />

the product of the separate dimensions. But furthermore, according to (iv), all vectors in H 1 ⊗ H 2<br />

are obtainable as linear combinations of vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 , which in turn are linear<br />

combinations of the vectors |α j ⟩ 1 ⊗|β k ⟩ 2 . Therefore, these vectors also span the entire space H 1 ⊗H 2 .<br />

In other words, |α 1 ⟩ 1 ⊗ |β 1 ⟩ 2 , |α 2 ⟩ 1 ⊗ |β 1 ⟩ 2 , . . . , |α N1 ⟩ 1 ⊗ |β N2 ⟩ 2 is also a basis for H 1 ⊗ H 2 . For<br />

the dimension of H 1 ⊗ H 2 we thus find<br />

dim (H 1 ⊗ H 2 ) = dim H 1 · dim H 2 . (II. 96)<br />

Consequently, an arbitrary vector |χ⟩ ∈ H 1 ⊗ H 2 can, in this product basis |α j ⟩ 1 ⊗ |β k ⟩ 2 , be written<br />

as<br />

|χ⟩ =<br />

∑N 1 ∑N 2<br />

j=1 k=1<br />

c jk |α j ⟩ 1 ⊗ |β k ⟩ 2 with c jk = ( 1⟨α j | ⊗ 2 ⟨β k | ) |χ⟩ ∈ C. (II. 97)<br />

For vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 it holds that<br />

N 1 ∑<br />

j=1<br />

a j |α j ⟩ 1 ⊗<br />

N 2 ∑<br />

k=1<br />

b k |β k ⟩ 2 =<br />

∑N 1 ∑N 2<br />

j=1 k=1<br />

a j b k |α j ⟩ 1 ⊗ |β k ⟩ 2 . (II. 98)


II. 5. DIRECT SUM AND DIRECT PRODUCT 33<br />

We see that (II. 98) is a special case of (II. 97), that is, where c jk = a j b k . The special vectors which<br />

can be written as (II. 98), i.e., in the form |ϕ⟩ 1 ⊗|ψ⟩ 2 , are called direct product vectors, or factorizable.<br />

In a direct sum space H 1 ⊕H 2 all vectors can be written in the form |ϕ⟩ 1 ⊕|ψ⟩ 2 , but in a direct product<br />

space H 1 ⊗ H 2 not all vectors can be written in the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 . Further on we will see that<br />

states for which c jk cannot be written as a j b k give rise to typical quantum mechanical behavior, as<br />

in the thought experiment of EPR where composite systems are considered, corresponding to states<br />

on H 1 ⊗ H 2 which cannot be factorized. Such states are called non - factorizable or entangled states.<br />

If A and B are operators on H 1 and H 2 , respectively, the direct product operator A ⊗ B is the<br />

operator on H 1 ⊗ H 2 , defined by<br />

(A ⊗ B) ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />

)<br />

:= A |ϕ⟩1 ⊗ B |ψ⟩ 2 . (II. 99)<br />

It follows that, with operators C ∈ H 1 and D ∈ H 2 ,<br />

(A ⊗ B) (C ⊗ D) = (A C) ⊗ (B D). (II. 100)<br />

Similar to vectors, operators on the direct product space H 1 ⊗ H 2 are not always factorizable. The<br />

total momentum operator P 1 + P 2 and the distance operator Q 1 − Q 2 of EPR, with P as defined in<br />

section I. 2, (I. 1), and Q likewise, are examples of such non - factorizable direct product operators,<br />

P 1 ⊗ 11 2 + 11 1 ⊗ P 2 and Q 1 ⊗ 11 2 − 11 1 ⊗ Q 2 . (II. 101)<br />

EXERCISE 12. Calculate the commutator of these operators, given that [ P i , Q j<br />

]<br />

= −iδij .<br />

The following properties of the direct product of operators will, further on, be used frequently:<br />

A ⊗ 0 = 0 ⊗ B = 0 ,<br />

(A 1 + A 2 ) ⊗ B = (A 1 ⊗ B) + (A 2 ⊗ B),<br />

11 ⊗ 11 = 11,<br />

a A ⊗ b B = a b (A ⊗ B), (II. 102)<br />

(A ⊗ B) − 1 = A − 1 ⊗ B − 1 ,<br />

(A ⊗ B) † = A † ⊗ B † ,<br />

Tr ( bA ⊗ cB ) = b c Tr A · Tr B.<br />

EXERCISE 13. Prove the properties of ⊗ in (II. 102).


34 CHAPTER II. THE FORMALISM<br />

Finally, the matrix A ⊗ B of the operator A ⊗ B in the direct product space H 1 ⊗ H 2 is of the<br />

form<br />

⎛ ⎛<br />

⎞<br />

⎞<br />

b 11 · · · b 1N2<br />

⎜<br />

a 11 ⎝<br />

.<br />

. ..<br />

⎟<br />

⎠ · · · a 1N1 B<br />

b N2 1 b N2 N 2 A ⊗ B =<br />

, (II. 103)<br />

a 22 B .<br />

⎜<br />

.<br />

⎝ .<br />

..<br />

⎟<br />

⎠<br />

a N1 1 B · · · a N1 N 1<br />

B<br />

where a ij = ⟨α i | A | α j ⟩ and b kl = B kl = ⟨β k | B | β l ⟩, as in (II. 24). This matrix is called the<br />

Kronecker product of the matrices A and B.<br />

II. 6<br />

ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES<br />

This section is intended for interested readers, who wish to gain more in - depth knowledge of<br />

Hilbert spaces.<br />

In physical applications of quantum mechanics we nearly always need infinite - dimensional Hilbert<br />

spaces. Indeed, this already applies to the case of a free particle in one spatial dimension.<br />

The mathematical theory of infinite - dimensional Hilbert spaces is in some aspects more difficult<br />

than that of finite - dimensional ones.<br />

II. 6. 1<br />

THE STRUCTURE <strong>OF</strong> VECTOR SPACES<br />

An infinite - dimensional space H is a space where for every n independent vectors in H, with<br />

n arbitrarily large, it is always possible to find still another vector in H that is independent of these<br />

vectors. In rough approximation it can be said that all formulas of the previous sections remain valid<br />

if we replace the sums from 1 to N by sums from 1 to infinity. But, of course, attention must be given<br />

to the convergence of such sums. This leads to two extra assumptions which were superfluous in the<br />

theory of finite - dimensional spaces.<br />

(i) Separability. A Hilbert space H is called separable if it has a countable basis, i.e., a countable<br />

set of independent vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H exists such that every vector |ϕ⟩ ∈ H<br />

can, analogously to (II. 14), be written as<br />

|ϕ⟩ =<br />

∞∑<br />

c j |ϕ j ⟩ with c j = ⟨ϕ j | ϕ⟩. (II. 104)<br />

j=1<br />

This equation is shorthand for<br />

lim<br />

m→∞<br />

∥<br />

∥ϕ −<br />

m∑ ∥ ∥∥<br />

c j ϕ j = 0. (II. 105)<br />

j=1


II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 35<br />

(ii) Completeness. We require that the space is complete, which means that every Cauchy sequence,<br />

i.e., a sequence of vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H, for which<br />

lim ∥ϕ j − ϕ k ∥ = 0, (II. 106)<br />

j, k→∞<br />

has a limit vector |ϕ⟩ in H,<br />

lim ∥ϕ m − ϕ∥ = 0. (II. 107)<br />

m→∞<br />

for example, in this sense Q, the set of rational numbers, is incomplete, since many Cauchy<br />

sequences of rational terms exist which have no limit in Q, for instance the series expansions of π<br />

and e. If the limiting points of all Cauchy sequences are added to Q, we obtain exactly R. Q is called<br />

a countably infinite set, R is called an uncountably infinite set.<br />

Below, we will assume Hilbert spaces to be separable and complete.<br />

EXERCISE 14. Prove that every finite - dimensional complex vector space with an inner product<br />

is separable and complete.<br />

The claim in the3 above exercise makes clear that in the finite - dimensional case the requirements<br />

of separability and completeness are indeed superfluous.<br />

The next two spaces are well - known examples of infinite - dimensional Hilbert spaces.<br />

(i) The space of all complex, square integrable functions,<br />

{<br />

∫<br />

}<br />

L 2 (R) := ψ : R → C ∣ |ψ(q)| 2 dq < ∞ , (II. 108)<br />

R<br />

with an inner product defined as<br />

∫<br />

⟨ψ | ϕ⟩ := ψ ∗ (q) ϕ(q) dq, (II. 109)<br />

R<br />

and likewise for L 2 (R n ) with arbitrary n ∈ N + .<br />

(ii) The space of square summable sequences of complex numbers, defined by Erhard Schmidt,<br />

l 2 (N) :=<br />

{<br />

c : N → C ∣<br />

∞∑<br />

j=0<br />

}<br />

|c j | 2 < ∞ , (II. 110)<br />

with inner product<br />

⟨c | d⟩ :=<br />

∞∑<br />

cj ∗ d j . (II. 111)<br />

j=0


36 CHAPTER II. THE FORMALISM<br />

The proof that these vector spaces are complete is not simple, however, the proof that the remaining<br />

requirements for a Hilbert space have been met, is.<br />

These two spaces correspond to two versions of quantum mechanics, where L 2 (R) corresponds<br />

to Schrödingers wave mechanics (1926) and l 2 (N) to the matrix mechanics of Heisenberg, Born, and<br />

Jordan (1925), that is, if we take matrix mechanics in the enriched version of Von Neumann, since<br />

the original version did not contain a ‘state space’. These two versions of quantum mechanics are<br />

mathematically equivalent, see F.A. Muller (1997a, 1997b and 1999) for historical details.<br />

II. 6. 2<br />

OPERATORS<br />

More serious complications occur when introducing of operators on infinite - dimensional Hilbert<br />

spaces. First, we will see that such operators are in general ‘unbounded’, which entails that<br />

they cannot be defined on the entire Hilbert space. Consequently, the definition of sum and product<br />

of operators, as well as their adjoints, becomes more cumbersome, and the terms ‘self - adjoint’<br />

and ‘Hermitian’ no longer coincide. Second, these operators do not always have eigenvectors in H.<br />

Therefore it is more difficult to give a useful version of the spectral theorem.<br />

The second problem is independent of the first, i.e., it can also appear for bounded self - adjoint<br />

operators. ▹<br />

For position and momentum both complications occur together which is shown by an example.<br />

EXAMPLE<br />

Consider the position operator<br />

Q : ψ(q) ↦→ q ψ(q), (II. 112)<br />

and the momentum operator<br />

P : ψ(q) ↦→ − i d ψ(q), (II. 113)<br />

dq<br />

both acting on L 2 (R).<br />

The first problem is that these operators do not map every vector in L 2 (R) to another vector<br />

in L 2 (R). For instance, every non - differentiable function in L 2 (R) is outside the domain of P .<br />

Vice versa, taking for Q, for example, ψ (q) = (a + q) − 3 2 with a ∈ R, we have ψ ∈ L 2 (R),<br />

but Qψ ∉ L 2 (R).<br />

The second problem is that the eigenvalue equation for momentum, −i d dq<br />

ψ (q) = pψ (q), has<br />

solutions ψ (q) ∝ e i pq for p ∈ R, but these functions are not square integrable and therefore<br />

they are not in L 2 (R). Something similar applies to the eigenvalue equation Qψ(q) = q 0 ψ(q)<br />

and its solutions ψ(q) = δ(q − q 0 ).


II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 37<br />

II. 6. 2. 1<br />

UNBOUNDED OPERATORS<br />

Let us start with a definition: an operator A on Hilbert space H is called bounded if the set of<br />

positive numbers ∥Aχ∥ = ∥⟨χ | A | χ⟩∥ has an upper bound for all unit vectors |χ⟩, where the least<br />

upper bound, or supremum, is called the norm of A,<br />

{<br />

}<br />

∥A∥ = sup ∥Aχ∥ ∈ R ∣ ∥χ∥ = 1 . (II. 114)<br />

The set of all bounded operators on H is written as B(H).<br />

In finite - dimensional Hilbert spaces all operators are bounded, but this is not the case in infinite -<br />

dimensional Hilbert spaces. As we want to hold on to the requirement that every vector A|ψ⟩ has a<br />

finite norm, we have to exclude from the domain of A the set of vectors |ϕ⟩ for which<br />

∥A χ∥<br />

∥χ∥<br />

→ ∞ if |χ⟩ → |ϕ⟩. (II. 115)<br />

Therefore, from now an operator A is a linear mapping from a subset of H to H. This subset is called<br />

the domain of A, written as Dom A ⊂ H. Hence, an operator is a linear mapping<br />

ψ ∈ Dom A, A : ψ ↦→ A ψ ∈ H. (II. 116)<br />

We will, however, always assume that Dom A is dense in H which means that every vector ϕ in H<br />

can be approximated arbitrarily well by vectors in Dom A. The foregoing implies that also sums and<br />

products of operators are generally defined on a limited domain only,<br />

Dom (A + B) = Dom A ∩ Dom B (II. 117)<br />

{<br />

}<br />

Dom (A B) = ψ ∈ Dom B : B ψ ∈ Dom A . (II. 118)<br />

It is more difficult to introduce the adjoint A † of an operator A. The operator is again called<br />

Hermitian if<br />

⟨ϕ | A | ψ⟩ = ⟨ψ | A | ϕ⟩ ∗ ∀ ϕ, ψ ∈ Dom A, (II. 119)<br />

but this definition is no longer sufficient for our purposes, as can be seen in the next example.<br />

EXAMPLE<br />

Consider the operator P from (II. 113), now acting on L 2( [0, ∞⟩ ) , and choose as its domain<br />

Dom P =<br />

{<br />

ψ :<br />

∫ ∞<br />

0<br />

∫<br />

|ψ(q)| 2 dq < ∞,<br />

}<br />

|P ψ(q)| 2 dq < ∞, ψ(0) = 0 . (II. 120)<br />

This operator is indeed Hermitian, which can be checked using integration by parts, where the<br />

non - integral term cancels out because of the boundary condition ψ(0) = 0. But the operator is<br />

not self - adjoint, as we will see in the next exercise.


38 CHAPTER II. THE FORMALISM<br />

To introduce the adjoint of an operator we first delimit the domain. Let Dom A † be the set of all<br />

vectors |ϕ⟩ such that a vector |η⟩ exists for which<br />

⟨ϕ | A | ψ⟩ = ⟨η | ψ⟩ ∀ |ψ⟩ ∈ Dom A. (II. 121)<br />

Using the assumption that Dom A is dense in H it is possible to show that if such a vector |η⟩ exists<br />

it is also unique. The adjoint A † of operator A is now, by definition, the mapping<br />

A † : |ϕ⟩ ∈ Dom A † ↦→ |η⟩ := A † |ϕ⟩, (II. 122)<br />

and the operator is called self - adjoint if<br />

A = A † and Dom A = Dom A † . (II. 123)<br />

This requirement is stronger than Hermiticity; it can be shown that in general it holds for Hermitian<br />

operators that Dom A ⊂ Dom A † , instead of (II. 123).<br />

EXERCISE 15. Verify that the domain of P † , with P as in the example above, is indeed larger<br />

than the domain of P .<br />

II. 6. 2. 2<br />

CONTINUOUS SPECTRA<br />

Another aspect in which infinite - dimensional Hilbert spaces deviate from finite - dimensional<br />

ones is the possibility for an operator to have a continuous spectrum, a mathematical impossibility<br />

in the finite - dimensional case since the term ‘spectrum’ was defined as the set of eigenvalues of<br />

operators. Examples of operators with continuous spectra are, again, the position operator and the<br />

momentum operator, whose spectra consist of the entire line of real numbers R. Therefore, the<br />

term ‘spectrum’ needs to be redefined. The spectrum of operator A is now defined as the set of all<br />

values λ ∈ C for which the operator A − λ11 has no inverse operator. To illustrate the deviations from<br />

the finite - dimensional case we give two examples, the angle operator and the angular momentum<br />

operator.<br />

EXAMPLE<br />

Consider the Hilbert space L 2( [0, 2π] ) and the angle operator<br />

Q : ψ(q) ↦→ q ψ(q), 0 q 2 π. (II. 124)<br />

This operator has, analogous to (II. 112), eigenfunctions which are not in H, its spectrum is the<br />

interval [0, 2π], but it is bounded, ∥Q∥ = 2π.


The angular momentum operator<br />

II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 39<br />

L : ψ(q) ↦→ − i d ψ(q), (II. 125)<br />

dq<br />

with domain<br />

Dom L =<br />

{<br />

}<br />

ψ : ∥L ψ∥ < ∞, ψ(0) = ψ(2π) , (II. 126)<br />

does have normalized eigenfunctions,<br />

ψ(q) = 1 √<br />

2 π<br />

e i l q , (II. 127)<br />

and a discrete spectrum l ∈ Z. But, since l can be arbitrarily large, it is unbounded.<br />

II. 6. 2. 3<br />

SPECTRAL THEOREM<br />

Von Neumann succeeded in proving the spectral theorem, in the version of II. 3. 1, for infinite -<br />

dimensional Hilbert spaces for which we can formulate the theorem now.<br />

SPECTRAL THEOREM:<br />

To every normal operator A, bounded or unbounded, corresponds a unique mapping of<br />

subsets of Spec A to the set P (H) of projectors on H, ∆ ↦→ P A (∆), having the following<br />

properties:<br />

(i) P ∅ = 0<br />

(ii) P C = 11<br />

(iii) P ∪i ∆ i<br />

= ∑ i<br />

P ∆i for all ∆ i mutually disjoint. (II. 128)<br />

For the position operator Q we have an explicit expression for the spectral family of eigenprojectors<br />

of Q,<br />

P Q (∆) ψ(q) =<br />

{ q ψ(q) if q ∈ ∆<br />

0 otherwise<br />

, (II. 129)<br />

hence, P Q (∆) is in fact a multiplication with the characteristic function of ∆. The spectral family of<br />

the momentum operator is obtained by applying a Fourier transform to the aforementioned expression.<br />

The probability of finding upon measurement for the physical quantity A, which corresponds to<br />

the normal operator A if the physical system is in the state ψ ∈ H, a value a ∈ ∆ ⊂ R, is<br />

Prob ψ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩, (II. 130)


40 CHAPTER II. THE FORMALISM<br />

which, using (II. 129), yields for the physical quantity position Q<br />

∫<br />

Prob ψ (Q : ∆) = ⟨ψ(q) | P Q (∆) | ψ(q)⟩ = q |ψ(q)| 2 dq. (II. 131)<br />

All empirical statements of quantum mechanics can therefore be expressed in terms of projectors, or,<br />

more precisely, all empirical statements of quantum mechanics concerning physical quantity A can<br />

be expressed in terms of the spectral family of A.<br />

∆<br />

II. 6. 3<br />

DIRAC<br />

Finally we remark that quantum mechanics à la Dirac willingly and knowingly violates Von Neumann’s<br />

postulates by going outside the Hilbert space. Dirac writes (1958, p. 40)<br />

The bra and ket vectors that we now use form a more general space than a Hilbert space.<br />

To make Dirac’s approach mathematically expressible, the French mathematician Laurent Schwarz<br />

developed the theory of distributions, and the Russian mathematical physician I.M. Gel’fand developed<br />

the theory of rigged Hilbert spaces. Contrary to Schrödinger and Von Neumann, Dirac regarded<br />

wave mechanics as a generalization of matrix mechanics, going from a discrete index to a continuous<br />

index, making a transition from square summable sequences of complex numbers to wave functions,<br />

and from infinite matrices to integral kernels.<br />

II. 6. 4<br />

SUMMARY<br />

A complex Hilbert space is, by definition, a complete, separable complex vector space with an inner<br />

product which is related to the norm by ∥ψ∥ 2 = ⟨ψ | ψ⟩, its dimension is either finite or countably<br />

infinite. Contrary to the infinite - dimensional case, the requirements of separability and completeness<br />

are superfluous in the finite - dimensional case because they are derivable from the other properties of<br />

a Hilbert space, but in the vast majority of physical applications infinite - dimensional Hilbert spaces<br />

and unbounded operators are required.


III<br />

THE POSTULATES<br />

The sciences do not try to explain, they hardly even try to interpret, they mainly make<br />

models. By a model is meant a mathematical construct which, with the addition of certain<br />

verbal interpretations, describes observed phenomena. The justification of such a<br />

mathematical construct is solely and precisely that it is expected to work [. . . ]<br />

— John von Neumann<br />

It would seem that the theory is exclusively concerned about ‘results of measurement’,<br />

and has nothing to say about anything else. [. . . ] To restrict quantum mechanics to be<br />

exclusively about piddling laboratory operations is to betray the great enterprise.<br />

— John Bell<br />

In this chapter we will formulate and discuss Von Neumann’s postulates. Next, we will extend the<br />

quantum mechanical concept of ‘pure’ states by adding ‘mixed’ states, and show how quantum<br />

mechanics treats states of subsystems of composite physical systems. Finally, we apply these<br />

concepts to spin 1/2 particles and we derive some formulas needed in subsequent chapters.<br />

III. 1<br />

VON NEUMANN’S POSTULATES<br />

We are now ready to give, in some cases in simplified fashion, Von Neumann’s postulates of<br />

quantum mechanics, which link the physical concepts of the theory to the mathematical concepts of<br />

its formalism.<br />

1. State postulate, pure states. Every physical system has a corresponding Hilbert space H, the<br />

states of the system are completely described by unit vectors in H. A composite physical<br />

system corresponds to the direct product of the Hilbert spaces of the subsystems.<br />

2. Observables postulate. Every physical quantity A of the system corresponds to a self - adjoint<br />

operator A in H. Dirac called the quantities ‘observables’.<br />

3. Spectrum postulate. The only possible outcomes which can be found upon measurement of a<br />

physical quantity A, corresponding to an operator A, are values from the spectrum of A.<br />

4. Born postulate, discrete case. If the system is in a state |ψ⟩ ∈ H, and a measurement is made<br />

of a physical quantity A, corresponding to an operator A with a discrete spectrum Spec A,<br />

probability to find the outcome a i ∈ Spec A, is equal to<br />

Prob |ψ⟩ (a i ) = ⟨ψ | P ai | ψ⟩, (III. 1)


42 CHAPTER III. THE POSTULATES<br />

where P ai is the projector from the spectral decomposition (II. 57) of A.<br />

5. Schrödinger postulate. As long as no measurements are made on the system, the time evolution<br />

of the system is described by a unitary transformation,<br />

|ψ(t)⟩ = U (t, t 0 ) |ψ(t 0 )⟩. (III. 2)<br />

6. Projection postulate, discrete case. If the system is in a state |ψ⟩ ∈ H and a measurement is<br />

made on a physical quantity A corresponding to an operator A with discrete spectrum, and the<br />

outcome of the measurement is the eigenvalue a i ∈ Spec A, the system is, immediately after<br />

the measurement, in the eigenstate<br />

|ψ⟩ P a i<br />

|ψ⟩<br />

. (III. 3)<br />

∥P ai |ψ⟩∥<br />

The first four postulates connect the (undefined) concepts ‘physical system’, ‘state’, ‘quantity’<br />

and ‘measurement’ to mathematical concepts. In the literature the postulates 3 and 4 are sometimes<br />

combined into the so - called measurement postulate. The last two postulates determine the evolution<br />

of the states in time.<br />

Ad 1. The state postulate implies that systems with the same |ψ⟩ are in the same physical state.<br />

The way in which this state vector |ψ⟩ is produced, is thus unimportant. Also the fact that two systems<br />

which are described by the same |ψ⟩ can, upon measurement, have different outcomes, which is<br />

allowed according to the measurement postulate, is no reason to regard their states as being different.<br />

On the other hand, not every pair of mutually different unit vectors also represent different states.<br />

Usually it is assumed that vectors whose only difference is their phase factor e iθ , with θ ∈ R, describe<br />

the same physical state, because they predict the same probability distributions for outcomes of all<br />

possible measurements. Such vectors form a so - called unit ray.<br />

The statement that all unit vectors of H describe physical states also need not be true in general.<br />

Notice that the set of unit vectors is extremely large. Even for a particle in one spatial dimension the<br />

Hilbert space is infinite - dimensional. Furthermore, some types of superposition, linear combinations<br />

of two or more eigenstates, do not occur in nature, for instance superpositions of states with different<br />

charges, i.e., electrical, baryonic etc., or superpositions of states with different spin.<br />

It is possible to prohibit these superpositions in the theory by introducing so - called superselection<br />

rules. The requirement that, for identical particles, only states are allowed which are symmetric or<br />

antisymmetric under permutation of the particles is an example of such a superselection rule. In the<br />

presence of a superselection rule the class of allowed states breaks up into in a direct sum of the<br />

eigenspaces of the superselection operator,<br />

H = ⊕ j=1<br />

H j . (III. 4)<br />

Within one such subspace H j , called a coherent sector, superpositions of all states are allowed.


III. 1. VON NEUMANN’S POSTULATES 43<br />

In absence of superselection rules the entire Hilbert space is one coherent sector. Then the superposition<br />

principle is valid in general, which says that for every two states |ψ⟩ and |ϕ⟩ the linear<br />

combination a|ψ⟩ + b|ϕ⟩, with |a| 2 + |b| 2 = 1, is a state too. Because nature apparently imposes<br />

superselection rules, which can sometimes be derived from symmetries as was first shown by Wick,<br />

Wightman and Wigner (1952), the superposition principle only applies for coherent sectors. Since<br />

superpositions of vectors from different coherent sectors do not correspond to physical states, the<br />

state postulate has to be accordingly reformulated.<br />

As far as composite physical systems are concerned, we say that the system is in an entangled<br />

state iff the state vector is not factorizable, see section II. 5. In the thought experiment of EPR such an<br />

entangled state plays the principal part. Schrödinger (1935b) was the first to show that the occurrence<br />

of entanglement is widespread in quantum mechanics and he considered this to be the cardinal distinction<br />

between classical mechanics and quantum mechanics. In section III. 2 we will further extend<br />

the notion of state.<br />

Ad 2. The question if every self - adjoint operator represents a physical quantity, has, according<br />

to some authors, a negative answer. Wigner, for instance, asked how to measure the quantity corresponding<br />

to the self - adjoint operator P + Q. Another example is a projector which projects on<br />

superpositions of vectors from different coherent sectors, as we saw in Ad 1.<br />

Also the reverse question, whether every physically meaningful quantity is represented by a self -<br />

adjoint operator, is controversial. For some physical quantities which correspond to experimentally<br />

clear measuring procedures, such as ‘time of decay’ in case of a radioactive atom, or the ‘phase’ of<br />

a harmonic oscillator, no associated self - adjoint operator can be found. In later generalizations of<br />

the formalism of quantum mechanics this problem is somewhat relieved by considering more general<br />

mathematical constructions, the so - called positive operator valued measures, which are also capable<br />

of representing physical quantities; see for example A.S. Holevo (1982) or Busch, Grabowski and<br />

Lahti (1995).<br />

Another question is which operator exactly corresponds to which quantity. Again, no commonly<br />

accepted recipe is available here. Generally, one starts with demanding that certain classical quantities<br />

are represented by special operators. It is standard procedure to choose position and momentum to<br />

be these quantities and to require that the corresponding operators satisfy the canonical commutation<br />

relation of Born and Jordan (1925), and Dirac (1925),<br />

[P, Q] := P Q − Q P = − i 11. (III. 5)<br />

Next, a certain ‘quantization prescription’ is chosen which can be used to construct an operator<br />

corresponding to more general physical quantities. Dirac’s mathematical prescription of replacing<br />

Poisson brackets by commutators is famous. Unfortunately, this prescription is inconsistent. The<br />

alternative prescriptions for quantization which have been presented for this purpose, do not mutually<br />

agree. We will not discuss this problem further.<br />

Ad 4. With P ψ = |ψ⟩ ⟨ψ|, as defined in (II. 37), P ai as in (II. 54), and using the relation (II. 56),<br />

the probability of finding a value a i ∈ Spec A, in a measurement of the physical quantity A with


44 CHAPTER III. THE POSTULATES<br />

corresponding operator A, can also be written as<br />

⟨ψ | P ai | ψ⟩ =<br />

∑n i<br />

j=1<br />

⟨ψ | a i , j⟩ ⟨a i , j | ψ⟩ =<br />

∑n i<br />

j=1<br />

|⟨a i , j | ψ⟩| 2 = Tr P ai P ψ (III. 6)<br />

Likewise, the expectation value of A, with A as defined in (II. 55), is<br />

⟨A⟩ ψ = ⟨ψ | A | ψ⟩ =<br />

M∑ ∑n i<br />

⟨ψ | a i , j⟩ a i ⟨a i , j | ψ⟩ =<br />

i=1 j=1<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

a i |⟨a i , j | ψ⟩| 2 = Tr(III. AP ψ 7) .<br />

In case there is no degeneracy, (III. 6) takes the simpler form<br />

⟨ψ | P ai | ψ⟩ = |⟨a i | ψ⟩| 2 = Tr P ψ P ai . (III. 8)<br />

We also note that in case A has a continuous spectrum, as discussed in section II. 6, we have (II. 130),<br />

Prob |ψ⟩ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩. ▹ (III. 9)<br />

Ad 5. If the system is invariant under translations in time, the unitary evolution operator U (t, t 0 )<br />

depends only on the time difference t − t 0 , and can be written as U (t − t 0 ). The evolution operators<br />

then form a continuous abelian Lie group, the group of translations in time, satisfying the group multiplication<br />

structure U(t) U(t ′ ) = U(t + t ′ ). According to the Stone - Von Neumann theorem (1932),<br />

they can be written as<br />

U (t) = e − i H t (III. 10)<br />

where H is a unique self - adjoint operator H as the generator of the Lie group. H is called the<br />

Hamiltonian. Therefore, the evolution operator U (t − t0) from the Schrödinger postulate can be<br />

written as<br />

U (t − t 0 ) = e − i H (t−t 0) , (III. 11)<br />

and the Schrödinger equation is, according to (III. 2),<br />

i d dt |ψ(t)⟩ = i d dt e − i H (t − t 0) |ψ(t 0 )⟩ = H |ψ(t)⟩. (III. 12)


III. 2. PURE AND MIXED STATES 45<br />

Ad 6. This is the notorious projection postulate. It introduces a second kind of dynamics in<br />

the theory; a projector is, in general, not unitary and therefore it cannot be written in terms of the<br />

Schrödinger postulate. Some authors do not regard the projection postulate to be a part of quantum<br />

mechanics. The problem is then how to account for the measurement process using the other<br />

postulates, this will be discussed further in chapter VIII.<br />

The version of the projection postulate we gave is a stronger version of Von Neumann’s original<br />

formulation and is defined by G. Lüders (1951). Von Neumann only required that the state, directly<br />

after a measurement of A which has a i as an outcome, is an (arbitrary) eigenstate with eigenvalue a i .<br />

In Lüders’ version the state directly after the measurement, (III. 3), is the normalized projection of the<br />

original state on the eigenspace of a i . Here the disturbance of the original state is as small as possible,<br />

in the sense that the angle between the original and the final state is as small as possible.<br />

If the operator A is maximal, both versions coincide because in that case P ai is a 1 - dimensional<br />

projector.<br />

III. 2<br />

PURE AND MIXED STATES<br />

A state vector, a unit vector in H, provides a description of the system which is as complete as the<br />

theory allows. In classical mechanics such a description corresponds, for a system of point particles,<br />

to giving all coordinates of position and momentum; (q, p) := (q 1 , . . . , q n ; p 1 , . . . , p n ) is a point<br />

in the phase space Γ. In practice, the value of these coordinates is often not known precisely and a<br />

probability distribution ρ(q, p) is introduced over the phase space Γ. The integral of ρ(q, p) over ∆<br />

is the probability to find the system in the subset ∆ ⊆ Γ. The probabilities have to be positive and<br />

normalized,<br />

∫<br />

ρ(q, p) 0 and ρ(q, p) dq dp = 1. (III. 13)<br />

Γ<br />

In classical physics it is also customary to extend the notion of state and also call a probability<br />

distribution ρ a (generalized) state of the system. A physical quantity A corresponds to a real function<br />

on the phase space, A : Γ → R, and the expectation value of A in the state ρ is<br />

∫<br />

⟨A⟩ ρ := A(q, p) ρ(q, p) dq dp. (III. 14)<br />

Γ<br />

The states ρ form a convex set, i.e., if ρ 1 and ρ 2 are states on Γ and w 1 and w 2 both are real numbers<br />

satisfying<br />

then<br />

0 w i 1 and w 1 + w 2 = 1, (III. 15)<br />

ρ := w 1 ρ 1 + w 2 ρ 2 (III. 16)<br />

also satisfies the requirements of (III. 13) and therefore it is also a state on Γ. This convex set of states<br />

is written S (Γ).


46 CHAPTER III. THE POSTULATES<br />

A state which cannot be decomposed according to (III. 16) is called a pure state, otherwise it<br />

is called a mixed state. The pure states are the states ρ concentrated on a single point of Γ, the δ -<br />

‘functions’. Generally, the elements of a convex set which cannot be written in the form (III. 16),<br />

with w 1 , w 2 ≠ 0, are called extreme elements of that set, therefore in our case the extreme elements<br />

are the pure states. Every element of a convex set can always be written as a convex sum of extreme<br />

elements. This corresponds to the expansion of ρ to δ - functions,<br />

∫<br />

ρ(q, p) = ρ(q ′ , p ′ ) δ(q − q ′ ) δ(p − p ′ ) dq ′ dp ′ . (III. 17)<br />

Γ<br />

The dynamics of an arbitrary state follows from the Hamiltonian equations of motion of the pure<br />

states, found by calculating the path of least energy. This holds for conservative systems, which<br />

the quantum mechanical states in these lecture notes are assumed to be. We will come back to the<br />

derivation of the equations in section VI. 5.<br />

The Hamiltonian equations of motion are<br />

˙q = ∂H<br />

∂p<br />

and<br />

ṗ = − ∂H . (III. 18)<br />

∂q<br />

To find the equation of motion in terms of ρ we use Liouville’s theorem which states that for points<br />

moving in phase space obeying the Hamiltonian equations of motion the time evolution of the probability<br />

distribution ρ(q, p, t) is constant. Using (III. 18) and the Poisson brackets<br />

{H, ρ} :=<br />

( ∂H<br />

∂q<br />

∂ρ<br />

∂p − ∂ρ<br />

∂q<br />

∂H<br />

∂p<br />

the Liouville equation, the equation of motion for the state ρ<br />

)<br />

, (III. 19)<br />

equals<br />

dρ<br />

dt = ∂ρ<br />

∂t + ∂ρ<br />

∂q<br />

∂ρ ˙q + ṗ = 0, (III. 20)<br />

∂p<br />

∂ρ<br />

∂t<br />

= {H, ρ}. (III. 21)<br />

Now we will consider, analogous to the classical case, a probability distribution of the state vectors<br />

in H. With help of the state ρ we introduce a mapping µ of subsets ∆ of Γ to R,<br />

∫<br />

µ(∆) := ρ(q, p) dq dp with ∆ ⊆ Γ. (III. 22)<br />

∆<br />

This mapping µ is additive<br />

µ (∪ i ∆ i ) = ∑ i=1<br />

µ(∆ i ) (III. 23)<br />

for every countable sequence of disjoint ∆ i ⊂ Γ. Furthermore,<br />

0 µ(∆) 1, µ(∅) = 0 and µ(Γ) = 1. (III. 24)


III. 2. PURE AND MIXED STATES 47<br />

Each mapping which maps a measurable subset of Γ to a number in the interval [0, 1], thereby<br />

satisfying (III. 23) and (III. 24), is called a probability measure. It is not difficult to see that every<br />

probability distribution ρ corresponds univocally to a probability measure and vice versa, this is even<br />

true for δ - functions. Therefore we can also represent a state, in the extended meaning, by a probability<br />

measure on Γ.<br />

Analogous to this reasoning we now aim to let the physical states in quantum mechanics correspond<br />

to probability measures on H. Since we want to preserve the structure of H, we do not consider<br />

arbitrary subsets of H, instead we look at the set P(H) of all subspaces of H generated by orthogonal<br />

projectors, or, equivalently, at the projectors projecting on those subspaces. What we are thus looking<br />

for is a probability measure on P (H), i.e. a mapping<br />

µ : P (H) → [0, 1], (III. 25)<br />

which is additive in the relevant manner; if P 1 , P 2 , . . . , P N is a set of pairwise orthogonal projectors,<br />

P i ⊥ P j for i ≠ j, the following holds,<br />

( ∑<br />

µ<br />

j<br />

P j<br />

)<br />

= ∑ j<br />

and the mapping satisfies<br />

µ(P j ), (III. 26)<br />

µ(0 ) = 0 and µ(11) = 1. (III. 27)<br />

In 1957 A.M. Gleason proved the following theorem.<br />

GLEASON’S THEOREM:<br />

Every probability measure µ on P (H) can, under the condition that dim H > 2, be<br />

written as<br />

µ(P ) = Tr P W, (III. 28)<br />

for a certain operator W satisfying the following requirements: 1<br />

(i) W = W † ,<br />

(ii) ⟨ψ | W | ψ⟩ 0 ∀ |ψ⟩ ∈ H,<br />

(iii) Tr W = 1. (III. 29)<br />

The original proof of Gleason’s theorem is extraordinarily difficult. In the appendix of these<br />

lecture notes, p. 183, ff, we prove a simplified version of this theorem for the interested reader.<br />

1 Conditions (i) and (ii) of (III. 29) are not mutually independent in the complex Hilbert space of the formalism of<br />

quantum mechanics, in this space (i) is in fact superfluous. In a complex Hilbert space all positive operators are self -<br />

adjoint, and an operator A is uniquely defined by all matrix elements of the form ⟨ψ | A | ψ⟩. This is, however, not the case<br />

in a real space, where Gleason’s theorem is also valid. In that case (i) and (ii) are independent.


48 CHAPTER III. THE POSTULATES<br />

Here we prove that (III. 28) indeed satisfies the requirements (III. 25), (III. 26) and (III. 27) of a<br />

probability measure.<br />

Proof<br />

Requirement (III. 27) is obvious, and verification of (III. 26) can be done with (II. 31). To<br />

prove (III. 25), i.e.<br />

µ(P ) = Tr P W ∈ [0, 1], (III. 30)<br />

we choose an orthonormal basis of eigenvectors of P ; P |v k ⟩ = |v k ⟩, P |u l ⟩ = 0. Then<br />

Tr P W = ∑ k<br />

⟨v k | P W | v k ⟩ + ∑ l<br />

⟨u l | P W | u l ⟩<br />

= ∑ k<br />

⟨v k | W | v k ⟩ 0, (III. 31)<br />

due to the positivity of the operators W . If P is a projector, then 11 − P is one also, therefore<br />

Tr (11 − P )W 0, (III. 32)<br />

such that indeed, with (III. 29) (iii), we see that<br />

0 Tr P W + Tr (11 − P )W = Tr (P + 11 − P )W = Tr W = 1. □ (III. 33)<br />

An important aspect of Gleason’s theorem is the fact that the probability measure (III. 28) is<br />

continuous in P . For measures representing pure states, this is proved in the appendix on p. 183, ff.<br />

If dim H = 2, discontinuous probability measures exist on P (H). To see this, consider a real H.<br />

The 1 - dimensional subspaces are lines through the origin connecting opposite points on the circle.<br />

Attaching values as in the diagram, figure III. 1,<br />

P 2<br />

1<br />

0<br />

P 1<br />

0<br />

1<br />

Figure III. 1: A discontinuous measure for dim H = 2<br />

we see that, with µ(0 ) = 0 and µ(11) = 1, for two arbitrary orthogonal projectors we have<br />

µ(P 1 ) + µ(P 2 ) = 1 = µ(11) = µ(P 1 + P 2 ). (III. 34)


III. 2. PURE AND MIXED STATES 49<br />

This measure is indeed additive, but we also see that it is not continuous, and consequently, Gleason’s<br />

theorem does not hold for dim H = 2.<br />

The operator W is known as the statistical operator, or as the density matrix, or the state operator.<br />

In analogy with the classical case we extend the notion of state and call W a state of the physical<br />

system. From now on states will be represented by the state operators W .<br />

The state operators W form a set S(H) which is again convex; if W 1 and W 2 are state operators,<br />

then<br />

W = w 1 W 1 + w 2 W 2 with 0 w i 1 and w 1 + w 2 = 1 (III. 35)<br />

is again a state operator. The most simple example of a state operator is a 1 - dimensional projector.<br />

A higher - dimensional projector is not a state operator.<br />

EXERCISE 16. Why not?<br />

Before showing how the state operators W represent states, we will prove the next theorem.<br />

THEOREM:<br />

The 1 - dimensional projectors in P(H) are the extreme elements of the convex set S(H)<br />

of all state operators W on H.<br />

Proof<br />

To prove this theorem we first have to show that P ψ cannot be written in the form<br />

P ψ = w W 1 + (1 − w) W 2 , with 0 w 1. (III. 36)<br />

Suppose it could be done. Then, using (II. 37), it also has to hold that, for all |ϕ⟩ ⊥ |ψ⟩,<br />

which implies<br />

⟨ϕ | P ψ | ϕ⟩ = 0 = w ⟨ϕ | W 1 | ϕ⟩ + (1 − w) ⟨ϕ | W 2 | ϕ⟩, (III. 37)<br />

⟨ϕ | W 1 | ϕ⟩ = ⟨ϕ | W 2 | ϕ⟩ = 0. (III. 38)<br />

Now, a positive operator can always be written as the square of a self - adjoint operator, W i = A 2 i ,<br />

yielding that for all |ϕ⟩ ⊥ |ψ⟩<br />

⟨ϕ | W i | ϕ⟩ = ⟨ϕ | A 2 i | ϕ⟩ = ∥A i |ϕ⟩∥ 2 = 0 ⇒ A i |ϕ⟩ = 0 ⇒ W i |ϕ⟩ = 0 .(III. 39)<br />

Therefore, W 1 and W 2 map to the 1 - dimensional space spanned by |ψ⟩. They are, according<br />

to (III. 29), therefore, both identical to the projector P ψ ,<br />

W 1 = P ψ = W 2 . (III. 40)<br />

We thus conclude that P ψ cannot be split up into other state operators.


50 CHAPTER III. THE POSTULATES<br />

Now we have to show that the 1 - dimensional projectors are the only extreme elements. A state<br />

operator is self - adjoint and has, according to the spectral theorem, p. 26, a complete orthonormal<br />

set of eigenstates |w i , j⟩, where j is the degeneracy, j = 1, . . . , n i , and which has M ∈ N +<br />

different w i . We can write an arbitrary W ∈ S (H) as<br />

W =<br />

M∑<br />

∑n i<br />

i=1 j=1<br />

w i W i,j , (III. 41)<br />

where<br />

W i,j := |w i , j⟩ ⟨w i , j|, and<br />

M∑<br />

n i = dim H. (III. 42)<br />

i=1<br />

For w i it holds that<br />

M∑<br />

n i w i = 1 and 0 < w i < 1 (III. 43)<br />

i=1<br />

because, according to (III. 29) (ii) and (III. 29) (iii),<br />

w i = ⟨w i , j | W | w i , j⟩ 0 and Tr W =<br />

M∑<br />

n i w i = 1. (III. 44)<br />

i=1<br />

Thus we se that the sum (III. 41) is a convex decomposition of W .<br />

A convex decomposition W = w 1 W 1 + w 2 W 2 can always be decomposed further through<br />

expansion of W 1 and W 2 . In case of a bounded convex set the expansion ends on extreme<br />

elements. Therefore, if W is an extreme element, the sum has to reduce to one term. In that case<br />

W is a 1 - dimensional projector, and we see that all extreme elements of S(H) are 1 - dimensional<br />

projectors. □<br />

Physical states which are represented by 1 - dimensional projectors are called pure states, where<br />

states which can be divided non - trivially are called mixed states or mixtures. To see that pure states<br />

correspond to the vector states of H, consider W to be the 1 - dimensional projector P ψ projecting<br />

on the vector |ψ⟩. The state defined by this state operator through (III. 28) behaves exactly like the<br />

vector state |ψ⟩; for arbitrary |ϕ⟩ it holds that<br />

µ W (P ϕ ) = Tr P ϕ W = Tr P ϕ P ψ = ⟨ψ | P ϕ | ψ⟩ = |⟨ψ | ϕ⟩| 2 , (III. 45)<br />

which means that the probability to find the state |ϕ⟩ in the state |ψ⟩ is equal to (III. 6). 2 It holds<br />

especially that µ(P ψ ) = 1, and if |ϕ⟩ ⊥ |ψ⟩, then µ(P ϕ ) = 0. We see that the state P ψ assigns a<br />

2 ‘The probability to find the state |ϕ⟩’ is shorthand for the probability to find, upon measurement of the quantity corresponding<br />

to the projector |ϕ⟩ ⟨ϕ|, the value 1.


III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 51<br />

probability to the orthogonal set of vectors from which |ψ⟩ is an element, which is totally concentrated<br />

on the vector |ψ⟩. In this sense P ψ is analogous to a δ - distribution on the classical phase space.<br />

But the 1 - dimensional projectors are, generally, not mutually orthogonal which means that the<br />

pure state P ψ also assigns a positive probability to P ϕ if ⟨ϕ | ψ⟩ ̸= 0. This is contradictory to the<br />

classical case, where the pure state, which is concentrated on (p 0 , q 0 ), i.e., δ(q − q 0 , p − p 0 ), always<br />

assigns a zero probability to every other pure state. This is characteristic for quantum mechanics and<br />

is the cause for the radical difference between quantum states and classical states.<br />

In this section we showed that a unique correspondence exists between the pure states, the extreme<br />

elements of the convex set S (H) of state operators, the 1 - dimensional projectors and, up to a phase<br />

factor, the unit vectors in H. We will conclude this section with a formulation of the extended version<br />

of the state postulate (1) and the generalization of the Born postulate (4).<br />

1 ′ State postulate, mixed and pure states. Every physical system has a corresponding Hilbert<br />

space. The mixed physical states of the system uniquely correspond to the state operators<br />

within S (H), the pure physical states of the system uniquely correspond to the state operators<br />

on the boundary ∂ S (H). States of composite physical systems correspond bijectively to state<br />

operators on the direct product space of the state spaces H 1 and H 2 of the subsystems, i.e., with<br />

elements of S (H 1 ⊗ H 2 ).<br />

◃ 3<br />

4 ′ Generalized Born postulate, discrete case. If the system is in the state W ∈ S (H), the probability<br />

to find, upon measurement of quantity A corresponding to an operator A having a discrete<br />

spectrum, an eigenvalue in ∆ ⊆ Spec A, is equal to<br />

Prob W (A : ∆) = Tr P A (∆)W, (III. 46)<br />

where P A (∆) ∈ P (H) projects on the subspace span by the eigenvectors having their eigenvalues<br />

in ∆.<br />

III. 3<br />

THE INTERPRETATION <strong>OF</strong> MIXED STATES<br />

The spectral decomposition (III. 41) suggests an interpretation of the state W . As we saw in (III. 45),<br />

a pure state W = P ψ corresponds to a probability measure µ, which we call concentrated on the<br />

eigenvector |ψ⟩ since µ(P ψ ) = 1. In the same way an arbitrary W corresponds, according to (III. 41),<br />

to a probability measure on its orthonormal set of eigenvectors |w i , j⟩, assigning a probability w i to<br />

the eigenvector |w i , j⟩. With the projector W i,j as in (III. 42), we have<br />

µ W (W i,j ) = Tr W i,j W = Tr<br />

M∑<br />

k=1<br />

n k ∑<br />

l=1<br />

W i,j w k |w k , l⟩ ⟨w k , l|<br />

=<br />

M∑<br />

k=1<br />

n k ∑<br />

l=1<br />

w k |⟨w k , l | w i , j⟩| 2 = w k δ ik δ jl = w i . (III. 47)<br />

3 Notice how in this extended version of the state postulate the annoying phase factor has disappeared.


52 CHAPTER III. THE POSTULATES<br />

The expectation value of operator A is, according to (III. 7) and replacing P ψ by W , also forming<br />

an orthonormal basis,<br />

⟨A⟩ W = Tr AW, (III. 48)<br />

which yields, using again (III. 7) and the spectral decomposition of W , (III. 41),<br />

⟨A⟩ W = Tr<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

A w i W i,j<br />

=<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i Tr AW i,j =<br />

M∑<br />

i=1<br />

w i<br />

∑n i<br />

j=1<br />

⟨w i , j | A | w i , j⟩. (III. 49)<br />

This is exactly the weighted sum of w i and the expectation values of A in the states |w i , j⟩.<br />

The above suggests that W describes an ensemble of physical systems each of which is in one<br />

of the pure states |w i , j⟩ and that w i is the fraction of systems in |w i , j⟩. This is the way Von Neumann<br />

originally introduced state operators, in analogy to ensembles in classical statistical mechanics,<br />

hence his terminology statistical operator. But this attractive interpretation, known as the ignorance<br />

interpretation of mixtures, is not without problems as we will show now.<br />

In case of degeneracy the choice of the basis vectors in (III. 41) is not unique, and the projector P i<br />

in the subspace corresponding to the eigenvalue w i can be written in terms of basis states in arbitrarily<br />

many ways,<br />

∑n i<br />

j=1<br />

|w i , j⟩ ⟨w i , j| =<br />

∑n i<br />

k=1<br />

|w i , k⟩ ⟨w i , k|, (III. 50)<br />

with { |w i , k⟩} another arbitrary orthonormal basis in this subspace. Therefore, given any W we<br />

cannot say of which vector states the ensemble is composed. To see that this is a general phenomenon,<br />

consider the operator<br />

W =<br />

K∑<br />

p k U k =<br />

k=1<br />

K∑<br />

p k |u k ⟩ ⟨u k |. (III. 51)<br />

k=1<br />

Here K ∈ N + is arbitrary and {|u k ⟩} is an arbitrary basis of unit vectors which are, in general,<br />

not orthogonal, but as long as the p k satisfy 0 p k 1 and ∑ p k = 1, as required in (III. 35), the<br />

operator W in (III. 51) is still a state operator.<br />

Indeed, equation (III. 51) is an alternative decomposition of W into extreme elements, just like<br />

the spectral decomposition. We see that, in contrast to the classical case, convex decompostions are<br />

not unique.<br />

According to the ignorance interpretation, W describes the ensemble as consisting of systems of<br />

which a fraction p k is in the state |u k ⟩, e.g.<br />

⟨A⟩ W = Tr AW =<br />

K∑<br />

p k ⟨u k | A | u k ⟩, (III. 52)<br />

k=1


ut the probability to find the system in |u k ⟩ is<br />

µ W (U k ) = Tr U k W = Tr<br />

III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 53<br />

K∑<br />

m=1<br />

U k p m |u m ⟩ ⟨u m | =<br />

K∑<br />

m=1<br />

p m |⟨u k | u m ⟩| 2 (III. 53)<br />

Although the result (III. 52) is in accordance with the behavior of an ensemble of systems being in<br />

the state |u k ⟩ with probability p k , we see that for (III. 53), contrary to (III. 47), the outcome, i.e. the<br />

probability to find in (III. 51) the state |u k ⟩, is in general not p k , which is a consequence of the non -<br />

orthogonality of the states |u k ⟩. On the other hand, (III. 51) can always be written in the form (III. 41),<br />

in terms of the orthonormal set of eigenvectors of W , which leads to the conclusion that ensembles<br />

which are interpreted as being physically completely different, are described by the same operator W .<br />

This can be compared with the fact that a pure state |ψ⟩ can be written in numerous ways as a<br />

superposition of other pure states, which corresponds to different ways of preparation of |ψ⟩ by superposition<br />

of other states, for instance in a tilted Stern - Gerlach apparatus in case of measurement of<br />

spin. We can no longer see if |ψ⟩ is, for example, a superposition of spin up and down in the z - direction,<br />

or of spin up and down in the x - direction.<br />

For pure states this seems completely natural; it is a direct consequence of the state postulate<br />

which forms a vector space of states. In case of mixed states the situation is less clear. It can be<br />

maintained that an ensemble, of which each system is in the state |u k ⟩ with probability p k , really<br />

differs from an ensemble of systems which are in a state |w i , j⟩ with probability w i , even though the<br />

expectation values of all physical quantities are equal for both ensembles. In that case, from<br />

W =<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i |w i , j⟩ ⟨w i , j| =<br />

K∑<br />

p k |u k ⟩ ⟨u k | (III. 54)<br />

k=1<br />

it has to be concluded that the state operator W characterizes these ensembles incompletely. There is<br />

no postulate in quantum mechanics by which this is prohibited.<br />

Another view is, however, that the state operator is a complete description of a state, the different<br />

possible ways of preparation are not retrievable from the state W . Consequently, the conclusion has<br />

to be that W , in (III. 51), does not characterize an ensemble which exists of a mixture of systems in<br />

pure states |u k ⟩, but an ensemble characterized by W only presents itself as such an ensemble upon<br />

measurement. Here we see again that, in quantum mechanics, we get in trouble if we speak in terms<br />

of what really exists. In section III. 5 we will return to this discussion in the context of improperly<br />

mixed states.<br />

The dynamics of mixed states follows, as in the classical case, from the pure states. Define,<br />

analogously to (III. 41),<br />

W (t) :=<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i W i,j (t). (III. 55)<br />

According to the Schrödinger postulate, (III. 2),<br />

|w i , j, t⟩ := U (t − t 0 ) |w i , j, t 0 ⟩, (III. 56)


54 CHAPTER III. THE POSTULATES<br />

which yields for (III. 55)<br />

W (t) =<br />

M∑<br />

i=1<br />

∑n i<br />

j=1<br />

w i U (t − t 0 ) W i,j (t 0 ) U † (t − t 0 ), (III. 57)<br />

and therefore<br />

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 58)<br />

With (III. 11) we find<br />

i d dt W (t) = [H, W (t 0)], (III. 59)<br />

which is the analogue of the Liouville equation of motion, (III. 21), describing the time evolution of<br />

the states ρ. Equation (III. 59) is called the Liouville - Von Neumann equation, it is the generalization<br />

of the Schrödinger equation to an equation for mixed states.<br />

The extensions of the Schrödinger postulate and the projection postulate for mixed states can now<br />

be formulated.<br />

5 ′ Generalized Schrödinger postulate. If no measurements are made on the physical system, the<br />

time evolution of the state of the system is described by a unitary transformation,<br />

W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 60)<br />

6 ′ Generalized projection postulate, discrete case. If the system is in a state W when a measurement<br />

is made on a physical quantity A corresponding to an operator A having a discrete spectrum,<br />

and the outcome of the measurement is the eigenvalue a i ∈ R, the system is, directly<br />

after the measurement, in the eigenspace corresponding to the eigenvalue a i ,<br />

W P a i<br />

W P ai<br />

Tr P ai W P ai<br />

. (III. 61)<br />

◃ Remark<br />

Remember that, in general, the projectors P ai do not have to be 1 - dimensional. ▹<br />

Finally, we give a theorem concerning the generalized Schrödinger postulate which is important<br />

for the measurement problem.<br />

VON NEUMANN’S THEOREM A:<br />

The properties ‘pure’ and ‘mixed’ are invariant under a unitary time evolution.


III. 4. COMPOSITE SYSTEMS 55<br />

Proof<br />

We know that if W is pure, i.e. equal to a 1-dimensional projector, then W 2 = W .<br />

Now consider the expression (sometimes called the purity of W ):<br />

Tr W 2 = ∑ i,<br />

w 2<br />

i (III. 62)<br />

since Tr W = 1 → ∑ i w i = 1, and W is pure iff exactly one of the w i is equal to 1, and all<br />

others vanish, we conclude that<br />

Tr W 2 = 1iff W is pure; Tr W 2 < 1iff W is mixed (III. 63)<br />

But Tr W 2 is invariant under the time evolution (III. 60). Indeed, if we remember that U † (t −<br />

t 0 ) = U −1 (t − t 0 ) and that Tr AB = Tr BA, it follows that<br />

Tr (W(t)) 2 = Tr U(t−t 0 ) W(t 0 ) U † (t−t 0 )U(t−t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U(t−t 0 ) W(t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U<br />

□<br />

III. 4<br />

COMPOSITE SYSTEMS<br />

Suppose that a system S is composed of two subsystems S I and S II . The Hilbert spaces associated<br />

with S I and S II are H I and H II , with dim H I = N I and dim H II = N II , the Hilbert space<br />

associated with S is the direct product space H = H I ⊗ H II , with dim H = N. If |α 1 ⟩, . . . , |α n ⟩<br />

and |β 1 ⟩, . . . , |β m ⟩ are bases of the subspaces H I and H II , {|α i ⟩ ⊗ |β j ⟩} forms a basis in H. An<br />

arbitrary vector in H is a superposition of such direct products of basis vectors and is generally not of<br />

the form |ψ⟩ ⊗ |ϕ⟩, with |ψ⟩ ∈ H I and |ϕ⟩ ∈ H II . Consequently, one cannot say for such an arbitrary<br />

state in H that the subsystems are in some pure state in H I or H II .<br />

This entanglement of the subsystems, when |Ψ⟩ ̸= |ψ⟩ ⊗ |ϕ⟩, with |Ψ⟩ ∈ H, which is characteristic<br />

for quantum mechanics, has no analogue in classical mechanics. It is a consequence of the formal<br />

requirement that the state space of a composite system is also a vector space. Entanglement is the aspect<br />

of the quantum mechanical description that gives rise to the EPR - paradox and the measurement<br />

problem as we shall see in later chapters.<br />

The quantities of system S correspond to self - adjoint operators in H. We make the supposition<br />

that quantities of the subsystem S I correspond to operators of the form A ⊗ 11 in H, where A is<br />

a self - adjoint operator in H I , and quantities of S II correspond analogously to operators of the<br />

form 11 ⊗ B, with B in H II . A state of S is given by a state operator W in H; W ∈ S (H). In<br />

general, W is not a direct product of operators, but in case W can be written as a direct product, we<br />

write W = W 1 ⊗ W 2 , with W 1 and W 2 state operators in H I and H II , respectively.


56 CHAPTER III. THE POSTULATES<br />

EXERCISE 17. Prove the following statements.<br />

(a) W = W 1 ⊗ W 2 is a state operator if W 1 and W 2 are state operators.<br />

(b) The opposite of (a) is not true; give a counterexample.<br />

(c) W = W 1 ⊗ W 2 is pure iff both W 1 and W 2 are pure.<br />

EXERCISE 18. Prove that for all vectors |ψ⟩, |ψ ′ ⟩ ∈ H I and |ϕ⟩, |ϕ ′ ⟩ ∈ H II we have<br />

(<br />

|ψ⟩ ⊗ |ϕ⟩<br />

)(<br />

⟨ψ<br />

′<br />

| ⊗ ⟨ϕ ′ | ) = |ψ⟩ ⟨ψ ′ | ⊗ |ϕ⟩ ⟨ϕ ′ |. (III. 65)<br />

THEOREM:<br />

If W is a direct product of operators, W = W 1 ⊗ W 2 , the subsystems are mutually<br />

independent, i.e., the probability to find for A⊗11 the value a i and for 11⊗B the value b j<br />

is equal to the product of the separate probabilities. In this case the expectation values<br />

factorize too, such that ⟨A ⊗ B⟩ W 1 ⊗ W 2<br />

= ⟨A⟩ W 1<br />

⟨B⟩ W 2<br />

.<br />

Proof<br />

Let a i and b j be eigenvalues of A and B, respectively. Using (III. 65) we see that the projector on<br />

the eigenstate |a i ⟩ ⊗ |b j ⟩ of A ⊗ B is P |ai⟩ ⊗ P |bj⟩. Therefore, with (II. 102), p. 33,<br />

( )<br />

µ W P|ai⟩ ⊗ P |bj⟩<br />

= Tr ( )( )<br />

P |ai⟩ ⊗ P |bj⟩ W 1 ⊗ W 2<br />

= Tr ( )<br />

P |ai ⟩W 1 ⊗ P |bj ⟩W 2<br />

= Tr P |ai ⟩W 1 Tr P |bj ⟩W 2<br />

=<br />

( ( )<br />

µ W 1 P|ai⟩)<br />

µW<br />

2<br />

P|bj⟩<br />

=<br />

(<br />

µ W P|ai⟩ ⊗ 11 ) (<br />

µ W 11 ⊗ P|bj⟩)<br />

, (III. 66)<br />

which proves the first part of the theorem.<br />

For the factorization of the expectation values, we have, analogously,<br />

⟨A ⊗ B⟩ W 1 ⊗W 2<br />

= Tr (A ⊗ B)(W 1 ⊗ W 2 ) = Tr AW 1 Tr BW 2<br />

= ⟨A⟩ W 1<br />

⟨B⟩ W 2<br />

, (III. 67)<br />

and we see that the expectation values indeed factorize. □<br />

From (III. 67) we also see that, if W = W 1 ⊗ W 2 , then ⟨A ⊗ 11⟩ W = Tr A W 1 = ⟨A⟩ W1<br />

and ⟨11 ⊗ B⟩ W = ⟨B⟩ W2 , but this does not hold for more general statistical operators W .


III. 4. COMPOSITE SYSTEMS 57<br />

With (II. 99), for an arbitrary state operator W , hence in general W ≠ W 1 ⊗ W 2 , the expectation<br />

value of A ⊗ 11 is<br />

⟨A ⊗ 11⟩ W = Tr (A ⊗ 11)W<br />

=<br />

=<br />

=<br />

∑N I ∑N II<br />

i=1<br />

N I ∑<br />

i=1<br />

N I ∑<br />

i=1<br />

j=1<br />

N I ∑<br />

k=1 j=1<br />

N I<br />

(<br />

⟨αi | ⊗ ⟨β j | )( A ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />

N II ∑<br />

∑<br />

⟨α i | A | α k ⟩<br />

k=1<br />

(<br />

⟨αi | ⊗ ⟨β j | )( A |α k ⟩ ⟨α k | ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />

N II ∑<br />

j=1<br />

(<br />

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) . (III. 68)<br />

To find ⟨A ⊗ 11⟩ W , define the operator W I in H I , called the partial trace of W in relation to H II ,<br />

W I = Tr II W :=<br />

N II<br />

∑<br />

⟨β j | W | β j ⟩, W I ∈ S (H I ). (III. 69)<br />

j=1<br />

For this partial trace it holds that<br />

⟨α k | W I | α i ⟩ =<br />

N II ∑<br />

j=1<br />

and substituting (III. 70) in (III. 68) yields<br />

⟨A ⊗ 11⟩ W =<br />

N I ∑<br />

i=1<br />

(<br />

⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) , ⟨α k | W I | α i ⟩ ∈ R, (III. 70)<br />

N I<br />

∑<br />

⟨α i | A | α k ⟩ ⟨α k | W I | α i ⟩ = Tr AW I = ⟨A⟩ WI . (III. 71)<br />

k=1<br />

Analogously, with W II the partial trace of W in relation to H I ,<br />

W II = Tr I W :=<br />

N I<br />

∑<br />

⟨α i | W | α i ⟩, W II ∈ S (H II ), (III. 72)<br />

i=1<br />

we see that<br />

⟨11 ⊗ B⟩ W = Tr BW II = ⟨B⟩ WII . (III. 73)<br />

EXERCISE 19. Prove that Tr II W and Tr I W are state operators in H I and H II , respectively.


58 CHAPTER III. THE POSTULATES<br />

Concerning the expectation values of the quantities of the subsystem S I alone we can replace the<br />

state W by the partial trace, or state operator, Tr II W in H I , analogously for S II . Therefore it is<br />

customary to let the states of the subsystems correspond to the partial traces Tr II W and Tr I W .<br />

For the partial traces it holds that if W is a direct product of state operators W 1 and W 2 in H I<br />

and H II , respectively, W can also be written as a direct product of its partial traces, which we now<br />

show in a lemma.<br />

LEMMA:<br />

If W is a direct product of the form W = W 1 ⊗ W 2 , where W 1 and W 2 are state operators<br />

in H I and H II , respectively, then Tr II W = W 1 and Tr I W = W 2 .<br />

Proof<br />

Tr II W = Tr II (W 1 ⊗ W 2 ) =<br />

∑N II<br />

⟨β j | W 1 ⊗ W 2 | β j ⟩<br />

j=1<br />

∑N II<br />

= W 1 ⟨β j | W 2 | β j ⟩ = W 1 Tr W 2 = W 1 , (III. 74)<br />

j=1<br />

likewise,<br />

Tr I (W 1 ⊗ W 2 ) = W 2 . □ (III. 75)<br />

From this lemma we see that W = W 1 ⊗ W 2 = Tr II W ⊗ Tr I W , and with the first theorem<br />

of this section, p. 56, this leads to the conclusion that if W is a direct product of its partial traces, it<br />

can be uniquely reconstructed from its partial traces. Generally, an arbitrary state operator W of the<br />

composite system can not be defined by its partial traces, which was shown by Von Neumann.<br />

VON NEUMANN’S THEOREM B:<br />

The partial traces Tr II W and Tr I W uniquely define W , iff at least one of the partial<br />

traces is pure, in which case W is factorizable,<br />

W = Tr II W ⊗ Tr I W. (III. 76)<br />

Proof<br />

Let {|u i ⟩} be a basis of eigenstates of W I having non - degenerate eigenvalues. Leaving out the<br />

eigenvalues p n and u i which are equal to 0, expand W and Tr II W in their eigenvectors,<br />

W =<br />

N∑<br />

p n |ψ n ⟩ ⟨ψ n | with |ψ n ⟩ ∈ H (III. 77)<br />

n=1<br />

and<br />

Tr II W =<br />

∑N I<br />

i=1<br />

u i |u i ⟩ ⟨u i | with |u i ⟩ ∈ H I . (III. 78)


III. 4. COMPOSITE SYSTEMS 59<br />

◃ Remark<br />

Leaving out the eigenvalues u i = 0, the eigenvectors |u i ⟩ with eigenvalue 0 do not occur in the<br />

expansion of Tr II W , however, they do belong to the complete basis basis {|u i ⟩}. ▹<br />

Let {|v j ⟩} be a basis in H II . Then {|u i ⟩ ⊗ |v j ⟩} is a basis in H, and |ψ n ⟩ can be expanded as<br />

where<br />

|ψ n ⟩ =<br />

|ϕ n i ⟩ :=<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N II<br />

j=1<br />

ψ n<br />

ij |u i ⟩ ⊗ |v j ⟩ =<br />

∑N I<br />

i=1<br />

|u i ⟩ ⊗ |ϕ n i ⟩ (III. 79)<br />

ψ n<br />

ij |v j ⟩ ∈ H II . (III. 80)<br />

These |ϕi n ⟩ are, in general, not orthogonal. Substituting (III. 79) in (III. 77) we have<br />

W =<br />

N∑<br />

n=1<br />

p n<br />

∑N I<br />

∑N I<br />

i=1 k=1<br />

|u i ⟩ ⟨u k | ⊗ |ϕ n i ⟩ ⟨ϕ n k |. (III. 81)<br />

Subtitution of (III. 81) in (III. 69) yields<br />

Tr II W =<br />

∑N II<br />

⟨β l | W | β l ⟩ =<br />

l=1<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

∑N II<br />

p n |u i ⟩ ⟨u k | ⟨β l | ϕi n ⟩ ⟨ϕk n | β l ⟩<br />

l=1<br />

=<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

p n ⟨ϕ n k | ϕ n i ⟩ |u i ⟩ ⟨u k |. (III. 82)<br />

With {|ψ i ⟩} a basis, the coefficients in the expansion of an operator of the form ∑ ij c ij|ψ i ⟩⟨ψ j |<br />

are unique, and comparison of (III. 82) with (III. 78) gives<br />

therefore,<br />

N∑<br />

p n ⟨ϕk n | ϕi n ⟩ = u i δ ik , (III. 83)<br />

n=1<br />

Tr II W =<br />

∑N I<br />

∑N I<br />

i=1 k=1<br />

u i δ ik |u i ⟩ ⟨u k | =<br />

∑N I<br />

i=1<br />

u i |u i ⟩ ⟨u i |. (III. 84)<br />

◃ Remark<br />

In (III. 83) it follows for i = k, due to the positivity of the p n , that if u i = 0 for certain i,<br />

then |ϕ n i ⟩ = 0 for all n and we see that in (III. 79) only the terms appear for which u i ≠ 0.<br />

Consequently, the same terms occur in (III. 79) as in the expansion (III. 78) of Tr II W . ▹


60 CHAPTER III. THE POSTULATES<br />

If Tr II W is pure, there is only one term<br />

Tr II W = |u 1 ⟩ ⟨u 1 |, (III. 85)<br />

and substitution in (III. 79) yields<br />

|ψ n ⟩ = |u 1 ⟩ ⊗ |ϕ 1 n ⟩. (III. 86)<br />

Therefore,<br />

W =<br />

N∑<br />

N∑<br />

p n |u 1 ⟩ ⟨u 1 | ⊗ |ϕ n 1 ⟩ ⟨ϕ n 1 | = |u 1 ⟩ ⟨u 1 | ⊗ p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 87)<br />

n=1<br />

n=1<br />

Analogous to (III. 82) we find for<br />

Tr I W =<br />

N∑<br />

∑N I<br />

∑N I<br />

n=1 i=1 k=1<br />

p n ⟨u k | u i ⟩ |ϕ n i ⟩ ⟨ϕ n k |. (III. 88)<br />

With i = k = 1 and ⟨u 1 | u 1 ⟩ = 1 we have<br />

Tr I W =<br />

N∑<br />

p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 89)<br />

n=1<br />

Substituting (III. 89) in (III. 87) we see that W = Tr II W ⊗ Tr I W . Indeed, if one of the partial<br />

traces is pure, W is factorizable, and therefore completely determined, by its partial traces.<br />

To show the ‘only if’ - part of the theorem, that Tr II W and Tr I W uniquely define the state W<br />

of the composite system only if at least one of the partial traces is pure, since only in that<br />

case W is factorizable, we decompose them into orthogonal 1 - dimensional eigenprojectors,<br />

where both u i , v j ∈ [0, 1] sum up to 1 as required in (III. 35) for the projectors to be state<br />

operators,<br />

Tr II W =<br />

Tr I W =<br />

It then holds that<br />

∑N I<br />

i=1<br />

∑N II<br />

j=1<br />

Tr II W ⊗ Tr I W =<br />

u i |u i ⟩ ⟨u i | :=<br />

v j |v j ⟩ ⟨v j | :=<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N I<br />

i=1<br />

∑N II<br />

j=1<br />

u i U i , (III. 90)<br />

v j V j . (III. 91)<br />

u i v j U i ⊗ V j . (III. 92)


Now consider an operator W of the form<br />

III. 4. COMPOSITE SYSTEMS 61<br />

W =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

which is, in general, not factorizable.<br />

z ij U i ⊗ V j , (III. 93)<br />

EXERCISE 20. Prove that U i ⊗ V j is a 1 - dimensional projector in H.<br />

The operator W , (III. 93), is a state operator if<br />

z ij ∈ [0, 1] and<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

z ij = 1, (III. 94)<br />

furthermore, with (III. 69) and (III. 72) we have<br />

and<br />

Tr II W =<br />

Tr I W =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

z ij U i (III. 95)<br />

z ij V j . (III. 96)<br />

This system has an infinite number of solutions for the unknown z ij , unless one of the partial<br />

traces is pure, e.g. Tr II W = U 1 . In that case, according to (III. 95) it has to hold for i = 1<br />

that ∑ j z 1j = 1. But then (III. 35) requires that ∑ j z ij = 0 if i ≠ 1, which means that, because<br />

of the non - negativity of the z ij , it has to hold that z ij = 0 if i ≠ 1. Substituting i = 1 in (III. 93)<br />

yields<br />

W =<br />

∑N II<br />

j=1<br />

z 1j U 1 ⊗ V j<br />

= U 1 ⊗<br />

∑N II<br />

j=1<br />

z 1j V j = Tr II W ⊗ Tr I W, (III. 97)<br />

where the last step is in accordance with (III. 96).<br />

We conclude that only if, at least, one of the partial traces is pure, W is factorizable. □<br />

In the foregoing we saw that only if the state operator W of a composite system is factorizable, it<br />

can be uniquely defined. Contrary to classical physics, in quantum mechanics maximal knowledge of<br />

the state of the subsystems is in general not equivalent to maximal knowledge of the state of the entire


62 CHAPTER III. THE POSTULATES<br />

system. Consequently, the state of the entire system can, generally, not be derived from measurements<br />

on the separate subsystems. 4<br />

If the partial traces of W = W 1<br />

⊗ W 2 are both pure, W is also pure, as we saw in the exercise<br />

on p. 56, and since the pure partial traces each have only one term W is of the form |u⟩ ⟨u| ⊗ |v⟩ ⟨v|.<br />

On the other hand, a pure state in H is, generally, not factorizable, which we will show in an example.<br />

EXAMPLE<br />

If |u i ⟩ and |v j ⟩ span a basis in H I and H II , respectively, an arbitrary vector |ψ⟩ in H = H I ⊗ H II<br />

is of the form<br />

|ψ⟩ =<br />

∑N I<br />

∑N II<br />

i=1 j=1<br />

c ij |u i ⟩ ⊗ |v j ⟩. (III. 98)<br />

An arbitrary pure state in H is therefore of the form<br />

|ψ⟩ ⟨ψ| =<br />

∑N I<br />

∑N II<br />

∑N I<br />

∑N II<br />

i=1 j=1 k=1 l=1<br />

Consider the following pure entangled state in H,<br />

c ∗ kl c ij<br />

(<br />

|ui ⟩ ⊗ |v j ⟩ )( ⟨u k | ⊗ ⟨v l | ) . (III. 99)<br />

|Φ⟩ = 1 2<br />

√<br />

2<br />

(<br />

|u1 ⟩ ⊗ |v 1 ⟩ + |u 2 ⟩ ⊗ |v 2 ⟩ ) . (III. 100)<br />

The corresponding W is the 1 - dimensional projector<br />

(<br />

W = |Φ⟩ ⟨Φ| = 1 2 |u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 2 |<br />

+ |u 2 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) . (III. 101)<br />

This pure state W is not factorizable, and cannot be written in the form (III. 93). But although W<br />

is pure, its partial traces are not pure,<br />

Tr II W =<br />

Tr I W =<br />

∑N II<br />

(<br />

⟨v j | Φ⟩ ⟨Φ | v j ⟩ = 1 2 |u1 ⟩ ⟨u 1 | + |u 2 ⟩ ⟨u 2 | ) , (III. 102)<br />

j=1<br />

∑N I<br />

i=1<br />

⟨u i | Φ⟩ ⟨Φ | u i ⟩ = 1 2<br />

(<br />

|v1 ⟩ ⟨v 1 | + |v 2 ⟩ ⟨v 2 | ) , (III. 103)<br />

and indeed,<br />

W I ⊗ W II = 1 4<br />

(<br />

|u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 2 | +<br />

|u 2 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) ≠ W. (III. 104)<br />

4 This aspect of the quantum mechanical state description is, however, analogous to a classical state description with a<br />

probability distribution. The two - particle distribution function ρ(q 1 , p 1 ; q 2 , p 2 ) is not uniquely defined by the marginal<br />

distribution functions<br />

∫<br />

∫<br />

ρ 1 (q 1 , p 1 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 2 dp 2 and ρ 2 (q 2 , p 2 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 1 dp 1 ,<br />

the marginals are, after all, analogous to the partial traces.


III. 5. PROPER AND IMPROPER MIXTURES 63<br />

III. 4. 1<br />

SUMMARY<br />

1. The state operator W ∈ S (H) of a composite system, whether pure or not, is not factorizable<br />

in general.<br />

2. If W is factorizable, the factors are equal to the partial traces of W ,<br />

W = W 1 ⊗ W 2 implies W 1 = Tr II W and W 2 = Tr I W. (III. 105)<br />

3. The partial traces uniquely define W iff, at least, one of the partial traces is pure, in which<br />

case W is directly factorizable, W = W 1 ⊗ W 2 .<br />

4. The partial traces of W are pure iff W is pure and of the form W = ( |u⟩ ⊗ |v⟩ )( ⟨u| ⊗ ⟨v| ) ,<br />

with |u⟩ ∈ H I and |v⟩ ∈ H II .<br />

III. 5<br />

PROPER AND IMPROPER MIXTURES<br />

The states of composite systems shed new insight on the interpretation of mixtures. Suppose that<br />

W I and W II are the partial traces of an arbitrary state operator W , and, with u i , v j ∈ [0, 1], it holds<br />

that<br />

W I =<br />

N I ∑<br />

i=1<br />

u i |u i ⟩ ⟨u i | and W II =<br />

N II ∑<br />

j=1<br />

v j |v j ⟩ ⟨v j |. (III. 106)<br />

W I and W II contain all quantum mechanical information about results of measurements on the subsystems<br />

in H I and H II . The question is whether we can interpret this by assuming that the individual<br />

subsystems are in the pure states |u i ⟩ and |v j ⟩, with probabilities u i and v j , respectively. If this were<br />

the case, the composite system could be divided in subensembles of systems in the states |u i ⟩ ⊗ |v j ⟩<br />

with probabilities depending on possible correlations between the values of i and j. The state would<br />

be of the form<br />

W ′ =<br />

=<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

p ij<br />

(<br />

|ui ⟩ ⊗ |v j ⟩ )( ⟨u i | ⊗ ⟨v j | )<br />

p ij |u i ⟩ ⟨u i | ⊗ |v j ⟩ ⟨v j |. (III. 107)<br />

The coefficients p ij have to satisfy<br />

p ij ∈ [0, 1],<br />

N II ∑<br />

j=1<br />

p ij = u i ,<br />

N I ∑<br />

i=1<br />

p ij = v j<br />

and<br />

∑N I ∑N II<br />

i=1<br />

j=1<br />

p ij = 1, (III. 108)


64 CHAPTER III. THE POSTULATES<br />

but otherwise they are free to choose. As far as being in one of the states |u i ⟩ or |v j ⟩ can be interpreted<br />

as a property the subsystems possess, all correlations between these properties in the total state can<br />

be expressed by the p ij . If there are no correlations, p ij = u i v j .<br />

But we see that W ′ is of the special form (III. 93), and therefore in general not equal to the arbitrary<br />

state operator W we started with, it cannot be said that the individual subsystems are in the pure<br />

states |u i ⟩ and |v j ⟩. Although W I and W II are state operators, they cannot be interpreted as mixtures of<br />

pure states. The mixed states W I and W II are called improper mixtures by B. d’Espagnat (1989, p.61).<br />

Proper mixed states can in principle be taken as an ensemble of systems which are in pure states,<br />

where improper states cannot.<br />

The foregoing shows that the concept of mixed states is forced upon us by the theory of composite<br />

systems as a natural extension of the concept of pure states. Even if the composite system is in a pure<br />

state, the subsystems are generally not pure, it is not correct to understand mixed states in general as<br />

simple mixtures of pure states, in the way the mixture of pieces in the box of a game of chess consists<br />

of black and white pieces.<br />

Finally, we make an observation about similar, or identical, particles. A system of similar particles<br />

is described in quantum mechanics by symmetrized states. Consider the following symmetrized<br />

two - particle state<br />

|Ψ(1, 2)⟩ = 1 2<br />

√<br />

2<br />

(<br />

|u⟩ ⊗ |v⟩ ± |v⟩ ⊗ |u⟩<br />

)<br />

, (III. 109)<br />

where the first factor in each direct product is related to particle 1, and the second to particle 2. In<br />

this case the two subspaces are identical and |u⟩ and |v⟩ can represent states in both one and the other<br />

subspace. The corresponding state operator is<br />

W = |Ψ(1, 2)⟩ ⟨Ψ(1, 2)| = 1 (<br />

2 |u⟩ ⟨u| ⊗ |v⟩ ⟨v| ± |u⟩ ⟨v| ⊗ |v⟩ ⟨u|<br />

the partial traces are<br />

and<br />

W I = Tr II W = 1 2<br />

W II = Tr I W = 1 2<br />

± |v⟩ ⟨u| ⊗ |u⟩ ⟨v| + |v⟩ ⟨v| ⊗ |u⟩ ⟨u| ) , (III. 110)<br />

(<br />

|u⟩ ⟨u| + |v⟩ ⟨v|<br />

)<br />

, (III. 111)<br />

(<br />

|v⟩ ⟨v| + |u⟩ ⟨u|<br />

)<br />

, (III. 112)<br />

and we see that the partial traces are identical. We have to say that both particles are in the same state,<br />

we certainly can not say that one particle is in the state |u⟩ and the other in |v⟩. We cannot assign a<br />

pure state to the separate particles, although the state of the composite system is pure.<br />

III. 6<br />

SPIN 1/2 PARTICLES<br />

The time - dependent Schrödinger equation for the wave function Ψ(q, t) is given by<br />

i ∂Ψ<br />

∂t<br />

= − 2<br />

2m ∇2 Ψ + V Ψ. (III. 113)


III. 6. SPIN 1/2 PARTICLES 65<br />

In this equation<br />

⃗p = − i ⃗ ∇ (III. 114)<br />

is the canonical momentum operator, yielding for the components of the angular momentum ⃗ L = ⃗q×⃗p<br />

for a system in 3 - dimensional space<br />

L i = − i ϵ ijk q j ∂ k . (III. 115)<br />

These components do not commute,<br />

[L i , L j ] = i ϵ ijk L k , (III. 116)<br />

but the operator ⃗ L 2 = L 2 x + L 2 y + L 2 z does commute with ⃗ L, or with any one of its components,<br />

where usually L z is taken.<br />

The simultaneous eigenstates of ⃗ L 2 and L z are written as |l, m⟩, and their eigenvalues are discrete,<br />

⃗L 2 |l, m⟩ = 2 l (l + 1) |l, m⟩, with l = 0, 1 2 , 1, 3 2<br />

, . . . , (III. 117)<br />

L z |l, m⟩ = m |l, m⟩ with m = − l, − l + 1, . . . , l − 1, l. (III. 118)<br />

Although the algebraic derivation using the commutation relations allows for half integer values,<br />

for angular momentum ⃗ L the values of l can only be integers to make sense physically. But the half<br />

integer values are included in the description of spin.<br />

Spin ⃗ S is an internal degree of freedom of elementary particles, which cannot easily be described<br />

in classical terms, but is similar to ⃗ L. A main difference is that where the value of the angular momentum<br />

of a particle can vary, the value s of spin of a particle is constant. The similarity is that spin has,<br />

like ⃗ L, a direction ⃗n in 3 - dimensional space, and satisfies the commutation relations of (III. 116).<br />

Writing the simultaneous eigenstates of ⃗ S 2 and S z as |s, m⟩, we can use (III. 117) and (III. 118)<br />

again, where L 2 and L z are replaced by S 2 and S z , respectively, and l by s. The eigenvalues of ⃗ S 2<br />

and S z are<br />

s = 0 : ⃗ S 2 = 0, S z = 0, (III. 119)<br />

s = 1 2 : S ⃗ 2 = 3 4 2 , S z = − 1 2 , 1 2<br />

, (III. 120)<br />

and so on for s = 0, 1 2 , 1, 3 2<br />

, . . . . In this section we restrict ourselves to the most simple non - trivial<br />

case, spin 1/2.<br />

For spin 1/2 particles there are only two orthonormal eigenstates, | 1 2 , 1 2 ⟩ and | 1 2 , − 1 2<br />

⟩, called<br />

‘spin up’ and ‘spin down’, usually written as |↑⟩ and |↓⟩, respectively. Together, these eigenstates<br />

form a basis for a spin space, the 2 - dimensional Hilbert space H = C 2 .<br />

According to the observables postulate, p. 41, the observable spin corresponds uniquely to a self -<br />

adjoint, or Hermitian, operator A in H. Every Hermitian operator in C 2 can be represented in the<br />

aforementioned basis as a 2 × 2 - matrix,<br />

A =<br />

( )<br />

a11 a 12<br />

a 21 a 22<br />

=<br />

( )<br />

a0 + a z a x − ia y<br />

a x + ia y a 0 − a z<br />

= a 0 11 + a x σ x + a y σ y + a z σ z = a 0 11 + ⃗a · ⃗σ, (III. 121)


66 CHAPTER III. THE POSTULATES<br />

with real coefficients a 0 and ⃗a, and ⃗σ defined by the Pauli matrices,<br />

σ x =<br />

( ) 0 1<br />

, σ<br />

1 0 y =<br />

( ) 0 −i<br />

, σ<br />

i 0 z =<br />

( ) 1 0<br />

. (III. 122)<br />

0 −1<br />

EXERCISE 21. Prove the aforementioned statement.<br />

( (<br />

Writing the eigenvectors of σ z , 1<br />

and 0<br />

, as |z ↑⟩ and |z ↓⟩, we have<br />

0)<br />

1)<br />

σ z |z ↑⟩ = |z ↑⟩ and σ z |z ↓⟩ = − |z ↓⟩. (III. 123)<br />

Analogously, let |x ↑⟩, |x ↓⟩ and |y ↑⟩, |y ↓⟩ denote eigenstates for the eigenvalues ±1 of σ x and σ y .<br />

The Pauli matrices have the following properties:<br />

σ 2 x = σ 2 y = σ 2 z = 11, (III. 124)<br />

σ i σ j = i ϵ ijk σ k , (III. 125)<br />

Tr ⃗σ = 0. (III. 126)<br />

Using the anticommutation relations for the Pauli matrices, [σ i , σ j ] +<br />

from (III. 125), we find a useful relation,<br />

= 0, which follow directly<br />

(⃗a · ⃗σ) ( ⃗ b · ⃗σ) = (⃗a · ⃗b) 11 + i ⃗σ · (⃗a × ⃗ b) (III. 127)<br />

from which it follows that<br />

(⃗a · ⃗σ) 2 = 11 if ∥⃗a∥ = 1. (III. 128)<br />

A 2 × 2 - matrix A has eigenvalues ±1 iff A 2 = 11, and therefore, with ⃗n a unit vector, we see<br />

that the only operators of the form (III. 121) having eigenvalues ±1 are precisely of the form ⃗n · ⃗σ.<br />

This allows us to let spin in the direction ⃗n correspond to the operator<br />

⃗S = 1 2<br />

⃗n · ⃗σ. (III. 129)<br />

We will found this choice shortly, but first we determine the eigenvectors of the spin operator ⃗n · ⃗σ.<br />

Writing ⃗n in spherical coordinates<br />

⃗n =<br />

⎛ ⎞<br />

sin θ cos ϕ<br />

⎝sin θ sin ϕ⎠ , (III. 130)<br />

cos θ


III. 6. SPIN 1/2 PARTICLES 67<br />

we have<br />

⃗n · ⃗σ =<br />

( cos θ e<br />

− i ϕ )<br />

sin θ<br />

e i ϕ , (III. 131)<br />

sin θ − cos θ<br />

with eigenvectors<br />

|⃗n, +⟩ =<br />

(<br />

)<br />

e − i 2 ϕ cos 1 2 θ<br />

e i 2 ϕ sin 1 2 θ<br />

and |⃗n, −⟩ =<br />

(<br />

)<br />

− e − i 2 ϕ sin 1 2 θ<br />

e i 2 ϕ cos 1 2 θ<br />

(III. 132)<br />

for eigenvalues ±1.<br />

EXERCISE 22. Verify (III. 132)<br />

III. 6. 1<br />

SPIN 1/2 AND ROTATIONS IN SPIN SPACE<br />

A rotation over an angle α ∈ [0, π) around an axis in the direction of the unit vector ⃗m,<br />

with ⃗m ∈ R 3 , can be written as a unitary matrix<br />

U (⃗m, α) = e − i α ( ⃗m · ⃗J) , (III. 133)<br />

where the total angular momentum J ⃗ = L ⃗ + S ⃗ is the infinitesimal generator of rotations. With L ⃗ = 0<br />

and writing S i = 1 2 σ i, which is, using (III. 124), in accordance to (III. 120) and the still unfounded<br />

(III. 129), the Pauli matrices are the generators of rotations in C 2 , leading to<br />

U (⃗m, α) = e − i 2 α ( ⃗m · ⃗σ) , (III. 134)<br />

where ∥⃗m∥ is again 1. Using Taylor expansions, with (III. 128) we find for (III. 134)<br />

∞∑ (− i) k (⃗m · ⃗σ) k (<br />

U(⃗m, α) =<br />

1<br />

k!<br />

2 α) k<br />

=<br />

k=0<br />

∞∑<br />

k=0<br />

k=even<br />

(− 1) 1 2 k ( 1<br />

k!<br />

2 α) ∑<br />

k ∞ 11 + i (⃗m · ⃗σ)<br />

k=1<br />

k=odd<br />

(− 1) 1 2 (k+1) ( 1<br />

k!<br />

2 α) k<br />

= cos 1 2 α 11 − i (⃗m · ⃗σ) sin 1 2α. (III. 135)<br />

It can be verified that, under a rotation around an axis ⃗m over an angle α, with ⃗n R the unit vector<br />

in the rotated direction, the eigenstates of ⃗n · ⃗σ, (III. 132), transform into the eigenstates of ⃗n R · ⃗σ,<br />

obeying the rotational transformation rules<br />

U (⃗m, α) |⃗n, ±⟩ = |⃗n R , ±⟩. (III. 136)


68 CHAPTER III. THE POSTULATES<br />

We illustrate (III. 136) using a rotation of ⃗n in the x z - plane, ϕ = 0, over an angle α around<br />

the y - axis as in diagram III. 2.<br />

⃗n<br />

z<br />

θ<br />

α<br />

⃗n R<br />

x<br />

y<br />

Figure III. 2: A rotated unit vector in the xz - plane<br />

For ⃗n and ⃗n R we have<br />

⎛ ⎞ ⎛ ⎞<br />

sin θ<br />

sin(θ + α)<br />

⃗n = ⎝ 0 ⎠ , ⃗n R = ⎝ 0 ⎠ . (III. 137)<br />

cos θ<br />

cos(θ + α)<br />

The eigenstates of ⃗n · ⃗σ, using (III. 132), are<br />

( cos<br />

1<br />

|⃗n, +⟩ = 2 θ )<br />

sin 1 2 θ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 138)<br />

and<br />

|⃗n, −⟩ =<br />

( − sin<br />

1<br />

2 θ )<br />

cos 1 2 θ<br />

= − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩. (III. 139)<br />

Rotating around the y - axis and therefore<br />

(<br />

U (⃗e y , α) = (cos 1 2 α 11 − i ⃗e y · ⃗σ sin 1 cos<br />

1<br />

2 α) = 2 α − sin 1 2 α )<br />

sin 1 2 α cos 1 2 α , (III. 140)<br />

we have<br />

U (⃗e y , α) |⃗n, +⟩ =<br />

( )<br />

cos<br />

1<br />

2<br />

(θ + α)<br />

sin 1 2 (θ + α)<br />

and<br />

U (⃗e y , α) |⃗n, −⟩ =<br />

= cos 1 2 (θ + α) |z ↑⟩ + sin 1 2<br />

(θ + α) |z ↓⟩ (III. 141)<br />

( )<br />

− sin<br />

1<br />

2<br />

(θ + α)<br />

cos 1 2 (θ + α)<br />

= − sin 1 2 (θ + α) |z ↑⟩ + cos 1 2<br />

(θ + α) |z ↓⟩, (III. 142)


III. 6. SPIN 1/2 PARTICLES 69<br />

and we see that (III. 141) and (III. 142) are indeed the eigenstates |⃗n R , +⟩ and |⃗n R , −⟩ of ⃗n R · ⃗σ.<br />

Comparison of these eigenstates with the eigenstates of ⃗n · ⃗σ, (III. 138) and (III. 139), shows<br />

that (III. 136) is satisfied. As can easily be verified, this holds in general, and we conclude that spin<br />

is represented by the spin operator ⃗n · ⃗σ, founding our choice (III. 129).<br />

Under a rotation around the y - axis over an angle θ the eigenvectors of σ z transform into<br />

and, likewise,<br />

U (⃗e y , θ) |z ↑⟩ = (cos 1 2 θ 11 − i σ y sin 1 2 θ) |z ↑⟩ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 143)<br />

U (⃗e y , θ) |z ↓⟩ = − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩ (III. 144)<br />

Especially, it holds that the eigenvectors of σ x correspond to a rotation of the eigenvectors of σ z<br />

around the y - axis over θ = 1 2 π,<br />

and<br />

U (⃗e y , 1 2 π) |z ↑⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ + |z ↓⟩<br />

)<br />

= |x ↑⟩, (III. 145)<br />

U (⃗e y , 1 2 π) |z ↓⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↓⟩ − |z ↑⟩<br />

)<br />

= |x ↓⟩. (III. 146)<br />

EXERCISE 23. Construct, analogously, the states |y ↑⟩ and |y ↓⟩ from |z ↑⟩ and |z ↓⟩ using a<br />

rotation around the x - axis.<br />

Successively rotating over 1 2<br />

π transforms |z ↑⟩ via |x ↑⟩, |z ↓⟩ and |x ↓⟩ into −|z ↑⟩, instead of<br />

into |z ↑⟩, and consequently, we have to rotate |z ↑⟩ over 4π to come back to |z ↑⟩ again. Generally,<br />

a rotation over 2π transforms a state |ϕ⟩ into −|ϕ⟩. This means we cannot simply visualize particles<br />

with spin as tiny spinning tops!<br />

Finally a useful relation holds. Choosing again ⃗e y for ⃗m, we have U (⃗e y , α) as in (III. 140) which<br />

yields for arbitrary ⃗n, (III. 130),<br />

⟨⃗n, +| U (⃗e y , α) |⃗n, +⟩ = cos 1 2 α + (e − i 2 ϕ − e i 2 ϕ ) cos 1 2 θ sin 1 2 θ sin 1 2 α<br />

= cos 1 2 α − i sin ϕ sin θ sin 1 2α, (III. 147)<br />

from which we see that, if ⃗n and ⃗n R are in the xz - plane, ϕ = 0 or ϕ = π,<br />

⟨⃗n, + | ⃗n R , +⟩ = cos 1 2 α ⃗n ⃗n R<br />

, (III. 148)<br />

where α ⃗n ⃗nR is the angle between ⃗n and ⃗n R . Because ⃗n and α can be chosen arbitrarily, this relation<br />

holds for any two vectors ⃗n and ⃗n ′ in the xz - plane, and, by freedom of choice of the coordinate<br />

system, it holds whenever ⃗n and ⃗n ′ are in the same plane.


70 CHAPTER III. THE POSTULATES<br />

EXERCISE 24. Show that the operator 1 2<br />

(11 + ⃗n · ⃗σ) is the projector on |⃗n, +⟩,<br />

1<br />

2<br />

(11 + ⃗n · ⃗σ) = |⃗n, +⟩ ⟨⃗n, +|. (III. 149)<br />

◃ Remark<br />

This holds in any matrix representation. ▹<br />

III. 6. 2<br />

MIXED SPIN 1/2 STATES<br />

Every Hermitian 2 × 2 - matrix can, as stated before, be written as (III. 121), A = a 0 11 + ⃗a · ⃗σ,<br />

with real coefficients a 0 and ⃗a. According to (III. 29), for the corresponding operator A to be a state<br />

operator the trace of A has to be 1, which means that a 0 = 1 2<br />

. Furthermore, A has to be positive.<br />

A positive matrix can be written as the square of a Hermitian matrix B,<br />

B = b 0 11 + ⃗ b · ⃗σ and B 2 = (b 2 0 + ⃗ b 2 ) 11 + 2 b 0<br />

⃗ b · ⃗σ, (III. 150)<br />

Therefore,<br />

a 0 = 1 2 = b 2 0 + ⃗ b 2 and ⃗a = 2 b 0<br />

⃗ b. (III. 151)<br />

The possible values of b 0 are limited by (III. 151), b 2 0<br />

fixed, ⃗ b = 1 ⃗a<br />

2 b 0<br />

, yielding<br />

1 2 , while as soon as b 0 is chosen ⃗ b is<br />

⃗a 2 = 4 b 0<br />

2⃗ b 2 = 4 b 0<br />

2 ( 1<br />

2 − b 0 2) . (III. 152)<br />

Obviously, ⃗a 2 only depends on b 2 0 and its values in the interval [0, 1 2 ] are between 0 and 1 4<br />

, where ⃗a<br />

2<br />

has a maximum for b 2 0 = 1 4 . In other words, A is a state operator iff a 0 = 1 2 and ⃗a 2 1 4<br />

, in which<br />

case some b 0 and ⃗ b exist, satisfying the requirements (III. 151).<br />

Now an arbitrary state operator is<br />

W = 1 2 (11 + ⃗w · ⃗σ), ⃗w 2 1. (III. 153)<br />

This state operator is characterized by the vector ⃗w, called the polarization vector, which has its<br />

endpoints within or on the surface of the unit sphere, the so - called Bloch sphere. For ∥ ⃗w∥ = 1 the<br />

system is called completely polarized, for ⃗w = 0 it is called unpolarized, and if 0 < ∥ ⃗w∥ < 1 it is<br />

called partially polarized.<br />

The state operators with ⃗w 2 = 1 are the pure states, the 1 - dimensional projectors,<br />

W 2 = 1 4 (11 + 2 ⃗w · ⃗σ + ⃗w 2 11) = 1 2<br />

(11 + ⃗w · ⃗σ) = W, (III. 154)<br />

the state operators with ⃗w 2 < 1 are mixed states. The set of state operators is a convex set as we<br />

can now easily see. If ⃗w 1 and ⃗w 2 are within or on the surface of the unit sphere, then α ⃗w 1 + β ⃗w 2 ,<br />

with 0 < α, β < 1 and α +β = 1, is the chord linking ⃗w 1 and ⃗w 2 , and this chord is within the sphere.


III. 6. SPIN 1/2 PARTICLES 71<br />

EXERCISE 25. Prove the following statements.<br />

(a) ⟨⃗σ⟩ W = ⃗w,<br />

(b) det W = 1 4 (1 − ⃗w 2 ),<br />

(c) the eigenvalues of W are 1 2 ± 1 2 ∥ ⃗w∥.<br />

EXAMPLES<br />

In the following two examples, consider vectors ⃗w with ∥ ⃗w∥ = 1, thus corresponding to pure<br />

states.<br />

(a) Since in this case ⃗w equals the unit vector ⃗n, for ⃗w = (0, 0, 1) ∈ R 3 we have<br />

( )<br />

W = 1 (11 1 0<br />

2 + σ z) = , (III. 155)<br />

0 0<br />

which is a 1 - dimensional projector, it is the matrix representation of W = |z ↑⟩ ⟨z ↑|.<br />

Likewise we have<br />

⃗w = (1, 0, 0) =⇒ W = 1 2 (11 + σ x) = |x ↑⟩ ⟨x ↑|, (III. 156)<br />

⃗w = (0, 1, 0) =⇒ W = 1 2 (11 + σ y) = |y ↑⟩ ⟨y ↑|,<br />

and we see that generally W = 1 2<br />

(11 + ⃗n · ⃗σ) corresponds to the pure state |⃗n, +⟩, as was<br />

already shown in (III. 149).<br />

In the same way, for |⃗n, −⟩ we have<br />

etc.<br />

⃗w = (0, 0, − 1) =⇒ W = 1 2 (11 − σ z) = |z ↓⟩ ⟨z ↓|, (III. 157)<br />

(b) For the probability to find spin up in the direction ⃗n ′ in the state |⃗n, +⟩, with (III. 45)<br />

and (III. 127) we find<br />

µ W ⃗n<br />

(W ⃗n ′) = Tr W ⃗n ′ W ⃗n = Tr ( 1<br />

2 + ⃗n ′ · ⃗σ) · 1<br />

2<br />

(11 + ⃗n · ⃗σ))<br />

= 1 4 Tr ( 11 + ⃗n ′ · ⃗σ + ⃗n · ⃗σ + (⃗n ′ · ⃗n)11 + i⃗σ · (⃗n ′ × ⃗n) )<br />

= 1 2 (1 + ⃗n ′ · ⃗n) = 1 2 (1 + cos θ) = cos2 1 2θ, (III. 158)<br />

with θ the angle between ⃗n and ⃗n ′ . This is in accordance with (III. 148).<br />

The following examples concern mixed state operators W , for which ⃗w has its endpoint somewhere<br />

inside the sphere, ⃗w 2 < 1.<br />

(c) Choosing ⃗w to be 1 2<br />

(0, 1, 0) yields<br />

( 1<br />

W = 1 (11 2 + 1 2 σ 2<br />

y) =<br />

− 1 4 i<br />

This can, for instance, be factorized as<br />

1<br />

4 i 1<br />

2<br />

)<br />

. (III. 159)<br />

W = 1 4 |z ↑⟩ ⟨z ↑| + 1 4 |z ↓⟩ ⟨z ↓| + 1 2<br />

|y ↑⟩ ⟨y ↑|, (III. 160)<br />

which clearly is a mixture.


72 CHAPTER III. THE POSTULATES<br />

The next two examples concern the center of the Bloch sphere, ⃗w = 0 .<br />

(d) With ⃗w = 0 , we have<br />

( ) 1 0<br />

W = 1 2<br />

. (III. 161)<br />

0 1<br />

The eigenvalues of this mixed state W are degenerate, and various factorizations are possible,<br />

for example<br />

W = 1 2 |x ↑⟩ ⟨x ↑| + 1 2<br />

|x ↓⟩ ⟨x ↓|<br />

= 1 2 |y ↑⟩ ⟨y ↑| + 1 2<br />

|y ↓⟩ ⟨y ↓|<br />

= 1 2 |z ↑⟩ ⟨z ↑| + 1 2<br />

|z ↓⟩ ⟨z ↓|. (III. 162)<br />

(e) Under a rotation R, ⃗w behaves like a vector in R 3 ,<br />

U (R) ( ⃗w · ⃗σ) U − 1 (R) = ⃗w R · ⃗σ (III. 163)<br />

where U (R) is given by (III. 135). Therefore, the only rotation invariant state for a 1 - particle<br />

system is ⃗w = 0 .<br />

The similarity between the set of density matrices W and the 3 - dimensional unit sphere of polarization<br />

vectors is specific for spin 1/2 particles, in which case every pure state is also the eigenstate<br />

for the spin operator in a certain spin direction. For spin 1 bosons and higher spin particles this no<br />

longer applies.<br />

III. 6. 3<br />

TWO SPIN 1/2 PARTICLES<br />

III. 6. 3. 1<br />

SINGLET AND TRIPLET STATES<br />

Consider a composite system of two spin 1/2 fermions. In the direct product space C 2 ⊗ C 2 = C 4<br />

a basis is<br />

|z ↑⟩ ⊗ |z ↑⟩, |z ↑⟩ ⊗ |z ↓⟩, |z ↓⟩ ⊗ |z ↑⟩, |z ↓⟩ ⊗ |z ↓⟩. (III. 164)<br />

From these basis states the simultaneous eigenstates |s, m⟩ of the operators ⃗ S 2 = ( ⃗ S 1 + ⃗ S 2 ) 2<br />

and S z = S 1z + S 2z can be formed, where s can be 0 or 1. The eigenvalues of ⃗ S 2 are 2 s(s + 1), the<br />

eigenvalues of S z are m, as introduced on p. 65.<br />

The singlet state or singlet for short, with s = 0 and therefore m = 0, is the entangled state<br />

|Ψ 0 ⟩ = |0, 0⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↓⟩ − |z ↓⟩ ⊗ |z ↑⟩<br />

)<br />

, (III. 165)<br />

which looks the same in terms of the eigenstates of S x and S y , having spherical symmetry. The singlet<br />

is a simultaneous eigenstate of S x , S y and S z with eigenvalue 0. Hence the singlet is an eigenstate<br />

of ⃗n · ⃗S with eigenvalue 0, which means that a rotation (III. 133) carries (III. 165) back into itself.


III. 6. SPIN 1/2 PARTICLES 73<br />

The triplet states, with s = 1 and m = 1, 0, −1 are<br />

|1, 1⟩ = |z ↑⟩ ⊗ |z ↑⟩<br />

√<br />

|1, 0⟩ = 1 ( )<br />

2 2 |z ↑⟩ ⊗ |z ↓⟩ + |z ↓⟩ ⊗ |z ↑⟩<br />

|1, − 1⟩ = |z ↓⟩ ⊗ |z ↓⟩. (III. 166)<br />

III. 6. 3. 2<br />

CORRELATIONS<br />

In chapter VII we will use the spin correlation function of the singlet,<br />

E QM (⃗a, ⃗ b) := ⟨0, 0|⃗a · ⃗σ 1 ⊗ ⃗ b · ⃗σ 2 |0, 0⟩, (III. 167)<br />

where ⃗a, ⃗ b ∈ R 3 are unit vectors. E QM (⃗a, ⃗ b) is the expectation value to find both for particle 1 spin<br />

up along ⃗a and for particle 2 spin up along ⃗ b. To find E QM (⃗a, ⃗ b), first choose the z - axis along ⃗a<br />

as in diagram III. 3, next choose the x - axis in such a way that ⃗ b is in the xz - plane. The spherical<br />

symmetry of the singlet state allows such a choice.<br />

z<br />

⃗a<br />

θ ⃗a, ⃗ b<br />

⃗ b<br />

Figure III. 3: Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b<br />

x<br />

With ⃗a = ⃗e z , ⃗ b similar to ⃗n in (III. 137), and θ ⃗a, ⃗ b<br />

the angle between ⃗a and ⃗ b, we have<br />

E QM (⃗a, ⃗ b) = ⟨0, 0| σ 1z ⊗ (sin θ ⃗a, ⃗ b<br />

σ 2x + cos θ ⃗a, ⃗ b<br />

σ 2z ) |0, 0⟩. (III. 168)<br />

Now σ z |z ↑⟩ = |z ↑⟩, σ x |z ↑⟩ = |z ↓⟩ etc., so that we have, using (II. 100), (III. 165) and (III. 166),<br />

√<br />

(σ 1z ⊗ σ 2x ) |0, 0⟩ = 1 ( )<br />

2 2 |1, 1⟩ + |1, −1⟩ (III. 169)<br />

which is perpendicular to |0, 0⟩, and<br />

(σ 1z ⊗ σ 2z ) |0, 0⟩ = − |0, 0⟩, (III. 170)<br />

from which we see that<br />

E QM (⃗a, ⃗ b) = − cos θ ⃗a, ⃗ b<br />

. (III. 171)


74 CHAPTER III. THE POSTULATES<br />

III. 6. 3. 3<br />

CONDITIONAL PROBABILITIES<br />

In chapter VII we will also need to know, again in case the particles are in the singlet state, the<br />

probability for the spin of particle 2 to be found in the direction ⃗ b, given that the spin of particle 1 was<br />

found in the direction ⃗a. This conditional probability is, by definition,<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

Prob ( ) . (III. 172)<br />

⃗a · ⃗σ 1 = 1<br />

Here the joint probability is<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = | ( ⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| ) |0, 0⟩| 2 , (III. 173)<br />

with |⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ the direct product of the eigenstates of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 having eigenvalues +1.<br />

Again choosing ⃗a and ⃗ b as in diagram III. 3, |⃗a ↑⟩ = |z ↑⟩ and | ⃗ b ↑⟩ equal to |⃗n, +⟩, (III. 138), we find<br />

for the direct product<br />

|⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ = |z ↑⟩ ⊗ ( cos 1 2 θ ⃗a, ⃗ b |z ↑⟩ + sin 1 2 θ ⃗a, ⃗ b |z ↓⟩) . (III. 174)<br />

Therefore, with (III. 165),<br />

( ) √<br />

⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| |0, 0⟩ =<br />

1<br />

2 2 sin<br />

1<br />

2 θ ⃗a, ⃗ , (III. 175)<br />

b<br />

and we see that the joint probability is<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 sin2 1 2 θ ⃗a, ⃗ . (III. 176)<br />

b<br />

Likewise, again using (III. 173) with ⟨ ⃗ b ↓| equal to |⃗n, −⟩, (III. 139), we have<br />

Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (III. 177)<br />

b<br />

This yields for the marginal probability<br />

Prob ( ⃗a · ⃗σ 1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

and we see that the conditional probability (III. 172) is<br />

+ Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />

= 1 2 sin2 1 2 θ ⃗a, ⃗ b + 1 2 cos2 1 2 θ ⃗a, ⃗ b = 1 2<br />

, (III. 178)<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = sin 2 1 2 θ ⃗a, ⃗ . (III. 179)<br />

b<br />

◃ Remark<br />

By definition there is no correlation between the two results of measurements of spin if<br />

Prob ( ⃗ b · ⃗σ2 = 1 ∣ ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ) , (III. 180)<br />

which is the case if ⃗a and ⃗ b are perpendicular. ▹


III. 6. SPIN 1/2 PARTICLES 75<br />

We are now able to calculate the correlation (III. 167) directly, using a well - known formula from<br />

probability theory,<br />

E QM (⃗a, ⃗ b) =<br />

∑+1<br />

∑+1<br />

a=−1 b=−1<br />

a b Prob (a, b), (III. 181)<br />

where a, b ∈ { −1, 1} are the results of measurements of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 , respectively, and<br />

Prob (a, b) is the joint probability to find a and b at measurements of the respective spin quantities.<br />

Using (III. 176) and (III. 177) and calculating the probabilities with eigenvalues −1 for ⃗a · ⃗σ 1 we<br />

find<br />

E QM (⃗a, ⃗ b) = Prob (1, 1) + Prob (− 1, − 1) − Prob (1, − 1) − Prob (− 1, 1)<br />

= 2 · 1<br />

2 sin2 1 2 θ ⃗a, ⃗ b − 2 · 1<br />

2 cos2 1 2 θ ⃗a, ⃗ b<br />

= − cos θ ⃗a, ⃗ b<br />

. (III. 182)<br />

This is indeed equal to the earlier result (III. 171).<br />

III. 6. 3. 4<br />

EXAMPLE <strong>OF</strong> A MIXED STATE <strong>OF</strong> TWO SPIN 1/2 PARTICLES<br />

Consider, analogous to (III. 100), the pure entangled state<br />

|Φ⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↑⟩ + |z ↓⟩ ⊗ |z ↓⟩<br />

)<br />

, (III. 183)<br />

and the corresponding state W = |Φ⟩ ⟨Φ|, acting in H I ⊗ H II ,<br />

W = 1 (<br />

2 |z ↑⟩ ⟨z ↑| ⊗ |z ↑⟩ ⟨z ↑| + |z ↑⟩ ⟨z ↓| ⊗ |z ↑⟩ ⟨z ↓| +<br />

|z ↓⟩ ⟨z ↑| ⊗ |z ↓⟩ ⟨z ↑| + |z ↓⟩ ⟨z ↓| ⊗ |z ↓⟩ ⟨z ↓| ) , (III. 184)<br />

where the first factor in the direct product acts in H I , and the second factor in H II .<br />

The representation of W in the corresponding basis (III. 164) of H = H I ⊗ H II is, using the<br />

Kronecker product of matrices, (II. 103),<br />

⎛ ⎞<br />

1 0 0 1<br />

W = 1 ⎜0 0 0 0<br />

⎟<br />

2 ⎝0 0 0 0⎠ . (III. 185)<br />

1 0 0 1<br />

This is indeed a pure state, since W is idempotent, a necessary and sufficient condition for bounded,<br />

self - adjoint operators to be a projector.<br />

The partial traces are<br />

W I = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H I), (III. 186)<br />

W II = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H II), (III. 187)


76 CHAPTER III. THE POSTULATES<br />

and their matrix representation in the basis of σ z is<br />

W I = 1 2<br />

( ) 1 0<br />

0 1<br />

and W II = 1 2<br />

( ) 1 0<br />

. (III. 188)<br />

0 1<br />

Although W is a pure state, the direct product of the partial traces W I and W II is not pure,<br />

⎛ ⎞<br />

1 0 0 0<br />

W I ⊗ W II = 1 ⎜0 1 0 0<br />

⎟<br />

4 ⎝0 0 1 0⎠ ≠ W. (III. 189)<br />

0 0 0 1<br />

This conclusion is, of course, in accordance with the conclusion (III. 104) concerning the pure state<br />

operator (III. 100).<br />

◃ Remark<br />

Notice that all matrices in this example are indeed Hermitian, positive and have trace 1, the requirements<br />

of Gleason’s theorem, p. 47, for operators W to be state operators. ▹<br />

EXERCISE 26.<br />

(a) In (III. 184), fill in the matrix representations of the projectors in H I and H II , and check<br />

that forming Kronecker products indeed yields (III. 185).<br />

(b) Is the state (III. 184) spherically symmetric?


IV<br />

THE COPENHAGEN INTERPRETATION<br />

It is wrong to think that the task of physics is to find out how nature is. Physics concerns<br />

what we can say about nature.<br />

— Niels Bohr<br />

The Heisenberg-Bohr tranquilizing philosophy - or religion? - is so delicately contrived<br />

that, for the time being, it provides a gentle pillow for the true believer from which he<br />

cannot very easily be aroused. So let him lie there.<br />

— Albert Einstein<br />

I know it is not the fault of N. B. that he did not study philosophy. But I deeply regret<br />

that by his authority the brains of two or three generations will be upset and hindered to<br />

think about the problems ‘He’ pretends to have solved.<br />

— Erwin Schrödinger<br />

Bohr’s famous institute being located in Copenhagen, the standard interpretation of quantum<br />

mechanics as explained in most of the textbooks is generally indicated as the Copenhagen Interpretation.<br />

It is however worth mentioning that the conceptions of the many supporters of the<br />

Copenhagen Interpretation, Niels Bohr, Werner Heisenberg, Wolfgang Pauli, Rudolf Peierls,<br />

Léon Rosenfeld and John Wheeler, to name some of them, mutually differ on numerous points,<br />

and that some of them, including Bohr himself, modified their conceptions in the course of time,<br />

so that the name ‘Copenhagen Interpretation’ is more a collective noun than the name of one<br />

clearly outlined vision. Moreover, important contributions to the standard interpretation of the<br />

theory have been made by Born and Von Neumann, working independently of the Copenhagen<br />

school. In this chapter we will evaluate the conceptions of Heisenberg and Bohr as the main<br />

representatives of the Copenhagen Interpretation, and consider more closely the debate between<br />

Einstein and Bohr. Finally, we will discuss the exact expression of the uncertainty principle.<br />

IV. 1<br />

HEISENBERG AND THE UNCERTAINTY PRINCIPLE<br />

The history of modern quantum mechanics starts in 1925, when Heisenberg publishes his famous<br />

transitional article ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />

Beziehungen’ (‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’). His<br />

summary reads<br />

The present paper seeks to establish a basis for theoretical quantum mechanics founded<br />

exclusively upon relationships between quantities which in principle are observable.


78 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Obviously the theory was only allowed to speak about observable quantities; every attempt to<br />

visualize the inside of an atom had to be avoided. In particular, one could not speak of the orbit<br />

of an electron. Only the transitions between stationary states were ‘observable’ and therefore the<br />

transition quantities could be characterized by two discrete indices. These ideas were developed<br />

by Heisenberg, Born and Jordan into matrix mechanics. They represented all physical quantities<br />

by infinite complex Hermitian matrices. The ‘quantum condition’, the fundamental equation of this<br />

theory, is the commutation relation<br />

P Q − Q P = − i 11 (IV. 1)<br />

between the matrices P and Q, which were meant to be the ‘quantum counterparts’ of the canonical<br />

dynamical quantities, momentum and position, of classical mechanics à la Hamilton.<br />

In 1926 matrix mechanics received unexpected competition by wave mechanics, established by<br />

Erwin Schrödinger. He interpreted the electron as a vibrating charge cloud, continuously moving<br />

in space. In his conception the stationary states could be understood as resonances, comparable to<br />

the vibrations of the string of a violin. According to Schrödinger, wave mechanics was to be preferred<br />

over matrix mechanics because wave mechanics offers a graphic picture of what takes place in<br />

microphysical reality. This interpretation foundered on three insoluble problems:<br />

(i) waves of physical systems consisting of more than one particle were defined in the configuration<br />

space R 3N instead of in the three - dimensional space R 3 surrounding us,<br />

(ii) wave packets of free particles eventually fall apart and therefore, the electron cannot remain a<br />

localized entity,<br />

(iii) the wave function can carry complex values.<br />

Nevertheless, eventually the empirical strength of wave mechanics turned out to be just as strong<br />

as that of matrix mechanics.<br />

The fact that an approach with such radically different starting points turned out to be possible<br />

also, impelled Heisenberg to further clarify his starting points. The result of this effort is his ‘uncertainty<br />

principle’, formulated for the first time in his 1927 article ‘Über den anschaulichen Inhalt der<br />

quantentheoretischen Kinematik und Dynamik’, which was translated as ‘The physical content of<br />

quantum kinematics and mechanics’.<br />

In this article Heisenberg wonders how the ‘orbit’ of an electron must be understood in quantum<br />

mechanics. On the one hand, the basic equation (IV. 1) prevents granting numerical values to position<br />

and momentum simultaneously, on the other hand, the path of a particle in, for example, a Wilson<br />

chamber, seems to be directly perceptible. To find a way out of this dilemma, he was inspired by a<br />

statement of Einstein (H.J. Folse 1985, p. 91),<br />

[. . . ] it is the theory finally which decides what can be observed and what can not [. . . ]<br />

Could it be, that if a path cannot be defined in quantum mechanics, it can in fact not be observed also?<br />

This idea led him to analyze what the theory has to say about observations.


IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 79<br />

He starts (1927, Eng. tr. p. 64) with linking measuring and defining operationally,<br />

When one wants to be clear about what is to be understood by the words “position of the<br />

object”, for example of the electron, relative to a given frame of reference, then one must<br />

specify definite experiments with whose help one plans to measure the “position of the<br />

electron”, otherwise this word has no meaning.<br />

We will call this the measuring = defining principle.<br />

One could, for example, determine the position of an electron by examining it under a microscope.<br />

According to classical optics a microscope has a limited resolution. The Abbe criterion gives the<br />

smallest distinguishable details as<br />

δq ∼<br />

λ , (IV. 2)<br />

sin ε<br />

where λ is the wavelength of light and ε is the aperture, the opening angle of the lens. For a precise<br />

measurement we must therefore use a very short wavelength, i.e. gamma radiation. But in that case<br />

the Compton effect cannot be neglected. The radiation behaves as a flow of particles, with momentum<br />

p 0 = h λ<br />

, which collides with the electron and causes it to recoil.<br />

Figure IV. 1: Heisenberg’s γ - microscope<br />

To allow for an observation at least one photon has to collide with the electron, which will bring<br />

about a change of momentum. But as we do not know anything more about the direction of the<br />

photon after the collision than that it has gone through the lens, we cannot indicate the size of the<br />

recoil exactly. As can be seen in figure IV. 1, the transfer of momentum remains unknown to an<br />

amount<br />

δp ∼ p 0 sin ε = h λ<br />

sin ε (IV. 3)<br />

and therefore<br />

δq δp ∼ h. (IV. 4)


80 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

The more closely the position is determined, δq is small, the more inaccurately the momentum afterwards<br />

is known, δp is large.<br />

Quoting Heisenberg again (loc. cit.)<br />

At the instant when position is determined - therefore, at the moment when the photon is<br />

scattered by the electron - the electron undergoes a discontinuous change in momentum.<br />

This change is the greater the smaller the wavelength of the light employed - that is, the<br />

more exact the determination of the position. At the instant at which the position of the<br />

electron is known, its momentum therefore can be known up to magnitudes which correspond<br />

to that discontinuous change. Thus, the more precisely the position is determined,<br />

the less precisely the momentum is known, and conversely.<br />

This conclusion is the first formulation of the uncertainty principle. According to Heisenberg’s<br />

own measuring = defining principle this conclusion can, however, not yet be drawn because it also<br />

has to be specified what, in this context, must be understood by the term ‘momentum of the electron’.<br />

In a later discussion (Heisenberg 1930), Heisenberg specifies the reasoning by also discussing the<br />

definition of the momentum of the electron.<br />

This reasoning goes as follows. Suppose that the momentum of the electron has been measured<br />

in advance with an inaccuracy δ p 1 . Next, the position is measured with an inaccuracy δ q, then the<br />

momentum is measured again, with inaccuracy δp 2 . We can assume that δp 1 ≪ p 1 and δp 2 ≪ p 2 ,<br />

so that the momentum is very accurately known before and after the position measurement. Now it<br />

makes sense to speak of the momentum p 1 of the electron shortly before the position measurement.<br />

If now the position is measured very precisely, the position and momentum of the electron in the past<br />

are arbitrarily well defined. Heisenberg (1930, p. 20):<br />

[. . . ] if the velocity of the electron is at first known and the position then exactly measured,<br />

the position for times previous to the measurement may be calculated. Then for<br />

these past times δp δq is smaller than the usual limiting value [. . . ]<br />

Apparently, the uncertainty relation does not apply to the past. In the example the uncertainty concerns<br />

the unpredictability of the value of p 2 after the position measurement, not the inaccuracy δp 2<br />

with which p 2 can be measured. This unpredictability can be determined by accurately measuring<br />

the momentum before and after the determination of position, and the unpredictability is larger if<br />

the determination of position was more precise. Although it is true that one can speak in a logically<br />

consistent manner of the position and momentum of the electron in the past (loc. cit.),<br />

[. . . ] but this knowledge of the past is of a purely speculative character, since it can never<br />

(because of the unknown change in momentum caused by the position measurement) be<br />

used as an initial condition in any calculation of the future progress of the electron and<br />

thus cannot be subjected to experimental verification. It is a matter of personal belief<br />

whether such a calculation concerning the past history of the electron can be ascribed<br />

any physical reality or not.<br />

For Heisenberg, such a calculation does not describe reality. But then, what is reality to him?<br />

Heisenberg says, (1927, Eng. tr. p. 73),<br />

The “orbit” comes into being only when we observe it.


IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 81<br />

Apparently, the measurement creates reality, instead of revealing it. This is what we call the measuring<br />

= creating principle.<br />

This leads to the following representation. First, we measure the momentum of the electron<br />

precisely. Not only is the term “the momentum of the electron” hereby defined, now we also can<br />

say, according to the measuring = creating principle, that the value of the momentum, which was<br />

determined in this measurement, is physically real. Next, we measure the position precisely. At<br />

this measurement the electron obtains an exact position. After this measurement the momentum of<br />

the electron has however changed in an unpredictable manner. This can be verified with a second<br />

precise momentum measurement. This unpredictability turns out to be all the larger as the position<br />

measurement is more precise.<br />

Now the question is, if the electron had this changed momentum already before the second momentum<br />

measurement, i.e., if this value is also physically real before this measurement. According<br />

to Heisenberg this is not the case, because we can only predict the momentum to the order of the<br />

size of the change. Before the second momentum measurement the electron has only a blurred, fuzzy<br />

momentum. Only when the measurement of momentum has been carried out the electron regains a<br />

sharply defined momentum. ‘Fuzzy’ is meant in the ontological sense, as the sharpness of a property<br />

the electron possesses. As one quantity is measured more precisely, the conjugate quantity becomes<br />

more fuzzy.<br />

◃ Remark<br />

Directly after the measurement of momentum it is meaningful to say that the electron has this momentum,<br />

because in that case the outcome of a next measurement of momentum can, within the accuracy<br />

of measurement, be predicted with certainty. ▹<br />

In later work Heisenberg uses the Aristotelian term potential. A related term by K.R. Popper<br />

is propensity. The electron has a propensity to produce, at measurement, a certain outcome. This<br />

propensity can be understood as a real property of the electron, even if we are not performing a<br />

measurement. The potential and propensity interpretations are therefore ‘realistic’ interpretations, or<br />

at least not in conflict with scientific realism which is, roughly speaking, the thesis that a scientific<br />

theory tells us how (a part of) reality is made up.<br />

IV. 1. 1<br />

REMARKS<br />

(a) Heisenberg derives the uncertainty relation (IV. 4) for the electron from a quantum mechanical<br />

treatment of the photon. What he in fact hereby proves is the consistency of the uncertainty<br />

principle.<br />

(b) Although it is frequently written that the uncertainty relation restricts simultaneous measurements,<br />

simultaneous measurements of position and momentum do not appear in this discussion.<br />

(c) Creation of the sharp value of a quantity upon measurement can, in the terminology of the<br />

projection postulate, p. 42, be described as follows. Upon measurement of p the state transforms<br />

into the proper eigenstate of p. In that state q is unpredictable. If next q is measured, the<br />

state transforms into the proper eigenstate of q and p becomes unpredictable. The uncertainty<br />

principle says that that unpredictability is larger if the preceding measurement of q was more<br />

precise.


82 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

(d) Heisenberg (1930) describes the path of an electron in a Wilson chamber as follows. Suppose<br />

that the incoming electron can be described by a wave packet with fairly sharply defined position<br />

and momentum. Upon free development this packet spreads out in the course of time so<br />

that the position becomes less sharp. When the electron ionizes a molecule in the Wilson chamber<br />

a macroscopic droplet is formed, which can be understood as a position measurement. As<br />

a result the wave packet reduces to a packet which is rather sharply located, with a dimension<br />

in the order of a molecule, which again spreads out until a next ionization takes place.<br />

It can be shown that the successive spreading and contraction in position and momentum is,<br />

according to the uncertainty principle, in agreement with the observation of a macroscopic<br />

path. We cannot speak however of the path of an electron in an atom, not even approximately.<br />

An observation of the position of the electron with an accuracy larger than the dimension of the<br />

atom requires such a large recoil that the electron is generally pushed out of the atom entirely.<br />

Therefore, of such an ‘orbit’ no more than one point is observable. Notice that observation plays<br />

a vital role; the path in the Wilson chamber only comes into existence because we observe it.<br />

(e) As a result of Heisenbergs discussion of the uncertainty principle the term measurement disturbance<br />

was introduced in quantum mechanics. Initially the inclination existed to consider this<br />

as a more or less classical physical process; the momentum of the electron is disturbed by the<br />

collision with a photon. This is also indicated by Heisenberg’s use of the word ‘error’ for δq.<br />

From the beginning, Bohr resisted this explanation of Heisenberg, and he put the emphasis on<br />

the necessity to combine mutually excluding terms from a wave and particle picture in one description.<br />

Especially because of EPR it later became clear that the ‘measurement disturbance’<br />

cannot be an ordinary error.<br />

IV. 2<br />

BOHR AND COMPLEMENTARITY<br />

The core of the Copenhagen interpretation lays, of course, in Bohr’s work. His articles are characterized<br />

by an entirely own style. Remarkably, Bohr hardly uses the formalism of the theory, he<br />

generally gives a qualitative argument instead. His difficult, and sometimes obscurely formulated,<br />

long sentences are notorious, full of subordinate clauses and conditional definitions which do not<br />

always clarify his intentions. A careful reconstruction and interpretation of Bohr’s point of view,<br />

and its development in the course of time, has been given by E. Scheibe (1973, chapter 1), another<br />

interpretation is the monograph of H.J. Folse (1985).<br />

Centrally in Bohr’s consideration is the language we use to do physics. Bohr emphasizes that,<br />

regardless of how abstract and refined the terms of modern physics may be, in essence they are only<br />

an extension of everyday language, and they are nothing but means of communication we use to<br />

communicate observational results to other people. Such an observational result, the outcome of a<br />

measurement on a physical system in certain experimental circumstances, is therefore the basic element<br />

of consideration. For this, Bohr uses the term phenomenon. Every phenomenon is the resultant<br />

of a physical system S, a preparation apparatus P , a measuring apparatus M and their mutual interaction<br />

in a concrete experimental situation.<br />

The description of a phenomenon must always be made in unambiguous terms because of the<br />

requirement of communicability. A statement like, for example, “the object is in a superposition


IV. 2. BOHR AND COMPLEMENTARITY 83<br />

of two different states” is therefore not suitable. In classical physics a sufficient arsenal of terms is<br />

developed for these aims.<br />

According to Bohr, characteristic of classical physics is in the first place that the interaction between<br />

object and measuring apparatus can be assumed to be negligible small. This implies that upon<br />

describing a phenomenon the measuring apparatus can be left out of consideration. Instead of the<br />

statement: “Thermal interaction between a thermometer and a glass of water has, in certain circumstances,<br />

yielded as a result that the mercury column has been found to have a certain length”, we<br />

can also say: “The temperature of water has a certain value”. In this case we can, without objection,<br />

transfer the description of the phenomenon onto the object itself, and speak in terms of its properties.<br />

The essential difference between classical physics and quantum physics is, according to Bohr, that<br />

in quantum physics the interaction is quantized. The interaction between an object and a measuring<br />

apparatus can only exist of the exchange of one or more quanta, and cannot be made arbitrarily small.<br />

Bohr calls this starting point the quantum postulate (Bohr 1928, p. 580).<br />

<strong>QUANTUM</strong> POSTULATE:<br />

[The] essence [of the quantum theory] may be expressed in the so - called quantum postulate,<br />

which attributes to any atomic process an essential discontinuity, or rather individuality,<br />

completely foreign to the classical theories and symbolized by Planck’s quantum<br />

of action.<br />

In a phenomenon the object, the measuring apparatuses, and their interaction form an indivisible<br />

whole, and the interaction always amounts to at least one quantum h. This postulate unsettles the<br />

procedure to convert the description of a phenomenon into a description of the object itself.<br />

There is however a second element in Bohr’s point of view, which tempers this pessimistic conclusion.<br />

Scheibe called it the buffer postulate (1973, p. 24) because “the function of the postulate is<br />

to use classical physics as a buffer against the quantum - mechanical treatment of a phenomenon”,<br />

BUFFER POSTULATE:<br />

The description of the apparatus and of the results of observation, which forms part of<br />

the description of a quantum phenomenon, must be expressed in the concepts of classical<br />

physics (including those of “everyday life”), eliminating consistently the Planck quantum<br />

of action.<br />

The context of this requirement is again to be able to communicate our experimental findings to other<br />

people. The reasoning is as follows (Bohr 1947, p. 59),<br />

[. . . ] by an experiment we simply understand an event about which we are able in an<br />

unambiguous way to state the conditions necessary for the reproduction of the phenomena.<br />

In the account of these conditions, there can, therefore, be no question of departing<br />

from the Newtonian way of description and, in particular, it may be stressed that by the<br />

[. . . measuring apparatus . . . ], we simply understand some piece of machinery as regards<br />

the working of which classical mechanics can be entirely relied upon and where,<br />

consequently, all quantum effects have to be disregarded.


84 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Bohr assumes that only the language and terms of classical physics are suitable for the description<br />

of observational results. He writes (Bohr 1931, p. 692)<br />

[. . . ] the unambiguous interpretation of any measurement must be essentially framed in<br />

terms of the classical physical theories, and we may say that in this sense the language<br />

of Newton and Maxwell will remain the language of physicists for all time.<br />

This is a particularly radical point of view, and we will return to its motivation later.<br />

The combination of both postulates now leads to the following reasoning. In all phenomena an<br />

interaction exists between the system and the measuring apparatus which has a minimal order of magnitude<br />

h > 0, after all, the most minute measurements always rely on a quantum phenomenon. But<br />

in our description of the phenomenon we are forced to use classical concepts and this interaction, h,<br />

cannot occur. The consequence is that in our description the interaction is not analyzable.<br />

At the same time the classical character of the description makes it possible to speak again in<br />

terms of properties of the object itself. Therefore, instead of the statement “the interaction between a<br />

particle and a photographic plate resulted in a little black dot in a certain area of the plate”, we can<br />

also say “the particle has been found at a position in that area”, where no longer is referred to the<br />

measuring apparatus.<br />

But the large difference with the classical situation is that we, by disregarding the interaction,<br />

in a certain way make a mistake which remains without consequences within this phenomenon, but<br />

prevents the description to be combinable with the information obtained under different experimental<br />

conditions. If the object is coupled to another measuring apparatus there will be another interaction,<br />

which will again not be analyzable. Descriptions of the object that have been obtained under different<br />

measurement arrangements cannot be combined to one picture which covers it all. We will illustrate<br />

this in a more concrete case.<br />

IV. 2. 1<br />

COMPLEMENTARY PHENOMENA<br />

The most important examples of phenomena which give additional, but mutually excluding information<br />

on an object are measurements of position and momentum. Bohr (1939, p. 22) writes<br />

[. . . ] any phenomenon in which we are concerned with tracing a displacement of some<br />

atomic object in space and time necessitates the establishment of several coincidences<br />

between the object and the rigidly connected bodies and movable devices which, in serving<br />

as scales and clocks respectively, define the space - time frame of reference to which<br />

the phenomenon in question is referred.<br />

In this case, therefore, the object has an interaction with an apparatus which is firmly bolted down<br />

or anchored, so that its position remains secured. But the consequence is that a possible exchange<br />

of momentum between object and apparatus cannot be analyzed. Such a transfer of momentum<br />

will be absorbed by the fixed parts of the apparatus without leaving behind any trails. Within this<br />

experimental setup we are therefore prohibited to say anything about the momentum of the object.


IV. 2. BOHR AND COMPLEMENTARITY 85<br />

The opposite applies to the measurement of momentum (Bohr in Schilpp 1949, p. 219);<br />

In the study of phenomena in the account of which we are dealing with detailed momentum<br />

balance, certain parts of the whole device must naturally be given the freedom to<br />

move independently of others.<br />

Bohr assumes that a measurement of momentum is made by registering the recoil after a collision,<br />

for example, with a test particle. In this way we can, using the conservation laws, retrieve the<br />

momentum of the object. However, the condition that the test particle can move freely means that we<br />

cannot guarantee that it preserves a definite position. It is therefore excluded from being used as part<br />

of a spatial coordinate system, and now we cannot say anything about the position of the object.<br />

In order to perform a position measurement we must therefore put the object in contact with a<br />

part of the measuring apparatus which has been bolted down firmly, while performing a momentum<br />

measurement we must observe the recoil of a freely movable part of the measuring apparatus, and<br />

apply the momentum conservation law. Position and momentum measurements therefore exclude<br />

each other, because a measuring apparatus cannot at the same time be bolted down and freely movable.<br />

In the description of the object we must choose between granting a position or momentum. As worded<br />

by Philipp Frank (1949, p. 163)<br />

Quantum mechanics speaks neither of particles the positions and velocities of which<br />

exist but cannot be accurately observed, nor of particles with indefinite positions and<br />

velocities. Rather, it speaks of experimental arrangements in the description of which the<br />

expressions ”position of a particle” and ”velocity of a particle” can never be employed<br />

simultaneously.<br />

Bohr calls this characteristic property of quantum mechanics, where two quantities exclude each<br />

other whereas both are necessary to describe all phenomena in which the object can participate, complementarity.<br />

Position and momentum are examples of complementary quantities. Similar considerations<br />

apply to time and energy, such that a general complementarity exists between on the one hand<br />

a space - time description of phenomena, and on the other hand a dynamical description, frequently<br />

indicated by Bohr as ‘causally’, in which the conservation laws for energy momentum are applicable.<br />

◃ Remark<br />

The complementarity between quantities like position and momentum or descriptions using space -<br />

time coordination or dynamic laws differs from, and replaces, the contrast which Bohr placed central<br />

in his earlier work, namely between ‘wave’ and ‘particle’, because a classical particle has both position<br />

and momentum, a classical wave has neither. ▹<br />

The role of the uncertainty relations in Bohr’s views can now be described as considering them<br />

in the first place as symbolic expressions of the impossibility to define position and momentum at<br />

the same time when describing an object. In a phenomenon in which the position is determined<br />

sharply, δ q = 0, the momentum must be undetermined, δ p = ∞, and vice versa. But the relation<br />

δq δp ∼ h is, of course, more general. Bohr (1934, pp. 60,61) interprets this as follows:<br />

At the same time, however, the general character of this relation makes it possible to<br />

a certain extent to reconcile the conservation laws with the space - time co - ordination<br />

of observations, the idea of a coincidence of well - defined events in a space - time point<br />

being replaced by that of unsharply defined individuals within finite space - time regions.


86 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

The meaning Bohr attaches to the uncertainty relations can be summarized this way: the sharper<br />

we can, in a phenomenon, define the position of the object, the fuzzier the momentum must be defined,<br />

and vice versa. The quantities δq and δp in the relation δqδp ∼ h therefore represent the fuzziness in<br />

the definition. Bohr emphasizes an epistemological role of these quantities stronger than an ontological<br />

role.<br />

IV. 2. 2<br />

REMARKS AND PROBLEMS<br />

Bohr’s supposition that classical language is a definite means of expression for physical observations<br />

which cannot be improved upon, is radical and at first sight even fairly unacceptable. Language<br />

develops and history teaches us that from time to time new concepts are necessary. Aristotle had,<br />

for example, no momentum concept, Newton knew nothing of energy, Coulomb had no theory of<br />

fields, etc. Doesn’t it speak for itself that quantum mechanics also asks for new concepts? Bohr,<br />

however, (ibid., p. 16), says<br />

[. . . ] it would be a misconception to believe that the difficulties of the atomic theory may<br />

be evaded by eventually replacing the concepts of classical physics by new conceptual<br />

forms.<br />

Bohr emphasizes that with this point of view he does not reject the introduction of new entities,<br />

e.g. quarks, superstrings or black holes. The aspects of classical language which are the reason<br />

that it cannot be improved upon are, according to him, descriptions in terms of space and time and<br />

descriptions in terms of cause and effect. These are the only categories with which we can describe<br />

observational results.<br />

Another problem with the idea that the classical concepts cannot be improved upon is Bohr’s<br />

immediate conclusion that the quantum of action cannot occur in the description of a phenomenon,<br />

because a statement such as ‘h = 6.6 · 10 −34 Js’ is also an unambiguous summary of experimental<br />

evidence, although not of one phenomenon. The idea that h cannot appear in the language of observations<br />

is a weak, and in fact untenable point in his argumentation. The prohibition of the use of h<br />

in the language of observations also brought Bohr to the conclusion that the spin of an electron, 1 2 ,<br />

would be fundamentally unobservable. This conclusion has been proven to be incorrect.<br />

In some articles Bohr gives a more abstract explanation of the quantum postulate and emphasizes<br />

the ‘symbolic’ role of h. It does not so much represent the inevitable interaction, or measurement<br />

disturbance, between object and measuring apparatus, as the fundamental impossibility to make a<br />

sharp distinction between object and observation apparatus. It is, in any case, clear that Bohr does not<br />

regard the formalism of quantum mechanics, with its wave functions and operators, as an extension<br />

or improvement of classical language. He emphasizes that this formalism is purely symbolic and<br />

cannot be taken as a description, as the quantum state of a system is given without reference to the<br />

experimental setup.<br />

It should be noted that Bohr, at emphasizing the applicability of concepts, has more in mind than<br />

the ‘logical’ question of ’definiteness’. For Bohr a term like ‘position of a particle’ is applicable if we<br />

can in fact control and secure this position, using firmly bolted apparatuses. Bohr’s use of the term<br />

‘determination’ refers both to a measurement as to a state preparation.


IV. 2. BOHR AND COMPLEMENTARITY 87<br />

Speaking of ‘partially defined positions and momenta’, Bohr considers the uncertainty relation<br />

between position and momentum as the possibility to come to a compromise with the complementarity<br />

between position and momentum. Here we can think of a context of measurement in which the<br />

object interacts with a part of the apparatus which is linked with the rest of the apparatus by means<br />

of a spring with a finite spring constant, an intermediate form between ‘freely movable’ and ‘firmly<br />

bolted’. He has, however, not developed this compromise. This point of view does in fact not fit the<br />

usual mathematical derivation of the uncertainty relations for position and momentum. They make,<br />

for two given (sharp) quantities p and q, a statement about spreading in quantum states, not about the<br />

well-definedness of the quantities. It has been attempted to prove this compromise mathematically,<br />

by the introduction of ‘blurred quantities’, e.g. Busch, Grabowski and Lahti (1995).<br />

Of fundamental importance in Bohr’s point of view is that in a phenomenon an object and experimental<br />

setup are involved. The setup determines which frame of concepts applies to the object. In<br />

many cases the contrast between object and measuring apparatus coincides with that of the microscopic<br />

and macroscopic system, respectively. But that is not necessarily so. A macroscopic system<br />

can also be considered as an object while a microscopic system can serve as a measuring apparatus.<br />

We can consider, for example, a macroscopic measuring apparatus to be the object of another measurement.<br />

As soon as we do this the macroscopic system can, according to Bohr, no longer execute<br />

its role as a measuring device. It becomes an object itself, to which the quantum formalism must be<br />

applied. This functional contrast between object and measuring apparatus is therefore more essential<br />

than that between microscopic and macroscopic systems.<br />

For a good understanding of Bohr’s position, and Heisenberg’s for that matter, it is important to<br />

notice that measurements do not require the presence of consciousness. Decisive for applicability of<br />

classical concepts is the presence of a measurement context. Therefore, subjectivity does not play<br />

a role in any form, for applicability of a concept as ‘momentum’ it does not matter if a conscious<br />

observer, a computer or another measuring apparatus carries out the momentum measurement.<br />

Also, from Bohr’s refusal to assign a realistic meaning to the quantum mechanical description, the<br />

conclusion cannot be drawn that he supports an anti - realistic or ‘instrumentalist’ view on physics,<br />

where instrumentalism is roughly the thesis that a scientific theory is only an instrument to carry out<br />

calculations of which we compare the outcomes with the indications of measuring apparatuses, in<br />

particular, that a theory is no ‘knowledge of the world’, that it does not provide a faithful picture of<br />

what reality is. An object such as an electron has, besides its quantum mechanical state, more than<br />

enough permanent properties, such as the super - selected quantities mass and charge which are not<br />

subject to complementarity, to conceive it as a real, existing object.<br />

IV. 2. 3<br />

AGREEMENT AND DIFFERENCE BETWEEN HEISENBERG AND BOHR<br />

Both Heisenberg and Bohr emphasize that quantum mechanics is a complete theory which cannot<br />

be extended into a more detailed description with hidden variables. Bohr says (Schilpp 1949, p. 235)<br />

[. . . ] in quantum mechanics, we are not dealing with an arbitrary renunciation of a more<br />

detailed analysis of atomic phenomena, but with a recognition that such an analysis is in<br />

principle excluded.


88 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Heisenberg (1927, p. 83) also expresses himself in this sense. He defines the uncertainty relations<br />

as<br />

Even in principle, we cannot know the present in all detail.<br />

He rejects the conception that behind the statistic description of quantum mechanics there still is a<br />

‘real world’ as a “fruitless and senseless speculation” (loc. cit.).<br />

According to both Bohr and Heisenberg, the quantum mechanical description cannot be applied<br />

to the whole world, because a classically described context of measurement is always necessary. The<br />

border between the classical and quantum mechanical description can be moved at will, but cannot<br />

be removed. Therefore, quantum mechanics is not a universal theory in the sense that there exists<br />

something like a ‘wave function of the universe’.<br />

Further agreement between Heisenberg and Bohr is found in the significance they attach to measurement.<br />

The difference is that according to Heisenberg something changes in the object during<br />

measurement; some properties are created, others disappear or become fuzzy. According to Bohr<br />

nothing has to happen in the object. The experimental setup only enables some description of the system<br />

which would not be allowed at another experimental setup. According to Bohr, the uncertainty<br />

relation is a symbolic, contrary to a descriptive, expression of the impossibility to define position and<br />

the momentum in one phenomenon.<br />

Another difference is that Heisenberg tends, more than Bohr, to a realistic interpretation of the<br />

mathematical quantum formalism. In an interview at the end of his life, Heisenberg admitted that he<br />

never really understood the idea of complementarity.<br />

IV. 3<br />

DEBATE BETWEEN EINSTEIN EN BOHR<br />

IV. 3. 1<br />

INTRODUCTION<br />

Einstein, who contributed to the development of the quantum theory until 1922, never wanted<br />

to accept the Copenhagen interpretation. In his memoirs, Heisenberg mentions how he, at a visit to<br />

Berlin, explained his starting - point that the theory may speak exclusively about observable quantities,<br />

and, to his surprise, Einstein wanted to know nothing about it, “the theory decides what can be<br />

observed”. The main source of the course of the debate between Einstein and Bohr which we will<br />

review here, is Bohr’s own report ‘Discussion with Einstein on Epistemological Problems in Atomic<br />

Physics’ (Bohr 1949).<br />

The very first time Einstein gave publicity to his objections was at the 5 th Solvay conference in<br />

Brussels in 1927 where he suggested there were two conceivable conceptions concerning the quantum<br />

mechanical wave function.<br />

(i) The state ψ gives a description of the individual system which is as complete as possible.<br />

(ii) The state ψ does not characterize an individual system but an ensemble of identically prepared<br />

systems. Therefore, as a description of the individual system ψ is incomplete, ψ is a ‘statistical<br />

quantity’.


IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 89<br />

Conception (i) was defended by Heisenberg and Bohr. Einstein posed the next objection to this<br />

conception: when a particle travels through a narrow slit, the wave function will, by deflection, extend<br />

itself over a large part of space. If this is a complete description of the particle, we have to conclude<br />

that it is potentially present everywhere in this area. But after detection of the particle on a photographic<br />

plate it is out of the question that it can still be found elsewhere. Therefore, the wave function<br />

must disappear suddenly there, which would imply a peculiar ‘action at a distance’. This objection<br />

does not apply to conception (ii), because there the detection simply corresponds to the choice of an<br />

element from the ensemble.<br />

In his answer, Bohr emphasized that the deflection of the wave function by a slit in a firmly bolted<br />

screen finds its origin in the possibility of the particle to exchange momentum with the screen. But<br />

this exchange of momentum is not analyzable within this setup, i.e., without detaching the screen.<br />

The question whether a more detailed description of the individual case is possible found its<br />

temporary culmination in the analysis of the thought experiment with the double slit, which is depicted<br />

in figure IV. 2. When a monochromatic wave travels through a screen with two narrow slits, an interference<br />

pattern is visible on a photographic plate. This is typical for wave behavior, where the waves<br />

from both slits cooperate. An individual particle, however, can only travel through one slit, and the<br />

wave function does not tell us through which slit it travels.<br />

Figure IV. 2: The double slit interference experiment (Bohr 1949 )<br />

Einstein now suggested that it was nevertheless possible to obtain information about through<br />

which slit the particle travels, for example by measuring the transfer of momentum to the first screen.<br />

If this screen received a thrust downwards, the particle has chosen the upper slit, and vice versa.<br />

Bohr answered that if we want to measure the momentum transfer to the screen with an exactitude<br />

which is enough to distinguish the recoils belonging to the paths through the two slits, the momentum<br />

of the screen itself must be very exactly known. If d represents the distance between the slits, and l<br />

represents the distance between the screens, the angle between the two paths is of the order<br />

α ≃ sin α = d . (IV. 5)<br />

l<br />

The recoil is of the order<br />

p 0 sin α ≃ d , (IV. 6)<br />

λ l


90 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

and therefore we have to know the momentum of the screen with an uncertainty<br />

δp d . (IV. 7)<br />

λ l<br />

Gaining such an exactitude is, however, only possible if the screen is movable. But in that case it is<br />

no longer possible to fulfil its function as a screen which determines an exact position for the slit. It<br />

is therefore no longer part of the original measuring context, as can be seen in figure IV. 3.<br />

Figure IV. 3: Contexts of measurement in which the interference of the particles is visible, and those<br />

in which the recoil of the screen is visible, exclude each other. (Bohr 1949 )<br />

Actually, because now we will perform a measurement on the screen, the screen itself has to be<br />

considered an object. This means that quantum mechanics applies to it, and the screen is, therefore,<br />

also subject to an uncertainty relation<br />

δq λ l . (IV. 8)<br />

d<br />

But this is an indefiniteness of the same order of magnitude as the distance between the interference<br />

bands. Bohr concludes that under these circumstances interference can no longer be seen.<br />

With this reasoning he was able to transform Einstein’s objection to an affirmation of his idea<br />

of complementarity; as soon as we try to carry out a closer analysis of the phenomenon, we have to<br />

modify the experimental setup in such a way that the phenomenon changes unrecognizably. Nowadays<br />

an alternative of this thought experiment can actually be carried out in a laboratory, as we will<br />

discuss in section IV. 4.<br />

IV. 3. 2<br />

THE PHOTON BOX<br />

At the 6 th Solvay Conference in 1930 in Brussels, Einstein gave another example, which is known<br />

under the name ‘the photon box’. It concerns an isolated box filled with radiation and equipped with<br />

a clock mechanism which opens a shutter during a very short interval. It is assumed that in advance<br />

the box is weighed meticulously.


IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 91<br />

Upon closure of the shutter we have, according to Einstein, a choice: either we weigh the box<br />

again and determine how much mass has vanished so that we can, using the relation E = m c 2 ,<br />

retrieve the energy of the escaped photon, or we open the box and read off the clock mechanism to<br />

determine when the shutter has been opened, which enables us to predict the time of exit of the photon<br />

and therefore its time of arrival at a remote detector. We can choose between both options long after<br />

the photon has left.<br />

Bohr’s answer is not entirely clear. It may be assumed that he did not understand Einstein’s<br />

intentions correctly. 1 He explains Einstein’s objection as an attempt to refute the uncertainty relation<br />

between energy and time; he shows that both determinations cannot possibly be made at the same<br />

time.<br />

Bohr reasons as follows. Assume that the box hangs in equilibrium from a spring in a gravitational<br />

field. When in a time interval T a mass δm escapes, it receives an upward impulse F ∆t of magnitude<br />

g δm T. (IV. 9)<br />

We can keep T finite by, at some moment, hanging a small weight to the box to compensate for the<br />

loss of mass. Suppose we want to determine the mass of the photon by measuring this momentum<br />

transfer then, again, the momentum of the box at the start of the experiment must be exactly known,<br />

δp g δm T. (IV. 10)<br />

But now the same argument applies as used in the double slit experiment. This precise determination<br />

of momentum is only possible if the fixation of the position of the box is given up. The box itself<br />

must be considered a quantum mechanical object, and therefore the uncertainty relation δ pδ q h<br />

applies to it. The position of the box is unknown with an uncertainty of magnitude<br />

δq <br />

<br />

g δm T<br />

(IV. 11)<br />

from which it follows that the gravitational potential ϕ g to which the clock is exposed is also uncertain,<br />

δϕ g ≃ g δq <br />

. (IV. 12)<br />

δm T<br />

But according to the red shift formula from the general theory of relativity (!) the pace of a clock is<br />

influenced by the gravitational potential,<br />

∆T<br />

T<br />

= δϕ g<br />

, (IV. 13)<br />

c2 therefore, the pace of the clock is also uncertain, and consequently the time of opening of the clock is<br />

unknown. Under the circumstances in which we can determine the energy of the photon, we cannot<br />

retrieve its exit time exactly.<br />

Although Bohr seems to rebuke Einstein with his own theory, Bohr’s answer evokes, among other<br />

things, the question whether it is appropriate that the correctness of quantum mechanics relies on<br />

the correctness of the general theory of relativity, which is a classical theory, and is, strictly spoken,<br />

contradictory to quantum mechanics.<br />

1 That Einstein indeed had the intention to point out the freedom of choice is apparent in a letter to Bohr from Paul<br />

Ehrenfest, who heard the argument from Einstein earlier.


92 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

EXERCISE 27. Try, using the uncertainty relation for time and energy, δ tδ E h, to refute<br />

Einstein’s argumentation without appealing to other physical theories.<br />

IV. 3. 3<br />

EINSTEIN, PODOLSKY AND ROSEN<br />

The thought experiment of Einstein, Podolsky and Rosen, which we discussed in section I. 2,<br />

forms the highlight of the debate. Here Einstein’s objections emerge in their most pure form.<br />

Given two systems which interacted with each other at some time, but are separated now, consider<br />

two non - commuting quantities A and A ′ of one of the particles, and B and B ′ of the other<br />

particle. Measurement of A allows us to do a certain prediction concerning B of the other particle,<br />

measurement of A ′ allows us, analogously, to make a certain prediction concerning B ′ of the other<br />

particle.<br />

Einstein admits that these two measurements cannot be carried out simultaneously. But we can<br />

choose which measurement we perform while the other particle is very far away. It is not reasonable,<br />

EPR argue, that this other particle will be influenced by this choice. This means that although<br />

only one of both predictions concerning the other particle can be done with certainty, both predictions<br />

are, at the same time, true, corresponding to properties of the other particle, i.e., to ‘elements of<br />

physical reality’.<br />

IV. 3. 4<br />

HEISENBERG, BOHR AND EINSTEIN, PODOLSKY AND ROSEN<br />

According to Heisenberg, measurement has an essential influence. Some properties of the particle<br />

become sharp, others fuzzy. If this consequence of measurement would be understood to be a physical<br />

interaction, this would evoke the next ‘natural’ requirement of locality (M.L.G. Redhead 1987, p. 77)<br />

An unsharp value for an observable cannot be changed into a sharp value by measurements<br />

performed at a distance.<br />

But the analysis of EPR shows that, the particles being far removed from each other, this requirement<br />

has not been met, making Heisenberg’s interpretation much less physically pictorial than it seemed to<br />

be initially. The natural requirement of locality in Bohr’s interpretation reads (loc. cit.)<br />

A previously undefined value for an observable cannot be defined by measurements performed<br />

‘at a distance’.<br />

This requirement has also not been fulfilled.<br />

Bohr’s answer to EPR, and his rejection of the incompleteness claim, amounts to the notion that<br />

the aforementioned requirement of locality can be violated without implying the existence of superluminal<br />

physical effects. The ‘defining’ functioning of measuring apparatuses is not a process that<br />

propagates in space and time and by means of some interaction disturbs particles that are not measured,<br />

or creates values for properties in those particles. It concerns an epistemological role of the<br />

measuring apparatuses. The measuring apparatuses measuring one of a pair of correlated particles<br />

define which classical terms apply to both particles.


IV. 4. NEUTRON INTERFEROMETRY 93<br />

If the position is measured of one of the particles, we have to do with a phenomenon in which<br />

the term position is applicable. Thus, on the basis of the correlation between these particles the term<br />

‘position’ is also applicable to the other particle. If the position of one of the particles is measured,<br />

a ‘position perspective’ is opened, so to speak, to the world. Likewise, measurement of momentum<br />

on one of the particles makes the other particle accessible to a description with the term ‘momentum’.<br />

Even though there is no physical intervention on this particle, it is still not permitted to speak about<br />

the particles having these properties outside the context of a phenomenon. Therefore, Bohr rejects<br />

Einstein’s reasoning that the other particle, not being disturbed by the measurement, consequently<br />

also possesses the properties ‘position’ and ‘momentum’ independent of measurement.<br />

In fact, this same reasoning can be applied to the the double slit experiment, as Bohr showed<br />

in his answer to Einstein. In this experiment we also have a choice to do either a measurement of<br />

momentum on the screen and this way determine which path the particle has taken, thereby losing the<br />

interference pattern, or to measure its position, thereby retrieving the interference pattern again. But<br />

Bohr writes<br />

As repeatedly stressed, the principal point is here that such measurements demand mutually<br />

exclusive experimental arrangements.<br />

IV. 4<br />

NEUTRON INTERFEROMETRY<br />

Nowadays, a variant version of the thought experiment with the double slit can be carried out in<br />

the laboratory using a neutron interferometer. A neutron interferometer consists of a massive perfect<br />

silicon crystal, usually with dimensions of approximately 10 × 10 × 50 cm 3 . After cutting large<br />

notches in the crystal, a basis with upstanding teeth remains, see figure IV. 4.<br />

Figure IV. 4: Several perfect crystal neutron interferometers (Rauch and Werner 2000 )


94 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

Using an interferometer with three upstanding teeth, a monochromatic beam of neutrons with<br />

a de Broglie wavelength of approximately 1 Å now hits the first tooth of this crystal. The crystal<br />

lattice acts like a grid and lets the beam pass in very sharply defined directions. Under suitable<br />

conditions there are exactly two emanating beams, one transmitted (T) and one reflected (R), as<br />

shown in figure IV. 5 a.<br />

At the second tooth this process is repeated, and both beams are again split up. Two of them are<br />

now outside the interferometer where they are screened, no longer participating. The remaining two<br />

beams are bent towards each other and meet at the third tooth. Here, both beams are split up again,<br />

and now the straightforward going beam of one path is superimposed on the reflected beam of the<br />

other path. Neutron detectors are placed in both emanating beams.<br />

T<br />

T<br />

2<br />

R<br />

A<br />

R<br />

1<br />

R<br />

B<br />

T<br />

a) A sketch of the setup b) The experimental results<br />

(Rauch and Werner 2000 )<br />

Figure IV. 5: The interference pattern in the neutron interferometer is acquired by measuring the<br />

intensity in the detectors at a variable optical path length difference.<br />

If the incoming beam comes from below, and the beams are not manipulated, all neutrons turn out<br />

to end up in the upper beam at detector A, undergoing constructive interference, while the neutrons<br />

in the lower beam extinguish each other. For this phenomenon it is essential that the interferometer<br />

consists of only one crystal, for in that case the waves remain coherent even though, along the way,<br />

the beams have been separated by ‘macroscopic distances’, approximately 5 cm or ≃ 10 9 λ. When a<br />

neutron has arrived in a detector it can have traveled along one of both paths.<br />

Upon introducing a phase difference between the two paths by sliding a small piece of aluminium<br />

of variable thickness in one of the paths, the intensity shifts from the upper to the lower detector. This<br />

intensity is a periodic function of the thickness of the piece of aluminium, see figure IV. 5 b. This is<br />

the interference pattern.<br />

Now the question is if we can, in some way, uncover along which path the particle has traveled.<br />

Following Bohr’s line of thought this should be possible by sawing off one of the teeth and measuring<br />

the recoil it receives of the neutron. Such an experiment can, however, not be carried out with the<br />

required experimental exactitude.<br />

Another option is to make use of the fact that the neutron is a spin 1/2 particle and therefore has<br />

an internal degree of freedom. We can carry out such an experiment with a polarized beam, where


IV. 4. NEUTRON INTERFEROMETRY 95<br />

all neutrons have, at entry in the interferometer, spin up in the z - direction. We place the complete<br />

setup in a homogeneous magnetic field which ensures that spin up and spin down have a different<br />

energy ω 0 . In one of the paths we place a ‘spin flipper’, a small coil through which an alternating<br />

current runs having exactly the resonance frequency ω 0 . At a suitable choice of the length of the<br />

coil the spin of every neutron which travels through it will be flipped over. Subsequently, we place<br />

spin analyzers in front of the detectors, so that we can not only observe in which emanating beam the<br />

neutron is located but also its spin in the z - direction.<br />

In this setup we can therefore uncover exactly along which path the particle has traveled; spin up<br />

means the path without the spin flipper has been chosen, spin down means the neutron traveled along<br />

the path with the spin flipper. But in this setup no more interference is seen! The intensity is equal in<br />

both detectors and independent of the phase difference.<br />

We can describe this as follows. The wavepath function |ϕ 0 ⟩ ∈ L 2 (R 2 ) of an emanating neutron<br />

exists of four terms,<br />

|ϕ 0 ⟩ = 1 2<br />

(<br />

|ϕ1A ⟩ + |ϕ 1B ⟩ + e i χ |ϕ 2A ⟩ + e i χ |ϕ 2B ⟩ ) . (IV. 14)<br />

Here ϕ iA and ϕ iB represent the wave functions ending up in the detectors A and B, respectively, 1<br />

and 2 refer to the two possible paths through the interferometer, as can be seen in figure IV. 5 a. The<br />

factor e iχ corresponds to the phase shift by the aluminium. If χ = 0, there is maximum constructive<br />

interference in A and total destructive interference in B, from which it follows that<br />

|ϕ 1A ⟩ = |ϕ 2A ⟩ and |ϕ 1B ⟩ = − |ϕ 2B ⟩. (IV. 15)<br />

The intensity in detector A is given by the expectation value of a projection P A , where<br />

P A |ϕ iA ⟩ = |ϕ iA ⟩ and P A |ϕ iB ⟩ = 0, analogously for P B . Therefore, we find for the intensity I A<br />

of the neutron beam that encounters detector A, quantum mechanically expressed as the probability<br />

to find a neutron in detector A,<br />

I A = ⟨ϕ 0 | P A |ϕ 0 ⟩ = 1 (<br />

4 ⟨ϕ1A | + ⟨ϕ 2A | e − i χ) ( |ϕ 1A ⟩ + e i χ |ϕ 2A ⟩ )<br />

and likewise for I B ,<br />

= 1 2<br />

I B = ⟨ϕ 0 | P B |ϕ 0 ⟩ = 1 4<br />

= 1 2<br />

(1 + cos χ), (IV. 16)<br />

(<br />

⟨ϕ1B | + ⟨ϕ 2B | e − i χ) ( |ϕ 1B ⟩ + e i χ |ϕ 2B ⟩ )<br />

(1 − cos χ). (IV. 17)<br />

In this experiment the neutrons are polarized, therefore we can add the spin state to the wavepath<br />

function and thus get a Pauli spinor,<br />

( 1<br />

|ϕ i, tot ⟩ = |ϕ 0 ⟩ ⊗ |z ↑⟩ = ϕ(⃗q) =<br />

0)<br />

( ) ϕ(⃗q)<br />

0<br />

∈ L 2 (R 3 ) ⊗ C 2 . (IV. 18)<br />

The functioning of the spin flipper, which we assume to be completely ideal, can now be described as<br />

follows. The component of the state traveling along path 1 does not meet a spin flipper, which means


96 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

that it remains unaltered, and we have, leaving out the cartwheels ⊗,<br />

|ϕ 1A ⟩ |z ↑⟩ → |ϕ 1A ⟩ |z ↑⟩ and |ϕ 1B ⟩ |z ↑⟩ → |ϕ 1B ⟩ |z ↑⟩, (IV. 19)<br />

whereas for the components traveling along path 2 the spin direction reverses,<br />

|ϕ 2A ⟩ |z ↑⟩ → |ϕ 2A ⟩ |z ↓⟩ and |ϕ 2B ⟩ |z ↑⟩ → |ϕ 2B ⟩ |z ↓⟩. (IV. 20)<br />

Therefore, the total final state is<br />

|ϕ f, tot ⟩ = 1 2<br />

which means that for the intensity we have<br />

(<br />

|ϕ1A ⟩ |z ↑⟩ + |ϕ 1B ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ + e i χ |ϕ 2B ⟩ |z ↓⟩ ) , (IV. 21)<br />

I A = ⟨ϕ f, tot | P A ⊗ 11 |ϕ f, tot ⟩ = 1 4 ⟨ϕ f, tot| ( |ϕ 1A ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ ) = 1 2<br />

, (IV. 22)<br />

and likewise for I B . We see that, because of the orthogonality of the spin states |z ↑⟩ and |z ↓⟩, the<br />

interference term disappears.<br />

With the neutron interferometer we can also illustrate the fact that there is always freedom of<br />

choice because we can, instead of analyzers for spin in the z - direction, place analyzers for spin in<br />

the x - direction.<br />

The eigenvectors for spin in the x - direction are superpositions of those in the z - direction, see<br />

section III. 6, equations (III. 145) and (III. 146),<br />

|x ↑⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ + |z ↓⟩<br />

)<br />

and |x ↓⟩ = 1 2<br />

√<br />

2<br />

(<br />

|z ↓⟩ − |z ↑⟩<br />

)<br />

. (IV. 23)<br />

We can calculate the probability to find, e.g., a neutron with spin in the negative x - direction in detector<br />

A, as the expectation value of the projector P A |x ↓⟩⟨x ↓| in the state |ϕ f, tot ⟩, (IV. 21),<br />

⟨ϕ f, tot | ( P A ⊗ |x ↓⟩ ⟨x ↓| ) |ϕ f, tot ⟩<br />

= 1 (<br />

4 ⟨ϕ1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + e i χ ⟨ϕ 1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩<br />

+ e − i χ ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩ )<br />

= 1 4<br />

(1 − cos χ), (IV. 24)<br />

and we see interference again.<br />

EXERCISE 28. Verify the calculations (IV. 22) and (IV. 24).<br />

In this case we also can choose whether we measure spin in the x - direction or in the z - direction<br />

long after the neutron has left the interferometer, which means that the neutron seems to make the<br />

choice whether to take one of the paths through the interferometer, or to show interference between


IV. 5. THE UNCERTAINTY RELATIONS 97<br />

both paths, after it has left the interferometer. J.A. Wheeler (1978) called such experiments delayed -<br />

choice experiments. Outcomes of measurements in the future seem to determine what has happened<br />

in the past!<br />

Actual confirmation of this freedom of choice was not obtained until 2007, when a group in<br />

Cachan, France, succeeded to carry out such an experiment using linearly polarized single photons,<br />

a 48 m interferometer and two beamsplitters. In their article (Jaques 2007) they conclude that<br />

Our realization of Wheeler’s delayed - choice gedanken experiment demonstrates that<br />

the behavior of the photon in the interferometer depends on the choice of the observable<br />

that is measured, even when that choice is made at a position and a time such that it is<br />

separated from the entrance of the photon into the interferometer by a space - like interval.<br />

EXERCISE 29. Give, concisely, Bohr’s view on such experiments.<br />

IV. 5<br />

THE UNCERTAINTY RELATIONS<br />

IV. 5. 1<br />

INTRODUCTION<br />

Heisenberg’s original reasonings concerning the uncertainty principle resulted in ‘approximate<br />

inequalities’ for position q and momentum p, and for energy E and time t, of the form<br />

δq δp ∼ h and δE δt ∼ h. (IV. 25)<br />

In this section we will focus on the mathematical meaning of δ q, δ p, δ E and δ t and their interpretation.<br />

In his first article, Heisenberg (1927) gives the Gaussian wave packet as the only quantitative<br />

example. Its Fourier transform is also Gaussian and the widths of these packets are inversely proportional<br />

to each other, a general result of Fourier analysis. A suitable definition of these widths<br />

yields q 1 p 1 = h, where q 1 and p 1 represent the widths in question. Still in the same year E.H. Kennard<br />

derived the next general inequality,<br />

∆ ψ Q ∆ ψ P 1 2<br />

, (IV. 26)<br />

where ∆ ψ Q and ∆ ψ P are standard deviations of Q and P in ψ ∈ L 2 (R). In his Chicago lectures,<br />

Heisenberg (1930) considers the Kennard inequality (IV. 26) as the mathematical expression of the<br />

uncertainty principle. We will criticize this still widespread conception shortly, and give a derivation<br />

of the ‘standard uncertainty inequalities’, which are a generalization of the Kennard inequality.<br />

◃ Remark<br />

In his discussions of the uncertainty principle, Bohr exclusively makes use of relations of the<br />

type (IV. 25). ▹


98 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 2<br />

THE STANDARD UNCERTAINTY RELATIONS<br />

If ψ ∈ L 2 (R) is the normalized wave function of a physical system in the q - language,<br />

with ∥ψ∥ = 1, the wave function ˜ψ(p) in the p - language is its Fourier transform<br />

˜ψ(p) =<br />

∫<br />

1<br />

√<br />

2 π <br />

R<br />

e − i p q<br />

ψ(q) dq, (IV. 27)<br />

and its inverse Fourier transform is<br />

∫<br />

1<br />

ψ(q) = √ e i p q<br />

˜ψ(p) dp. (IV. 28)<br />

2 π <br />

R<br />

The norm is invariant under Fourier transformations, therefore ∥ ˜ψ∥ = 1.<br />

The standard deviation of position in a state |ψ⟩, ∆ ψ Q, is defined as<br />

∫<br />

( ∫ ) 2.<br />

(∆ ψ Q) 2 = ⟨Q 2 ⟩ ψ − ⟨Q⟩ ψ 2 = q 2 |ψ(q)| 2 dq − q |ψ(q)| 2 dq (IV. 29)<br />

R<br />

R<br />

Likewise, for momentum, ∆ ψ P , we have<br />

(∆ ψ P ) 2 = ⟨P 2 ⟩ ψ − ⟨P ⟩ ψ<br />

2<br />

∫<br />

= − 2 ψ ∗ (q) d2 ψ(q)<br />

( ∫<br />

R dq 2 dq − − i ψ ∗ (q) dψ(q) ) 2<br />

dq<br />

R dq<br />

∫<br />

= p 2 | ˜ψ(p)|<br />

( ∫ 2. 2 dp − p | ˜ψ(p)| dp) 2 (IV. 30)<br />

R<br />

R<br />

Without loss of generality we can assume ⟨P ⟩ and ⟨Q⟩ to equal 0, so that<br />

of 1 2<br />

(∆ ψ P ) 2 = − 2 ∫<br />

R<br />

ψ ∗ (q) d2 ψ(q)<br />

dq 2 dq =<br />

∫<br />

R<br />

p 2 | ˜ψ(p)| 2 dp. (IV. 31)<br />

If the wave function ψ (q) is a Gaussian wave packet, the product takes on the minimum value<br />

. An example is the ground state of the one - dimensional harmonic oscillator having mass m,<br />

ϕ 0 (q) =<br />

( m ω0<br />

π <br />

) 1<br />

4 e − m ω q2<br />

2 , (IV. 32)<br />

with energy E 0 = 1 2 ω 0.<br />

Before interpreting the Kennard inequality (IV. 26), we give a still more general inequality, derived<br />

by Schrödinger (1930). Consider two arbitrary self - adjoint operators A and B acting on a Hilbert<br />

space H. Define, for a pure state |ψ⟩ ∈ H, the following operators:<br />

A ψ := A − ⟨A⟩ ψ 11 and B ψ := B − ⟨B⟩ ψ 11. (IV. 33)<br />

The expectation values of these operators are, in the state |ψ⟩, equal to 0,<br />

⟨A ψ ⟩ ψ = ⟨B ψ ⟩ ψ = 0. (IV. 34)


IV. 5. THE UNCERTAINTY RELATIONS 99<br />

The Cauchy - Schwarz inequality (II. 12), p. 19, for the vectors A ψ |ψ⟩ and B ψ |ψ⟩ reads<br />

⟨A ψ ψ | A ψ ψ⟩ ⟨B ψ ψ | B ψ ψ⟩ ∣ ∣ ⟨Aψ ψ | B ψ ψ⟩ ∣ ∣ 2 . (IV. 35)<br />

Because A ψ and B ψ are self - adjoint, we can also write this inequality as follows,<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ ∣ ∣⟨A ψ B ψ ⟩ ψ<br />

∣ ∣<br />

2 . (IV. 36)<br />

Using both the commutator [· , ·] − and the anti - commutator [· , ·] + , we find for the right - hand side<br />

of (IV. 36)<br />

∣ ⟨Aψ B ψ ⟩ ψ<br />

∣ ∣<br />

2<br />

where the cross - term disappears because of<br />

Furthermore,<br />

= ∣ 1<br />

2 ⟨[A ψ, B ψ ] − ⟩ ψ + 1 2 ⟨[A ∣<br />

ψ, B ψ ] + ⟩ ψ 2<br />

= 1 ∣<br />

∣<br />

4 ⟨[Aψ , B ψ ] − ⟩ ψ 2 +<br />

1<br />

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 , (IV. 37)<br />

⟨[A ψ , B ψ ] − ⟩ ∗ ψ = − ⟨[A ψ, B ψ ] − ⟩ ψ<br />

⟨[A ψ , B ψ ] + ⟩ ∗ ψ = + ⟨[A ψ, B ψ ] + ⟩ ψ . (IV. 38)<br />

[A ψ , B ψ ] − = [A, B] − , (IV. 39)<br />

and we obtain the inequality<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />

∣ ∣<br />

∣⟨[A, B] − ⟩ ψ 2 +<br />

1<br />

4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 . (IV. 40)<br />

In view of the inequalities (IV. 26) and (IV. 40), we make a few remarks.<br />

(i) Leaving out the last term on the right - hand side of inequality (IV. 40) gives the better known<br />

but weaker inequality, derived by H.P. Robertson (1929),<br />

⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />

∣<br />

∣⟨[A, B] − ⟩ ψ<br />

∣ ∣<br />

2 . (IV. 41)<br />

(ii) Notice that ⟨A 2 ψ ⟩ ψ is equal to the square of the standard deviation of the quantity A in the<br />

state |ψ⟩,<br />

⟨A 2 ψ ⟩ ψ = ⟨(A − ⟨A⟩ ψ ) 2 ⟩ = (∆ ψ A) 2 . (IV. 42)<br />

(iii) For the special case A = Q and B = P , the Robertson inequality (IV. 41) transforms into the<br />

Kennard inequality (IV. 26), and the expressions (IV. 29) and (IV. 31) correspond to ⟨Q 2 ψ ⟩ ψ in<br />

the q - language and ⟨P 2<br />

ψ ⟩ ψ in the p - language.<br />

(iv) Notice that in deriving these uncertainty relations the interpretation of the uncertainties plays<br />

no role.


100 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

(v) An objection to the Robertson inequality (IV. 41) and the Schrödinger inequality (IV. 40) is that<br />

the right - hand side depends on the state, therefore, it is no absolute lower limit for all states.<br />

If |ψ⟩ is an eigenstate of A, the right - hand side of the Robertson inequality (IV. 41) is 0 and<br />

does not provide any restriction on ∆B. Therefore, even if A and B are not both at the same<br />

time sharp in any state, i.e., they do not have simultaneous eigenstates, this does not follow<br />

from the inequality (IV. 41).<br />

Only if the right - hand side of inequality (IV. 41) is unequal to zero for all states, the Robertson<br />

inequality represents the uncertainty principle. This is the case if the commutator is a multiple<br />

of unity, as in the case of P and Q, where [P, Q] = −i11, see p. 78, (IV. 1). It can, however, be<br />

proved that this canonical commutation relation [P, Q] can only apply to unbounded operators<br />

having no eigenstates in the, inevitably infinite dimensional, Hilbert space in which they act.<br />

(vi) Already in 1929 E.U. Condon pointed out the following facts (Jammer 1974, p. 71). In certain<br />

states, non - commuting operators can both be sharp. Take, for example, the ground state of the<br />

H - atom, or any stationary state with total angular momentum l = 0. This is also an eigenstate<br />

of L x , L y and L z with eigenvalue 0. Therefore, ∆L x ∆L y = 0, and likewise for L x and L z ,<br />

and for L y and L z , although these operators do not mutually commute. Therefore, the fact that<br />

operators do not commute does not guarantee an uncertainty relation. Furthermore, sometimes<br />

an inequality holds for commuting operators. Take again a stationary state of the H - atom,<br />

with l = 1 and m = 0. In that state ⟨[L x , L y ]⟩ = 0, whereas ∆L x ≠ 0 and ∆L y ≠ 0.<br />

In conclusion, there are fundamental objections against accepting the Schrödinger inequality, and<br />

by implication against the weaker inequalities which follow from it, to be the mathematical expression<br />

of Heisenberg’s uncertainty principle.<br />

And this is not everything yet.<br />

IV. 5. 3<br />

SINGLE SLIT EXPERIMENT<br />

Relations (IV. 26) and (IV. 41) are considered to be the mathematical expression of the uncertainty<br />

principle in the major part of textbooks on quantum mechanics. Next to the previous criticism, we<br />

will show that this also is, remarkably enough, inconsistent with the experiments used as illustrations<br />

of this principle (Uffink and Hilgevoord 1985, 1988 and Hilgevoord and Uffink 1988, 1990).<br />

Consider the deflection of light, or of electrons, by a single slit in an absorbing screen, an example<br />

Heisenberg also gives. Take for the wave function representing the particles passing through the<br />

screen with the slit a simple square wave function, see figure IV. 6,<br />

ψ ss (q) =<br />

{<br />

1 √<br />

2 a<br />

if |q| a<br />

0 elsewhere<br />

, (IV. 43)<br />

where 2 a ∈ R + is the width of the slit, and q the Cartesian coordinate parallel to the screen and<br />

perpendicular to the slit.


IV. 5. THE UNCERTAINTY RELATIONS 101<br />

2 a<br />

|ψ ss (q)| 2<br />

Figure IV. 6: The probability distribution in position for a slit of width 2 a<br />

The Fourier transform of ψ ss is<br />

˜ψ ss (p) =<br />

√ a<br />

π <br />

sin(ap/)<br />

. (IV. 44)<br />

a p / <br />

The square of this wave function, | ˜ψ ss (p)| 2 , has the same form as the diffraction pattern for the slit<br />

which is formed on a photographic plate placed far away, see figure IV. 7.<br />

2π/a<br />

| ˜ψ ss (p)| 2<br />

Figure IV. 7: The diffraction pattern for a small slit of width 2 a<br />

For the standard deviation of position and momentum in the state ψ ss we find<br />

(∆ ψss Q) 2 =<br />

∫<br />

R<br />

q 2 |ψ ss (q)| 2 dq = 1<br />

2 a<br />

∫ +a<br />

−a<br />

q 2 dq = 1 3 a2 (IV. 45)


102 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

and<br />

yielding<br />

(∆ ψss P ) 2 =<br />

∫<br />

R<br />

p 2 | ˜ψ ss (p)| 2 dp = 1<br />

π a<br />

∫<br />

R<br />

| sin(ap)| 2 dp = ∞, (IV. 46)<br />

∆ ψss Q ∆ ψss P = 1 3√<br />

3 a ∞. (IV. 47)<br />

This indeed satisfies the Kennard inequality (IV. 26), but in a little interesting manner.<br />

Although ∆ ψss P = ∞, the function | ˜ψ ss | 2 has in fact a very pronounced central peak, of a width<br />

of the order a −1 , in which 95% of the total probability is located. It is the inverse proportionality<br />

of the width of this central peak to the width of the slit, which, according to Heisenberg, illustrates<br />

the uncertainty principle; it is impossible to make the probability densities |ψ ss (q)| 2 and | ˜ψ ss (p)| 2<br />

arbitrarily small at the same time.<br />

But this conclusion can not be inferred from the Kennard inequality (IV. 26). If a goes to infinity,<br />

| ˜ψ ss (p)| 2 goes to the delta function δ (p). The standard deviation ∆ ψss P , however, remains<br />

divergent. In other words, 95% of a probability distribution can be concentrated on an arbitrarily<br />

small interval, whereas the standard deviation of the distribution remains arbitrarily large. 2 If nothing<br />

is given concerning the distributions |ψ ss (q)| 2 and | ˜ψ ss (p)| 2 but the Kennard inequality (IV. 26),<br />

these distributions could both be very narrow, and, consequently, Heisenberg’s conclusion can not be<br />

derived from the Kennard inequality, in contrast to what is usually claimed.<br />

Nevertheless, Heisenberg’s conclusion is correct for the given example of the single slit. This<br />

raises the question if his statement is valid in general. What we are in fact interested in is a measure<br />

for the width of a probability distribution representing the width of the unweighted distribution.<br />

The most natural definition of such a measure is the smallest interval a fraction α ∈ [0, 1] of<br />

the total probability can be in, where, roughly, α = 0.95 is taken. If ρ is a probability density, the<br />

definition is<br />

{<br />

∫ b<br />

}<br />

W α (ρ) := min [a, b] ⊂ R ∣ ρ(x) dx = α . (IV. 48)<br />

a<br />

For position and momentum in quantum mechanics we define<br />

{<br />

∫ b<br />

}<br />

W α (Q, ψ) := min [a, b] ⊂ R ∣ |ψ(q)| 2 dq = α , (IV. 49)<br />

{<br />

W α (P, ψ) := min [a, b] ⊂ R<br />

∣<br />

a<br />

∫ b<br />

a<br />

| ˜ψ(p)|<br />

}<br />

2 dp = α . (IV. 50)<br />

The product of these measures also satisfy an uncertainty relation, as was shown for the first time by<br />

H.J. Landau and H.O. Pollak (1961), nota bene in a journal for industrial engineers of the American<br />

Bell Telephone Company,<br />

W α (P, ψ) W α (Q, ψ) c α , (IV. 51)<br />

where α ∈ ( 1<br />

2 , 1] , and c α > 0 is a constant which only depends on α, not on ψ.<br />

2 Responsible for this phenomenon is the mathematical fact that the standard deviation assigns a quadratically increasing<br />

weight to the tails of a distribution. In a Gaussian distribution, e.g. the Gaussian wave packet (IV. 32), these tails go to zero<br />

rapidly enough because an exponential power goes to zero more rapidly than any polynomial goes to infinity, but for many<br />

wave functions occurring in physics the standard deviation diverges.


IV. 5. THE UNCERTAINTY RELATIONS 103<br />

From this inequality it follows that the probability densities of position and momentum cannot<br />

simultaneously be made arbitrarily small, in the sense that a fraction α is concentrated on a arbitrarily<br />

small interval. Finally, 34 years after the birth of the uncertainty principle that of which everyone<br />

thought follows from the standard uncertainty relations was proven.<br />

For the square wave function ψ ss (IV. 43) and its Fourier transform (IV. 44) we find<br />

W α (Q, ψ ss ) ≃ a and W α (P, ψ ss ) ≃ , (IV. 52)<br />

a<br />

so that the product is in the order of magnitude of .<br />

IV. 5. 4<br />

TIME AND ENERGY<br />

In the same article in which Heisenberg (1927) introduces the uncertainty relation for position<br />

and momentum, he also discusses the uncertainty relation between time and energy, starting from the<br />

‘well - known’ equation Et − tE = ih. This equation has caused many problems.<br />

If t is taken to be the universal time parameter, the spectrum of the operator t must be the real axis.<br />

But then the commutation relation can only be satisfied by an energy operator of which the spectrum<br />

is the real axis also. On the other hand, we know that the energy spectrum of quantum mechanical<br />

systems is generally bounded from below and can even be totally or partially discrete. Hence, the<br />

conclusion was soon drawn that there is no time operator in quantum mechanics (Von Neumann 1932,<br />

Pauli 1933). In the light of the existence of a position operator and with the theory of relativity in<br />

mind it was felt that in quantum mechanics something strange was going on with ‘time’. This is<br />

expressed in almost all textbooks and articles concerning this subject. Nevertheless, it has to do with<br />

a conceptual confusion which has not been noticed for a remarkably long time.<br />

As it happens, the comparison between q and t is faulty if t is understood to be a universal time<br />

parameter. After all, q is a dynamic variable of a specific physical system, for example of a particle,<br />

and therefore there are a lot of q’s in a multiple particle system. There is, however, only one time<br />

parameter. This does not belong to a certain physical system but must be put on a par with the<br />

universal position coordinates x, y, z, with which it is linked in the theory of relativity. No more<br />

than these position coordinates, the time coordinate t is an operator in quantum mechanics. Only the<br />

dynamic variables of physical systems can be operators, and the problem outlined above is therefore<br />

a pseudo - problem.<br />

Nevertheless, one can wonder if dynamic variables exist which are just as ‘timelike’, literally<br />

speaking, as q is ‘positionlike’. The answer is affirmative. Such variables exist in systems we call<br />

‘clocks’, think, for example, of the position or the orientation of the hand of a clock. But also very<br />

simple, microscopic systems can have such variables. In quantum mechanics these dynamic time<br />

variables become operators. They occur in specific systems and therefore they are not universal.<br />

And, similar to other dynamic variables, generally the spectrum of such time operators in quantum<br />

mechanics is not the entire real axis (see further J. Hilgevoord 2002).


104 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 5<br />

DOUBLE SLIT EXPERIMENT<br />

Even more interesting is the famous interference experiment with the double slit. The wave function<br />

corresponding to particles passing through the screen with the slits is, in analogy with (IV. 43),<br />

ψ ds (q) =<br />

{<br />

1 √<br />

2 a<br />

if q ∈ [− A − a, − A + a] ∪ [A − a, A + a]<br />

0 elsewhere<br />

, (IV. 53)<br />

where 2a is the width of each slit, 2A is the distance between the slits, and A ≫ a, see figure IV. 8.<br />

2 A<br />

2 a<br />

|ψ ds (q)| 2<br />

Figure IV. 8: The probability distribution in position for a double slit, 2 a is the width of each slit and<br />

2 A the distance between the slits<br />

The Fourier transform of this double square wave function ψ ds is<br />

˜ψ ds (p) =<br />

√<br />

2 a<br />

( Ap<br />

) sin(ap/)<br />

π cos . (IV. 54)<br />

a p / <br />

The function | ˜ψ ds | 2 again has the same form as the interference pattern for the slits on a photographic<br />

plate placed far away, as can be seen in figure IV. 9.<br />

2 π / a<br />

2 π / A<br />

| ˜ψ ds (p)| 2<br />

Figure IV. 9: The interference pattern for the double slit


IV. 5. THE UNCERTAINTY RELATIONS 105<br />

Now there are, however, two parameters playing a role. The distance of the slits A is a measure for<br />

the total width of |ψ ds (q)| 2 , the ‘enveloping’ cosine factor in (IV. 54), while the width of the slits a is a<br />

measure for the ‘fine structure’ of this probability density. For | ˜ψ ds (p)| 2 the roles have reversed, A −1<br />

is a measure for the width of the interference lines, while a −1 is a measure for the total width of the<br />

interference pattern. This shows the well - known fact that the width of the interference lines and the<br />

distance between the slits are inversely proportional. In a moment we will see that Bohr’s discussion<br />

of the double slit experiment exactly rests on this fact.<br />

◃ Remark<br />

Consider the measures<br />

∆ ψds Q ≃ A and ∆ ψds P = ∞, (IV. 55)<br />

W α (Q, ψ ds ) ≃ A and W α (P, ψ ds ) ≃ . (IV. 56)<br />

a<br />

None of these measures gives the fine structure. Therefore, Bohr’s Copenhagen reasoning, treated<br />

in the next subsection, cannot be based on the Kennard inequality (IV. 26) nor on the inequality of<br />

Landau and Pollak (IV. 51). ▹<br />

EXERCISE 30. Verify the calculations (IV. 55) and (IV. 56).<br />

IV. 5. 6<br />

A NEW UNCERTAINTY MEASURE<br />

Bohr’s reasoning concerning the double slit experiment goes as follows. A way to determine<br />

through which slit the particle has gone is measuring the recoil in the q - direction that the screen<br />

experiences at the passage of this particle. To this end the screen must be able to move in the q - direction.<br />

Instead of a fixed screen we take therefore a screen that is suspended from a spring, as can be<br />

seen in figure IV. 10. The incoming momentum p is perpendicular to the screen.<br />

We assume conservation of kinetic energy, i.e. a heavy screen, which means that only the direction<br />

of the momentum changes. Consequently, a particle arriving at position q of the photographic<br />

plate, gives a recoil to the screen of, assuming r ≫ A and therefore sin θ ≈ tan θ,<br />

( q ± A<br />

r<br />

)<br />

p, (IV. 57)<br />

depending on which slit it has gone through. To be able to measure the difference in recoil, it must<br />

hold for the inaccuracy δP with which the momentum of screen was known in advance, that<br />

δP < 2 A p . (IV. 58)<br />

r


106 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

q<br />

q = r tan θ 1 + A<br />

= r tan θ 2 − A<br />

2 A<br />

1<br />

2<br />

a<br />

θ 2<br />

θ 1<br />

r<br />

Figure IV. 10: Moving screen<br />

Because of the inequality<br />

δP δQ , (IV. 59)<br />

to the inaccuracy with which the position Q of the screen was known then applies<br />

δQ ><br />

r . (IV. 60)<br />

2 A p<br />

But the width of the interference lines on the photographic plate is<br />

λ r<br />

2 A = r , (IV. 61)<br />

2 A p<br />

where λ = p<br />

is the de Broglie wavelength of the electron. Bohr therefore concludes that the uncertainty<br />

in the position of the screen will result in the erasure of the interference pattern.<br />

◃ Remarks<br />

First, we see that Bohr applies the uncertainty principle to the screen which means that he treats this<br />

macroscopic body quantum mechanically. Second, he uses the uncertainty principle in a qualitative<br />

manner, in particular, he does not give a definition of the uncertainties δP and δQ. Third, the relevant<br />

uncertainty in Q is of the order of magnitude of the width A −1 of the interference lines. Bohr<br />

therefore has no use of the Kennard inequality (IV. 26) or the inequality of Landau and Pollak (IV. 51),<br />

which do not contain this width. Finally, Bohr does not show how erasure of the interference pattern<br />

exactly takes place, obviously, he considers it to be intuitively evident. ▹<br />

From the previous it should be clear that something is still lacking in the mathematical formulation<br />

of the uncertainty principle. One would hope that there may exist some direct relation between the


IV. 5. THE UNCERTAINTY RELATIONS 107<br />

total width of a distribution in the p - language (q - language), and the fine structure of this distribution<br />

in the q - language (p - language) as exhibited by the wave function for the double slit, assuming that<br />

this relation has general validity. Indeed, such a relation has been found (Uffink and Hilgevoord 1985),<br />

w α (Q, ψ) W α (P, ψ) C α and w α (P, ψ) W α (Q, ψ) C α , (IV. 62)<br />

where w α ( · , ψ) ∈ R + is a measure for the width of the fine structure of ψ, W α ( · , ψ) ∈ R + is<br />

the measure for the total width of ψ as introduced earlier, and C α > 0 is a constant depending<br />

on α ∈ (0, 1], but not on the state ψ.<br />

Illustratively, if W is taken as a measure of the size of the objective of a microscope and w as a<br />

measure of the fine structure of the image, the inequalities express the fact that the resolving power<br />

must decrease if the aperture is reduced. Likewise, the direction of incoming radiation can better<br />

determined by using a long array of radio telescopes than by using a short one, etc. These inequalities<br />

thus express, among other things, the well-known fact in optics that the resolving power of an<br />

apparatus improves as the apparatus is larger.<br />

The inequalities (IV. 62) seem to solve the problem for Bohr. A closer consideration however<br />

tells us that W α (P, ψ) is not the suitable measure to express whether the difference in recoil can or<br />

cannot be observed. More precise, W α (P, ψ) > 2Ap<br />

r<br />

does not guarantee that this difference cannot be<br />

observed. W α (P, ψ) can be large in this experiment, which makes the inequality (IV. 62) ineffective.<br />

Actually, it is the question if Bohr’s argument can in fact be based on an uncertainty relation.<br />

Nevertheless, his conclusion is correct! The fact is that a direct calculation of the double slit<br />

experiment by D. Hauschildt, unpublished, shows that the intensity of the interference, in case the<br />

screen is movable, is proportional to the factor<br />

∣ ⟨χ| e<br />

i 2 A p Q r sc<br />

|χ⟩ ∣ . (IV. 63)<br />

Here |χ⟩ is the state of the screen and Q sc is the position operator of the screen. The state<br />

|χ⟩ ′<br />

:= e i 2 A p<br />

r Q sc<br />

|χ⟩ (IV. 64)<br />

is the state of which the momentum spectrum is shifted by 2Ap<br />

r<br />

with respect to the momentum spectrum<br />

of the state |χ⟩,<br />

⟨p | χ ′ ⟩ = ⟨ p − 2 A p<br />

r<br />

∣ χ<br />

⟩<br />

. (IV. 65)<br />

The factor (IV. 63) is, therefore, exactly the quantum mechanical expression describing to what extent<br />

the state of the screen after the recoil can be distinguished from the state of the screen before the<br />

recoil.<br />

If the momentum spectrum of |χ⟩ is broad with respect to 2Ap<br />

r<br />

, the overlap (IV. 63) will be large,<br />

namely almost 1. In that case |χ⟩ and |χ⟩ ′ are difficult to distinguish and interference is large. If the<br />

momentum spectrum of |χ⟩ only contains peaks which are narrow with respect to 2Ap<br />

r<br />

, then (IV. 63)<br />

is small. The states |χ⟩ and |χ⟩ ′ are well distinguishable then and interference is small. The essence<br />

of Bohr’s reasoning is therefore correct; to the extent in which the screen can serve as a measuring apparatus<br />

to determine the slit a particle goes through, interference disappears. Whether this reasoning<br />

can be based on an uncertainty relation, is unknown to this very day.


108 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />

IV. 5. 7<br />

INTERPRETATION<br />

The statistical interpretation of the uncertainty W α (A, ψ) in (IV. 62) is that it is a measure for<br />

the predictability of an outcome of measurement given a probability distribution, it is nothing but<br />

the usual statistical interpretation of the standard deviation. How we must physically understand<br />

this uncertainty depends directly on how we must physically understand quantum mechanical probabilities.<br />

We will discuss this elaborately further on.<br />

The number w α (A, ψ) is a measure for the distinguishability between the state ψ (probability<br />

distribution) and some other state (other probability distribution) when measuring quantity A corresponding<br />

to operator A. This is also nothing but the usual statistical interpretation of this measure.


V<br />

HIDDEN VARIABLES<br />

While we have thus shown that the wave function does not provide a complete description<br />

of the physical reality, we left open the question of whether or not such a description<br />

exists. We believe, however, that such a theory is possible.<br />

— Einstein, Podolsky and Rosen<br />

You may have already suspected that I still believe in the hidden variables hypothesis.<br />

[. . . ] Anyway, for me, the hidden variable hypothesis is still the best way to ease my<br />

conscience about quantum mechanics.<br />

— Gerard ’t Hooft<br />

In this chapter we get acquainted with so-called ‘hidden variable theories’ and the motivation<br />

to consider such theories. We examine if it is possible to shove such a ‘hidden variable theory’<br />

under quantum mechanics, the way classical mechanics can be shoven under classical statistical<br />

mechanics. We also treat the notorious impossibility theorems of Von Neumann and of Kochen<br />

and Specker.<br />

V. 1 HIDDEN REALITY<br />

Quantum mechanics is, roughly speaking, a theory about outcomes of measurements; about which<br />

values can be found upon measurement and about the probability of finding a specific value in such<br />

a measurement. Moreover, according to the Copenhagen perspective, this description is complete:<br />

there is nothing more to say about a physical system. As a consequence, quantum mechanics is<br />

exclusively concerned with the observable behaviour of measuring apparatuses.<br />

In the eyes of many authors, this is bizarre. In the entire history of physics we see that the aim<br />

of a theory has been to tell us something about how reality is organized, how to explain what we<br />

observe around us. Measuring is the eminent scientific manner to examine whether a given theory or<br />

hypothesis meets this aim, or to gather data to help us select theories. Measurment is not an aim, but<br />

a tool. The subject of physical theories, physical reality, does not occur in the quantum mechanical<br />

tale, in contrast to nearly all theories in classical physics.<br />

From this point of view, we could hope that quantum mechanics is some sort of cloak, which must<br />

be sustained by an underlying theory concerning physical reality. Because that underlying theory is<br />

hidden under the quantum mechanical cloak, we will speak of a hidden variable theory.<br />

So, let us examine the matter not from the viewpoint of quantum mechanics, but from ‘physical<br />

reality’, taking as a working hypothesis that something like a ‘physical reality’ exists. The behavior of


110 CHAPTER V. HIDDEN VARIABLES<br />

radioactive atomic nuclei, as discussed in the Introduction, p. 7, suggests that individual nuclei differ<br />

from each other, they show various life spans and emit α - particles with distinct momentum. The<br />

natural idea is that this difference in behavior has a cause, which can be found in mutually differing<br />

properties of the physical states of the individual nuclei. Quantum mechanics does not give us these<br />

differences, but perhaps a description of state exists, exceeding that what quantum mechanics tells us.<br />

We would like such an additional description to show us how the phenomena observed at an<br />

individual nucleus follow decisively from the state of that nucleus. Such a description requires extra<br />

variables in comparison with the quantum mechanical description. It is conceivable that not all of<br />

these variables are accessible to our present, and possibly future, possibilities of observation. They<br />

are ‘hidden’ from us, but they must exist to explain the observed differences. If they exist, then<br />

quantum mechanical states correspond to probability distributions over the states described by these<br />

variables. These probability distributions would only express our ignorance concerning the exact<br />

physical states. In this respect, the situation would be entirely analogous to that in classical statistical<br />

mechanics. EPR believed that it must in principle be possible to construct such a theory.<br />

Such an attempt, interpreting quantum mechanics as a statistical theory about an underlying physical<br />

reality, is what is called a hidden variable theory, HVT for short, the support under the quantum<br />

mechanical cloak. Assuming that quantum mechanics is empirically adequate, we will examine if it<br />

is possible in principle to found this description on a HVT.<br />

An important distinction between several types of HVT’s concerns the question whether the hidden<br />

variables describing the physical state of the system can depend on which quantity of the system is<br />

measured. Theories in which this has been permitted are called contextual, they will be discussed in<br />

section V. 4. For the moment, we will first concentrate on the simpler case where this is not permitted,<br />

the non - contextual theories, to be discussed in section V. 2.<br />

Another important division has to do with determinism. Although it is the objective of a HVT to<br />

supplement or complete the quantum mechanical description of a physical system, this does not imply<br />

that with this supplement the precise future behavior of this system can be entirely predicted, it is<br />

conceivable that the HVTtoo merely determines probabilities of possible events. In that case we speak<br />

of an indeterministic, or stochastic, HVT. In this chapter we will discuss only deterministic HVT’s,<br />

but we will come back to stochastic HVT’s in chapter VII.<br />

V. 2 NON - CONTEXTUAL HIDDEN VARIABLES<br />

Let us try to reconstruct quantum mechanics in analogy with classical statistical mechanics. We<br />

assume a space Λ analogous to the phase space Γ known from statistical physics, which we have<br />

already met in section III. 2. An arbitrary ‘point’ in that space Λ is indicated with λ. We do not in<br />

advance impose any restriction to the mathematical form of λ. The variable λ can represent anything,<br />

for example a single real variable, an infinite - dimensional vector field, complex functionals, etc. The<br />

possibilities are endless, the only restriction will be that a probability measure can be defined on Λ. It<br />

is possible to also incorporate the quantum mechanical state as a component in the specification of λ.<br />

Speaking about a ‘classical’ statistical model here does not mean that the HVT must look like<br />

classical mechanics, let alone that λ specifies the position and momentum of the particles, although<br />

we do not exclude that as a possibility.


V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 111<br />

In the HVT, a pure physical state corresponds to a single ‘point’ λ ∈ Λ. We assume that the<br />

system is always in one of these states λ ∈ Λ, even though we do not know in which one. A general,<br />

mixed state is a probability distribution over Λ. For any given λ every physical quantity A has an<br />

exact value, denoted by A[λ], which is revealed upon measurement of A, and therefore a physical<br />

quantity A can be represented as a real function on the space A : Λ → R.<br />

Furthermore, every quantity represented by quantum mechanics has to have a counterpart in the<br />

HVT. If such a quantity, corresponds to the function A : Λ → R the values A [λ] can take are<br />

the eigenvalues of the self - adjoint operator A : H → H which, according to quantum mechanics,<br />

corresponds to quantity A.<br />

It is also required that every quantum mechanical state can be represented in the HVT; for every<br />

state operator W there must be a corresponding probability distribution ρ W over Λ. It is, however,<br />

not necessary that pure quantum states correspond to pure hidden variable states, the idea being that<br />

the HVT allows for a more detailed, complete description of the system. Neither is it necessary that<br />

every probability distribution on Λ corresponds to a state operator, the HVT could easily be a theory<br />

richer than quantum mechanics.<br />

The requirement that the HVT has to reproduce the empirical statements of quantum mechanics<br />

is now expressed in the requirement that the expectation values of quantity A belonging to a physical<br />

system in a physical state, corresponding in the HVT to ρ W , and in quantum mechanics to W , coincide,<br />

∫<br />

⟨A⟩ ρW := A[λ] ρ W (λ) dλ = Tr A W, (V. 1)<br />

Λ<br />

where ρ W : Λ → [0, ∞) is a probability density,<br />

∫<br />

ρ W (λ) dλ = 1. (V. 2)<br />

Λ<br />

For a pure state |ψ⟩, (V. 1) reduces to<br />

∫<br />

A[λ] ρ ψ (λ) dλ = ⟨ψ | A | ψ⟩. (V. 3)<br />

Λ<br />

In the discrete case the integrals are replaced by summations.<br />

Summary<br />

An non - contextual HVT is any theory meeting the following requirements.<br />

(i) Every physical state of a physical system corresponds to a probability distribution ρ over Λ.<br />

This is the state postulate.<br />

(ii) Every physical quantity A corresponds to a function A : Λ → R, λ ↦→ A[λ]. This is the<br />

observables postulate.


112 CHAPTER V. HIDDEN VARIABLES<br />

(iii) The range of A : Λ → R coincides with the spectrum of the self - adjoint operator A which,<br />

according to quantum mechanics, corresponds to quantity A.<br />

The expectation value of A when the physical system is in the state ρ W which, according<br />

to quantum mechanics, corresponds to the state operator W , equals the quantum mechanical<br />

expression for the expectation value<br />

⟨A⟩ ρW :=<br />

∫<br />

Λ<br />

A[λ] ρ W (λ) dλ = Tr AW.<br />

We will call this last requirement (iii) the reproduction criterion.<br />

Since all probabilities in quantum mechanics can be written as Tr PW , with P ∈ P(H), it follows<br />

that all probability distributions in quantum mechanics coincide with the corresponding probability<br />

distributions in the HVT.<br />

We can now ask whether it is possible to construct a HVT satisfying the above requirements. The<br />

answer is that it is indeed possible, even in a quite trivial way, by choosing Λ large enough. We<br />

illustrate this by means of a simple example.<br />

Suppose there are only three quantities A, B, C, with possible values {a 1 }, {b 1 , b 2 }, {c 1 , c 2 } and<br />

represented by functions A, B, C : Λ → R. The possible value combinations are<br />

(a 1 , b 1 , c 1 ), (a 1 , b 1 , c 2 ), (a 1 , b 2 , c 1 ), (a 1 , b 2 , c 2 ). (V. 4)<br />

We now construct a space Λ by identifying every value combination with a point of Λ. If we denote<br />

these points by λ 1 , λ 2 , λ 3 and λ 4 , then<br />

A[λ 1 ] = a 1 , B[λ 3 ] = b 2 , C [λ 4 ] = c 2 , etc. (V. 5)<br />

When there are more quantities, we extend Λ correspondingly.<br />

We have to introduce a probability measure<br />

µ : F (Λ) → [0, 1] with<br />

∑<br />

µ(λ j ) = 1 (V. 6)<br />

j<br />

such that (V. 1) is satisfied. In our case Λ is discrete and consists of four points only, as a result of<br />

which the integral (V. 1) becomes a sum. For example, to quantity B it must apply that<br />

Tr B W =<br />

4∑<br />

B[λ j ] µ W (λ j )<br />

j=1<br />

This is satisfied by<br />

= b 1<br />

(<br />

µW (λ 1 ) + µ W (λ 2 ) ) + b 2<br />

(<br />

µW (λ 3 ) + µ W (λ 4 ) ) . (V. 7)<br />

µ W (a i , b j , c k ) = Tr P ai W Tr P bj W Tr P ck W, (V. 8)


V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 113<br />

where P ai is the projector on the subspace corresponding to the eigenvalue a i of A, etc. Indeed,<br />

according to quantum mechanics<br />

and therefore<br />

while, with<br />

we have<br />

B = b 1 P b1 + b 2 P b2 , (V. 9)<br />

Tr BW = b 1 Tr P b1 W + b 2 Tr P b2 W, (V. 10)<br />

P a1 = 11, P b1 + P b2 = 11, P c1 + P c2 = 11, (V. 11)<br />

µ W (λ 1 ) + µ W (λ 2 ) = µ W (a 1 , b 1 , c 1 ) + µ W (a 1 , b 1 , c 2 )<br />

Likewise we find<br />

= Tr P a1 W Tr P b1 W (Tr P c1 W + Tr P c2 W )<br />

= Tr P b1 W. (V. 12)<br />

µ W (λ 3 ) + µ W (λ 4 ) = Tr P b2 W. (V. 13)<br />

Therefore, (V. 7) has been satisfied, and the same applies to the expectation values of A and C.<br />

If we have, in general, the quantities A, B, C, . . . , F , with values a i , b j , c k , . . . , f l , where<br />

i = 1, . . . , n A , j = 1, . . . , n B , etc., the measure<br />

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W Tr P bj W Tr P ck W · · · Tr P fl W, (V. 14)<br />

satisfies requirement (V. 3) for all quantities. For example, the probability of finding for quantity A<br />

the value a i is<br />

Prob µ W<br />

(A : a i ) =<br />

∑<br />

µ W (a i , b j , c k , . . . , f l ) = Tr P ai W, (V. 15)<br />

j, k,..., l<br />

because all others sum up to 1. Here we have the required quantum mechanical result. Kochen<br />

and Specker (1967) showed how to formulate this idea in the case of an infinite number of physical<br />

quantities.<br />

This solution of the completeness problem is, however, not very interesting physically. It can be<br />

seen from the factorizable probabilities in (V. 8) that all quantities are treated here as being statistically<br />

independent which is not in agreement with physical practice. Some quantities are functions of<br />

other quantities, e.g., kinetic energy is a function of momentum, E kin = p2<br />

2m<br />

, while other quantities<br />

link with two or more other quantities, such as kinetic, potential and total energy, E = E kin + E pot .<br />

In the just outlined HVT we have ignored such links.


114 CHAPTER V. HIDDEN VARIABLES<br />

To illustrate this we assume that in our example C = A+B so that c 1 = a 1 +b 1 and c 2 = a 1 +b 2 .<br />

Now the possible value combinations in the HVT are<br />

(a 1 , b 1 , a 1 + b 1 ), (a 1 , b 1 , a 1 + b 2 ), (a 1 , b 2 , a 1 + b 1 ), (a 1 , b 2 , a 1 + b 2 ), (V. 16)<br />

and we see that (A + B) [λ] is not equal to A [λ] + B [λ] for all λ. Nevertheless, the HVT succeeded<br />

in reproducing, by construction, all quantum mechanical expectation values, in other words,<br />

the HVT reproduces the relation<br />

⟨ψ | A + B | ψ⟩ = ⟨ψ | A | ψ⟩ + ⟨ψ | B | ψ⟩, (V. 17)<br />

without requiring<br />

(A + B)[λ] = A[λ] + B[λ]. (V. 18)<br />

If we would require (V. 18), Λ would only consist of the points (a 1 , b 1 , a 1 + b 1 ) and (a 1 , b 2 , a 1 + b 2 )<br />

which is, of course, a strong restriction.<br />

In the very first proof of the impossibility of a HVT, that is, of the insolubility of the completeness<br />

problem, given by Von Neumann (1932), the requirement (V. 18) was indeed imposed on the HVT.<br />

Von Neumann required (V. 18) for every hidden variable state, in particular also for pure hidden<br />

variable states, which means that (V. 18) must apply to all λ ∈ Λ. We don’t need to discuss Von<br />

Neumann’s elaborate proof of this claim in detail, since J.S. Bell (1966) has shown this impossibility<br />

by means of a very simple example.<br />

Since the values of A[λ] etc. have to be the eigenvalues of the corresponding operators, it can be<br />

seen immediately that this requirement cannot be satisfied in general. Consider for example the Pauli<br />

matrices<br />

σ x =<br />

( ) 0 1<br />

, σ<br />

1 0 y =<br />

( ) 0 − i<br />

i 0<br />

and σ x + σ y =<br />

( )<br />

0 1 − i<br />

. (V. 19)<br />

1 + i 0<br />

The eigenvalues σ x and σ y are ±1, but the eigenvalues of σ x + σ y are ± √ 2, and therefore, (V. 18)<br />

cannot be satisfied.<br />

Bell argued that the requirement (V. 18) is physically unreasonable. For instance, measuring<br />

σ x ,σ y and σ x +σ y requires three different measurement apparatuses, for example three Stern - Gerlach<br />

magnets in three different orientations. There is absolutely no reason to assume that an algebraical<br />

link would exist between the individual outcomes of these measurements. The fact that in quantum<br />

mechanics the relation (V. 17) exists for pure states, even in case A and B do not commute, must be<br />

considered as a particular property of quantum mechanics.<br />

Since the requirement (V. 18) is unreasonably strong, one can wonder whether there are other,<br />

reasonable, requirements which can be imposed to a HVT in order to find acceptable solutions of the<br />

completeness problem. This brings us to the next section.


V. 3 KOCHEN AND SPECKER’S THEOREM<br />

V. 3. KOCHEN AND SPECKER’S THEOREM 115<br />

As we already proved in section II. 4, p. 28, in quantum mechanics the next theorem holds: if the<br />

operators A, B, C, . . . commute, there is a maximal operator O of which they are a function,<br />

A = f (O), B = g(O), etc. (V. 20)<br />

A measuring procedure for A, B, C, . . . would be to measure O and apply the function relation to the<br />

result in order to find the values for A, B, C, . . . Kochen and Specker (1967, p. 64) call the quantities<br />

corresponding to A, B, C, . . . commeasurable.<br />

Now it seems reasonable to require, as Von Neumann did, that the HVT also has this structure, i.e.,<br />

for B, C : Λ → R, if B = f (C), it follows that B[λ] = f ( C [λ] ) , or<br />

f (C)[λ] = f ( C [λ] ) . (V. 21)<br />

This function rule, (V. 21), yields the so - called sum rule for commuting operators,<br />

[A, B] = 0 =⇒ (A + B)[λ] = A[λ] + B[λ], (V. 22)<br />

since, with O again the maximal operator of which A and B are a function, A = f (O), B = g(O),<br />

implying<br />

(A + B) = h(O) with h = f + g, (V. 23)<br />

from (V. 21) it then follows in this HVT that<br />

(A + B)[λ] = h(O)[λ] = h ( O[λ] ) = f ( O[λ] ) + g ( O[λ] )<br />

= (f O)[λ] + (g O)[λ] = A[λ] + B[λ]. (V. 24)<br />

EXERCISE 31. Prove, again using (V. 21), the product rule for commuting operators,<br />

[A, B] = 0 =⇒ (A B)[λ] = A[λ] B[λ]. (V. 25)<br />

Now we will see how the requirement, (V. 21), which at first sight is eminently reasonable, nevertheless<br />

renders a HVT of quantum mechanics impossible.<br />

THEOREM :<br />

A HVT satisfying the requirements (i) - (iii), p. 111, and the function rule (V. 21), does<br />

not exist if dim H > 2.


116 CHAPTER V. HIDDEN VARIABLES<br />

Proof<br />

Consider a complete collection of mutually orthogonal projectors P 1 , . . . ,P N on a N - dimensional<br />

Hilbert space. Such projectors mutually commute; [P i , P j ] = 0. An arbitrary sum of such projectors<br />

over some subset ∆ ⊂ {1, . . . , N} is again a projector,<br />

∑<br />

i∈Delta<br />

P i = P ∆ ∈ P (H). (V. 26)<br />

Therefore, according to the sum rule (V. 22) it has to hold that<br />

∑<br />

, P i [λ] = P ∆ [λ]. (V. 27)<br />

i ∈∆<br />

But the values P i [λ] are the eigenvalues of the operators P i , therefore they are 0 or 1, likewise<br />

for P ∆ [λ], these values also follow from (V. 21). In particular, taking ∆ = {1, . . . , N}, we find<br />

N∑<br />

, P i [λ] = 11[λ] = 1.<br />

i=1<br />

But then the value assignment P i [λ] to the projectors satisfies the requirements for a probability<br />

measure on P (H), i.e.<br />

µ λ (P i ) := P i [λ] ∈ {0, 1} (V. 28)<br />

is a normalized, additive mapping on the subspaces of H. According to Gleason’s theorem, p. 47,<br />

this probability measure can always be written as<br />

µ λ (P i ) = Tr P i W λ , (V. 29)<br />

for a certain state operator W λ , provided that dim H > 2. There is, however, a contradiction<br />

between (V. 29) and (V. 28). The measure (V. 29) is continuous; a small change of the direction<br />

of P i induces a small change of µ(P i ). The measure (V. 28) is however necessarily discontinuous<br />

because µ(P i ) can only have the values 0 and 1.<br />

The conclusion has to be that a value assignment to quantities satisfying (V. 21), and therefore<br />

(V. 27), is impossible. As a consequence, a HVT of this type is not possible. □<br />

In this proof we used Gleason’s theorem, which is difficult to prove, and his own proof is not very<br />

transparent. There have also been given direct proofs for the impossibility of this value assignment.<br />

Bell (1966) and Kochen and Specker (1967) were the first to prove this in general, i.e., for dim H > 2<br />

and for all states; see also Belinfante (1973). We will not discuss these proofs in detail but restrict<br />

ourselves to a number of observations. Before we do so, we formulate Kochen en Specker’s theorem.<br />

KOCHEN AND SPECKER’S THEOREM :<br />

It is not possible to assign values to all physical quantities of an arbitrary physical system,<br />

with a Hilbert space of dim > 2, in accordance with function rule (V. 21).


V. 3. KOCHEN AND SPECKER’S THEOREM 117<br />

Sketch of the direct proof<br />

We can formulate the problem as follows. Consider as a particular case of (V. 26) a resolution of<br />

identity into 1 - dimensional projectors,<br />

P 1 + P 2 + · · · + P n = 11. (V. 30)<br />

According to (V. 21), thence (V. 22), the following must hold<br />

P 1 [λ] + P 2 [λ] + · · · + P n [λ] = 11[λ] = 1 (V. 31)<br />

for every resolution of identity. Consider the 1 - dimensional projectors H as lines in all possible<br />

directions through the origin of H. Now assign to all lines the value 0 or 1, such that the sum<br />

of the values of each complete set of orthogonal lines is 1. Alternatively, consider the points of<br />

intersection of these lines with the surface of the unit sphere in H. To each point of the sphere the<br />

value 0 or 1 is assigned, antipodal points are assigned the same value, and the sum of the values<br />

of the points of intersection of an orthogonal basis with the surface of the sphere is 1.<br />

If this problem is soluble in a complex H, it is also soluble in a real H with the same dimension.<br />

To see this, choose a basis in H and generate, by application of real orthogonal transformations,<br />

a structure which is isomorphic to a real H. Therefore, we can restrict ourselves to proving the<br />

impossibility of the requested value assignment in a real H.<br />

Furthermore, the impossibility in H N implies the impossibility in H N+1 . This can be shown<br />

by considering the N - dimensional subspace which is orthogonal to a line having value 0. Each<br />

orthogonal (N + 1) - tuple of which this line is a part then turns into an N - tuple with a correct<br />

value assignment. In other words, if it is possible in an (N +1) - dimensional H, it is also possible<br />

in an N - dimensional H and, therefore, we only have to consider a real H with a dimension as<br />

low as possible.<br />

Notice that the problem for a 2 - dimensional Hilbert space H 2 does have a solution, see for<br />

example the diagram V. 1.<br />

1 0<br />

0<br />

1<br />

Figure V. 1: A solution for dim H = 2<br />

All proofs therefore aim at the case of a real, 3 - dimensional Hilbert space H 3 . Now it immediately<br />

seems plausible that the requested value assignment in H 3 is not possible, to each point of<br />

the unit sphere R 3 with value 1 infinitely many points belong having value 0, namely, the equator<br />

of which that point is a pole. On the other hand, of each orthogonal triad of points only two points<br />

have the value 0. But this is, of course, not a proof.<br />

Bell (1966, pp. 450, 451) showed that points with different values cannot be arbitrarily close.<br />

This is an independent proof of the continuity of the measure, and therefore contrary to the necessary<br />

discontinuity of (V. 28).


118 CHAPTER V. HIDDEN VARIABLES<br />

Kochen and Specker (1967, p. 69) explicitly constructed a set of 117 spin quantities for which no<br />

consistent value assignment exists. This construction is depicted on the cover of Redhead (1987)<br />

and can be seen in figure V. 2 a. It shows that every value assignment in accordance with function<br />

rule (V. 21) leads to contradictions.<br />

Kochen and Conway only needed 31 quantities in the so - called Peres cube of 33 points (Peres 1993).<br />

This construction is depicted in figure V. 2 b. □<br />

Figure V. 2: a) Kochen - Specker diagram b) Conway - Kochen diagram<br />

(Redhead 1987 ) (Tkadlec 2000 )


V. 3. KOCHEN AND SPECKER’S THEOREM 119<br />

Figure V. 3: M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of the<br />

left pillar. Each cube has 4 lines from the mutual center to its vertices, 6 lines to the centers of<br />

its edges, and 3 lines to the centers of its faces. Three of the lines are shared by all three cubes,<br />

giving 3 · (4 + 6 + 3 ) − 6 = 33 lines. These are Peres’ vectors. (Text Meyer 2003 )<br />

It is interesting to see what the measure (V. 29), according to Von Neumann the probability measure<br />

of quantum mechanics, looks like in this case. For a pure state W = |ψ⟩ ⟨ψ|, with P i = |χ⟩ ⟨χ|<br />

the measure (V. 29) is<br />

µ(P i ) = Tr P i W = ⟨ψ | P i | ψ⟩ = |⟨χ | ψ⟩| 2 (V. 32)<br />

so that in a real space we have<br />

µ(P i ) = |⟨χ | ψ⟩| 2 = cos 2 θ, (V. 33)


120 CHAPTER V. HIDDEN VARIABLES<br />

with θ the angle between |ψ⟩ and |χ⟩, see figure V. 4.<br />

ψ<br />

1<br />

θ<br />

χ<br />

cos 2 θ<br />

0<br />

Figure V. 4: µ(P i ) = cos 2 θ<br />

In the appendix of these lecture notes, p. 183, ff., we will prove that, if we assign to each point<br />

of the upper half of a unit sphere a non - negative real number such that 1 is assigned to the ’north<br />

pole’, 0 is assigned to the ’equator’ and the sum of the values of each orthogonal triad in this half<br />

sphere is 1, there is only one possible value assignment and that is the quantum mechanical one,<br />

i.e., in accordance with cos 2 θ.<br />

◃ Remarks<br />

First, illustrations of Kochen and Specker’s theorem are easy to find for Hilbert spaces of dimension<br />

larger than 3, for example 8, in which case a handful of quantities suffices, see Mermin (1993). We<br />

will come back to that in section VII. 6. Second, when restricted to rational angles between spin<br />

vectors, no contradiction with quantum mechanics can be obtained, as D.A. Meyer (1999) proved. ▹<br />

V. 3. 1 SUMMARY<br />

According to Kochen and Specker’s theorem, a HVT satisfying the state postulate and the observables<br />

postulate, p. 111 (i) and (ii), together with the function rule (V. 21), is contradictory to the state<br />

postulate and the observables postulate of quantum mechanics if dim H > 2, although for Hilbert<br />

spaces with dim H 2 it is possible. This conclusion shows how stringent the vector space structure<br />

of quantum mechanics is, and in particular, the fact that there are many different decompositions of<br />

unity forms a heavy barrier for a HVT.<br />

V. 4 CONTEXTUAL HIDDEN VARIABLES<br />

Essential for Kochen and Specker’s proof is the fact that a 1 - dimensional projector can be part<br />

of several decompositions of unity. This is possible as long as the projectors are not maximal, i.e.,<br />

if dim H > 2. The existence of degenerated projectors, apart from unity, is essential for the proof of<br />

Kochen and Specker, and for this reason it does not hold in a 2 - dimensional H where all projectors,<br />

except 11, are maximal. By means of degenerated projectors also non - commuting operators become<br />

connected to each other. By the requirement (V. 21) this is transferred to the quantities of the HVT, so


V. 4. CONTEXTUAL HIDDEN VARIABLES 121<br />

that via a detour we still impose a requirement for non - commeasurable quantities on the HVT. We<br />

will consider this in detail now.<br />

Suppose that operator A commutes with the maximal operators C 1 and C 2 , while [C 1 , C 2 ] ≠ 0.<br />

Then we have<br />

which implies<br />

A = f (C 1 ) and A = g(C 2 ), (V. 34)<br />

f (C 1 ) = g(C 2 ), (V. 35)<br />

and we see that A is degenerate. Function rule (V. 21) leads to the same relation between the quantities<br />

of the HVT,<br />

yielding<br />

A[λ] = f ( C 1 [λ] ) and A[λ] = g ( C 2 [λ] ) , (V. 36)<br />

f ( C 1 [λ] ) = g ( C 2 [λ] ) . (V. 37)<br />

Again, this is a relation between the value assignments to quantities which do not commute in quantum<br />

mechanics, but the relation is not one - to - one, the functions f and g are not bijective.<br />

It can be supposed that such a requirement is unreasonable is because such quantities are not<br />

commeasurable. In other words, the structure of quantum mechanics, and particularly the proposition<br />

that an operator can be a function of two non - commuting maximal operators, leads to relations<br />

between quantities which cannot be measured in one single experiment.<br />

The following is what occurs at the different decompositions of unity. Consider two bases, {|α j ⟩}<br />

and {|β j ⟩}, in a Hilbert space H of dimension N > 2 and suppose that |α 1 ⟩ = |β 1 ⟩, while all other<br />

basis vectors are different. Then we have<br />

N∑<br />

P |αj ⟩ = 11 =<br />

j=1<br />

N∑<br />

P |βj ⟩ and P |α1 ⟩ = P |β1 ⟩. (V. 38)<br />

j=1<br />

Define, as follows, two maximal operators with all coefficients c j and d j distinct,<br />

C :=<br />

N∑<br />

c j P |αj ⟩ and D :=<br />

j=1<br />

N∑<br />

d j P |βj ⟩, (V. 39)<br />

j=1<br />

then it follows that<br />

P |α1 ⟩ = f (C) = g(D). (V. 40)<br />

This leads to a connection between the non - commuting operators C and D, and using (V. 21)<br />

this leads to a connection between the corresponding representations C[λ] and D[λ] in the HVT. It is<br />

this type of relations which the HVT cannot satisfy.


122 CHAPTER V. HIDDEN VARIABLES<br />

◃ Remark<br />

Notice that the occurrence of non - maximal operators P |αi ⟩ is indeed essential, if P |αi ⟩ would be<br />

maximal, C and D would commute, as we saw in section II. 4 on p. 30. M.J. Maczynski (1971) has<br />

proved that if we exclusively consider maximal quantities, and therefore we would apply (V. 21) to<br />

maximal quantities only, Kochen and Specker’s theorem is no longer valid, and in that case a HVT is<br />

possible. ▹<br />

An obvious expedient is to strictly constrain requirement (V. 21) to quantities which are measurable<br />

within one context. In our example the projector P |α1 ⟩ is commeasurable with both C and D,<br />

while mutually C and D are not commeasurable. Therefore, we have to distinguish between a value<br />

assignment P |αi ⟩[λ] within the context of a measurement of C, and one within the context of a measurement<br />

of D. We can think, for example, of a measurement of C and application of the function relation<br />

P |α1 ⟩ = f(C), or of a measurement of D and application of the function relation P |α1 ⟩ = g(D).<br />

More generally, suppose<br />

A = f (C) = g(D) where [C, D] ≠ 0. (V. 41)<br />

Then we distinguish the hidden variable quantities A C [λ] and A D [λ], where the index indicates the<br />

context of measurement. If C and D do not commute there is, according to a contextual HVT, no<br />

reason to assume that for all λ ∈ Λ it holds that<br />

A C [λ] = A D [λ], (V. 42)<br />

as is the case in every HVT we have considered so far.<br />

Kochen and Specker do assume (V. 42), however, and find a contradiction with quantum mechanics.<br />

The remedy is therefore to ‘split up’ all degenerate quantities by addition of the context in which<br />

they are measured, as was firstly proposed by B.C. van Fraassen (1973). For the sake of convenience<br />

we here assume that a measurement of a degenerated quantity always develops by means of the measurement<br />

of a maximal quantity, which does not have to be split up. By definition we then have<br />

A C [λ] = f ( C [λ] ) and A D [λ] = g ( D[λ] ) . (V. 43)<br />

This yields a weaker form of (V. 21). Suppose A = f (C), B = g(C) and A = h(B) = h(g(C)),<br />

then using (V. 43) we have<br />

A C [λ] = h ( B C [λ] ) . (V. 44)<br />

This consideration leads to a new postulate for a HVT, which, in case the HVT accommodates this<br />

postulate, we call contextual.<br />

CONTEXTUAL OBSERVABLES POSTULATE:<br />

If A is a physical quantity which can be taken as a function of at least two other physical<br />

quantities, for example A = f (C) and A = g (D), then, in the HVT, to A corresponds<br />

a function A C : Λ → R iff quantity C is measured, and a function A D : Λ → R iff<br />

quantity D is measured. If A, f(C) and g(D) are the corresponding quantum mechanical<br />

operators, the following applies,<br />

∀ λ ∈ Λ : A C [λ] = A D [λ] ⇐⇒ [C, D] = 0. (V. 45)


V. 4. CONTEXTUAL HIDDEN VARIABLES 123<br />

Although splitting up quantities is a natural consequence of the idea of commeasurability, it means<br />

giving up a one - to - one relation between the quantities of quantum mechanics and those of the HVT in<br />

a very drastic manner; since the operator P |α1 ⟩ is part of infinitely many decompositions of unity, there<br />

are infinitely many contexts in which P |α1 ⟩ can be measured.<br />

The idea that the context of the measurement must be taken into the consideration can already be<br />

found in Bell (1966). In this article, which was actually written earlier than his famous article with the<br />

Bell inequality, Bell makes some observations concerning the requirements which could be imposed<br />

to a contextual HVT. They have to have a spatial meaning and enable us to interpolate a space - time<br />

picture, preferably causally, between the preparation and the measurement of states.<br />

He then considers Bohm’ s theory of the quantum potential, see chapter VI, and shows that this<br />

theory is not local. He wonders if every HVT of quantum mechanics must have this non - local character<br />

(Bell 1966, p. 452),<br />

However, it must be stressed that, to the present writer’s knowledge, there is no proof that<br />

any hidden variable account of quantum mechanics must have this extraordinary character.<br />

It would therefore be interesting, perhaps, to pursue some further “impossibility<br />

proofs,” replacing the arbitrary axioms objected to above by some condition of locality,<br />

or of separability of distant systems.<br />

Meanwhile, still before the delayed publication of his article, Bell (1964) himself had found such a<br />

proof.<br />

Now we will show how the idea of locality can be brought to expression in a contextual HVT with<br />

‘split’ quantities. Consider a composite system with Hilbert space H = H I ⊗ H II and an operator of<br />

the form A ⊗ 11 where A is maximal in H I . Then the operator A ⊗ 11 is not maximal in H, and<br />

A ⊗ 11 = f (X), (V. 46)<br />

where X is some maximal operator on H. Especially consider an X of the form<br />

X = X I ⊗ X II . (V. 47)<br />

Suppose there is no interaction, or not anymore, between the systems I and II. Then we can raise the<br />

question if X II must be taken to belong to the context of A ⊗ 11.<br />

Consider a second maximal operator<br />

Y = X I ⊗ Y II (V. 48)<br />

which only differs from X in the last factor. We then have<br />

A ⊗ 11 = f (X) = g(Y ). (V. 49)<br />

A requirement of locality is now that<br />

(A ⊗ 11) XI ⊗ X II<br />

[λ] = (A ⊗ 11) XI ⊗ Y II<br />

[λ], (V. 50)<br />

in other words, a change in that what is measured of system II, does not result in a splitting of<br />

quantities of system I. A contextual HVT satisfying (V. 50) is called local.


124 CHAPTER V. HIDDEN VARIABLES<br />

The key question is if a local contextual HVT is compatible with quantum mechanics. As an<br />

example we consider Bohm’s version of the thought experiment of EPR (Cooke and Hilgevoord 1979);<br />

two spin 1/2 particles being in a singlet state. Measurements of the spin of each of the particles<br />

correspond to operators of the form σ i ⊗ τ j , where σ i is the operator of the component of the spin of<br />

the first particle in the direction i and τ j is, likewise, the operator for the second particle. In contrast<br />

to the previously considered operators of the form X I ⊗ X II , the operators σ i ⊗ τ j are not maximal.<br />

Let us consider three directions, i, j ∈ {1, 2, 3}, which means there are nine such measurements.<br />

The result of a measurement of spin is either up or down, and consequently every measurement has<br />

four possible outcomes. If we introduce a quantity in the HVT for each of the nine quantities, we can,<br />

as we saw, reproduce the quantum mechanical predictions. Between the operators the relation<br />

σ i ⊗ τ j = (σ i ⊗ 11) (11 ⊗ τ j ), with i, j ∈ {1, 2, 3} (V. 51)<br />

holds. Now we also have to introduce quantities in the HVT for the six operators σ i ⊗ 11 and 11 ⊗ τ j .<br />

In an autonomous HVT the quantities must also satisfy (V. 51), because the factors on the right -<br />

hand side of (V. 51) commute. This means that there are only six independent quantities in the<br />

HVT and it can be shown that with this the experimental predictions of quantum mechanics can not<br />

be reproduced, see Wigner’s derivation in VII. 3.<br />

In a contextual HVT however, we consider the quantities σ i ⊗ 11 and 11 ⊗ τ j to be dependent of<br />

the context of the operators of which they are functions. Let χ(τ j ) be a function which assigns the<br />

value 1 to the outcome of every spin measurement τ j ,<br />

We then have<br />

χ(τ j ) = 11, with j ∈ {1, 2, 3}. (V. 52)<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] = (σ i ⊗ χ(τ j ))[λ]. (V. 53)<br />

This quantity represents the spin of particle 1 within the context of a measurement of σ i ⊗ τ j ,<br />

which is a measurement of both spins followed by multiplication of the results. Since j ∈ {1, 2, 3},<br />

this gives a 3 - fold splitting of the quantity σ i ⊗ 11. The product rule now only applies to quantities<br />

in the same context, and the validity is trivial in this case. There are enough independent quantities in<br />

the HVT again to be able to reproduce quantum mechanics. The splitting worked out.<br />

But at the same time we see the price we have to pay; the splitting does not satisfy the weak<br />

requirement of locality (V. 50), because for j ≠ j ′ we make a distinction between the quantities<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] and (σ i ⊗ 11) σi ⊗ τ j ′ [λ]. (V. 54)<br />

This means that properties, quantities having values, of the one particle can no longer be specified<br />

independent of those of the other particle, even if there is no interaction between these particles and<br />

they are located in different galaxies. Redhead (1987, p. 135) speaks of an ontological contextuality.<br />

The conclusion is that a contextual HVT has to be non - local to be compatible with quantum<br />

mechanics.<br />

◃ Remark<br />

Notice that we did not speak of a measurement of the quantity σ i ⊗ 11. We have invariably seen


V. 4. CONTEXTUAL HIDDEN VARIABLES 125<br />

this as being derived from the measurement of an operator of which it is a function. In this way the<br />

maximal operators eventually acquire a special status, they are not being split up and they are the<br />

only operators which can be measured directly. This can be assumed theoretically, but the relation<br />

with the experimental practice in the laboratory, where almost exclusively degenerated quantities are<br />

measured, is less clear. ▹


VI<br />

BOHMIAN <strong>MECHANICS</strong><br />

My suggestion is that at each state the proper order of operation of the mind requires<br />

an overall grasp of what is generally known, not only in formal, logical, mathematical<br />

terms, but also intuitively, in images, feelings, poetic usage of language, etc.<br />

— David Bohm<br />

But why then had Born not told me of this “pilot wave?” If only to point out what was<br />

wrong with it? [. . . ] Why is the pilot wave picture ignored in text books? Should it not be<br />

taught, not as the only way, but as an antidote to the prevailing complacency? To show<br />

that vagueness, subjectivity, and indeterminism, are not forced on us by experimental<br />

facts, but by deliberate theoretical choice?<br />

— John Bell<br />

We briefly describe Bohm’s hidden variables theory, which we will call Bohmian mechanics.<br />

Bohmian mechanics seems to have the same empirical strength as quantum mechanics, but succeeds<br />

to provide an image in space and time of what exactly takes place in micro - physical reality.<br />

VI. 1<br />

INTRODUCTION<br />

The debate between Bohr and Einstein concerning the interpretation of quantum mechanics<br />

reached its peak in the 1935 EPR - article. Although both authors frequently returned to the problems,<br />

neither of them has afterwards introduced new elements in his point of view. For most of<br />

the physicists in the nineteen thirties and later it was not difficult to declare a winner to the debate,<br />

Bohr’s view was accepted nearly unanimously. The question whether a physical reality hides behind<br />

quantum mechanics, which exists of objects having properties and of which we can form ourselves a<br />

picture in space and time, was put aside. It was also thought that Von Neumann’s proof, as discussed<br />

in V. 2, p. 114, made a hidden variables reconstruction of quantum mechanics untenable.<br />

It is the merit of Bohm to have made a breach in the Copenhagen interpretation for the first time,<br />

by doing exactly that what was impossible or meaningless according to the Copenhageners. In 1952<br />

he published two articles in which he presented a HVT of quantum mechanics. In the second article<br />

he describes the breach as follows (Bohm 1952 part II, p. 188)<br />

The usual interpretation of the quantum theory implies that we must renounce the possibility<br />

of describing an individual system in terms of a single precisely defined conceptual<br />

model. We have, however, proposed an alternative interpretation which does not imply


128 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

such a renunciation, but which instead leads us to regard a quantum - mechanical system<br />

as a synthesis of a precisely definable particle and a precisely definable ψ - field which<br />

exerts a force on this particle.<br />

Bohm’s theory is strongly related to ideas which Louis de Broglie already put forward at the<br />

Solvay Conference in 1927. However, criticism from the Copenhageners at the conference, especially<br />

expressed by Pauli, made de Broglie abandon his theory, which was indeed not quite completely and<br />

consistently developed. Bohm devised, independently of de Broglie, an entirely elaborated version,<br />

which brought about a reconversion of de Broglie.<br />

We will study Bohm’s theory because it is an example of a concrete HVT, in contrast to the abstract<br />

characterization of such theories which we discussed in the previous chapter. We will see that Bohm’s<br />

theory shows remarkable aspects which differ thoroughly from classical physics.<br />

VI. 2<br />

THE <strong>QUANTUM</strong> POTENTIAL<br />

Bohm’s theory, which we will call Bohmian mechanics, starts from wave mechanics, i.e. quantum<br />

mechanics with L 2 (R n ) as its Hilbert space, but without the projection postulate. 1 This means that<br />

Bohm assumes that there is a wave function ψ(⃗q, t) which always satisfies the Schrödinger equation.<br />

First we consider the 1 - particle case, if there are more particles, ψ has more arguments.<br />

The idea is to interpret this wave function as a statistical description of a particle which always has<br />

a certain position and momentum. We will see that this particle must then be subjected to dynamics<br />

which differs from classical dynamics, by assuming that the forces acting on the particle are not<br />

exclusively the forces known from classical physics.<br />

The basic assumption is the Schrödinger equation for a particle with mass m in a time independent<br />

potential V (⃗q),<br />

i <br />

∂ψ(⃗q, t)<br />

∂t<br />

= − 2<br />

2 m ∇2 ψ(⃗q, t) + V (⃗q) ψ(⃗q, t), (VI. 1)<br />

but we will interpret the wave function differently from its usual interpretation in quantum mechanics.<br />

To this end, we rewrite ψ, with the help of two real functions R, S : R 4 → R, as<br />

ψ(⃗q, t) = R(⃗q, t) e i S(⃗q, t) . (VI. 2)<br />

It is always possible to find such functions R and S. Requiring R(⃗q, t) 0, R and S are, at given ψ,<br />

uniquely defined, except where ψ = 0. Substitution of (VI. 2) in (VI. 1), and separating the real and<br />

imaginary parts of the resulting equation, leads to two equations,<br />

∂R(⃗q, t)<br />

∂t<br />

∂S(⃗q, t)<br />

∂t<br />

= − 1 (<br />

R(⃗q, t) ∇ 2 S(⃗q, t) + 2 ∇ R(⃗q, t) · ∇ S(⃗q, t) ) ,<br />

2 m<br />

(VI. 3)<br />

( ) 2 ∇ S(⃗q, t)<br />

= −<br />

− V (⃗q) +<br />

2 ∇ 2 R(⃗q, t)<br />

.<br />

2 m<br />

2 m R(⃗q, t)<br />

(VI. 4)<br />

1 In the literature, under Bohmian mechanics a ’streamlined’ version of Bohm’s original theory is understood, without a<br />

quantum potential.


VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 129<br />

First we consider equation (VI. 3). Using the abbreviation ρ = R 2 this equation becomes<br />

∂ρ(⃗q, t)<br />

∂t<br />

+ ∇ ·<br />

(<br />

ρ(⃗q, t)<br />

)<br />

∇ S(⃗q, t)<br />

m<br />

= 0, (VI. 5)<br />

where ρ = R 2 is equal to |ψ| 2 , the quantum mechanical probability density for finding a particle<br />

at a certain position, which leads to the interpretation of ρ(⃗q, t) to be the probability density to find<br />

the particle at time t at position ⃗q ∈ R 3 . If we now interpret ∇S (⃗q, t) as the momentum of the<br />

particle, ∇S = ⃗p = m⃗v, (VI. 5) acquires a clear meaning; it is the continuity equation for a probability<br />

density ρ, which expresses that the total probability, given by the integral of ρ(⃗q, t) over R, is<br />

constant in time.<br />

Now consider equation (VI. 4). The last term in this equation is the only term of both (VI. 3)<br />

and (VI. 4) in which Planck’s constant appears explicitly. For this term we define the so - called<br />

quantum potential,<br />

U (⃗q, t) : = − 2<br />

2 m<br />

∇ 2 R(⃗q, t)<br />

. (VI. 6)<br />

R(⃗q, t)<br />

In case the quantum potential U would be equal to 0, equation (VI. 4) reads<br />

∂S(⃗q, t)<br />

∂t<br />

= −<br />

(<br />

∇ S(⃗q, t)<br />

) 2<br />

2 m<br />

− V (⃗q), (VI. 7)<br />

which is exactly the classical Hamilton - Jacobi equation for one particle. In (VI. 7), S is called the<br />

action, and ∇S is, as mentioned above, the momentum of the particle. In other words, if U = 0,<br />

we can interpret equations (VI. 3) and (VI. 4), and therefore also the equivalent Schrödinger equation<br />

(VI. 1), as the statistical description of a particle moving in a potential V in accordance with the<br />

laws of classical mechanics. We will discuss (VI. 7) more elaborately in section VI. 5, thereby also<br />

motivating the interpretation of ∇S.<br />

In case the quantum potential U would not be equal to 0, the just discussed interpretation can<br />

still be given if we assume that, next to the classical potential V , the quantum potential U is added<br />

as a correction to the equation of motion. The momentum is still given by ⃗p = ∇S, and (VI. 5)<br />

remains to be a continuity equation. However, (VI. 7) is replaced by (VI. 4), the Hamilton - Jacobi<br />

equation for a particle in the potential field V + U. We see that we have now adopted, besides the<br />

well - known −∇V , an extra force which acts on the particle,<br />

⃗F (⃗q, t) =<br />

d⃗p(⃗q, t)<br />

dt<br />

= − ∇ ( V (⃗q) + U (⃗q, t) ) . (VI. 8)<br />

If the limit → 0 is taken in the Schrödinger equation, (VI. 1), the result is nonsense, but if → 0 is<br />

taken in the definition (VI. 6) of the quantum potential U, we have U (⃗q, t) = 0, and (VI. 8) reduces<br />

to Newton’s law of motion.<br />

We will now discuss a simple example to illustrate the difference between Bohmian mechanics<br />

and quantum mechanics.


130 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

EXAMPLE<br />

A particle sits in a 1 - dimensional ‘box’ of length L, having walls which are formed by infinitely<br />

high potential barriers. Quantum mechanics gives as stationary solutions<br />

ψ n (q, t) = ψ n (q) e − i En t , (VI. 9)<br />

with<br />

ψ n (q) =<br />

√<br />

2<br />

( nπq<br />

)<br />

L sin , q ∈ [0, L], (VI. 10)<br />

L<br />

and energy values<br />

E n =<br />

2<br />

2 m<br />

( n π<br />

) 2<br />

. (VI. 11)<br />

L<br />

Therefore, in Bohmian mechanics for a stationary state we have<br />

R n (q, t) = ψ n (q) and S n (q, t) = − E n t. (VI. 12)<br />

Now it is surprising that in this example it holds that<br />

p = ∂S n<br />

∂q<br />

= ∂(− E n t)<br />

∂q<br />

= 0, (VI. 13)<br />

i.e., according to Bohmian mechanics the particle is motionless. This also applies to other cases of<br />

stationary states, for example to the ground state of the hydrogen atom. It is in straight contradiction<br />

to the statements of quantum mechanics. After all, in the case of the box quantum mechanics<br />

assigns, if the particle is in the state ψ n , a large probability to finding the momentum p having values<br />

around ±nπ<br />

L<br />

, in which case the particle moves with p m<br />

> 0, although the quantum mechanical<br />

expectation value of p is zero for the particle in the box.<br />

This example shows that the statements of quantum mechanics and Bohmian mechanics do not<br />

coincide for all quantities. They only correspond concerning probability distributions for position<br />

measurements. Bohmian mechanics is, therefore, not a HVT in the sense of chapter V, where it was<br />

assumed that the statements of such a theory are similar to the statements of quantum mechanics for<br />

all quantities. Von Neumann’s impossibility proof is therefore not applicable to Bohmian mechanics.<br />

The explanation of the discrepancy between Bohmian mechanics and quantum mechanics lies, of<br />

course, in the use of the quantum potential. According to Bohm, the energy of the particle in the box<br />

has been entirely stored in the form of potential energy as a result of the quantum potential, hence,<br />

the particle has no kinetic energy.<br />

This changes however as soon as we open the box by removing one or both barriers. The quantum<br />

potential energy is again released, and the particle will start to move. The wave packet ψ(⃗q, t) then<br />

spreads out in space, in exactly the same way as prescribed by the Schrödinger equation, and there<br />

is no difference anymore between the statements of both theories concerning the movement of the<br />

particle.


VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 131<br />

The discrepancy between Bohmian mechanics and quantum mechanics has no perceptible consequences<br />

if we argue that all measurements are ultimately made by means of observation of position.<br />

Every physical quantity is eventually determined by a ‘pointer’ with a certain position, and a momentum<br />

measurement must eventually be registered by means of the displacement of some object.<br />

◃ Remark<br />

Notice that Bohm’s point of view deviates from that of Bohr, which says that position and momentum<br />

measurements exclude each other in principle but are both necessary to be able to give an exhaustive<br />

description of the system. ▹<br />

Figure VI. 1: The quantum potential for the two slit system as viewed from the screen, under assumption<br />

of a Gaussian distribution at the slits (Bohm 1989 )<br />

Finally we consider a special case. Suppose that A, B ⊂ R 3 are disjoint areas in space,<br />

i.e. A ∩ B = ∅, ψ A and ψ B are wave functions which are 0 outside these areas, and the wave<br />

function has the following form,<br />

ψ(⃗q) = a ψ A (⃗q) + b ψ B (⃗q), (VI. 14)<br />

with a, b ∈ R. Since ψ A and ψ B have no overlap, for all ⃗q ∈ R 3 it holds that<br />

ψ A (⃗q) ψ B (⃗q) = 0. (VI. 15)<br />

Therefore, the probability density belonging to (VI. 14) is<br />

ρ(⃗q) = |a ψ A (⃗q)| 2 + |b ψ B (⃗q)| 2 , (VI. 16)<br />

without a cross - term, and we see that the ensemble of particles described by the density |ψ (⃗q)| 2<br />

behaves like a mixture.


132 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

With<br />

S(⃗q) =<br />

⎧<br />

⎪⎨<br />

⎪⎩<br />

S A (⃗q) for ⃗q ∈ A,<br />

S B (⃗q) for ⃗q ∈ B,<br />

0 elsewhere,<br />

(VI. 17)<br />

and ψ A (⃗q) = R A (⃗q)e i S A(⃗q) , etc., (VI. 14) reads<br />

ψ(⃗q) = ( a R A (⃗q) + b R B (⃗q) ) e i S(⃗q) , (VI. 18)<br />

which means that also the quantum potential, as depicted in figure VI. 1, can now be taken as a sum<br />

of terms belonging to separate areas. The particles in area A do not perceive the wave function in<br />

area B at all.<br />

Figure VI. 2: A simulation of the double slit experiment in Bohmian mechanics. Each particle follows<br />

a certain path between the slits and the photographic plate. All particles coming from the upper slit<br />

arrive at the upper half of the photographic plate, likewise for the lower slit and lower half of the<br />

plate. The twists in the paths are caused by the quantum potential U. (Vigier et al. 1987 )<br />

VI. 3<br />

COMPOSITE SYSTEMS<br />

The technique used to rewrite the Schrödinger equation into equations describing particles with<br />

definite position and momentum in a non - classical potential field, can easily be generalized. For


VI. 3. COMPOSITE SYSTEMS 133<br />

example, for a system of two particles, represented by the wave function ψ (⃗q 1 , ⃗q 2 , t), we interpret<br />

|ψ(⃗q 1 , ⃗q 2 , t)| 2 as the probability density that, simultaneously, particle 1 is located at position ⃗q 1<br />

and particle 2 at position ⃗q 2 .<br />

We write<br />

ψ(⃗q 1 , ⃗q 2 , t) = R(⃗q 1 , ⃗q 2 , t) e i S(⃗q 1, ⃗q 2 , t) , (VI. 19)<br />

and the quantum potential is now given by<br />

2 ( 2 ∇1 R(⃗q 1 , ⃗q 2 , t)<br />

U (⃗q 1 , ⃗q 2 , t) = −<br />

+ ∇ 2 2 )<br />

R(⃗q 1 , ⃗q 2 , t)<br />

, (VI. 20)<br />

R(⃗q 1 , ⃗q 2 , t) 2 m 1 2 m 2<br />

where ∇ i := ∂ /∂⃗q i is the gradient to the coordinates of particle i. In this expression the coordinates<br />

of both particles occur. Therefore, the force on particle 1, ⃗ F 1 = −∇(V + U), also depends,<br />

by means of the quantum potential, on the position of particle 2, and vice versa. This can be compared<br />

to the situation in Newton’s gravitation theory, where such a dependence appears in the classical<br />

potential V ; there is an instantaneous interaction (Latin: actio in distans) between particles, a choice<br />

of another initial position of one particle immediately influences the dynamics of the other.<br />

Notice, however, that in Bohmian mechanics this influence does not have to decrease with the<br />

distance between the particles. Even if R (⃗q 1 , ⃗q 2 , t) would go to 0 for ∥⃗q 1 − ⃗q 2 ∥ → ∞, the quantum<br />

potential U(⃗q 1 , ⃗q 2 ) does not need to do so, it depends on the second derivative, which means that<br />

it depends on the strength of the oscillation of R, not on the amplitude.<br />

Also notice that the mutual dependence between the particles does not only appear by means of<br />

the quantum potential. The momentum of particle 1, given by ∇ 1 S(⃗q 1 , ⃗q 2 , t), cannot be chosen independently<br />

of the position of particle 2, and vice versa. This does not even happen in a classical theory<br />

with an actio in distans, and it gives Bohmian mechanics a deeply ‘holistic’ character.<br />

Only when the total wave function is a product this mutual dependence disappears, because then<br />

yielding<br />

ψ(⃗q 1 , ⃗q 2 , t) = ψ 1 (⃗q 1 , t) ψ 2 (⃗q 2 , t), (VI. 21)<br />

R(⃗q 1 , ⃗q 2 , t) = R 1 (⃗q 1 , t) R 2 (⃗q 2 , t),<br />

S(⃗q 1 , ⃗q 2 , t) = S 1 (⃗q 1 , t) + S 2 (⃗q 2 , t) (VI. 22)<br />

and, consequently, (VI. 20) becomes<br />

U (⃗q 1 , ⃗q 2 , t) = U 1 (⃗q 1 , t) + U 2 (⃗q 2 , t). (VI. 23)<br />

Each particle only feels its own potential field, and its momentum does not depend on the position<br />

of the other particle. If now the classical potential V is also a sum of 1 - particle potentials, this<br />

factorizability is preserved in time.<br />

We know, however, that the wave function ψ (⃗q 1 , ⃗q 2 , t) does in general not have to be a product<br />

state, and even if it is a product state at some moment, it will generally not remain to be one. We<br />

must therefore conclude that the quantum potential U represents a non - local connection between the<br />

particles.


134 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

◃ Remark<br />

For Bell, this observation was a reason to examine if quantum mechanical HVT’s can, in fact, be local<br />

at all. We will come back to this in chapter VII. ▹<br />

An intermediate form occurs if A, B, C, D ⊂ R 3 are certain areas in space, such that A ∩ C = ∅<br />

or B ∩ D = ∅, ψ A , ψ C , ϕ B , ϕ D are wave functions which are 0 outside these areas, and the wave<br />

function is, analogously to (VI. 14), of the form<br />

ψ(⃗q 1 , ⃗q 2 ) = a ψ A (⃗q 1 ) ϕ B (⃗q 2 ) + b ψ C (⃗q 1 ) ϕ D (⃗q 2 ), (VI. 24)<br />

with a, b ∈ R. Since the pair ψ A and ψ C , or the pair ϕ B and ϕ D , or both, have no overlap, for<br />

all ⃗q 1 , ⃗q 2 ∈ R 3 we have<br />

ψ A (⃗q 1 ) ψ C (⃗q 1 ) = 0 or ϕ B (⃗q 2 ) ϕ D (⃗q 2 ) = 0. (VI. 25)<br />

Therefore, the probability density belonging to (VI. 24) is<br />

ρ(⃗q 1 , ⃗q 2 ) = R 2 (⃗q 1 , ⃗q 2 ) = |a ψ A (⃗q 1 ) ϕ B (⃗q 2 )| 2 + |b ψ C (⃗q 1 ) ϕ D (⃗q 2 )| 2 , (VI. 26)<br />

without a cross - term, and we see that the ensemble, again analogously to (VI. 14), behaves like a<br />

mixture. In this case we call the wave function ψ(⃗q 1 , ⃗q 2 ) effectively factorizable.<br />

With<br />

⎧<br />

S ⎪⎨ A (⃗q 1 ) + S B (⃗q 2 ) for ⃗q 1 ∈ A, ⃗q 2 ∈ B<br />

S tot (⃗q 1 , ⃗q 2 ) = S C (⃗q 1 ) + S D (⃗q 2 ) for ⃗q 1 ∈ C, ⃗q 2 ∈ D<br />

(VI. 27)<br />

⎪⎩<br />

0 elsewhere,<br />

and ψ A (⃗q 1 ) = R A (⃗q 1 )e i S A(⃗q 1 ) , etc., because of (VI. 25) it holds that<br />

ψ(⃗q 1 , ⃗q 2 ) = a R A (⃗q 1 ) R B (⃗q 2 ) e i (S A(⃗q 1 ) + S B (⃗q 2 ))<br />

+ b R C (⃗q 1 ) R D (⃗q 2 ) e i (S C (⃗q 1 ) + S D (⃗q 2 ))<br />

(VI. 28)<br />

= ( a R A (⃗q 1 ) R B (⃗q 2 ) + b R C (⃗q 1 ) R D (⃗q 2 ) ) e i Stot(⃗q 1, ⃗q 2 ) .<br />

Therefore, also in case of composite systems, the quantum potential can be taken as a sum of terms<br />

belonging to the separate particles, and the momentum of a particle does not depend on the other<br />

particle.<br />

Consequently, we can interpret the system as being composed of a pair of particles of which one<br />

particle is in area A and the other in B, or, likewise, in area C and D. The pair of particles is not<br />

influenced by the wave functions or the quantum potential in the other area. For this reason, these<br />

pilot waves are also called empty waves. They have no dynamic influence on the particles, but they<br />

do contain energy. If, at some time, the wave functions will have overlap again, they will of course<br />

also regain influence.


VI. 4. REMARKS AND PROBLEMS 135<br />

VI. 4<br />

REMARKS AND PROBLEMS<br />

In Bohmian mechanics the wave function a plays a double role. On the one hand, we see<br />

that ρ(⃗q, t 0 ) = R 2 = |ψ (⃗q, t 0 )| 2 is equal to the probability density to find a particle at time t 0 at<br />

a certain position, and we use this to characterize the ensemble at t 0 . On the other hand, ψ determines<br />

the value of R, and thereby, by means of formula (VI. 6) or (VI. 20), also the quantum potential which<br />

has the same status as the classical potential V . This means that ψ is also connected with the dynamic<br />

evolution of particles.<br />

This is strange if seen from a classical perspective. In classical statistical mechanics it is always<br />

possible to specify the form of the probability density at t 0 independently of the dynamics. Inversely,<br />

the force acting on a particle in a classical theory does not depend on the probabilities that the particle<br />

would be at another position then it actually is. But we saw that in Bohmian mechanics the force does<br />

depend on the probabilities. In Bohm’s interpretation we must therefore assume that if at an initial<br />

time t 0 the quantum mechanical probability density is |ψ (⃗q, t 0 )| 2 , the particles subsequently move<br />

under the influence of forces which are also determined by ψ(⃗q, t 0 ).<br />

Nonetheless, it can be proved that if this pre - established harmony is valid at one moment in<br />

time, it remains valid at all other times. In later work, Bohm speculated that this harmony between<br />

the quantum potential and the probability density could possibly be understood as a requirement for<br />

equilibrium of an underlying ‘sub - quantum aether’. From this idea the expectation arises that if this<br />

equilibrium can be disrupted, it can only after some time become restored again, so that deviations<br />

from the quantum mechanical predictions can appear at very swift measurements. Until now such<br />

deviations have not been found.<br />

Bohmian mechanics gives, on the basis of the thesis that, eventually, all measurements are position<br />

measurements, the same empirically verifiable predictions as standard quantum mechanics does.<br />

Moreover, it provides a picture in which particles have position and momentum and it can be visualized<br />

how the particles move through space, even if there is no measurement. Also, Bohmian<br />

mechanics is deterministic; the evolution is determined by classical mechanics, extended with the<br />

quantum potential. Although these properties seem to be large advantages, Bohm’s proposal evoked<br />

no enthusiasm in the nineteen fifties.<br />

Of course, from the side of the Copenhageners little support was to be expected. The proposal was<br />

dismissed as ‘metaphysical speculation’, a return to the lost paradise of classical physics. Bohm parried<br />

this argument by calling the Copenhageners’ ‘completeness’ claim untestable and metaphysical.<br />

But Einstein also found the idea ‘too cheap’ because it leaned too much on the quantum mechanical<br />

formalism in combination with the classical idea of particles. Einstein himself thought that a<br />

completely new theory with a totally different perspective was necessary, such as his unified field<br />

theory. Probably, Einstein also had objections because of the far - reaching non - locality of Bohmian<br />

mechanics.<br />

Others stumbled at the fact that Bohmian mechanics only relies on a rewriting of the Schrödinger<br />

equation, and contains nothing new. Bohm had foreseen this criticism and tried to argue that his theory<br />

presents new ideas for experiments and that on distance and energy scales which are within range of<br />

Heisenberg’s indeterminacy principle, Bohmian mechanics will prove to be necessary. But above all,<br />

Bohm wanted to show the possibility of a HVT and to challenge the necessity of the Copenhagen<br />

interpretation.


136 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />

Bohmian mechanics has not lead to new verifiable statements, although ‘tunneling times’ are debated,<br />

about which quantum mechanics does not say anything, but Bohmian mechanics does. Furthermore,<br />

by the fresh look supplied by Bohmian mechanics, new extensions of the theory are suggested,<br />

such as the suggestion of an underlying sub - quantum aether, as a result of the unexpected double<br />

role of the wave function.<br />

In the nineteen nineties, a growing group of physicists considered Bohmian mechanics to be a<br />

serious alternative for the Copenhagen interpretation, see for example Holland (1993) and Cushing<br />

(1994), who suggests a sociological explanation for the fact that the physicists’ community did not<br />

replace quantum mechanics by the, according to Cushing, superior Bohmian mechanics.<br />

VI. 5<br />

THE HAMILTON - JACOBI EQUATION<br />

In classical mechanics we assume that for a system of n point particles, with canonical positions<br />

⃗q = (q 1 , . . . , q n ) ∈ R 3n and speeds ˙⃗q = ( ˙q 1 , . . . , ˙q n ) ∈ R 3n , a Lagrangian L(⃗q, ˙⃗q, t) can be<br />

found, the Lagrangian L = T − V being the difference between kinetic and potential energy. Define<br />

the following functional, called the action<br />

∫<br />

S γ (⃗q, t; ⃗q 0 , t 0 ) := L(⃗q, ˙⃗q, t) dt, (VI. 29)<br />

γ<br />

where the integral, for n particles in 3 dimensions, is taken over a continuous path γ in configuration<br />

space R 3n between an initial configuration ⃗q 0 at time t 0 and the configuration ⃗q at time t. In case the<br />

Lagrangian does not explicitly depend on t, we can also write S γ (⃗q, ⃗q 0 , t − t 0 ).<br />

The equations of motion are found by application of Hamilton’s principle of least action; for the<br />

path γ 0 which is actually followed, the action reaches an extremum in comparison to all possible<br />

continuous paths. This requirement,<br />

δS γ = 0, (VI. 30)<br />

provides n equations of motion of Euler and Lagrange,<br />

d<br />

dt<br />

∂L<br />

∂ ˙q j<br />

− ∂L<br />

∂q j<br />

= 0. (VI. 31)<br />

The Hamiltonian, H = T + V , is defined as the Legendre transform of the Lagrangian,<br />

H (⃗q, ⃗p, t) :=<br />

3n∑<br />

j=1<br />

p j ˙q j − L(⃗q, ˙⃗q, t) (VI. 32)<br />

where<br />

p j := ∂L<br />

∂ ˙q j<br />

(VI. 33)<br />

is the canonical momentum.


VI. 5. THE HAMILTON - JACOBI EQUATION 137<br />

Substitution of (VI. 32) in (VI. 29) yields<br />

S γ =<br />

∫<br />

γ<br />

( 3n∑<br />

j=1<br />

)<br />

p j ˙q j − H (⃗q, ⃗p, t) dt =<br />

3n∑<br />

j=1<br />

∫<br />

γ<br />

p j dq j −<br />

∫<br />

γ<br />

H (⃗q, ⃗p, t) dt, (VI. 34)<br />

and variation of S γ in this form yields the 2n Hamiltonian equations of motion,<br />

˙q j = ∂H<br />

∂p i<br />

,<br />

ṗ j = − ∂H<br />

∂q i<br />

. (VI. 35)<br />

Now consider the action S γ along a real path γ 0 , i.e., a path satisfying the equations of motion,<br />

and form its differential,<br />

dS(⃗q, ⃗q 0 , t − t 0 ) =<br />

3n∑<br />

j=1<br />

(p j dq j − p 0j dq 0j ) − H (⃗q, ⃗p, t) dt. (VI. 36)<br />

Comparison with<br />

dS(⃗q, ⃗q 0 , t − t 0 ) =<br />

3n∑<br />

j=1<br />

( ∂S<br />

∂q j<br />

dq j +<br />

∂S )<br />

dq 0j + ∂S dt (VI. 37)<br />

∂q 0j ∂t<br />

and using requirement (VI. 30) shows that<br />

H (⃗q, ⃗p, t) = − ∂S<br />

∂t ,<br />

p j = ∂S<br />

∂q j<br />

,<br />

p 0j = − ∂S<br />

∂q 0j<br />

, (VI. 38)<br />

and therefore<br />

∂S<br />

(<br />

∂t + H ⃗q, ∂S )<br />

∂⃗q , t<br />

= 0. (VI. 39)<br />

This is (VI. 7), the Hamilton - Jacobi equation, as discussed on p. 129. The technique to solve the<br />

mechanical equations of motion by means of this equation is especially due to Jacobi. Without discussing<br />

this technique in detail, we mention the following.<br />

For definite q 0 and t 0 it is possible to consider the action S as a function on configuration space. It<br />

can be shown that the paths satisfying the equations of motion are always perpendicular to the hyperplanes<br />

of constant S, hence the frequently quoted analogy with optics; paths are comparable to rays<br />

of light, and planes of constant S to wave fronts. If, for one moment in time, the values S are given<br />

over the complete configuration space, the Hamilton - Jacobi equation determines how they evolve in<br />

the course of time. The problem to find the paths of the particles is thus reduced to constructing the<br />

curves which are normal to the planes of constant S.<br />

◃ Remark<br />

Schrödinger originally based his derivation of wave mechanics on the idea that wave mechanics is to<br />

classical mechanics as wave optics is to ray optics, and with the just mentioned wave fronts and the<br />

Hamilton - Jacobi equation he came to his wave mechanics. ▹


VII<br />

BELL’S INEQUALITIES<br />

There is hardly a paper - nor was there any during the past two and a half decades -<br />

which deals with the foundations of quantum mechanics and does not refer to the work<br />

of John Stewart Bell.<br />

Bell’s theorem is the most profound discovery of science.<br />

— Max Jammer<br />

— Henry Stapp<br />

[. . . ] Bell is generally credited with having brought down a purely philosophical issue<br />

from the lofty realms of abstract speculation to the tangible reach of empirical investigation<br />

and of having thereby established what has been called ‘experimental metaphysics’.<br />

— Max Jammer<br />

The ‘Bell inequalities’ is a generic term for inequalities in terms of measurable physical quantities<br />

which are satisfied by hidden variables theories, but are violated by quantum mechanics. We will<br />

derive several Bell inequalities, belonging to different types of hidden variables theories. This<br />

also includes indeterministic, stochastic HVT’s, which fell outside the scope of chapter V.<br />

VII. 1<br />

LOCAL DETERMINISTIC HIDDEN VARIABLES<br />

VII. 1. 1<br />

DERIVATION <strong>OF</strong> THE FIRST BELL INEQUALITY<br />

Returning to the hidden variables theories, HVT’s, we focus our attention at a specific experiment.<br />

In the article ‘On the Einstein Podolsky Rosen paradox’ (1964), J.S. Bell examines the EPR experiment,<br />

discussed in section I. 2, in a version which was given by Bohm and Aharonov (Bohm 1957),<br />

also called the EPRB experiment. Bohm and Aharonov proposed an experiment in which two spin<br />

1/2 particles are prepared in the singlet state and, next, move apart in opposite directions. After they<br />

are separated, the spin of each of the particles is measured in an arbitrary direction, where the spin of<br />

particle 1 is measured in direction ⃗a and the remote particle 2 in direction ⃗ b, as in figure III. 3, p. 73.<br />

In this experiment, one can follow the same argument as EPR. Using the notation of section III. 6,<br />

if measurement of ⃗σ 1 · ⃗a yields the value +1 then, for the singlet state, measurement of ⃗σ 2 · ⃗a must<br />

yield the value −1 and vice versa.<br />

Since the result of a measurement of a spin component of the one particle can be predicted with<br />

certainty by measuring the same component of the other particle, whereas the particles are far away


140 CHAPTER VII. BELL’S INEQUALITIES<br />

from each other and do not interact, it follows, according to EPR, that the result of a measurement<br />

of any spin component is determined in advance, i.e., that it is an element of physical reality. This<br />

suggests that there there should be a more complete description of the state of the particles, including<br />

hidden variables.<br />

Specify this description of the pair of particles with variables λ ∈ Λ as we did in chapter V. We<br />

write the quantities corresponding to (⃗σ 1·⃗a)⊗(⃗σ 2·⃗b) as the pair (A, B), having values a,b = ±1. In a<br />

contextual HVT, these values are dependent on the hidden variable λ and the total measuring context,<br />

which can be specified here by means of the measurement directions ⃗a and ⃗ b, leading to<br />

A = A(⃗a, ⃗ b, λ) and B = B(⃗a, ⃗ b, λ). (VII. 1)<br />

Now the essential assumption is the requirement of locality that the quantity A does not depend<br />

on the reading ⃗ b of a remote spin meter, and vice versa for B and ⃗a. These quantities therefore only<br />

depend upon the local context,<br />

A(⃗a, ⃗ b, λ) = A(⃗a, λ), a = ±1,<br />

B(⃗a, ⃗ b, λ) = B( ⃗ b, λ), b = ±1. (VII. 2)<br />

b = +1<br />

b = −1<br />

B( ⃗ b, λ)<br />

A(⃗a, λ)<br />

a = +1<br />

a = −1<br />

b ′ = +1<br />

b ′ = −1<br />

B( ⃗ b ′ , λ)<br />

A(⃗a ′ , λ)<br />

a ′ = +1<br />

a ′ = −1<br />

Spin meter B<br />

ρ(λ)<br />

Spin meter A<br />

Source<br />

Figure VII. 1: Thought experiment of Einstein, Podolsky and Rosen on the singlet<br />

The source emitting the particle pairs probably does not prepare the pairs in the same state λ each<br />

time. We assume that the source can be characterized by a probability density ρ,<br />

∫<br />

ρ(λ) dλ = 1, (VII. 3)<br />

Λ<br />

where we also assume that this probability density does not depend on the measuring directions ⃗a<br />

and ⃗ b, which, after all, can be established long after the particles have left the source. The expectation<br />

value of the product of A and B in this HVT is therefore<br />

∫<br />

E(⃗a, ⃗ b) = A(⃗a, λ) B( ⃗ b, λ) ρ(λ) dλ. (VII. 4)<br />

Λ


VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 141<br />

Quantum mechanics gives as the expectation value, with the particle pair in the singlet state, see<br />

equation (III. 171), p. 73,<br />

E QM (⃗a, ⃗ b) = ⟨ ⃗σ 1 · ⃗a ⊗ ⃗σ 2 · ⃗b ⟩ = −⃗a · ⃗b = − cos θ ⃗a, ⃗ b<br />

. (VII. 5)<br />

But the expressions (VII. 4) and (VII. 5) cannot coincide for all directions ⃗a and ⃗ b. According<br />

to (VII. 2), the expectation value E(⃗a, ⃗ b) of the product of A and B cannot be less than −1. Therefore,<br />

to reach −1 at ⃗a = ⃗ b, also requiring equality between (VII. 4) and (VII. 5), it must hold for all unit<br />

vectors ⃗n that<br />

A(⃗n, λ) = − B(⃗n, λ), (VII. 6)<br />

which leads to<br />

∫<br />

E(⃗a, ⃗ b) = −<br />

Λ<br />

A(⃗a, λ) A( ⃗ b, λ) ρ(λ) dλ. (VII. 7)<br />

Now it follows, because of ( A(⃗n, λ) ) 2 = 1, that<br />

∫<br />

E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />

b ′ ) = − A(⃗a, λ) A( ⃗ b, λ) − A(⃗a, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />

=<br />

Λ<br />

∫<br />

Λ<br />

A(⃗a, λ) A( ⃗ b, λ) ( A( ⃗ b, λ) A( ⃗ b ′ , λ) − 1 ) ρ(λ) dλ, (VII. 8)<br />

where ⃗ b ′ is another setting of the remote spin meter, and A( ⃗ b ′ , λ) also has values ±1. Taking the<br />

absolute value on both sides, keeping in mind that |A(⃗a, λ)A( ⃗ b, λ)| = 1, it follows that<br />

∫<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />

b ′ )| 1 − A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ, (VII. 9)<br />

or,<br />

Λ<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| 1 + E( ⃗ b, ⃗ b ′ ). (VII. 10)<br />

This is the original Bell inequality.<br />

VII. 1. 2<br />

THE BELL INEQUALITY <strong>OF</strong> CLAUSER, HORNE, SHIMONY AND HOLT<br />

Next, we will derive a second inequality. In (VII. 8), we replace ⃗a by ⃗a ′ and the − sign by<br />

the + sign,<br />

∫<br />

E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ (<br />

b ′ ) = − A(⃗a ′ , λ) A( ⃗ b, λ) + A(⃗a ′ , λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />

∫<br />

= −<br />

Λ<br />

Λ<br />

A(⃗a ′ , λ) A( ⃗ b, λ) ( 1 + A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ. (VII. 11)


142 CHAPTER VII. BELL’S INEQUALITIES<br />

Now, in the same way as we derived (VII. 10), we obtain<br />

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 1 − E( ⃗ b, ⃗ b ′ ). (VII. 12)<br />

Combination of (VII. 10) and (VII. 12) leads to<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2. (VII. 13)<br />

This version of the Bell inequality has been first derived although under weaker assumptions than<br />

used here, by Clauser, Horne, Shimony and Holt (Clauser 1969), for which reason it is also called the<br />

CHSH inequality. We will return to these assumptions in section VII. 2,<br />

VII. 1. 3<br />

VIOLATION <strong>OF</strong> THE BELL INEQUALITIES BY <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

We will now prove the following theorem.<br />

BELL’S FIRST THEOREM:<br />

A local deterministic HVT is empirically contradictory to quantum mechanics.<br />

Proof<br />

With the expression empirically contradictory we mean that the two theories make contradictory<br />

statements in terms of measurable physical quantities. We will show that, quantum mechanically,<br />

there are spin quantities which violate the Bell inequalities.<br />

Consider the configuration below, where all vectors lie in the same plane.<br />

a<br />

a ′ , b<br />

b ′<br />

ϕ<br />

ϕ<br />

Figure VII. 2: A configuration in which the spin quantities violate the Bell inequality<br />

Using (VII. 5) for this configuration, and substituting the quantum mechanical expression into (VII. 13),<br />

F (ϕ) := | − cos ϕ + cos 2ϕ | + | − cos ϕ − 1| 2, . (VII. 14)<br />

This function is plotted in figure VII. 3.


VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 143<br />

2<br />

F (ϕ)<br />

0<br />

π/2<br />

ϕ →<br />

π<br />

Figure VII. 3: The Bell inequality violated for every acute angle ϕ<br />

We see that (VII. 14) is violated for every ϕ ∈ (0, 1 2 π). The maximum violation is F (60◦ ) = 5 2 ,<br />

as can be seen in the figure.<br />

Even larger violations are by the next configuration:possible in other configurations. The largest<br />

violation is obtained in the configuration of figure VII. 4(with all vectors in a single plane),<br />

leading to<br />

E QM (⃗a, ⃗ b) = − cos 45 ◦ = − 1 2<br />

√<br />

2,<br />

E QM (⃗a, ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

E QM (⃗a ′ , ⃗ b) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

E QM (⃗a ′ , ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />

√<br />

2,<br />

|E QM (⃗a, ⃗ b) − E QM (⃗a, ⃗ b ′ )| + |E QM (⃗a ′ , ⃗ b) + E QM (⃗a ′ , ⃗ b ′ )| = 2 √ 2. (VII. 15)<br />

This is a violation of 41%. □<br />

a<br />

b<br />

a ′ 45 ◦ b ′<br />

Figure VII. 4: the configuration giving the largest violation of the Bell inequality (all vectors in the<br />

same plane)


144 CHAPTER VII. BELL’S INEQUALITIES<br />

VII. 1. 4<br />

THE BELL INEQUALITY IN A NON-CONTEXTUAL, LOCAL DETERMINISTIC HVT<br />

To show that the Bell inequality, derived for a local deterministic contextual HVT, also holds for<br />

a local deterministic autonomous HVT, we consider a local deterministic autonomous model for the<br />

singlet.<br />

Assume that both particles are characterized by a ‘classical’ spin vector, ⃗ J and − ⃗ J, about a<br />

common axis. This is the hidden variable. In this HVT, we further assume that the outcome of a<br />

measurement of spin in the direction ⃗n is determined by the sign of the component of the spin vector<br />

in the direction ⃗n. Now let the particles fly away from each other. If the spin of the first particle in the<br />

direction ⃗a is measured we find the outcome<br />

⃗J · ⃗a<br />

∥ ⃗ J · ⃗a∥<br />

∈ {− 1, 1}, (VII. 16)<br />

for the spin of the second particle in direction ⃗ b we find<br />

− ⃗ J · ⃗b<br />

∥ ⃗ J · ⃗b∥<br />

∈ {− 1, 1}. (VII. 17)<br />

The result of the measurement of the first particle is independent of the direction ⃗ b and vice versa,<br />

therefore, the model is local.<br />

Now consider an ensemble of such two particle systems where ⃗ J is distributed isotropically. If a n<br />

is the sign of ⃗ J · ⃗a in the n th pair, and likewise, b n the sign of − ⃗ J · ⃗b, then if ⃗ J pierces through the<br />

shaded area of the unit sphere on the right side in figure VII. 5, a n b n = +1. Otherwise, a n b n = −1.<br />

⃗a<br />

+<br />

⃗a<br />

⃗ b<br />

−<br />

⃗J<br />

θ<br />

θ<br />

+<br />

−<br />

⃗ b<br />

−<br />

+<br />

− ⃗ J<br />

Figure VII. 5: Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n is<br />

positive, in the unshaded areas a n b n is negative.


VII. 2. LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES 145<br />

The surface of the shaded area is 4θ ⃗a, ⃗ b<br />

, that of the remaining part is 4(π − θ ⃗a, ⃗ b<br />

). For an isotropic<br />

distribution, averaging over the surface of the unit sphere, we therefore find<br />

⟨a n b n ⟩ = 1 (<br />

4 θ⃗a, ⃗<br />

4 π b<br />

− 4 (π − θ ⃗a, ⃗ b<br />

) ) = − 1 + 2 π θ ⃗a, ⃗ , (VII. 18)<br />

b<br />

which is an increasing line through (0, −1) having slope π 2 . This runs from perfect anti - correlation<br />

for θ = 0 to perfect correlation for θ = π.<br />

1<br />

− cos θ ⃗a, ⃗ b<br />

⟨a n b n ⟩<br />

0<br />

θ →<br />

π<br />

− 1<br />

Figure VII. 6: Comparison of the quantum mechanical expectation values and those for the local<br />

deterministic HVT<br />

In this HVT, equation (VII. 18) must satisfy the Bell inequality (VII. 13) for E (⃗a, ⃗ b) = ⟨a n b n ⟩.<br />

Choosing the angles as in the example on p. 142, figure VII. 2, if (VII. 18) is substituted in (VII. 13)<br />

it yields exactly 2 for any θ π, where the quantum mechanical expectation values violated the<br />

inequality for every θ ∈ (0, 1 2 π).<br />

In the configuration giving the largest violation of the inequality (VII. 13), see figure VII. 4, we<br />

have<br />

θ ⃗a, ⃗ b<br />

= 1 4 π and θ ⃗a, ⃗ = θ<br />

b ′ ⃗a ′ , ⃗ b = θ ⃗a ′ , ⃗ = 3 b ′ 4<br />

π, (VII. 19)<br />

and therefore, (VII. 18) substituted in (VII. 13) yields<br />

| ( − 1 + 2) 1 ( ) ( ) (<br />

− − 1 +<br />

3<br />

2 | + | − 1 +<br />

3<br />

2 + − 1 +<br />

3<br />

2)<br />

| = 1 + 1 = 2, (VII. 20)<br />

where quantum mechanically, on p. 143 we found 2 √ 2.<br />

We see that where quantum mechanics violated the inequality (VII. 13), this local deterministic<br />

autonomous HVT satisfies it, thereby confirming Bell’s first theorem.<br />

VII. 2<br />

LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES<br />

We have seen that a considerable difference exists between the empirically verifiable statements<br />

of quantum mechanics and those of a local deterministic, autonomous HVT for a singlet state and


146 CHAPTER VII. BELL’S INEQUALITIES<br />

suitably chosen spin directions. This enables an experimental test of these statements, and therefore<br />

of the correctness of the philosophical bases of both theories. A. Shimony (1989) spoke, concerning<br />

the experimental testing of the Bell inequalities, of ‘experimental metaphysics’.<br />

However, the question of experimental testing puts the derivation of the Bell inequalities in another<br />

perspective. We no longer want to compare a HVT with quantum mechanics, but with experimental<br />

results. In this respect (VII. 6), implying perfect anti - correlation when ⃗a = ⃗ b, is overly<br />

idealized. In a real experiment the particle detectors are not perfectly efficient, in the sense that not<br />

all particles are registered. Imagine a detector which, even if A(⃗a, λ) = 1, sometimes gives 0, i.e. not<br />

measured, or even −1, i.e. wrongly measured. Moreover, in a contextual HVT the outcomes could also<br />

be dependent of the measuring context, i.e. of (possibly hidden) variables of the detectors. But also in<br />

this generalized situation it is possible to derive the inequality (VII. 13) from a locality assumption.<br />

We will show this by proving the next theorem.<br />

BELL’S SECOND THEOREM:<br />

A local deterministic contextual HVT is empirically inconsistent with quantum mechanics.<br />

Proof<br />

Assume that the quantities A and B are functions of three arguments,<br />

A = A(⃗a, λ, µ), B = B( ⃗ b, λ, ν) where A, B ∈ {− 1, 1}. (VII. 21)<br />

Here the local deterministic character of the HVT is expressed; the outcome of the measurement<br />

at the measuring apparatus measuring ⃗a · ⃗σ is determined by λ ∈ Λ, describing the source, by the<br />

local hidden variables of that measuring device, expressed symbolically by µ ∈ Λ a , and by the<br />

position ⃗a of the meter pointer. Therefore, the requirement of locality is that A does not depend<br />

on ⃗ b and ν, and B does not depend on ⃗a and µ. We also assume that the hidden variables of the<br />

apparatuses are independent of each other and of λ,<br />

Defining<br />

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ) ρ 2 (ν). (VII. 22)<br />

and<br />

⟨A(⃗a, λ)⟩ :=<br />

⟨B( ⃗ b, λ)⟩ :=<br />

∫<br />

A(⃗a, λ, µ) ρ 1 (µ) dµ (VII. 23)<br />

Λ a<br />

∫<br />

B( ⃗ b, λ, ν) ρ 2 (ν) dν, (VII. 24)<br />

Λ b<br />

we have, instead of assumption (VII. 2), the much weaker requirements<br />

|⟨A(⃗a, λ)⟩| 1 and |⟨B( ⃗ b, λ)⟩| 1, (VII. 25)<br />

and we will show now that from this it is again possible to derive the Bell inequality (VII. 13).


dµ A(⃗a, λ, µ) dν B( ⃗ b, λ, ν) ρ(λ, µ, ν)<br />

VII. 3. WIGNER’S DERIVATION 147<br />

The expectation value in this HVT is<br />

∫ ∫<br />

∫<br />

E(⃗a, ⃗ b) = dλ<br />

Λ Λ a Λ b<br />

∫<br />

= ⟨A(⃗a, λ)⟩ ⟨B( ⃗ b, λ)⟩ ρ(λ) dλ, (VII. 26)<br />

Λ<br />

which is an ‘averaged’ version of (VII. 4). With (VII. 25) we see that<br />

∫<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| = |⟨A(⃗a, λ)⟩ ( ⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩ ) | ρ(λ) dλ<br />

Λ<br />

∫<br />

|⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ. (VII. 27)<br />

Λ<br />

Likewise we have<br />

|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| <br />

∫<br />

Λ<br />

|⟨B( ⃗ b, λ)⟩ + ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ, (VII. 28)<br />

and therefore<br />

|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2, (VII. 29)<br />

since |x + y| + |x − y| 2 if |x| 1 and |y| 1. We see that (VII. 29) is, indeed, the Bell<br />

inequality (VII. 13).<br />

For ⃗a ′ = ⃗ b ′ and the assumption of perfect anti - correlation E ( ⃗ b ′ , ⃗ b ′ ) = −1, from inequality<br />

(VII. 13) follows the original Bell inequality (VII. 10). But, as we showed, (VII. 13) remains<br />

valid under the weaker conditions (VII. 25). □<br />

◃ Remark<br />

It is not necessary to assume mutual independence for µ and λ or for ν and λ as in (VII. 22), the<br />

result (VII. 25) also follows when we make the weaker assumption that the conditional probability<br />

distributions of the apparatuses factorize the conjoint probability distribution ρ,<br />

ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ | λ) ρ 2 (ν | λ). ▹ (VII. 30)<br />

VII. 3<br />

WIGNER’S DERIVATION<br />

E.P. Wigner (1970) was the first to give an elegant derivation of a Bell inequality in terms<br />

of probabilities. We again consider the EPRB experiment from section VII. 1. Using three directions,<br />

⃗n 1 , ⃗n 2 , ⃗n 3 ∈ R 3 , define<br />

σ i := ⃗n i · ⃗σ and τ i := ⃗n i · ⃗τ with i ∈ {1, 2, 3}. (VII. 31)


148 CHAPTER VII. BELL’S INEQUALITIES<br />

Here ⃗σ and ⃗τ are the spin operators of particle 1 and particle 2, respectively. We assume the quantities<br />

of particle 1 to be independent of those of particle 2 and therefore<br />

(σ i ⊗ 11) σi ⊗ τ j<br />

[λ] = (σ i ⊗ 11) σi ⊗ τ j ′ [λ], (VII. 32)<br />

(11 ⊗ τ j ) σi ⊗ τ j<br />

[λ] = (11 ⊗ τ j ) σi ′ ⊗ τ j<br />

[λ]. (VII. 33)<br />

for i ′ ≠ i and j ′ ≠ j. This is the requirement of locality. Without this requirement we would have<br />

nine quantities in the HVT, namely the pairs (σ i ,τ j ), that is, as much quantities as measuring contexts.<br />

Now we have only six: σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 .<br />

The outcome of measurement of every spin quantity is ±1 in units of 1 2<br />

. A HVT must grant a<br />

probability to every combination of outcomes,<br />

0 p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) 1, (VII. 34)<br />

with the usual marginal distributions, for instance<br />

p (σ 1 , τ 1 ) =<br />

∑+1<br />

∑+1<br />

∑+1<br />

∑+1<br />

σ 2 =−1 σ 3 =−1 τ 2 =−1 τ 3 =−1<br />

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ), (VII. 35)<br />

and so on.<br />

◃ Remark<br />

Quantum mechanics does not have such joint probability distributions because these six quantities<br />

do not all in pairs commute with each other. The spin quantities are not jointly measurable but in<br />

the HVT their values are all fixed. ▹<br />

Calling the angles between ⃗n 1 , ⃗n 2 , ⃗n 3 : θ 12 , θ 23 , θ 31 , then in the singlet state we have, see chapter<br />

III, (III. 176) and (III. 177),<br />

Prob (σ i = 1 ∧ τ j = 1) = 1 2 sin2 1 2 θ ij, (VII. 36)<br />

Prob (σ i = 1 ∧ τ j = − 1) = 1 2 cos2 1 2 θ ij. (VII. 37)<br />

These are the quantum mechanical probabilities and we will see that the HVT, satisfying requirement<br />

(VII. 34), cannot reproduce this. From (VII. 36) and (VII. 37) follows the requirement<br />

p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) = 0 unless σ 1 = − τ 1 , σ 2 = − τ 2 , σ 3 = − τ 3 , (VII. 38)<br />

because the hidden variables cannot assume values giving a positive spin of both particles in the same<br />

direction.<br />

The probability for σ 1 and τ 3 to be both +1 is, using (VII. 36),<br />

∑ ∑<br />

p (+, σ 2 , σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 13 (VII. 39)<br />

τ 1 ,τ 2<br />

σ 2 ,σ 3<br />

= p (+, +, −, −, −, +) + p (+, −, −, −, +, +).


Likewise we calculate the following probabilities<br />

∑ ∑<br />

and<br />

σ 1 ,σ 3<br />

∑ ∑<br />

σ 2 ,σ 3<br />

VII. 3. WIGNER’S DERIVATION 149<br />

τ 1 ,τ 2<br />

p (σ 1 , +, σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 23 (VII. 40)<br />

= p (+, +, −, −, −, +) + p (−, +, −, +, −, +)<br />

τ , τ 3<br />

p (+, σ 2 , σ 3 , τ 1 , +, τ 3 ) = 1 2 sin2 1 2 θ 12 (VII. 41)<br />

From (VII. 40) and (VII. 41) it follows that<br />

= p (+, −, +, −, +, −) + p (+, −, −, −, +, +).<br />

p (+, +, −, −, −, +) 1 2 sin2 1 2 θ 23 and (VII. 42)<br />

p (+, −, −, −, +, +) 1 2 sin2 1 2 θ 12, (VII. 43)<br />

respectively. Consequently, we have for (VII. 39), the probability for σ 1 and τ 3 to be both +1,<br />

1<br />

2 sin2 1 2 θ 23 + 1 2 sin2 1 2 θ 12 1 2 sin2 1 2 θ 13, (VII. 44)<br />

which, using sin 2 1 2 θ = 1 2<br />

(1 − cos θ), is equal to<br />

(1 − cos θ 23 ) + (1 − cos θ 12 ) (1 − cos θ 13 ). (VII. 45)<br />

This is, in essence, the same as inequality (VII. 10); rewriting (VII. 45), realizing that 1 − cos θ 0,<br />

and comparing E(⃗a, ⃗ b) to − cos θ 12 etc. yields<br />

1 − cos θ 23 | − cos θ 12 + cos θ 13 |. (VII. 46)<br />

n 2<br />

n 1<br />

ϕ ϕ<br />

n 3<br />

Figure VII. 7: Violation of the Bell inequality again<br />

With θ 23 = θ 12 = 1 2 θ 13 = ϕ as in diagram VII. 7, (VII. 45) becomes<br />

1 − 2 cos ϕ + cos 2ϕ 0, (VII. 47)<br />

and using cos 2ϕ = 2 cos 2 ϕ − 1 we see that<br />

cos ϕ (1 − cos ϕ) 0. (VII. 48)<br />

Since 1 − cos ϕ 0 for every ϕ, this inequality is violated for every acute angle.


150 CHAPTER VII. BELL’S INEQUALITIES<br />

EXERCISE 32. What type of HVT is excluded by Wigner’s reasoning?<br />

◃ Remark<br />

Wigner (1970) makes the observation that the HVT would have been possible if the terms in (VII. 44)<br />

had been sin 1 2 θ instead of sin2 1 2θ. Apparently, our world depends on such ‘minimal’ mathematical<br />

differences. ▹<br />

VII. 4<br />

THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP<br />

In the previous derivations of the Bell inequalities hidden variables were assumed, which represent<br />

properties of the pair of particles and determine the outcomes of measurements of all physical<br />

quantities. As a consequence, in this HVT a joint probability is defined for the values of non -<br />

commuting quantities also, as we saw in Wigner’s derivation. This follows from the fact that at<br />

given λ both A(⃗a, λ) and A(⃗a ′ , λ) are fixed, for example<br />

p ( A(⃗a) = 1 ∧ A(⃗a ′ ) = 1 ) ∫<br />

= ρ(λ) dλ, (VII. 49)<br />

∆<br />

where ∆ ⊂ Λ is the area in which both A(⃗a, λ) = 1 and A(⃗a ′ , λ) = 1. Since quantum mechanics<br />

does not acknowledge such ‘simultaneous probabilities’ for non - commuting quantities, the quantities<br />

not being simultaneously measurable, it could be suspected that this property of the HVT is the main<br />

reason for the deviation from quantum mechanics, instead of locality or determinism.<br />

In the next derivation of the Bell inequality, given by P. Eberhard and H. Stapp (1977), the existence<br />

of hidden variables is not assumed. They claim that the Bell inequality follows from an assumption<br />

of locality only. However, what will be shown to be necessary in this derivation, is the<br />

assumption that we can speak reasonably about the outcomes of measurements which have not actually<br />

been carried out.<br />

THE EBERHARD - STAPP THEOREM:<br />

Quantum mechanics is a non - local theory.<br />

Proof<br />

Consider again the EPRB experiment. Let ⃗a and ⃗a ′ be two readings of the spin meter at A, and ⃗ b<br />

and ⃗ b ′ likewise at B. We can carry out four experiments:<br />

I : ⃗a, ⃗ b II : ⃗a, ⃗ b ′ III : ⃗a ′ , ⃗ b IV : ⃗a ′ , ⃗ b ′ . (VII. 50)<br />

Define, for the n th pair of particles, a n (I) as the outcome of a spin measurement in the direction ⃗a<br />

of the particle traveling to A while the meter at A points in the direction ⃗a, while at the other<br />

particle, which travels to B, spin in the direction ⃗ b is measured; this gives a n (I) = ±1 for<br />

experiment I and likewise for a n (II), a n ′ (III), a n ′ (IV), b n (I), b n ′ (II), b n (III) and b n ′ (IV).<br />

These values represent outcomes of measurements of actual or possible measurements, not actual<br />

properties of the particles which also exist if they are not measured.


VII. 4. THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP 151<br />

The assumption of locality is that an outcome of measurement of spin of particle 1, in direction ⃗a,<br />

does not depend on which spin direction, ⃗ b or ⃗ b ′ , is measured of the other, remote particle 2. This<br />

is the supposition of locality from the Eberhard - Stapp theorem, leading to what we will call the<br />

matching condition,<br />

a n (I) = a n (II),<br />

a n ′ (III) = a n ′ (IV),<br />

b n (I) = b n (III), b n ′ (II) = b n ′ (IV), (VII. 51)<br />

for all N particle pairs in the singlet state |Ψ 0 ⟩.<br />

Now we can define the following mathematical expression<br />

γ n := a n (I) b n (I) + a n (II) b n ′ (II) + a n ′ (III) b n (III) − a n ′ (IV) b n ′ (IV), (VII. 52)<br />

where the first term corresponds to experiment I, the second to experiment II, etc. Because of the<br />

value assignment ±1, γ is an even integer, and the fourth term being the product of the first three<br />

terms, subtraction of the fourth term means that γ has only two values, as we will see. Moreover,<br />

subtraction allows for an inequality similar to Bell’s inequality (VII. 13).<br />

In (VII. 52) we can omit writing out the labels referring to the numbers of the experiments because<br />

of the matching condition (VII. 51); a n := a n (I) = a n (II), etc. Rewriting (VII. 52),<br />

γ n = a n (b n + b n ′ ) + a n ′ (b n − b n ′ ), (VII. 53)<br />

because of the value assignment ±1 we immediately see that either the first or the second term<br />

equals 0, yielding for all n<br />

γ n = ± 2. (VII. 54)<br />

Averaging over N recurrences of the experiment we have<br />

∣ 1 N<br />

N∑ ∣ ∣∣<br />

γ n =<br />

n=1<br />

1<br />

∣<br />

N<br />

N∑<br />

a n b n +<br />

n=1<br />

Defining the correlation coefficients<br />

N∑<br />

a n b ′ n +<br />

n=1<br />

N∑<br />

a ′ n b n −<br />

n=1<br />

N∑<br />

a ′ ′<br />

n b n ∣ 2. (VII. 55)<br />

n=1<br />

we conclude<br />

c N (⃗a, ⃗ b) :=<br />

1 N<br />

N∑<br />

a n b n etc., (VII. 56)<br />

n=1<br />

|c N (⃗a, ⃗ b) + c N (⃗a, ⃗ b ′ ) + c N (⃗a ′ , ⃗ b) − c N (⃗a ′ , ⃗ b ′ )| 2. (VII. 57)<br />

This is indeed a Bell inequality again, equivalent to inequality (VII. 13) in the limit N → ∞.<br />

The expectation value of c(⃗a, ⃗ b) = ⟨a n b n ⟩ in quantum mechanics is given by (VII. 5) and the<br />

contradiction with (VII. 57) follows as in section VII. 2. □


152 CHAPTER VII. BELL’S INEQUALITIES<br />

◃ Remark<br />

The derivation of (VII. 57) directly comes from expression (VII. 52) and as a result, the existence of<br />

hidden variables does not have to be presumed, only locality was required. Sensationally, we seem to<br />

have proved that quantum mechanics is empirically inconsistent with the requirement of locality. ▹<br />

The experimental violation of the Bell inequalities thus leads us to the conclusion that physical<br />

reality is not local. What we, however, have presupposed in the matching condition (VII. 51) is that<br />

we can simultaneously assign values to a n and a n′ , although they cannot be simultaneously measured<br />

because the spin measuring device cannot be at the same time in both positions ⃗a and ⃗a ′ ≠ ⃗a. In fact,<br />

of the set of four terms in (VII. 52), at the most one of them is experimentally realizable. Still, we<br />

spoke of outcomes of measurements that have not actually been carried out. Of course, the derivation<br />

of the Bell inequality (VII. 57) from the matching condition (VII. 51) is mathematically flawless. The<br />

question is whether the matching condition (VII. 51) follows from the requirement of locality. We<br />

will now explore this question further.<br />

VII. 4. 1<br />

COUNTERFACTUAL CONDITIONAL STATEMENTS AND INDETERMINISM<br />

Let a n be the outcome of experiment I. With their matching condition, Eberhard and Stapp claim<br />

that this value of a n would be unaltered if we had carried out experiment II instead of experiment I<br />

because these experiments only differ in the settings of the B - meter, which is far away. Therefore,<br />

a n is the outcome which the spin meter A would have given for the n th pair of particles for both<br />

experiment I and experiment II. Redhead (1987, p. 92) formulates this requirement as follows<br />

PRINCIPLE <strong>OF</strong> LOCAL COUNTERFACTUAL DEFINITENESS (PLCD):<br />

The result of an experiment which could be performed on a microscopic system has a<br />

definite value which does not depend on the setting of a remote piece of apparatus.<br />

This means that if this setting would have been different, the outcome of the experiment would<br />

not have been different. Using the same mathematics as before it follows that<br />

PLCD → Bell inequality. (VII. 58)<br />

Since PLCD is an assumption of locality concerning outcomes of measurements, (VII. 58) seems<br />

to be independent of the existence of hidden variables. But appearances are deceptive. In fact, PLCD<br />

is only reasonable in a deterministic context, and not in the case of indeterminism.<br />

Consider the following example given by Redhead (ibid.). Suppose that, at t 1 , just before the<br />

clock strikes twelve, I raise my hand. Now I ask the question if the clock would also have struck if<br />

I had not raised my hand at t 1 . Intuitively, the right answer is ‘Yes’, in agreement with PLCD. Now<br />

replace the clock by a radioactive atom which decays at t 2 . Suppose I raised my hand at t 1 < t 2 ,<br />

would the atom also have decayed if I had not done this? Now the answer is far from clear. If the<br />

decay is purely indeterministic, a recurrence of the experiment, even if it is just a thought experiment,<br />

does not have to have the same outcome. The supposition that the atom would not have decayed if I<br />

had not raised my hand, is not contradictory to locality.<br />

The assumptions that outcomes of measurements remain to have the same values even if they are<br />

not measured, or that measurements which are not carried out have certain outcomes in advance, are


VII. 5. STOCHASTIC HIDDEN VARIABLES 153<br />

only reasonable in a deterministic context. But in a deterministic context these assumptions do not<br />

differ from each other, and a outcome of measurement is decisively linked to the value the quantity<br />

had just beforehand, therefore, to a hidden variable.<br />

The conclusion is that the assumption of Eberhard and Stapp, PLCD, is no more general than the<br />

assumption that the value a n is a property of the particles which is determined in advance, and which<br />

is independent of the settings of the meter at B. This means that the derivation is no more general<br />

than the derivation for a local deterministic HVT.<br />

VII. 5<br />

STOCHASTIC HIDDEN VARIABLES<br />

In this section we will no longer require determinism in the HVT; the λ only determine the probability<br />

that a quantity has a certain value, which is revealed by the measuring apparatus in the way a<br />

balance reveals our weight. A stochastic HVT is linked more closely to quantum mechanics, enabling<br />

a more well - defined comparison between the assumptions leading to the Bell inequalities on the one<br />

hand, and quantum mechanics on the other.<br />

In our stochastic HVT we assume the existence of a probability distribution at given directions<br />

⃗a, ⃗ b ∈ R 3 of the spin meters in the EPRB experiment<br />

p ⃗a, ⃗ b<br />

(a, b, λ), (VII. 59)<br />

which is the probability to find for the quantities A = ⃗σ 1 · ⃗a and B = ⃗σ 2 · ⃗b the values a and b,<br />

respectively, where it holds that a,b = ±1. Again, λ ∈ Λ is the hidden variable describing the source.<br />

Such a probability distribution can always be written in terms of conditional probabilities,<br />

p ⃗a, ⃗ b<br />

(a, b, λ) = p ⃗a, ⃗ b<br />

(a | b ∧ λ) p ⃗a, ⃗ b<br />

(b | λ) ρ ⃗a, ⃗ b<br />

(λ). (VII. 60)<br />

To be able to derive the Bell inequalities we make the following three suppositions.


154 CHAPTER VII. BELL’S INEQUALITIES<br />

1. Outcome independence<br />

The probability to find a value a for ⃗a · ⃗σ is ‘completely’ determined by the settings of the spin<br />

meters and by λ, particularly, it is not necessary to also give outcome b, likewise for finding a<br />

value b,<br />

p ⃗a, ⃗ b<br />

(a | b ∧ λ) = p ⃗a, ⃗ b<br />

(a | λ) and p ⃗a, ⃗ b<br />

(b | a ∧ λ) = p ⃗a, ⃗ b<br />

(b | λ). (VII. 61)<br />

2. Parameter independence<br />

The probability to find the outcome of measurement a or b is independent of the settings of the<br />

remote spin meter,<br />

p ⃗a, ⃗ b<br />

(a | λ) = p ⃗a (a | λ) and p ⃗a, ⃗ b<br />

(b | λ) = p ⃗b (b | λ). (VII. 62)<br />

3. Source independence<br />

The distribution of λ in the source does not depend on the settings of the spin meters,<br />

ρ ⃗a, ⃗ b<br />

(λ) = ρ(λ). (VII. 63)<br />

In principle we can adjust the spin meters ‘at the last moment’, long after the particles have left<br />

the source. It is reasonable to assume that the source is not influenced by what happens to the<br />

measuring devices in the future.<br />

Now we will prove the next theorem.<br />

BELL’S THIRD THEOREM:<br />

A stochastic HVT which is in agreement with outcome, parameter and source independence<br />

is empirically inconsistent with quantum mechanics.<br />

Proof<br />

As a consequence of the aforementioned properties, in every local stochastic HVT, (VII. 60) becomes<br />

p ⃗a, ⃗ b<br />

(a, b, λ) = p ⃗a (a | λ) p ⃗b (b | λ) ρ(λ), (VII. 64)<br />

or<br />

p ⃗a, ⃗ b<br />

(a, b | λ) = p ⃗a (a | λ) p ⃗b (b | λ), (VII. 65)<br />

which means that the quantities A and B are statistically independent of each other for given λ.<br />

This statement is often called factorizability or conditional independence.


VII. 5. STOCHASTIC HIDDEN VARIABLES 155<br />

Using (VII. 64), another Bell inequality can be derived for E(⃗a, ⃗ b) by means of the relation<br />

∫<br />

E(⃗a, ⃗ (<br />

b) = p⃗a, ⃗ b<br />

(1, 1, λ) − p ⃗a, ⃗ b<br />

(1, −1, λ) (VII. 66)<br />

Defining<br />

Λ<br />

− p ⃗a, ⃗ b<br />

(−1, 1, λ) + p ⃗a, ⃗ b<br />

(−1, −1, λ) dλ )<br />

∫<br />

(<br />

= p⃗a (1 | λ) − p ⃗a (−1 | λ) ) ( p ⃗b (1 | λ) − p ⃗b (−1 | λ) ) ρ(λ) dλ.<br />

Λ<br />

f (⃗a, λ) := p ⃗a (1 | λ) − p ⃗a (−1 | λ) (VII. 67)<br />

and<br />

g( ⃗ b, λ) := p ⃗b (1 | λ) − p ⃗b (−1 | λ), (VII. 68)<br />

we see that<br />

|f (⃗a, λ)| 1 and |g( ⃗ b, λ)| 1, (VII. 69)<br />

which brings us back to (VII. 25) and the subsequent equations so that again we obtain the Bell<br />

inequality (VII. 13). Violation of this Bell inequality means that (VII. 64) can not apply and<br />

therefore no HVT can guarantee both outcome independence (VII. 61) and parameter independence<br />

(VII. 62). □<br />

VII. 5. 1<br />

OUTCOME, PARAMETER AND SOURCE INDEPENDENCE<br />

The importance of the distinction between outcome and parameter independence was first brought<br />

to attention by J. Jarrett (1984).<br />

1. Outcome independence, (VII. 61), means that the probability of outcome b, for given λ, does<br />

not depend on the outcome a. This is motivated by the idea that λ gives a complete description of<br />

the state of the pair of particles; the variable λ contains an exhaustive specification of all factors<br />

which are relevant for the outcomes of measurement. Therefore, specifying the extra information that<br />

outcome a has occurred can, if λ is already known, not lead to new information on b.<br />

The purpose of the requirement can be illustrated by giving the next example, in which it is not<br />

satisfied. Suppose that two people, without looking, each draw a little ball out of a box containing two<br />

little balls, one black and one white. Hereafter they separate, one travels to New York, the other to<br />

Tokyo. Now consider a ‘stochastic hidden variable’ with probability 1 2<br />

for the little balls to be black<br />

or white. On arrival at Tokyo the traveler opens his hand and sees that his little ball is black, which<br />

instantaneously enables him to predict the color of the little ball in New York, it has to be white. Here<br />

the outcome of measurement of the one little ball does provide relevant information on the outcome<br />

of a measurement of the other little ball.


156 CHAPTER VII. BELL’S INEQUALITIES<br />

The idea behind the requirement of outcome independence is that such a situation could only<br />

occur because the HVT was incomplete; in a complete specification of the state of the pair of particles<br />

which existed at the beginning of the trip also the color of the little balls should have been included,<br />

even though the travelers did not know the color of their little ball. Then it automatically follows,<br />

at given λ, that the little ball in New York is white and the observation in Tokyo provides no new<br />

information.<br />

2. Parameter independence, (VII. 62), means that the probability distribution of the outcomes<br />

at A is independent of external changes at B, e.g. pointing the spin meter. The argumentation leading<br />

to the assumption of parameter independence is generally associated with the possibility of signaling.<br />

Suppose that, for example, adjustments ⃗ b and ⃗ b ′ existed such that<br />

p ⃗a, ⃗ b<br />

(a | λ) ≠ p ⃗a, ⃗ b ′ (a | λ), (VII. 70)<br />

then, in principle, it is possible to instantaneously exchange signals between experimenters located<br />

at A and B. Since the experimenter located at B can choose if he points his spin meter in the<br />

direction ⃗ b or ⃗ b ′ , an experimenter located at A is able, if the source emits particle pairs in a pure<br />

hidden - variables state λ, to register the relative frequency of outcomes of A and thereby retrieve<br />

which adjustment has been chosen by the experimenter at B. Violation of parameter independence<br />

therefore means that the HVT enables the instantaneous exchange of signals over arbitrarily large<br />

distances.<br />

3. Source independence, (VII. 63), means that the probability distribution over the hidden variable<br />

describing the particle pair cannot depend on the measuring directions chosen by the experimenters.<br />

The argumentation leading to the assumption of source independence is often described<br />

in terms of the ‘free will’ of the experimenters. The experimenters are considered to be completely<br />

‘free’ in their decision how to point their spin meters, and even to make their choice just at the last<br />

moment, when the particles have long left the source. Therefore, the probability distribution ρ(λ),<br />

which characterizes the source of the particle pairs, cannot depend on that.<br />

Of course, here too it applies that violation of the requirement is logically conceivable. It is<br />

possible that this freedom does not exist, and that at emitting the particles, the directions in which<br />

the experimenters will measure have already been determined. It is also conceivable that by some<br />

other cause a correlation exists between λ and the directions ⃗a and ⃗ b, influencing both. The first case,<br />

in which all relevant factors of the EPR experiment are determined in advance and the experimenters<br />

have no free will, is called super - determinism. Therefore, in a super - deterministic HVT the Bell<br />

inequalities can be violated also.<br />

VII. 5. 2<br />

<strong>QUANTUM</strong> <strong>MECHANICS</strong> AS A STOCHASTIC HVT<br />

Exclusively giving probability statements concerning outcomes of measurements, a stochastic<br />

HVT conceptually differs less from quantum mechanics than other HVT’s. In fact we can, without<br />

objection, take quantum mechanics itself as an example of a stochastic HVT by identifying λ with the<br />

quantum mechanical state and Λ with the relevant Hilbert space. Since quantum mechanics does not<br />

satisfy the Bell inequalities, it is interesting to examine which of the aforementioned requirements is<br />

violated inevitably by quantum mechanics.


VII. 5. STOCHASTIC HIDDEN VARIABLES 157<br />

3. Source independence. We already discussed the possibility of violation of the Bell inequalities<br />

by a super-deterministic theory without source independence. It is a philosophical question whether<br />

we can somehow establish if we have free will or not, therefore, it is a possibility, but not an inevitability,<br />

leaving outcome and parameter independence.<br />

2. Parameter independence. Describing the pairs of particles in the singlet state |Ψ 0 ⟩, (III. 165),<br />

by a pure hidden - variables state, the probability distribution is a delta - distribution,<br />

ρ Ψ0 (λ) = δ λ0 (λ) := δ(λ − λ 0 ), (VII. 71)<br />

which leads to<br />

∫<br />

p ⃗a, ⃗ b,λ0<br />

(a, b, λ) ρ Ψ0 (λ) dλ = p ⃗a, ⃗ b,λ0<br />

(a, b). (VII. 72)<br />

Λ<br />

The probabilities for the outcomes of measurement are given by (III. 176),<br />

p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) = 1 2 sin2 1 2 θ ⃗a, ⃗ b ,<br />

p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = −1) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (VII. 73)<br />

b<br />

EXERCISE 33. Also calculate the other two joint probabilities, that is, for a = 1 ∧ b = 1<br />

and a = −1 ∧ b = 1.<br />

The marginal probabilities are, using (VII. 73),<br />

p ⃗a, ⃗ b<br />

(a | λ 0 ) = p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = −1) = 1 2 ,<br />

p ⃗a, ⃗ b<br />

(b | λ 0 ) = p ⃗a, ⃗ b,λ0<br />

(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />

(a = −1 ∧ b = 1) = 1 2<br />

, (VII. 74)<br />

which means that, both being equal to 1 2<br />

, they are not dependent of the settings of a remote measuring<br />

device. Consequently, even the quantum mechanical correlations in the singlet cannot be used for<br />

signaling, there is no actio in distans, leading to the following theorem.<br />

NO - SIGNALING THEOREM:<br />

Quantum mechanics satisfies parameter independence, i.e., if subsystems of a composite<br />

physical system no longer interact, the probability of finding certain outcomes of measurement<br />

for an arbitrary quantity of subsystem 1 is independent of which quantity of<br />

subsystem 2 is measured, and vice versa.<br />

EXERCISE 34. Prove that the EPRB experiment is an example of the no - signaling theorem.<br />

Optional: prove, in general, the no - signaling theorem using state operators. Whoever cannot<br />

solve this problem, is advised to consult Ghirardi, Rimini and Weber (1980).


158 CHAPTER VII. BELL’S INEQUALITIES<br />

1. Outcome independence. In quantum mechanics it is indeed the requirement of outcome independence<br />

that is not satisfied. The conditional probabilities, i.e., the probabilities for the spin of<br />

particle 1 to be found in the direction ⃗a, given that the spin of particle 2 was found in the direction ⃗ b<br />

and vice versa, which were defined in (III. 172), p. 74, are clearly not independent,<br />

p ⃗a, ⃗ b<br />

(a = 1 | λ 0 ∧ b = 1) = sin 2 1 2 θ ⃗a, ⃗ , (VII. 75)<br />

b<br />

p ⃗a, ⃗ b<br />

(a = −1 | λ 0 ∧ b = 1) = cos 2 1 2 θ ⃗a, ⃗ . (VII. 76)<br />

b<br />

According to quantum mechanics, physical systems are inseparable. However, this interdependence<br />

of outcomes cannot be used to exchange signals since we do not control the outcomes of spin measurements<br />

and therefore we are unable to actively influence the probability distribution over the outcomes<br />

of measurements from a distance. Shimony (1984, p. 227) called it passion at a distance. The experimenter<br />

at B can, on the basis of his observation, indeed do a better prediction concerning an outcome<br />

at A than that which is possible on just the knowledge of the singlet state, but he cannot warn the<br />

observer at A, he only can watch passively.<br />

The singlet |Ψ 0 ⟩ ∈ C 4 does violate the Bell inequalities for suitably chosen spin quantities.<br />

The singlet is not factorizable, i.e., it cannot be written as a direct product of two states in C 2 , it is<br />

entangled. We can raise the question if types of quantum mechanical states exist which do not violate<br />

a Bell inequality for any choice of four spin quantities.<br />

Capasso, Fortunato and Selleri (1973) proved that the CHSH inequality, (VII. 13), is upheld for<br />

every choice of four spin quantities by all factorizable states and by all mixtures thereof. Violations<br />

are therefore only possible for entangled states. Vice versa, Home and Selleri (1991, pp. 22 - 26)<br />

proved that for every entangled pure state, that is, a state which cannot be written as a direct product,<br />

it is always possible to choose spin quantities in such a way that the CHSH inequality is violated.<br />

These results can be summarized in the statement that entanglement and violation of Bell inequalities<br />

are equivalent. It confirms Schrödinger’s insight from 1935 (Schrödinger 1935a) that the<br />

existence of entangled states marks the cardinal difference between classical and quantum mechanics.<br />

VII. 6<br />

AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES<br />

The contradiction between a local deterministic or a local stochastic HVT, both either autonomous<br />

or contextual, on the one hand, and quantum mechanics on the other hand, is statistical in nature, because<br />

it concerns inequalities in terms of expectation values or probabilities, like all Bell’s theorems.<br />

But Kochen and Specker’s theorem, which we discussed in V. 3, does not contain any inequalities. In<br />

this case it is customary to speak of algebraic proof.<br />

This raises the question whether an algebraic proof of Bell’s theorems is also possible, that is,<br />

without appealing to the measurement postulate. The answer is affirmative. Using a spin state of a<br />

composite system of four particles, D.M. Greenburger, M.A. Horn en A. Zeilinger (1989) showed<br />

that it is mathematically impossible to locally and separably assign values to all spin quantities. Here<br />

we will show a simplified version given by N.D. Mermin (1993), where, in using |GHZ⟩, we refer to<br />

the aforementioned authors.


VII. 6. AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES 159<br />

Consider a composite system of three spin 1/2 fermions with pure states in the direct product<br />

Hilbert space C 2 ⊗ C 2 ⊗ C 2 = C 8 . We look at 10 physical quantities which correspond to the<br />

spin operators represented in the Mermin pentagon, figure VII. 8. In this diagram σy<br />

1 is shorthand<br />

for σ y (1) ⊗ 11 (2) ⊗ 11 (3), and σy 1 σy 2 σx 3 is likewise for σ y (1) ⊗ σ y (2) ⊗ σ x (3), etc. On every<br />

straight line through the Mermin pentagon we find four commuting operators. These operators are<br />

products of commuting operators with eigenvalues ±1 and therefore have eigenvalues ±1 also.<br />

σ 1 y<br />

σ 1 x σ 2 x σ 3 x σ 1 y σ 2 y σ 3 x σ 1 y σ 2 x σ 3 y σ 1 x σ 2 y σ 3 y<br />

σ 3 x<br />

σ 3 y<br />

σ 1 x<br />

σ 2 y<br />

σ 2 x<br />

Figure VII. 8: The Mermin pentagon<br />

Using the properties of the Pauli matrices (III. 122), p. 66, it can be shown that<br />

(<br />

σx (1) ⊗ σ y (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ x (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ y (2) ⊗ σ x (3) )<br />

= − σ x (1) ⊗ σ x (2) ⊗ σ x (3), (VII. 77)<br />

where we note that the four operators acting in C 8 commute. Consequently, they have a simultaneous<br />

eigenstate in C 8 , having eigenvalue +1 for the three operators on the left - hand side of the equation,<br />

and eigenvalue −1 for the operator on the right - hand side. The entangled state in C 8 ,<br />

|GHZ⟩ := 1 2<br />

√<br />

2<br />

(<br />

|z ↑⟩ ⊗ |z ↑⟩ ⊗ |z ↑⟩ − |z ↓⟩ ⊗ |z ↓⟩ ⊗ |z ↓⟩<br />

)<br />

, (VII. 78)<br />

is such a state.<br />

We assume that the three particles are already far away from each other and are moving still further<br />

apart, and the composite system is, as far as spin is concerned, in the state |GHZ⟩. A measurement of<br />

two particles, of which we assume that it does not influence the third particle in any way, determines<br />

the value of the third particle because, according to quantum mechanics, the product of the outcomes<br />

of measurement is determined.


160 CHAPTER VII. BELL’S INEQUALITIES<br />

According to a HVT, at the moment a measurement is made the values of the spin quantities<br />

are revealed. If we call these values w x (1) for the spin in x - direction of particle 1, etc, then, because<br />

|GHZ⟩, (VII. 78), is a simultaneous eigenstate for the four quantities in C 8 , (VII. 77), it must<br />

hold that<br />

and<br />

w x (1) w y (2) w y (3) = w y (1) w x (2) w y (3) = w y (1) w y (2) w x (3) = + 1, (VII. 79)<br />

w x (1) w x (2) w x (3) = − 1. (VII. 80)<br />

The product of these four factors is<br />

(<br />

wx (1) w y (2) w y (3) ) ( w y (1) w x (2) w y (3) ) ( w y (1) w y (2) w x (3) ) ( w x (1) w x (2) w x (3) )<br />

= (+ 1) (+ 1) (+ 1) (− 1) = − 1. (VII. 81)<br />

But if we consider the product as as a product of the 12 values of these spin quantities we find<br />

w x (1) w y (2) w y (3) w y (1) w x (2) w y (3) w y (1) w y (2) w x (3) w x (1) w x (2) w x (3)<br />

= w 2 x (1) w 2 y (1) w 2 x (2) w 2 y (2) w 2 x (3) w 2 y (3) = 1 6 = + 1, (VII. 82)<br />

This leads to +1 = −1, which is, of course, an algebraical absurdity. And indeed, this is an algebraic<br />

proof since it contains no probabilities or inequalities.<br />

EXERCISE 35. What kind of HVT is excluded by the foregoing reasoning? Which postulates of<br />

quantum mechanics are necessary to obtain the contradiction?<br />

VII. 7<br />

MISCELLANEA<br />

Literature concerning the Bell inequalities has reached an extraordinarily large extent since the<br />

seventies of the 20 th century, however, its growth has decreased in recent years. In conclusion of this<br />

chapter we will briefly discuss some of the main topics.<br />

VII. 7. 1<br />

LOCALITY AND RELATIVITY<br />

Although in these lecture notes we have restricted ourselves to non - relativistic quantum mechanics,<br />

the speed of light did not play a role in our considerations, it is, of course, especially the<br />

special theory of relativity which provides the inspiration to study the (im -) possibility of signaling.


VII. 7. MISCELLANEA 161<br />

Therefore, it is interesting to consider the EPRB experiment schematically in a Minkowski diagram,<br />

figure VII. 9.<br />

ct<br />

A<br />

B<br />

λ<br />

Figure VII. 9: Minkowski diagram of the EPRB experiment, where λ is in the past light cones of both<br />

A and B<br />

A natural requirement of locality for a relativistic stochastic HVT is that the probability of an<br />

outcome A depends exclusively on the variables which specify the state in the past light cone of<br />

the measuring event at A, and likewise for B. Bell has called it local causality. We have seen that<br />

quantum mechanics is not a local causal theory. Indeed, the probability of an outcome at A cannot be<br />

influenced by the choice of the direction of measurement ⃗ b at B, but with the outcome at B, which<br />

can be registered there, a prediction can be done by an observer at B concerning the particle at A<br />

which an observer at A can not do, even if he has complete knowledge of the state in the past light<br />

cone of A.<br />

x<br />

VII. 7. 2<br />

LOCALITY VERSUS CONDITIONAL INDEPENDENCE<br />

A problem that is brought up in some publications, e.g. Fine (1982), De Muynck (1986, 1996),<br />

is, to what extent locality is necessary to derive the Bell inequalities. The authors argue that in<br />

‘requirements of locality’ only a special form of statistic independence is expressed. The distance<br />

between the measuring apparatuses is in absolutely no way manifest in the requirement. Although<br />

‘locality’ is a term which seems to presuppose a space - time, such space - times are conspicuous by<br />

their absence in relevant locality assumptions, they all are probability statements without reference to<br />

space or time.<br />

Indeed, strictly speaking one cannot say that these assumptions express a requirement of locality.<br />

It could be possible to expect an analogous independence for a hypothetical pair of particles, for<br />

example a photon and a gluon, which absolutely cannot interact with each other, but are located very<br />

close to each other. The essence is that in a local theory the large distance between the particles can<br />

be taken to be a sufficient, but not necessary condition for the absence of interactions.<br />

The requirement of outcome independence in the HVT is not a representation of the requirement<br />

of locality, it has only been motivated by it. The conclusion that is sometimes drawn from this, that<br />

apparently locality itself is irrelevant for the Bell inequality, is, however, incorrect. Factual violation<br />

of the Bell inequality means that every stochastic HVT satisfying the factorizability as formulated in<br />

section VII. 5 is excluded, and therefore, also the local versions are excluded.


162 CHAPTER VII. BELL’S INEQUALITIES<br />

VII. 7. 3<br />

DETERMINISM<br />

Another widespread view is that the derivation of the Bell inequalities always relies on a supposition<br />

of determinism in the HVT, so that giving up determinism would be a possible expedient from<br />

the Bell inequalities.<br />

Bell himself has emphasized the inadequacy of this view. Determinism, which is the possibility<br />

to make predictions concerning a remote object with certainty before making measurements, indeed<br />

plays an important role in the original version. But this is a consequence of the perfect correlation<br />

in the quantum mechanical expression (VII. 5), i.e., this determinism follows from the singlet state<br />

itself, and is not a specific supposition of the HVT, see for instance Suppes and Zanotti (1976), and<br />

Dieks (1983).<br />

We saw that in a stochastic, or indeterministic, HVT the Bell inequalities are also derivable, so<br />

that giving up determinism does not help. Moreover, the opposite is true; especially super - determinism,<br />

the supposition that also the choice of the direction of measurement by the experimenter is<br />

determined in advance, offers a way out of the Bell inequalities.


VIII<br />

THE MEASUREMENT PROBLEM<br />

[. . . ] if one has to stick to these darn quantum jumps then I regret that I ever have taken<br />

part in the whole thing.<br />

— Erwin Schrödinger<br />

In this final chapter we will elaborate on the most important interpretation problem, the measurement<br />

problem, which has the subject of an ever-continuing series of publications. We will give<br />

an introduction to Von Neumann’s quantum mechanical measurement theory and formulate the<br />

measurement problem, we will go through a number of attempts to solve it, and finally we will<br />

discuss some criticism of the theory.<br />

VIII. 1<br />

INTRODUCTION<br />

The term ‘measurement’ plays a very special role in quantum mechanics, and we suggest a short<br />

rereading of the first paragraphs of chapter V. It is remarkable that the term arises in the Von Neumann<br />

postulates as described in chapter III, p. 41, ff. Both in the measurement postulate, specifying the<br />

possible outcomes of measurement and giving a physical meaning to the probability measure which<br />

is determined by the state vector, or the state operator, in terms of outcomes of measurement, and<br />

in the projection postulate, establishing the evolution in time of the state at measurement, the term<br />

‘measurement’ comes forward.<br />

That special role also becomes apparent in the debates concerning the interpretation of the theory,<br />

where it is frequently remarked that measurement ‘creates’ the value for a quantity, or that it causes a<br />

sudden state change, as expressed by Dirac (1958, p. 36),<br />

In this way we see that a measurement always causes the system to jump into an eigenstate<br />

of the dynamical variable that is being measured, the eigenvalue this eigenstate<br />

belongs to being equal to the result of the measurement.<br />

From the perspective of classical physics, this is extremely unusual. In Newton’s theory of gravitation,<br />

or the electrodynamics of Faraday and Maxwell, measurements are sometimes mentioned, as<br />

suppliers of experimental facts, but never as specific types of operation on physical systems, needing<br />

a separate treatment in the theory.<br />

The point here is not only that measurements in classical physics, as is frequently stated, always<br />

bring about a negligible or compensable disturbance of the system and therefore can remain outside<br />

consideration, much more important is, that in in classical physics there is no distinction in principle


164 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

between processes which serve as measurements and processes that do not. Every physical process<br />

or every mutual influence of physical systems can, under suitable circumstances, be considered as<br />

a measurement. Since it is the physical theory that indicates which physical processes in nature are<br />

possible, the theory itself also provides the criterion for the kinds of measurements which are possible.<br />

According to Von Neumann’s postulates, in quantum mechanics this is exactly the other way<br />

around. First we must, according to the aforementioned postulates, have a criterion to know when a<br />

process is a measurement, before we can indicate what the theory has to say concerning the process,<br />

before we can apply the postulates. That the term measurement in this way gets a more fundamental<br />

status than the physical theory, is also expressed by the words of Pauli as quoted in chapter I, p. 9,<br />

that a measurement creating values is “outside the laws of nature”.<br />

Intuition tells us that measurements are just an ‘ordinary kind’ of physical interactions, and this<br />

intuition cannot easily be wept out, from which we will give an illustration. Consider a photon which<br />

has gone through a slit and is on its way to a photographic plate. If we presume the interaction with<br />

this photographic plate to be a measurement, the wave function of the photon must, according to the<br />

projection postulate, collapse on arrival at the plate. But we also know that the photographic plate has<br />

a microscopic structure. It contains silver atoms in an emulsion which can be excited by the photon<br />

and start a chemical process in such a way that we can see something when the plate is developed.<br />

Would it not be plausible that quantum mechanics could describe such a process using a Schrödinger<br />

equation?<br />

In every way this event looks like a physical interaction which falls completely within the well -<br />

known laws of nature, instead of without. And if this is denied, how shall we decide at all when<br />

a microscopic interaction between a photon and an atom can and when it cannot be labeled as a<br />

measurement? Asking an experimental physicist how her measurement setup works, one will be<br />

given an answer in which physical interactions, generally of electromagnetic nature, are of uppermost<br />

importance. It seems absurd to deny that events take place in the laboratory that are “outside the laws<br />

of nature”.<br />

The clash between the conception that measurements do not differ from other physical interactions<br />

on the one hand, and the fact that measurements in quantum mechanics acquired a special status<br />

because they are not classified to be physical interactions on the other hand, is called the quantum<br />

mechanical measurement problem in the broad sense.<br />

VIII. 2<br />

MEASUREMENT ACCORDING TO CLASSICAL PHYSICS<br />

Although usually no special attention is given to measurements in classical physics, it is no problem<br />

to give a general, schematic description of how a measurement is treated classically.<br />

A measurement brings about a correlation between a quantity A of a physical system S which<br />

is, within the context of a measurement, frequently called an object system, and a quantity R, where<br />

the R comes from reading, which is characteristic for the measuring apparatus M, the apparatus<br />

being a physical system also. In classical physics we assume that A has a certain value a ∈ R,<br />

where a is an element from a set of possible values, for instance a 1 , . . . , a n ⊂ R, and that after the<br />

measurement process R has a value r j = m(a j ), where m is a bijection of the possible values of A<br />

before the measurement, to the possible values of R after the measurement.


VIII. 2. MEASUREMENT ACCORDING TO CLASSICAL PHYSICS 165<br />

Take, for example, S to be yourself and M to be a balance, A is your weight and R is the reading<br />

of the pointer of the balance. Now you have an unknown weight value, a, which is revealed by<br />

the balance indicating r = m(a) = 63 kg. The role of a measurement is pragmatic; the value of<br />

a physical quantity of the object system which is not directly or not easily observable, for example<br />

mass, is correlated to a quantity that is directly observable, in this case the position of an pointer. For a<br />

correlation to occur between A and R there must be an interaction between S and M. This interaction<br />

can, potentially, influence the value of A in such a way that the value before the measurement can<br />

change to another value after measurement. Measurement is a process looking towards the past and<br />

its aim is to reveal the value of A before the interaction with M.<br />

If it is possible to predict, from the value a and the interaction between S and M, the value a ′<br />

which A has after measurement, then the measurement also looks at the future and acts like an apparatus<br />

which prepares a state of S in which A has the value a ′ . Think, for example, of an ammeter<br />

in an electric circuit with an energy source of V volt; if the current through a resistor R is I = V R<br />

without the ammeter, then, after the ammeter has been connected in series with the resistor, the current<br />

I ′ V<br />

equals<br />

R+R s<br />

, where R s is the internal resistance of the ammeter. In case a ′ equals a, the<br />

measurement is called non - disturbing or ideal. The measurement process thus has two aspects; what<br />

happens to the measuring apparatus M, and what happens to the physical system S, i.e. measurement<br />

and state preparation.<br />

In classical physics the measurement interaction can be taken to be arbitrarily small, in which<br />

case the value of A is not disturbed. Therefore, the transition in such an ideal measurement process<br />

is<br />

(a j , r 0 ) (a j , r j ) = ( a j , m(a j ) ) . (VIII. 1)<br />

Notice that the characteristics of the measurement are left out of the consideration. The method<br />

of measuring does not have anything to do with the phenomenon one wants to get information about.<br />

The motion of the planets in the gravitational field of the sun is studied by looking at them, i.e., by<br />

using the fact that the planets reflect sunlight. The optical instruments that are used have nothing to<br />

do with the gravitational motion under examination.<br />

Also notice that in this consideration the question how to measure A is only transformed into the<br />

question how to find the value of R. If we also would have to measure the value of R, this could<br />

lead to an infinite chain of measuring apparatuses. This is avoided by assuming that the quantity R is<br />

directly observable, hence the term pointer reading for R, where we have to take the term ‘pointer’<br />

very generally, for instance, screens showing results of measurements or results printed on paper are<br />

included in the term.<br />

We appeal in our description to a distinction between two different types of quantities; the directly<br />

observable quantities, that is, observable to the naked eye, versus the not directly observable or unobservable<br />

quantities. But this is not a distinction which corresponds to a fundamental distinction of<br />

these quantities, in classical physics all quantities are treated as properties of objects. The fact that we<br />

stop at a directly observable quantity R is a decision based on purely contingent factors, particularly<br />

human physiology and the physics of the human senses.


166 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

VIII. 3<br />

MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />

The following schematic representation of the measurement process in quantum mechanics is<br />

given by Von Neumann (1932).<br />

Suppose that A is a physical quantity of the object system S, represented quantum mechanically<br />

by the maximal operator A on Hilbert space H S , having a discrete spectrum a 1 , . . . , a N . Now<br />

let S interact with a measuring apparatus M, where M is described quantum mechanically also.<br />

For the measuring apparatus M to be able to function as a measuring apparatus, it has to have an<br />

pointer quantity R, represented by the operator R on Hilbert space H M , having orthonormal eigenstates<br />

|r 0 ⟩, . . . , |r N ⟩. These eigenstates have to be orthonormal since they correspond to pointer<br />

readings which can be distinguished by the human eye. Let |r 0 ⟩ be the eigenstate in which the pointer<br />

shows no deflection. The Hilbert space of this composite system S M is H = H S ⊗ H M with<br />

dim H M = dim H S + 1, the basis of R including |r 0 ⟩, where that of A does not include |a 0 ⟩.<br />

Prior to the measurement, the measuring apparatus M is in the eigenstate |r 0 ⟩. We want this state<br />

to change, as a result of the measurement interaction, into the eigenstate |r j ⟩ which is indicative of the<br />

value a j of A, thus, let S initially be in the eigenstate |a j ⟩ of A. Moreover, we want the measurement<br />

to be ideal, so that the state |a j ⟩ of S does not change.<br />

Von Neumann showed that this transition can indeed be brought about by a unitary transformation,<br />

which means we have to find for the composite system SM a unitary evolution operator U, inducing<br />

the transition<br />

U ( |a j ⟩ ⊗ |r 0 ⟩ ) = |a j ⟩ ⊗ |r j ⟩, (VIII. 2)<br />

where U describes the measurement interaction lasting some unspecified time interval.<br />

EXERCISE 36. Show that the operator<br />

U =<br />

N∑ N∑<br />

|a l ⟩ ⊗ |r [l+m] ⟩ ⟨a l | ⊗ ⟨r m | (VIII. 3)<br />

l=1 m=0<br />

(a) is unitary, and (b) induces the desired transition (VIII. 2). Here, [l + m] means l + m modulo<br />

N + 1, i.e.: [N + 1] = 0, [N + 2] = 1, etc.<br />

The formula (VIII. 2) strongly resembles the transition (VIII. 1). Apparently, everything we desired<br />

concerning the ideal measurement process in quantum mechanics, including the requirement<br />

that the value of A must not be disturbed, can be achieved using a unitary operator. At first sight,<br />

there does not seem to be any problem with a completely quantum mechanical treatment of the measurement<br />

interaction, taken as an ordinary physical process obeying Schrödinger’s equation. As in the<br />

classical case, the method of measuring is not discussed. We also did not appeal to the measurement<br />

or the projection postulate.


VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 167<br />

However, at a second look, the transition (VIII. 2) turns out to have peculiar consequences. The<br />

formula (VIII. 2) assumed that the object system S was, before the measurement, in an eigenstate of<br />

A. But what if S is in an arbitrary state |ψ⟩ ∈ H S ?<br />

We can decompose this arbitrary state |ψ⟩ into the orthonormal eigenstates |a j ⟩ of A with coefficients<br />

c j = ⟨a j | ψ⟩. Therefore, using |ψ⟩ = ∑ c j |a j ⟩ and the linearity of the evolution operator it<br />

follows that<br />

U ( |ψ⟩ ⊗ |r 0 ⟩ ) = U<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r 0 ⟩ =<br />

N S ∑<br />

j=1<br />

c j U ( |a j ⟩ ⊗ |r 0 ⟩ )<br />

=<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩ =: |Φ⟩. (VIII. 4)<br />

We see that the state |Φ⟩ of the composite system of object S and measuring apparatus M after the<br />

measurement is no longer a product state, rather it is entangled. This implies that we cannot describe<br />

S, nor M, with a pure state; the partial traces S and M yield mixed states, see section III. 4.<br />

This aspect has no classical analogue. We will come back to this, but first we consider the question<br />

whether this quantum mechanical description of the measurement process is compatible with the<br />

measurement postulate. Or, more precisely, whether application of the measurement postulate to A<br />

leads to the same result as its direct application to S. And we ask whether the desired correlation<br />

between the values of A and R is achieved. We will show now that this is indeed the case.<br />

The quantity R of the measuring apparatus M is represented on the Hilbert space H S ⊗ H M of<br />

the composite system SM as 11⊗R. The probability to find for this quantity the value r k is, according<br />

to the measurement postulate,<br />

Prob |Φ⟩ (R : r k ) = ⟨Φ| ( 11 ⊗ |r k ⟩ ⟨r k | ) |Φ⟩. (VIII. 5)<br />

With (VIII. 4) this yields<br />

Prob |Φ⟩ (R : r k ) = |c k | 2 , (VIII. 6)<br />

where we have used the orthonormality of the |r k ⟩ ∈ H M . This is the same result as yielded by<br />

direct application of the measurement postulate to the arbitrary |ϕ⟩ from (VIII. 4). Apparently, the<br />

probability to find an outcome r k when measuring R of M is always equal to the probability to find<br />

the outcome a k of A on S. This former measurement can therefore be regarded as a substitute for the<br />

latter.<br />

The validity of (VIII. 6) itself does not show that a correlation between the value of A and R has<br />

been established. To show that such a correlation exists, we have to know the probability of a certain<br />

pair of outcomes (a i , r k ) for A ⊗ R, in the state |Φ⟩ of (VIII. 4). The joint probability to find this pair<br />

of outcomes is<br />

Prob |Φ⟩ (A : a i ∧ R : r k ) = ⟨Φ| ( |a i ⟩ ⟨a i | ⊗ |r k ⟩ ⟨r k | ) |Φ⟩<br />

= ∣ ∣ ( ⟨a i | ⊗ ⟨r k | ) |Φ⟩ ∣ ∣ 2 = |c i | 2 δ ik . (VIII. 7)


168 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

The conditional probability to find for A the value a i , given that for R the value r k has been found,<br />

is therefore<br />

Prob |Φ⟩ (A : a i | R : r k ) = Prob (A : a i ∧ R : r k )<br />

Prob (R : r k )<br />

= |c i| 2 δ ik<br />

|c k | 2 = δ ik . (VIII. 8)<br />

In other words, in the state |Φ⟩ a strict correlation exists between the quantities A and R, represented<br />

quantum mechanically by the operators A and R.<br />

The schematic representation of the ideal measurement process is, as we have seen, consistent<br />

with the measurement postulate in the sense that a measurement on M can be a substitute for a measurement<br />

on S. Notice that, to answer this question, we did appeal to the measurement postulate.<br />

This is unavoidable, since the final state after the measurement process, (VIII. 4), is an entangled<br />

quantum state. We can only specify its empirical consequences by appealing to the meaning quantum<br />

mechanics attributes to such quantum states, and in Von Neumann’s postulates that meaning is established<br />

by means of the measurement postulate. Unfortunately, this postulate forces us to consider of a<br />

measurement again, namely, a measurement on the measuring apparatus M itself, by reading off the<br />

position of the pointer. Now we have to ask if this second measurement can also be represented as a<br />

normal interaction.<br />

Suppose that we introduce a second measuring apparatus M ′ which we use to read off the result<br />

of M using a new pointer quantity R ′ , represented by the operator R ′ in H M ′. As an example, we<br />

can think of a quantum mechanical description of our eye. Schematically, we then have the process<br />

|r j ⟩ ⊗ |r ′ 0⟩ −→ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 9)<br />

where the |r ′ j⟩ are the eigenstates of R ′ of M ′ . Let U ′ be the unitary operator describing the measurement<br />

by M ′ on M, lasting again some unspecified amount of time. Now we have, for the composite<br />

system SM M ′ in the Hilbert space H = H S ⊗ H M ⊗ H M ′,<br />

|a j ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩<br />

U<br />

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ 0⟩<br />

U ′<br />

|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 10)<br />

and therefore, if we start from a general initial state |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩, the final state will be<br />

|Φ ′ ⟩ = U ′ U ( |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩ ) =<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩. (VIII. 11)<br />

Again, one can argue that all this is consistent with the measurement postulate. That is, upon measurement<br />

of R ′ , the probability of finding the value r ′ k, is equal to |c k | 2 , etc.<br />

We can extend this type of reasoning ad nauseam, by incorporating more and more systems in<br />

the chain of measurement apparatuses, even including a photon scattered by the pointer and entering<br />

the eye of the observer, his retina, the nerve fibres of his brain, etc. All this is consistent with the<br />

measurement postulate, and you can, if you want to, be satisfied with this.<br />

However, the argument does not show that we can take measurements to be on an entirely equal<br />

footing with other physical interactions. No matter how far we extend the chain of apparatuses, the<br />

final state will always be a superposition of the form (VIII. 4) or (VIII. 11)); the meaning of which<br />

can only be specifies by saying what we will find at yet another measurement. The transition to the


VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 169<br />

conclusion that a certain state has been actually found, sometimes called the ‘Heisenberg cut’ (e.g.<br />

Primas 1993), cannot be made within the formalism. Rudolf Haag has expressed this situation as<br />

follows (Haag 1990, p. 246),<br />

Indeed the problem faced in the development in quantum theory has [. . . ] been [. . . ] the<br />

inability of devising any coherent realistic picture conforming with the observed phenomena.<br />

We can shift the place where we want to make the Heisenberg cut at will, by incorporating more<br />

and more systems in the quantum mechanical description. But the transition itself, exchanging the<br />

quantum mechanical description for a description in terms of observed facts, must come from outside<br />

quantum mechanics.<br />

One can of course, in analogy to the classical measurement scheme, simply postulate that this<br />

quantum mechanical description of the measurement process ends as soon as we can couple the system<br />

S, perhaps by means of many intermediate steps, to some measuring apparatus M whose pointer<br />

quantity R is directly observable. But here we are dealing with the fundamental issue in the theory<br />

and therefore we cannot be satisfied with a pragmatical point of view. Also, we would be faced<br />

with the question which quantities deserve to have the special status of being “directly observable”.<br />

Furthermore, there is the problem that the final states of (VIII. 4) or (VIII. 11) are entangled superpositions<br />

of states with different pointer positions. As we mentioned above, this has no classical analogue.<br />

Without further analysis it is hard to imagine what a direct observation on such states would look like.<br />

An example in which these issues emerge sharply is Schrödinger’s famous cat paradox (1935b),<br />

which we discussed already in the introduction. Schrödinger imagined that a living cat is locked up in<br />

a hermetically closed box, together with a radioactive substance of which perhaps one atom decays<br />

in the course of one hour. The box is provided with a Geiger counter which can register the decay of<br />

the atom, and activates upon decay an installation which lets escape a deadly gas.<br />

Assume that initially the quantum mechanical state of this total system is a product state with a<br />

very large number of factors. The state of the radioactive atom evolves in the course of the hour we<br />

agreed upon to wait into a superposition of the atom before and after decay. The evolution of the total<br />

state then takes on the same form as the state |Φ⟩ in (VIII. 11), i.e., the state evolves into something<br />

like<br />

c 1 (t) |A : 1⟩ ⊗ |ν : 0⟩ ⊗ · · · ⊗ |cat : ⌣⟩<br />

+ c 2 (t) |A : 0⟩ ⊗ |ν : 1⟩ ⊗ · · · ⊗ |cat : †⟩, (VIII. 12)<br />

where t is the time we wait and the system evolves, |A : 1⟩ and |A : 0⟩ are the states of the radioactive<br />

atom before and after decay, |ν : 1⟩ and |ν : 0⟩ are the states of the electromagnetic field with<br />

and without a photon, etc.<br />

The composite system is therefore in a gigantic superposition of states in which the cat is living<br />

and in which it is dead. If we want to hold on to the orthodox interpretation of quantum mechanics<br />

to the bitter end, we have to say that in this state the cat is neither living nor dead, and that only at<br />

measuring, which is perhaps lifting the lid of the box after one hour, there is a certain probability,<br />

namely |c 2 | 2 versus |c 1 | 2 , to find the cat dead or alive. It is the observer, the opener of the box, who<br />

determines the fate of the cat.


170 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

Figure VIII. 1: Schrödinger’s cat paradox (DeWitt 1970 )<br />

VIII. 4<br />

THE MEASUREMENT PROBLEM IN THE NARROW SENSE<br />

In the previous section we have seen how the measurement process, (VIII. 4), brings the composite<br />

system in a superposition of macroscopic different states, e.g. pointer positions. The development of<br />

such superpositions is a consequence of the linearity of the evolution operator. An example is given in<br />

the discussion between Einstein and Pauli, described in the introduction, p. 9, concerning the center of<br />

mass of a macroscopic body. The strangeness of a superposition comes from our tacit presupposition<br />

that the macroscopic pointer positions not only act as possible outcomes of a measurement, but can<br />

also be taken as properties of the pointer. We think that pointers of a measuring apparatus indicate<br />

something, even if we are not in the act of reading them off.<br />

Assuming, for the sake of convenience, that observing something is sufficient to decide that there<br />

is an element of physical reality which is responsible for the observation, we expect that if the quantum<br />

state presents a complete description of the system, i.e., if every element of physical reality has a<br />

counterpart in quantum mechanics, then those macroscopic properties should be represented by it.<br />

That is, however, not the case in the state (VIII. 4).<br />

The idea playing a background role is the next postulate, often called the ‘eigenstate-eigenvalue<br />

link’. It was explicitly supported both by Dirac (1958, p. 46) and Von Neumann (1955, p. 253).<br />

EIGENSTATE-EIGENVALUE LINK, PURE CASE:<br />

A physical system S has the property that quantity A has a definite value iff its state is an<br />

eigenstate of the operator A which, according to the observables postulate, corresponds<br />

to A.<br />

It is also conceivable that a system possesses a definite but unknown value for a quantity. If we use<br />

the ‘ignorance interpretation of mixtures’, as discussed in chapter III, p. 52, we obtain the variation<br />

EIGENSTAT-EIGENVALUE LINK, MIXED CASES:<br />

A physical system S has the property that quantity A has a definite but unknown value


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 171<br />

iff its state is in a mixture of eigenstates of the operator A which, according to the observables<br />

postulate, corresponds to A.<br />

These postulates speak about the existence of properties, about physical quantities having values,<br />

independent of a measurement or a measuring context.<br />

EXERCISE 37. Discuss the link between the property postulates and the sufficient condition of<br />

reality EPR(EPR) of Einstein, Podolsky and Rosen, section I. 2, p. 12, ff.<br />

From this point of view it would be good to have a quantum mechanical description of the measurement<br />

process in which, in any case, the measuring apparatus has a certain property after completion<br />

of the measurement. This means that, instead of the superposition (VIII. 4), we require, as a final<br />

state, the mixture<br />

W ′ =<br />

N∑<br />

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |. (VIII. 13)<br />

j=1<br />

Some authors, e.g. Landau and Lifshitz (1958, pp. 21 - 24), go still further and require as a final<br />

state an eigenstate |r k ⟩ of the pointer quantity R, corresponding to the pointer position found after<br />

measurement. According to them the measuring interaction finishes with an indeterministic jump,<br />

with probability |c j | 2 , to one of the states |a j ⟩ ⊗ |r j ⟩.<br />

Summarizing, we have the following options for the description of the measurement process. For<br />

the initial state there is no comtroversy,<br />

|ψ⟩ ⊗ |r 0 ⟩ =<br />

N S ∑<br />

j=1<br />

For the final state there are three possibilities,<br />

c j |a j ⟩ ⊗ |r 0 ⟩. (VIII. 14)<br />

1.<br />

N S ∑<br />

j=1<br />

c j |a j ⟩ ⊗ |r j ⟩, (VIII. 15)<br />

2. W ′ = ∑ j<br />

|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |, (VIII. 16)<br />

3. |a j ⟩ ⊗ |r j ⟩ with probability |c j | 2 . (VIII. 17)<br />

According to the foregoing line of reasoning we require that, at the end of a measuring interaction,<br />

the pointer of the measuring apparatus, which is of course macroscopic, designates something.<br />

The state (VIII. 15) does not satisfy this requirement, on the contrary, the quantum mechanical superposition<br />

|ψ⟩ of eigenstates |a j ⟩ of the quantity that is measured and which prohibited us to ascribe,


172 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

preliminary to the measurement, a certain value A to the object system S, proves to be contagious;<br />

after the interaction also the pointer quantity of the measuring apparatus has no definite value anymore,<br />

and if the composite system SM is coupled to another measuring apparatus M ′ , this also<br />

becomes infected with ‘property loss’. This is why (VIII. 16) and (VIII. 17) are preferred as final<br />

states over (VIII. 15).<br />

The problem of giving a treatment of the measurement process which produces one of these two<br />

final states, and which therefore ‘creates’ the definite values by means of the measuring interaction,<br />

is the measurement problem in the narrow sense. Notice that (VIII. 16) and (VIII. 17) cannot be<br />

obtained from the initial state by means of a unitary transformation. Therefore, we have to adjust or<br />

extend the first five Von Neumann postulates. We will discuss some proposals for a solution.<br />

VIII. 4. 1<br />

THE PROJECTION POSTULATE AND CONSCIOUSNESS<br />

By adding the projection postulate to the first five postulates, p. 41, Von Neumann gave the standard<br />

solution to the measurement problem in the narrow sense. He distinguished two ways in which<br />

a state can change in time,<br />

Process 1. The discontinuous, non - unitary, indeterministic projection occurring at a<br />

measurement; the projection postulate.<br />

Process 2. The continuous, unitary, deterministic evolution which is consistent with the<br />

Schrödinger equation or its generalization to mixed states, as long as no measurement is<br />

made on the system; the Schrödinger postulate.<br />

At measurement the state undergoes a transition into the eigenstate belonging to the outcome of<br />

measurement. Therefore, this brings about the final state (VIII. 17) and gives, in accordance with the<br />

eigenstate-eigenvalue link, p. 170, definite properties to both the object system and the pointer of the<br />

measuring apparatus.<br />

Although the measurement problem in the narrow sense is solved with these two types of evolution,<br />

the measurement problem in the broad sense, p. 164, comes into prominence more than ever.<br />

We would now like to have an explanation for the particular nature of a measurement, or at least a<br />

criterion with which it can be distinguished of other processes.<br />

Such a criterion is provided, by Von Neumann and for instance Wigner, W. Heitler (1970 p. 42),<br />

and F. London and E. Bauer (1939), in terms of the consciousness of an observer. London and Bauer<br />

reason as follows.<br />

Consider an object system S, a measuring apparatus M and a conscious observer B. The state of<br />

the composite system after measurement is, according to (VIII. 11),<br />

|Φ⟩ = ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |b j ⟩. (VIII. 18)<br />

According to London and Bauer, this is the description of the state for us. But for the conscious<br />

observer B it is not the same, because B has the characteristic capacity of introspection. By introspection<br />

he knows in which eigenstate he is, he perceives one certain pointer position. This breaks the


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 173<br />

quantum mechanical chain. If he knows that he is in the state |b k ⟩ and sees the meter indicating something<br />

which corresponds to the pointer state |r k ⟩, then from that moment on the state has immediately<br />

become |a k ⟩ ⊗ |r k ⟩ ⊗ |b k ⟩. Conscious introspection of the observer therefore causes the collapse<br />

of the wave packet. This strange situation is expressed in the thought experiment called ‘Wigner’s<br />

friend’, in which the measuring device is replaced by a friend who communicates the outcome of<br />

measurement to Wigner.<br />

The aforementioned authors emphasize the role of consciousness in the interpretation of quantum<br />

mechanics. It need hardly be emphasized that for the majority of physicists something like this is<br />

unacceptable. They are of the opinion that a measurement is finished as soon as the result is registered<br />

somewhere in the equipment. It is not necessary that it subsequently comes to attention of a conscious<br />

being. But of course, then the question remains again which criterion can be given for a permanent<br />

registration.<br />

VIII. 4. 2<br />

BOHMIAN <strong>MECHANICS</strong><br />

An important advantage of the theory of chapter VI is its avoidance of the projection postulate.<br />

This has consequences for the treatment of measurements. ‘Measuring’ is not a primitive concept in<br />

Bohmian mechanics, measurements are treated on an equal footing with all other physical interactions.<br />

The measuring apparatus is treated in the same manner as the measured object system, namely<br />

with the Bohmian equations, which are derived from the Schrödinger equations. As a consequence,<br />

the interaction between an object system and a measuring apparatus can be given according to the<br />

measurement scheme (VIII. 4).<br />

If, for the sake of simplicity, we limit ourselves to two terms, the interaction is of the<br />

form (VI. 24), p. 134, where ϕ B and ϕ D are the eigen - wave functions of the pointer quantity, corresponding<br />

to the various pointer positions. It is plausible to assume that ϕ B and ϕ D have no overlap.<br />

Consequently, the wave function of the object system and the measuring apparatus is effectively factorizable<br />

and we can regard the superposition as a mixture. There is no measurement problem in<br />

Bohmian mechanics.<br />

◃ Remark<br />

The requirement that ϕ B and ϕ D in (VI. 24) have no overlap is stronger than what is required in<br />

Von Neumann’s model. There it suffices that the wave functions are orthogonal, i.e., ⟨ϕ B | ϕ D ⟩ = 0<br />

instead of ϕ B (⃗q)ϕ D (⃗q) = 0 for all ⃗q ∈ R 3 . ▹<br />

VIII. 4. 3<br />

SPONTANEOUS COLLAPSE<br />

The next option has been developed by G.C. Ghirardi, A. Rimini, and T. Weber (1986), a related<br />

proposal comes from F.A. Bopp (1947). In this view the evolution from the Schrödinger postulate has<br />

to be replaced by an indeterministic evolution. A stochastic term is added, making the Schrödinger<br />

equation non - linear. This has as a consequence that every physical system from time to time spontaneously<br />

makes a small jump, so that the wave function collapses to, almost, a position eigenstate.<br />

The new constant of nature characterizing the relevant time scale is such that the probability of a<br />

spontaneous collapse of the wave function for a single elementary particle is extremely small, in the


174 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

order of once every 10 10 years, leaving the continuous Schrödinger equation an excellent approach<br />

for such a physical system.<br />

In this theory it can be shown that in case of composite systems a collapse of the state of a partial<br />

system brings about a collapse of the state of the entire composite system. This has as a consequence<br />

that the average frequency of these spontaneous jumps per unit of time increases with the number of<br />

degrees of freedom, and for a macroscopic system with approximately 10 25 particles the average time<br />

between two jumps, and therefore two collapses, will only be 10 −5 milliseconds. Hence, in good<br />

approximation, macroscopic systems always have a definite position where microscopic systems do<br />

not.<br />

The difference between this approach and that of Von Neumann is that in the first place there<br />

is no fundamental difference between measurements and other interactions, consciousness plays no<br />

role. Moreover, by adapting the evolution equation, this theory leads to predictions which differ from<br />

quantum mechanics, making it verifiable. By means of experiments it is possible to obtain upper<br />

and lower limits for the collapse frequency. Ghirardi, Rimini and Weber are of the opinion that the<br />

experimental data we have at present are still compatible with a finite interval for their new constant<br />

of nature.<br />

VIII. 4. 4<br />

MANY WORLDS<br />

Another option is the many - worlds interpretation of H. Everett (1957), J.A. Wheeler (1957) and,<br />

especially, B.S. DeWitt (1970, 1971). In this view it is posed that the quantum mechanics of the<br />

first five postulates gives a universally valid description of reality. Therefore, in principle the wave<br />

function of the universe can be written down. There is no part of the world, including the context of<br />

measurement, which is described classically. Moreover, there is no projection postulate. The wave<br />

function develops according to a unitary evolution, which means that it remains a pure state for all<br />

time.<br />

Everett models a measurement process by assuming that a certain system has a complete set<br />

of orthonormal eigenstates, which are interpreted to signify that certain outcomes of measurement<br />

have occurred and are permanently registered in a memory. They are analogous to the previously<br />

mentioned pointer positions |r j ⟩. The state |Ψ⟩ of the composite system of object system S and<br />

measuring apparatus M remains in the superposition form (VIII. 15) for all time. To every state |ϕ i ⟩<br />

of the object system corresponds a relative state of the measuring apparatus,<br />

|ψ⟩ rel<br />

Ψ, ϕ i<br />

:= N i<br />

∑<br />

j<br />

c ij |r j ⟩ with c ij = ( ⟨ϕ i | ⊗ ⟨r j | ) |Ψ⟩, (VIII. 19)<br />

where N i is a normalization constant and {|ϕ i ⟩} and {|r j ⟩} are arbitrary orthonormal bases of the<br />

Hilbert spaces H S and H M of the object system and measuring apparatus, respectively. It can simply<br />

be shown that this definition is independent of the choice of this basis, so that the relative state is<br />

uniquely defined by |Ψ⟩ and |ϕ i ⟩.<br />

In case of an ideal measurement we have<br />

|ψ⟩ rel<br />

Ψ, ϕ i<br />

= |r i ⟩. (VIII. 20)


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 175<br />

This relative state yields the usual conditional probability distribution for the possible outcomes of<br />

measurement of a quantity in case the object system is found in the state |ϕ i ⟩. This is substantiated by<br />

Everett by showing that, if we set the right conditions for the state |ϕ i ⟩, all predictions for quantities<br />

which only refer to the object system S can be determined using the relative state. Therefore, we can<br />

act as if a projection to that state has taken place. In reality, however, the superposition (VIII. 15)<br />

remains.<br />

Now the question is, of course, how this superposition must be interpreted. Especially DeWitt<br />

has propagated a radical view; all terms in this superposition represent real, existing worlds. The<br />

transition during the measurement process is a division of the world in uncountably many copies,<br />

where a different result is registered in each of them. All these worlds exist and develop further next<br />

to one another, without being able to have mutual contact. The problem how to choose one really<br />

realized term from the superposition, as we do using the projection postulate, is avoided because all<br />

terms are realized.<br />

Postulating the existence of such an multiplicity of worlds, with which, moreover, we absolutely<br />

cannot make contact, is acceptable only for a small number of people. But probably worse is the<br />

idea that any decay process in a star in a remote part of the universe can split up our local world into<br />

millions of copies of itself.<br />

Moreover, a difficult point in this theory is how the ‘splitting’ must be understood exactly. It<br />

seems that DeWitt intends a special kind of physical process which emerges at registration. This<br />

would look like adopting a second type of process besides the Schrödinger evolution, in contrast to<br />

the objective of the interpretation; the measurement problem in the broad sense would not be solved.<br />

There is also the problem which process we have to suggest for the reversed evolution; a ‘melting’ of<br />

worlds? In Everett’s original work the idea of a physical splitting of the universe does not occur. He<br />

only regards this as a ‘bookkeeping’ transition to a relative state.<br />

Finally there is the supposition that to a set of states |r j ⟩ of the measuring apparatus the interpretation<br />

can be given that herewith an outcome of measurement is permanently registered. This<br />

supposition cannot without problems be brought into conformity with quantum mechanics because it<br />

still concerns superpositions.<br />

VIII. 4. 5<br />

SUPERSELECTION RULES<br />

Again another option is to introduce superselection rules. Certain superpositions of microscopic<br />

states do not seem to occur in nature, for example, superpositions of states with unequal charge, e.g.<br />

electric, baryonic, or superpositions of states with integer and half integer spin. Therefore, it could be<br />

assumed that superpositions of macroscopically different states do not occur also, and the dynamics<br />

of quantum mechanics must then be adapted to account for this.<br />

In such a setup of quantum mechanics, e.g., in which the superposition principle is not valid<br />

in general, it is possible to have W ′ , (VIII. 16), as the final state of the measurement process, see<br />

Beltrametti and Cassinelli (1981, p. 57). More precisely, in the presence of superselection rules the<br />

mixture (VIII. 16) and the pure state (VIII. 15) become equivalent; the superselection rules provide<br />

the same expectation values for all physical quantities allowed by the superselection operators.<br />

An example of this approach is the suggestion of R. Penrose (1996) that in a future unified theory<br />

for quantum gravitation a superselection rule would apply to the space - time metric. Because


176 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

the gravitational field is taken into account in the metric, also the positions of massive bodies such<br />

as pointers of measuring apparatuses are superselected since the field depends on the positions of<br />

massive macroscopic bodies.<br />

VIII. 4. 6<br />

IRREVERSIBILITY <strong>OF</strong> MEASUREMENT<br />

The next option is to appeal to the special characteristic properties of measuring apparatuses,<br />

and to the theory of irreversible processes, as is done in the work of A. Daneri, A. Loinger and<br />

G.M. Prosperi (1962). According to these authors it is characteristic for measuring apparatuses that<br />

they are in a metastable state. An interaction with a microscopic system then causes, by means of a<br />

chain reaction, an irreversible response of the measuring apparatus.<br />

The description of such an irreversible process within quantum mechanics is not straightforward,<br />

because the unitary evolution is always reversible. It is necessary to make special assumptions concerning<br />

the structure of the macroscopic measuring apparatus and its observable quantities; all matrices<br />

corresponding to these quantities have to be almost diagonal in the energy representation. Then<br />

it can be shown that, as regards the empirical statements for this observable quantities, the final<br />

state (VIII. 15) can be replaced by that of (VIII. 16).<br />

The elegance of this approach is that the details and construction of the measuring apparatus are<br />

discussed. The presence of a metastable state indeed seems to be an essential aspect, like for example<br />

the Geiger counter, or the bubble chamber using superheated liquids. But the introduction of irreversible<br />

processes asks for a modification of the unitary evolution and therefore of the Schrödinger<br />

postulate. Just as in the quantum theory of Ghirardi, Rimini and Weber, this is a fundamental modification<br />

of quantum mechanics.<br />

VIII. 4. 7<br />

MODAL INTERPRETATION<br />

This option to solve the measurement problem is provided by the so - called modal interpretation,<br />

introduced by B.C. van Fraassen (1979) and developed by S. Kochen (1985), D. Dieks, (1989) and<br />

R. Healey (1989). Overviews are given by Vermaas (1999), and Dieks and Vermaas (1998).<br />

In the modal interpretation the projection postulate is removed together with a part of the property<br />

postulate, while the measurement postulate is replaced by a postulate saying that every vector of the<br />

form<br />

|ψ⟩ = ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ (VIII. 21)<br />

describes the situation in which system 1 has, as a property, the value a j for the quantity A corresponding<br />

to the operator which is determined by the basis {|a j (t)⟩} and in which, similarly, system 2<br />

has the value r j . Each of these states has a probability |c j | 2 to be realized. This is not different from<br />

the usual ‘ignorance interpretation’ of probabilities. Finally, the Schrödinger postulate is declared to<br />

be valid universally, it is, therefore, also effective during the measurement process.<br />

An important theorem by E. Schmidt, the so - called (biorthogonal -) decomposition theorem, says<br />

that for every composite system the evolution of a state |ψ⟩ in the form (VIII. 21) is unique as long


VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 177<br />

as |c j | ̸= |c k | for j ≠ k. Therefore it is possible for every state |ψ⟩ for which this holds to exactly<br />

indicate the potential corresponding properties. A generalization to mixed states can be achieved by<br />

taking the spectral decomposition of W of the composite system as the preferred decomposition, the<br />

Schmidt decomposition (VIII. 21) is then found for the special case of pure states.<br />

The idea that the meaning of the state vector can be exclusively formulated in terms of measurements<br />

is rejected, the state vector describes factual properties. The description by the wave function<br />

is, however, incomplete, |ψ⟩ determines the possibilities and the probabilities of the possibilities, but<br />

the real physical situation is not determined. Quantum mechanics is fundamentally indeterministic<br />

because sometimes one possibility, at other times another one occurs.<br />

Moreover, in this interpretation the ‘only if’ part of the property postulate is rejected, if a system<br />

is in an eigenstate it has indeed the corresponding eigenvalue, but not ‘only if’; a system which is<br />

in a superposition of eigenstates, (VIII. 21), nevertheless has one of the properties. In the first case<br />

a composite physical system necessarily has the property, in the second case contingently. In logic<br />

the italicized words are called ‘modalities’, hence the name modal interpretation. The projection<br />

postulate is now superfluous.<br />

If, however, the singlet state, being a state of a composite system also, is considered in the modal<br />

interpretation, this interpretation tells us less than quantum mechanics with the property postulate<br />

does.<br />

◃ Remarks<br />

In this interpretation, the metastability or possibly permanent nature of the quantities of system 2 plays<br />

no role in attributing properties. Another point in this interpretation is that, besides the Schrödinger<br />

dynamics for the state, there seems to be a need for a dynamics describing how properties change in<br />

time. Several attempts have been made to that end. ▹<br />

EXERCISE 38. What does quantum mechanics with the property postulate say about the EPRB<br />

experiment, p. 139, that the modal interpretation does not say, and why? Does it help to couple a<br />

measuring apparatus to the composite system of the two spin particles?<br />

VIII. 4. 8<br />

DECOHERENCE<br />

Finally we will discuss the option which is possibly supported by the majority of physicists, see<br />

H.J. Groenewold (1946), K. Gottfried (1989), N.G. van Kampen (1988), W.H. Zurek (1981 and 1982).<br />

Bell (1990) named this option the For All Practical Purposes solution, briefly FAPP. The idea is to<br />

show that the difference between the pure state (VIII. 15) and the mixed state (VIII. 16) is hardly<br />

perceptible in practice.<br />

A measuring apparatus is a macroscopic system which is in continuous interaction with its surroundings.<br />

A more realistic representation of the measurement process will therefore be of the


178 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

form (VIII. 11), but with a very large number of terms and factors, e.g.<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |s 0 ⟩ ⊗ · · · ⊗ |t 0 ⟩ ∑ j<br />

c j |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩. (VIII. 22)<br />

In practice, the coherence between the various terms of the superposition will rapidly be lost because<br />

this coherence can only be revealed if the expectation values of the quantities contain cross<br />

terms. To see this, consider a quantity which is a product of quantities of the various partial systems<br />

S, M, M ′ , . . . , M ′′ , for instance, of the form à ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T , or a summation thereof,<br />

which contains non - zero off - diagonal matrix elements. This means that we assume that<br />

(<br />

⟨ai ′| ⊗ ⟨r j ′| ⊗ · · · ⊗ ⟨t k ′| ) Ã ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T ( |a i ⟩ ⊗ |r j ⟩ ⊗ · · · ⊗ |t k ⟩ )<br />

= ⟨a i ′ | A | a i ⟩ ⟨r j ′ | R | r j ⟩ · · · ⟨t k ′ | T | t k ⟩ (VIII. 23)<br />

does not exclusively contains diagonal terms. However, in practice such quantities cannot be measured,<br />

as soon as we do not measure one of the partial systems the coherence is already broken.<br />

For example, because of the orthogonality of the states |s j ⟩, the expectation value of the quantity<br />

˜Q ⊗ ˜R ⊗ 11 ⊗ · · · ⊗ ˜T in the state (VIII. 22) is equal to that in the mixed state<br />

W ′′ = ∑ |c j | 2 ( |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩ )<br />

j<br />

(<br />

⟨aj | ⊗ ⟨r j | ⊗ ⟨s j | ⊗ · · · ⊗ ⟨t j | ) . (VIII. 24)


VIII. 5. INCOMPATIBLE QUANTITIES 179<br />

The step from the pure state (VIII. 22) to the mixture (VIII. 24) is therefore justified by limiting<br />

ourselves to practically realizable states.<br />

At first sight, this reasoning is in every way reasonable. Of course, the reasoning only refers to<br />

a particular class of quantities; a physical quantity for a composite system is certainly not always a<br />

direct product or a summation thereof. But it can be maintained that quantities which are not direct<br />

products are even harder to measure in practice. It is, however, beyond doubt that experimentally<br />

distinguishing the pure state (VIII. 22) from the mixed state (VIII. 24) using macroscopic quantities<br />

will be extremely difficult.<br />

Bell considers this FAPP solution as a pitfall, he speaks of the FAPP - trap. He emphasizes that the<br />

measurement problem is not a practical but a fundamental problem. The core of the problem is if,<br />

after the measurement process, certain properties are present in the measuring apparatus. The FAPP<br />

reasoning shows that, generally, in practice the system behaves as if it had those properties, but it<br />

leaves untouched the fact that ‘in reality’ the system does not have those properties, and that, if our<br />

experimental possibilities would be more ample, this is also experimentally provable.<br />

EXERCISE 39. Show that, using the physical quantity corresponding to the operator |Ψ⟩ ⟨Ψ|, in<br />

which |Ψ⟩ is the right - hand side of (VIII. 22), experimental distinction can be made between the<br />

pure state (VIII. 22) and the mixed state (VIII. 24).<br />

VIII. 5<br />

INCOMPATIBLE QUANTITIES<br />

So far we considered measuring a single physical quantity or two compatible, or commeasurable,<br />

physical quantities of the object system, where compatible quantities are quantities corresponding<br />

to commutating operators. The simple measurement theory (VIII. 2) however, enables us to discuss<br />

also the measurement of incompatible quantities.<br />

Let A and B be two arbitrary, incompatible quantities of the object system S corresponding<br />

to the maximal operators A and B. Measuring apparatus M 1 measures A and apparatus M 2 measures<br />

B. The pointer observables of the apparatuses are R and T , corresponding to the operators<br />

R and T , the eigenstates are |a j ⟩, |b j ⟩, |r j ⟩, |t j ⟩, respectively. The initial state is |ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

in H = H S ⊗ H 1 ⊗ H 2 , and with dim H S = N S ,<br />

|ψ⟩ =<br />

N S<br />

∑<br />

⟨a j | ψ⟩ |a j ⟩ =<br />

j=1<br />

N S<br />

∑<br />

⟨b k | ψ⟩ |b k ⟩. (VIII. 25)<br />

k=1


180 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

Now we first measure A and next B. The measurement scheme (VIII. 4) gives<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

A<br />

N S<br />

∑<br />

⟨a j | ψ⟩ |a j ⟩ ⊗ |r j ⟩ ⊗ |t 0 ⟩<br />

j=1<br />

B ∑<br />

N S<br />

N S<br />

∑<br />

⟨a j | ψ⟩ ⟨b k | a j ⟩ |b j ⟩ ⊗ |r j ⟩ ⊗ |t j ⟩. (VIII. 26)<br />

j=1 k=1<br />

If we first measure B and then A, we have<br />

|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />

B<br />

N S<br />

∑<br />

⟨b k | ψ⟩ |b k ⟩ ⊗ |r 0 ⟩ ⊗ |t k ⟩<br />

k=1<br />

A ∑<br />

N S<br />

N S<br />

k=1 j=1<br />

∑<br />

⟨b k | ψ⟩ ⟨b k | a j ⟩ ∗ |a j ⟩ ⊗ |r j ⟩ ⊗ |t k ⟩. (VIII. 27)<br />

We see that the final states (VIII. 26) and (VIII. 27) differ from each other. For the probability to get<br />

for A the outcome a j and for B the outcome b k we find<br />

and<br />

Prob A, B (R : r j ∧ T : t k ) = |⟨a j | ψ⟩| 2 |⟨b k | a j ⟩| 2 (VIII. 28)<br />

Prob B, A (T : t k ∧ B : r j ) = |⟨b k | ψ⟩| 2 |⟨a j | b k ⟩| 2 . (VIII. 29)<br />

The good thing is that the measurement theory enables us to make a statement about measurements<br />

of the incompatible quantities A and B which are done after each other, on the basis of the,<br />

possibly simultaneous, measurements of the compatible quantities R and T .<br />

EXERCISE 40. Why are R and T compatible?<br />

We see that the order in which A and B are measured is important. Here the result of the ‘measurement<br />

disturbance’ develops within the framework of the unitary time evolution of the state.<br />

For the conditional probability to find b k if we have found a j , and vice versa, we find,<br />

with (VIII. 28) and (VIII. 29), |⟨b k | a j ⟩| 2 and |⟨a j | b k ⟩| 2 , respectively, and we see that they are equal.<br />

This can be generalized easily. If we successively measure the discrete quantities A, A ′ , A ′′ , . . . ,<br />

having eigenvalues a i , a ′ j, a ′′ k, . . . , the probability to find, given that measurement of A yielded the<br />

outcome a i , for A ′ the outcome a ′ j and for A ′′ the outcome a ′′ k, etc. is equal to<br />

Prob ( · · · A ′′ : a ′′ k ∧ A ′ : a ′ j | A : a i )<br />

= · · · |⟨a ′′ k | a ′ j⟩| 2 |⟨a ′ j | a i ⟩| 2 = ⟨a i | a ′ j⟩ ⟨a ′ j | a ′′ k⟩ · · · ⟨a ′′ k | a ′ j⟩ ⟨a ′ j | a i ⟩<br />

= ⟨a i | P ′ j P ′′ k · · · P ′′ k P ′ j | a i ⟩ = Tr P i P ′ j P ′′ k · · · P ′′ kP ′ j. (VIII. 30)<br />

This result does not apply to degenerated eigenvalues.


VIII. 6. COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT 181<br />

We can consider (VIII. 30) to be the most general statement of quantum mechanics for maximal,<br />

discrete quantities; a probability statement concerning the occurrence of correlations between the<br />

outcomes of consecutive measurements. Empirically speaking, all of physics is about such statements,<br />

including classical physics. But classical physics permits us to associate with it a picture of physical<br />

systems as scraps and pieces of matter with properties, moving through space, while in quantum<br />

mechanics such a picture is not available.<br />

◃ Remark<br />

If we measure on S, as in (VIII. 2), the same quantity for a number of times, we will always find the<br />

same outcome. Then the projections in (VIII. 30) are orthogonal and<br />

Tr P i P j P k · · · P k P j = δ ij δ jk · · · . ▹ (VIII. 31)<br />

VIII. 6<br />

COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT<br />

Although the measurement scheme (VIII. 2) seems evident, it is not entirely so. To show this, we<br />

start with deriving a desired consequence from it; only physical quantities which correspond to normal<br />

operators are measurable. The pointer states of the measuring apparatus have to be macroscopically<br />

distinguishable, which means that the eigenstates |r j ⟩ of operator R are orthonormal, ⟨r j | r k ⟩ = δ jk ,<br />

since R corresponds to the observable pointer position R of the measuring apparatus. Because the<br />

measurement interaction is unitary, it holds that<br />

(<br />

⟨ai | ⊗ |r 0 ⟩ ) ( ⟨a j | ⊗ |r 0 ⟩ ) = ( ⟨a i | ⊗ ⟨r i | ) ( |a j ⟩ ⊗ |r j ⟩ ) , (VIII. 32)<br />

or<br />

⟨a i | a j ⟩ ⟨r 0 | r 0 ⟩ = ⟨a i | a j ⟩ ⟨r i | r j ⟩ = δ ij , (VIII. 33)<br />

and therefore, ⟨a i | a j ⟩ = 0 if i ≠ j, where the |a j ⟩ are again the eigenvectors of the maximal operator<br />

A, introduced on p. 166. The |a j ⟩ are thus orthonormal and can therefore be a basis. According<br />

to the spectral theorem of p. 26, every basis generates, by means of the projectors projecting on the<br />

elements of the basis, a normal operator. The physical quantity A indeed corresponds to the normal<br />

operator A and, representing a physical quantity, the eigenvalues of A are real. Consequently, A is,<br />

on finite dimensional Hilbert spaces, self - adjoint.<br />

Now we will discuss some points of criticism. The measurement scheme (VIII. 2) is strongly<br />

idealized. It does not say anything about the physical nature of measurements, which are nearly<br />

always of electromagnetic nature. In case of a concrete description, the evolution operator U (t) will<br />

have to represent something, i.e., a Hamiltonian H is needed which generates this evolution by means<br />

of U (t) = e − i H t . In general, A and U will not commute, in which case the |a j ⟩ do not transform<br />

into themselves, unless the duration of the measurement is ‘sufficiently short’. But the question what,<br />

in this connection, is sufficiently short cannot be answered without discussing the characteristics of U<br />

and H.<br />

Likewise, complying with the conservation laws evokes problems as is shown in a theorem by<br />

Wigner (1952) and Araki and Yanase (1960).


182 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />

THE WIGNER - ARAKI - YANASE THEOREM:<br />

The evolution U(τ), which brings about the measurement transition (VIII. 4) when measuring<br />

physical quantity A, is possible iff A commutes with all additive conserved quantities<br />

of the composite system of object system and measuring apparatus. In other words,<br />

conserved physical quantities which are not additive, additive physical quantities which<br />

are not conserved, and physical quantities which are neither conserved nor additive, cannot<br />

be measured exactly.<br />

Proof<br />

Here we will only prove the ‘if’ - part of the theorem. Let B be an additive conserved quantity of<br />

the composite system SM, i.e., B is, by definition, of the form<br />

B = B 1 ⊗ 11 + 11 ⊗ B 2 , (VIII. 34)<br />

which is conserved. This means that B commutes with the Hamiltonian H of the composite<br />

system,<br />

[B, H] = 0. (VIII. 35)<br />

Then B also commutes with every function of H and therefore with U (τ) = e − i H τ ,<br />

[B, U (τ)] = 0 =⇒ B = U † (τ) B U (τ). (VIII. 36)<br />

Consider the matrix element<br />

B jk := ⟨a j | ⊗ ⟨r 0 | B |a k ⟩ ⊗ |r 0 ⟩. (VIII. 37)<br />

On the one hand, because of the additivity of B, (VIII. 34), we have<br />

B jk = ⟨a j | ⊗ ⟨r 0 | (B 1 ⊗ 11 + 11 ⊗ B 2 ) |a k ⟩ ⊗ |r 0 ⟩<br />

= ⟨a j | B 1 | a k ⟩ + δ jk ⟨r 0 | B 2 | r 0 ⟩, (VIII. 38)<br />

while on the other hand, using (VIII. 36), we see that<br />

B jk = ⟨a j | ⊗ ⟨r 0 | U † (τ) B U (τ) |a j ⟩ ⊗ |r 0 ⟩ = ⟨a j | ⊗ ⟨r j | B | a k ⟩ ⊗ |r k ⟩<br />

= δ jk ⟨a j | B 1 | a k ⟩ + δ jk ⟨r j | B 2 | r k ⟩. (VIII. 39)<br />

Comparison of these two results shows that<br />

⟨a j | B 1 | a k ⟩ = 0 for j ≠ k, (VIII. 40)<br />

which means that in the basis {|a j ⟩} of H S , B 1 is in diagonal form and therefore A commutes<br />

with B 1 . □<br />

This theorem shows that the scheme (VIII. 4) can, strictly speaking, apply only to measurement of<br />

quantities which commute with all additive conserved quantities. However, the measurement scheme<br />

remains approximately valid if the value of the conserved quantity is large, which will easily be the<br />

case for macroscopic apparatuses. We therefore see that, whereas the U (t) in (VIII. 2) exists, a more<br />

concrete interpretation can come across problems. The shortcomings of the conventional formalism<br />

of quantum mechanics with regard to giving a faithful description of the measurement process, has<br />

lead to interesting extensions of the formalism, see, for instance, Busch, Lahti and Mittelstaedt (1991).


A<br />

GLEASON’S THEOREM<br />

Proofs really aren’t there to convince you that something is true - they’re there to show<br />

you why it is true.<br />

— Andrew Gleason<br />

Of course mathematics works in physics! It is designed to discuss exactly the situation<br />

that physics confronts; namely, that there seems to be some order out there - let’s find<br />

out what it is.<br />

— Andrew Gleason<br />

In section III. 2 we mentioned that Von Neumann suggested for a quantum mechanical probability<br />

measure the trace formula Tr P W , with P a projector. Gleason’s theorem shows that<br />

this probability measure in fact characterizes all probability measures on P (H), the set of all<br />

projectors on H. Since Gleason’s original proof is very difficult, in this appendix we will give a<br />

simplified version by proving the theorem for pure states only.<br />

A. 1 INTRODUCTION<br />

Let H be a real or complex Hilbert space with dim H > 2, and P (H) the set of all projectors<br />

on H. Let µ be a mapping µ : P (H) → [0, 1]. This µ is called a measure on H if it is additive,<br />

satisfying<br />

P i ⊥ P j =⇒ µ(P i + P j ) = µ(P i ) + µ(P j ) ∀ P i , P j ∈ P (H) (A. 1)<br />

µ(0 ) = 0 and µ(11) = 1. (A. 2)<br />

Combination of (A. 1) and the last requirement of (A. 2) implicates that µ attributes the value 1 to any<br />

orthogonal decomposition of unity.<br />

In section III. 2, p. 46, we saw that pure states are represented by the extreme elements of a convex<br />

set, and by proving the theorem on p. 49 we showed that the extreme elements of the convex set S(H)<br />

of state operators on H are the 1 - dimensional projectors in P (H). Consequently, the measure µ is<br />

called extreme if there exists a 1 - dimensional projector P such that<br />

µ(P ) = 1. (A. 3)<br />

This is also expressed by saying that µ is concentrated on P . We can now formulate Gleason’s<br />

theorem for pure states.


184 APPENDIX A. GLEASON’S THEOREM<br />

GLEASON’S THEOREM FOR PURE STATES:<br />

Under the condition that dim H > 2, a 1 - dimensional projector P 0 ∈ P (H) exists on<br />

which the measure µ : P (H) → [0, 1] is concentrated, such that<br />

µ(P ) = Tr P 0 P (A. 4)<br />

for all P ∈ P (H).<br />

The original proof by A.M. Gleason uses sophisticated mathematical methods and is rather<br />

opaque. Several authors have undertaken attempts a to give a more simple proof, particularly<br />

C. Piron (1976), J. Dorling (unpublished) and R. Cooke, M. Keane and B. Moran (1985), where<br />

the commentaries on the ‘elementary proof’ of Cooke, Keane and Moran by R.I.G. Hughes (1989)<br />

are clarifying.<br />

The following proof is a mixture of all this work. It exists of four steps, which, for that matter, do<br />

not coincide with the sections.<br />

A. 2 CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM<br />

Before taking the first step, we discuss a number of simple observations. First, the probability<br />

measure of Gleason’s theorem has to be continuous in P . In section III. 1, p. 48, we showed that<br />

discontinuous probability measures exist for dim H = 2. Therefore, the requirement dim H > 2<br />

holds without further mentioning throughout this appendix. Second, since the trace of a projector P<br />

is equal to the dimension of the subspace onto which it projects, the trace of a 1 - dimensional projector<br />

is 1, which yields for µ being concentrated on P 0<br />

µ(11) = Tr P 0 11 = 1, (A. 5)<br />

in accordance with (A. 2) and (A. 4). Third, every measure is entirely determined by giving its values<br />

on the 1 - dimensional projectors, and, since every higher - dimensional projector P is the sum<br />

of orthogonal 1 - dimensional projectors P i we can, with (A. 1), determine µ (P ) from the values<br />

of µ(P i ). Fourth, for every Hilbert space, (A. 4) at the same time defines an extreme measure µ on H<br />

which is concentrated on P 0 , and as of now we will indicate this measure by µ 0 ,<br />

µ 0 (P ) := Tr P 0 P. (A. 6)<br />

Using the idempotence of P 0 we have<br />

µ 0 (P 0 ) = Tr P 0 2 = Tr P 0 = 1, (A. 7)<br />

from which we see that (A. 4) holds for µ = µ 0 and P = P 0 . Since this measure, being concentrated<br />

on P 0 , assigns the value 0 to all projectors orthonormal to P 0 , it can also easily be verified that this<br />

measure satisfies the requirements (A. 1) and (A. 2).<br />

The foregoing observations lead to the conclusion that to prove Gleason’s theorem for pure states<br />

we have to prove that µ = µ 0 for all P ∈ P (H). Now we will take the first step.


A. 2. CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM 185<br />

A. 2. 1 STEP 1<br />

THEOREM 1:<br />

If Gleason’s theorem for pure states is true for any 3 - dimensional real Hilbert space, it<br />

is also true for any complex Hilbert space with dim H > 2.<br />

We will prove theorem 1 using a proof by contradiction.<br />

Proof<br />

Let H be a complex Hilbert space with dim H > 3 for which Gleason’s theorem is not true. Since<br />

all higher - dimensional projectors can be decomposed to 1 - dimensional projectors, it suffices to<br />

prove this theorem for 1 - dimensional projectors.<br />

Assume a measure µ on H exists, which is concentrated on P 0 ∈ P (H) such that µ(P 0 ) = 1,<br />

but differs from the measure µ 0 defined by (A. 6) in the sense that there is some 1 - dimensional<br />

projector P 1 for which the theorem does not hold,<br />

µ 0 (P 1 ) := Tr P 0 P 1 ≠ µ(P 1 ). (A. 8)<br />

First we will show that, if these measures differ on a higher - dimensional Hilbert space, they also<br />

differ on a 3 - dimensional Hilbert space.<br />

Using the projectors P 0 and P 1 , we can construct a set of three orthogonal 1 - dimensional projectors<br />

P 0 , ˜P 1 , P 2 in the following way. With P 0 = |e 0 ⟩ ⟨e 0 | and P 1 = |e 1 ⟩ ⟨e 1 |, construct a<br />

unit vector |ẽ 1 ⟩ in the plane spanned by |e 0 ⟩ and |e 1 ⟩ which is perpendicular to |e 0 ⟩, i.e.<br />

|ẽ 1 ⟩ ∝ (11 − P 0 ) |e 1 ⟩, (A. 9)<br />

as can be seen in figure A. 1. Then the projector ˜P 1 := |ẽ 1 ⟩ ⟨ẽ 1 | is perpendicular to P 0 . 1<br />

ẽ 1<br />

e 1<br />

e 2 e 0<br />

Figure A. 1: Construction of a 3 - dimensional subspace E<br />

Let P 2 be a 1 - dimensional projector which is perpendicular to both P 0 and ˜P 1 , it is always possible<br />

to choose such a projector because dim H > 3. With P 2 = |e 2 ⟩ ⟨e 2 |, the three orthonormal<br />

vectors |e 0 ⟩,|ẽ 1 ⟩ and |e 2 ⟩ together span a 3 - dimensional Hilbert space, which is a subspace of H.<br />

We will call this space E, and, by construction, P 0 , P 1 , ˜P 1 , P 2 ∈ P (E).<br />

1 To be exact<br />

˜P 1 = (1 − Tr P 0 P 1 ) −1 (P 1 + P 0 P 1 P 0 − P 1 P 0 − P 0 P 1 ).


186 APPENDIX A. GLEASON’S THEOREM<br />

Now we have the following statements,<br />

(a) P (E) ⊂ P (H),<br />

(b) the restriction of µ 0 to P (E) is a measure on P (E),<br />

(c) the restriction of µ to P (E) is a measure on P (E),<br />

(d) the measures µ 0 and µ differ on P (E).<br />

Statement (a) follows immediately from E ⊂ H. The statements (b) and (c) follow from the<br />

fact that both µ 0 and µ, being concentrated on P 0 , assign the value 1 to E, thereby assigning<br />

the value 0 to all subspaces of H perpendicular to E. Statement (d) follows from our assumption<br />

(A. 8).<br />

Next, we have to show that the Hilbert space E can be real. A Hilbert space is real if scalar<br />

multiplication and linear combinations of vectors are only carried out with real coefficients and<br />

the inner products are real. Choosing the vectors |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩, we have the freedom to<br />

absorb an arbitrary phase factor, which means that we can also take them real. Furthermore, we<br />

can exploit that freedom to bring about that the vector |e 1 ⟩, lying in the plane spanned by |e 0 ⟩<br />

and |ẽ 1 ⟩, becomes a linear combination with real coefficients, i.e.,<br />

|e 1 ⟩ = a |e 0 ⟩ + b |ẽ 1 ⟩ with a, b ∈ R. (A. 10)<br />

All inner products of the four vectors |e 0 ⟩, |ẽ 1 ⟩, |e 2 ⟩ and |e 1 ⟩ now have a real value. The required<br />

real Hilbert space is obtained by taking all linear combinations of |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩ with real<br />

coefficients. Because both |e 0 ⟩ and |e 1 ⟩ are elements of this Hilbert space, (a) through (d) remain<br />

valid.<br />

We see that, if Gleason’s theorem for pure states is not true for a complex Hilbert space with<br />

dim > 3, it is also not true for a real 3 - dimensional Hilbert space. Now assume that the theorem<br />

is proven to be true for a real Hilbert space with dim = 3. At the same time supposing that it is<br />

not true for a Hilbert space with dim > 3, so that it would, as we showed, also not be true for a<br />

real H with dim = 3, yields a contradiction. Therefore, theorem 1 is true. □<br />

A. 3 FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE<br />

While by proving theorem 1 we showed that, if Gleason’s theorem for pure states is true for a<br />

real, 3 - dimensional Hilbert space, it is also true for a complex Hilbert space with dim > 2, we did<br />

not prove that µ = µ 0 . In this section we will take the next steps towards proving that indeed µ = µ 0<br />

for all P ∈ P (H) in a real, 3 - dimensional Hilbert space.<br />

Conversion of an arbitrary complex Hilbert space to a 3 - dimensional real Hilbert space is convenient<br />

because this space is isomorphic with the usual 3 - dimensional Euclidean space R 3 . Here,<br />

the 1 - dimensional projectors correspond to lines through the origin, and we can identify them with<br />

points on the surface of a unit sphere, or actually, with half of the unit sphere because |e⟩ and −|e⟩ represent<br />

the same state. Those points will be designated by means of their spherical coordinates (θ, ϕ),<br />

or as points, or directions, on the surface of the unit sphere p, q, r, s, t, . . . , ∈ S 2 , where S 2 is the<br />

standard notation for this surface, and the index 2 refers to the fact that it is 2 - dimensional.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 187<br />

Letting lines through the origin represent 1 - dimensional projectors, the mapping µ is represented<br />

by a function µ which is a function of the points on S 2 or of the spherical coordinates of those points,<br />

having the following characteristics.<br />

The point p 0 , corresponding to the projector P 0 for which the measure µ is extreme, therefore<br />

µ(p 0 ) = 1, (A. 11)<br />

is called the north pole by convention, the other 1 - dimensional projectors are represented by points on<br />

the northern hemisphere. The set of all 1 - dimensional projectors perpendicular to a given direction r<br />

is called a great circle with axis r, this can be seen in figure A. 2. The great circle representing the<br />

projectors perpendicular to P 0 is called the equator, for which it holds that for any point s on the<br />

equator, according to (A. 1), (A. 11), and the requirement 0 µ 1,<br />

µ(s) = 0. (A. 12)<br />

The requirement (A. 1) for µ to be a measure is, if µ is taken to be a function of points on the<br />

surface of the unit sphere, that for arbitrary, mutually perpendicular axes (r, s, t) in the northern<br />

hemisphere it holds that<br />

µ(r) + µ(s) + µ(t) = 1, (A. 13)<br />

while for µ taken as a function of the spherical coordinates (θ, ϕ) of the points of intersection of the<br />

arbitrary axes (r, s, t) with the surface of the unit sphere we have<br />

µ(θ r , ϕ r ) + µ(θ s , ϕ s ) + µ(θ t , ϕ t ) = 1 (A. 14)<br />

where for any ϕ it holds that<br />

µ(0, ϕ) = 1 and µ( 1 2π, ϕ) = 0, (A. 15)<br />

assigning the required values to the north pole and the equator.<br />

Since we are working in a real, 3 - dimensional Hilbert space, we can assign values to the special<br />

measure (A. 6) in accordance with Von Neumann’s value assignment (V. 33), p. 119. Using (III. 45),<br />

with P 0 = |e 0 ⟩ ⟨e 0 | and P s = |ψ⟩ ⟨ψ|,<br />

µ 0 (P s ) = Tr P 0 P s = |⟨ψ | e 0 ⟩| 2 , (A. 16)<br />

with θ s the angle between s and the north pole, the special measure can be written as<br />

µ 0 (s) = cos 2 θ s . (A. 17)<br />

We will come back to this value assignment in section A. 4. In the next two steps we will prove<br />

that any measure µ (s) satisfying the requirements (A. 11) to (A. 13), or (A. 14) and (A. 15), is a<br />

nonincreasing function in θ s , and does not depend on ϕ.


188 APPENDIX A. GLEASON’S THEOREM<br />

A. 3. 1 STEP 2<br />

THEOREM 2:<br />

If the function µ (s) or, equivalently, µ (θ s , ϕ s ), satisfies the requirements (A. 11)<br />

to (A. 15), then µ(s) is a nonincreasing function in θ s .<br />

We will prove this theorem using two lemmas.<br />

A. 3. 1. 1 LEMMA 1<br />

A LITTLE LEMMA:<br />

Let {s ∈ S 2 | s ⊥ r} be the great circle with axis r ≠ p 0 . Furthermore, let s 0 represent<br />

the most northern point of this circle. Then for all points s of this great circle it holds<br />

that<br />

µ(s 0 ) µ(s), (A. 18)<br />

i.e., if we let s travel along a great circle, µ(s) will have its maximum value in the most<br />

northern point s 0 .<br />

Proof<br />

Choose a set of three orthogonal directions r, s, t, with s ∈ S 2 an arbitrary point on the great<br />

circle around axis r. From (A. 13) we have<br />

µ(r) + µ(s) + µ(t) = 1. (A. 19)<br />

Now carry out a rotation of the orthogonal pair s and t around the axis r until s arrives at the most<br />

northern point s 0 of the great circle. Under this rotation t arrives at a point t ′ at the equator as<br />

can be seen in figure A. 2.<br />

r<br />

p 0<br />

s 0<br />

θ<br />

t ′<br />

t<br />

∆ϕ<br />

s<br />

equator<br />

Figure A. 2: Rotation of s to s 0 and t to t ′ along a great circle around axis r


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 189<br />

Since r, s 0 and t ′ are still mutually orthogonal, we have<br />

µ(r) + µ(s 0 ) + µ(t ′ ) = 1, (A. 20)<br />

and combination with (A. 19) gives<br />

µ(s) + µ(t) = µ(s 0 ) + µ(t ′ ). (A. 21)<br />

But t ′ is on the equator, where, according to (A. 12), µ(t ′ ) = 0, and with 0 µ 1 we see that<br />

µ(s) = µ(s 0 ) − µ(t) µ(s 0 ). (A. 22)<br />

Therefore, on the great circle µ(s) has its largest value in the most northern point. □<br />

A. 3. 1. 2 LEMMA 2<br />

PIRON’S GEOMETRIC LEMMA :<br />

If the pair (s, t) are points on the northern hemisphere, and s lies more northwards than t,<br />

a curve of s to t can be found, existing entirely of segments of great circles, always<br />

starting at their most northern point.<br />

The following proof of Piron’s geometric lemma using projective geometry has been given by<br />

Cooke, Keane and Moran (1985).<br />

Proof<br />

The surface of the northern hemisphere of the unit sphere can be projected bijectively from the<br />

origin onto the horizontal plane P tangent to the north pole, as can be seen in figure A. 3. Therefore,<br />

we can also formulate our problem in this plane.<br />

P<br />

p 0 = Im(p 0 )<br />

Im(s ′ )<br />

Im(s 0 )<br />

s ′ s 0<br />

Im(s ′′ )<br />

s ′′<br />

Figure A. 3: Projection of points on a great circle onto a plane P through the north pole


190 APPENDIX A. GLEASON’S THEOREM<br />

All great circles, except the equator, are projected onto this plane as straight lines. The most<br />

northern point of such a great circle is projected onto the point of its corresponding line that is<br />

closest to the north pole. The line connecting the image of the north pole, Im(p 0 ), and the image<br />

of s 0 , Im(s 0 ), therefore intersects this line at a right angle.<br />

The projection plane therefore contains circles around the projected north pole corresponding to<br />

circles of constant northern latitude, where θ is constant, lines through the projected north pole<br />

corresponding to meridians which are lines of constant ϕ, and projected great circles, where one<br />

of those great circles is depicted in figure A. 4 by the thick grey line, while the projection of its<br />

most northern point is connected with the projected north pole by the thin grey line.<br />

P<br />

θ = c<br />

ϕ = c<br />

Figure A. 4: Projection of meridians, circles with constant latitude, and a great circle<br />

A continuous path from s to t, with s more northern than t, therefore θ s < θ t , along a series of<br />

segments of great circles while always starting at their most northern point, is represented in this<br />

way by a spiral consisting of straight line segments as shown in figure A. 5.<br />

t<br />

S N<br />

Figure A. 5: Spiral representing a projected path from s to t along subsequent great circles, each time<br />

starting at their most northern point<br />

By increasing the number of segments between s and t, we can let this spiral approach a circle<br />

with the north pole as its center. This means that on the northern hemisphere we can travel<br />

every desired distance in longitude by changing over to other great circles, while by changing<br />

over frequently enough we can make the decrease in northern latitude arbitrarily small, leaving θ<br />

constant or nearly constant.<br />

It is also possible to travel from a point t to a more southern point v having the same longitude,<br />

ϕ t = ϕ v . Of course, this can be done by traveling along a nearly circular path as described<br />

p 0<br />

S s


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 191<br />

above while taking ϕ from 0 to 2π and changing over just often enough to descend the required<br />

distance, but we will show it can also be done taking a path along two great circles only, again<br />

starting in their most northern points.<br />

As we saw, on the plane P paths of constant latitude are represented by circles around the north<br />

pole p 0 . Taking t as the starting point, choose it to be the most northern point of a great circle and<br />

travel along a segment, projected onto P as a straight line, to arrive at u, with θ u > θ t . From u,<br />

also choosing it to be the most northern point of a great circle, travel along a segment in opposite<br />

rotational direction, to arrive at v, the projection of which can be seen in figure A. 6.<br />

v<br />

t<br />

u<br />

ϕ(t) = ϕ(v)<br />

S<br />

Figure A. 6: Path from t to v, having the same longitude<br />

By traveling far enough along the great circle through t, u can always be chosen such that v can<br />

be reached from t in two steps. This means that we can always combine paths with constant latitude<br />

and constant longitude to create a path between two points s and t, where s is more northern<br />

than t, existing entirely of segments of great circles, always starting at their most northern point,<br />

thereby satisfying Piron’s lemma. □<br />

p 0<br />

A. 3. 1. 3 RESULT <strong>OF</strong> LEMMA 1 AND 2<br />

By proving the first lemma, we showed that µ(s 0 ), with s 0 the most northern point of the great<br />

circle through s, is always larger than, or equal to, µ(s), consequently, µ can only remain constant or<br />

decrease along a great circle if traveling along the circle starts from its most northern point.<br />

According to lemma 2, traveling from s to t, where s is more northern that t, is always possible<br />

to follow a path along subsequent great circles, each time starting at their most northern points.<br />

Combination of the two lemmas means that Piron’s lemma implies that we can find a sequence of<br />

points s, ′ , s ′′ , . . . , t, with<br />

and therefore<br />

µ(s) µ(s ′ ) . . . µ(t) for θ s < θ s ′ < . . . < θ t , (A. 23)<br />

µ(s) µ(t) for θ s < θ t , (A. 24)<br />

which proves theorem 2. □


192 APPENDIX A. GLEASON’S THEOREM<br />

A. 3. 2 STEP 3<br />

THEOREM 3:<br />

The function µ is constant at constant latitude and hence does not depend on ϕ,<br />

θ s = θ t ⇒ µ(θ s , ϕ s ) = µ(θ t , ϕ t ). (A. 25)<br />

Proof, first part<br />

Again, we will use a proof by contradiction.<br />

Suppose a latitude exists, i.e., there is a horizontal circle B on the surface of the unit sphere,<br />

B(θ 0 ) = {s ∈ S 2 | θ s = θ 0 }, (A. 26)<br />

for which µ is not constant. Here we assume that B(θ 0 ) is not the north pole or the equator, where<br />

theorem 3 is obvious. Now let<br />

and<br />

M (θ 0 ) := sup{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )} (A. 27)<br />

m(θ 0 ) := inf{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )}, (A. 28)<br />

where M (θ 0 ) is the least upper bound, or supremum, and m(θ 0 ) is the greatest lower bound, or<br />

infimum, of all values of µ over B(θ 0 ). If µ does not remain constant, it applies, for certain ε > 0,<br />

that<br />

M (θ 0 ) − m(θ 0 ) = ε. (A. 29)<br />

Now let C be an arbitrary continuous curve which intersects each circle of constant latitude at<br />

most once, i.e., C is strictly in - or decreasing.<br />

p<br />

B(θ 0 )<br />

C<br />

Figure A. 7: A strictly in - or decreasing curve C<br />

Let p be the point where the curve C intersects the latitude (A. 26),<br />

p = C ∩ B(θ 0 ), (A. 30)<br />

as can be seen in figure A. 7.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 193<br />

For every point s 1 on this curve north of p we have θ s1 < θ 0 , which means that according<br />

to (A. 24) it holds that µ(s 1 ) µ(s) for every point s ∈ B(θ 0 ). Consequently, it also holds that<br />

µ(θ s1 ) M (θ 0 ). (A. 31)<br />

Likewise, for all points s 2 of C south of B(θ 0 ) we see that<br />

µ(θ s2 ) m(θ 0 ). (A. 32)<br />

This reasoning holds no matter how close to B(θ 0 ) the points s 1 and s 2 are chosen.<br />

Because of (A. 29) we conclude that the value of µ, when traveling from north to south along<br />

the curve C, makes a discontinuous jump of at least<br />

M (θ 0 ) − m(θ 0 ) = ε (A. 33)<br />

to a lower value when passing B (θ 0 ). This conclusion applies to every continuous, strictly in -<br />

or decreasing curve intersecting B (θ 0 ), which means we can also choose the curve C to be a<br />

meridian,<br />

C = {s ∈ S 2 | ϕ s = ϕ 0 }, (A. 34)<br />

which is a great circle through the north pole having its axis t at the equator, see figure A. 8.<br />

p 0<br />

s ⊥ B(θ 0 )<br />

q<br />

θ q<br />

s<br />

p<br />

C<br />

t<br />

Figure A. 8: Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ )<br />

Let q ∈ C be orthogonal to the point of intersection p of C and B (θ 0 ), such that t, p and q<br />

are mutually orthogonal. Choose an orthogonal pair (s, s ⊥ ) ∈ C to be a rigid coordinate system.<br />

Rotating this system around axis t, we move s from north to south through point p, whereby,<br />

according to (A. 33), the value of µ jumps discontinuously with at least ε while crossing over the<br />

latitude of B(θ 0 ). The pair s and s ⊥ forming a rigid system, we know that<br />

µ(s) + µ(s ⊥ ) + µ(t) = 1, (A. 35)<br />

where µ(t) = 0 because the axis t is on the equator.


194 APPENDIX A. GLEASON’S THEOREM<br />

Therefore, if s moves southwards, passing through p, and simultaneously s ⊥ moves northwards,<br />

passing through q, the value of µ(s ⊥ ) also has to jump discontinuously. If µ(s) jumps with −ε,<br />

then µ(s ⊥ ) jumps with ε.<br />

Now choose another great circle C ′ with axis t ′ , which intersects B(θ 0 ) in p under a slightly tilted<br />

angle, as can be seen in figure A. 9.<br />

p 0<br />

q<br />

q ′ q ′′ B(θ 0 )<br />

t ′′<br />

C ′′<br />

p<br />

t ′<br />

C ′<br />

C<br />

t<br />

Figure A. 9: Great circle C and tilted great circles C ′ and C ′′<br />

For this great circle we can repeat the same argument, and conclude that for s ′ ∈ C ′ , while<br />

passing the latitude of B(θ 0 ), µ(s ′ ) makes a jump of at least ε, and an equally valued but opposite<br />

jump is made by µ (s ′⊥ ) in a point q ′ ∈ C ′ which is again perpendicular to p. Notice that,<br />

because C ′ is tilted with respect to C, θ(q) ≠ θ(q ′ ).<br />

This argument can be repeated endlessly, with great circles C ′′ , C ′′′ , . . . , C n , intersecting B(θ 0 )<br />

in p, always under different angles. We therefore find a series of points q, q ′ , q ′′ , . . . , q n where,<br />

in passing through one of them while traveling along one of the great circles through p, the value<br />

of µ jumps discontinuously. □<br />

Here we briefly pause from the proof of theorem 3 to prove a simple lemma.<br />

ACCESSORY LEMMA:<br />

Let C 1 and C 2 be two continuous curves on S 2 , intersecting in q, where q is not the<br />

most northern point of either curve. For some s ∈ C 1 , with s more northern than q,<br />

suppose that, traveling south, µ(s) makes a discontinuous jump of −ε < 0 in the point<br />

of intersection q.<br />

This means that it holds for all s ∈ C 1 and some constant a,<br />

and<br />

θ s < θ q ⇒ µ(s) a, (A. 36)<br />

θ s > θ q ⇒ µ(s) a − ε, (A. 37)<br />

and consequently, for all t ∈ C 2 , µ(t) also makes a discontinuous jump in q of at least ε.


A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 195<br />

Proof<br />

For every pair of points (s 1 , s 2 ) ∈ C 1 , where θ s1 < θ q and θ s2 > θ q , we can always find a<br />

pair (t 1 , t 2 ) ∈ C 2 , such that θ s1 < θ t1 < θ q and θ s2 > θ t2 > θ q , see figure A. 10.<br />

s 1<br />

t 1<br />

q<br />

s 2 t 2<br />

C 1<br />

C 2<br />

θ q<br />

Figure A. 10: Two continuous curves on S 2 , intersecting in q<br />

Using (A. 24), (A. 36) and (A. 37), we have for t ∈ C 2<br />

θ s < θ t < θ q ⇒ µ(s) µ(t) a, (A. 38)<br />

and<br />

θ s > θ t > θ q ⇒ µ(s) µ(t) a − ε. (A. 39)<br />

This holds no matter how close to q the points s and t are chosen, which proves the lemma. □<br />

Proof, second part<br />

Now we continue the proof of theorem 3. In the first part of the theorem we proved for the pair<br />

(s, s ⊥ ) that if µ jumps with ε in p, it also jumps with ε in q. The same rigidity holding for any<br />

pair (s i , s i⊥ ) ∈ C i , we concluded that µ jumps in every point q, q ′ , q ′′ , . . . , q n with at least ε.<br />

With the accessory lemma, we proved that, if µ makes a jump of at least ε at some point on one<br />

curve C, it does so on any curve C i through that point.<br />

Since we chose the directions q, q ′ , q ′′ , . . . , q n perpendicular to p, see figure A. 9, they all lie<br />

on C p , a great circle with axis p. Starting in its most northern point q, upon descending along this<br />

great circle C p towards the equator, µ(s) remains constant or decreases, as we showed by proving<br />

theorem 2.<br />

But according to the first part of this proof and the accessory lemma, upon descending along<br />

this great circle C p towards the equator, in each of the points q, q ′ , q ′′ , . . . , q n , µ jumps with<br />

at least −ε while passing their various latitudes. Since we can choose n arbitrary large, we can<br />

choose n to be larger than n > ε 1 , making the total jump nε > 1. This leads to µ acquiring values<br />

smaller than 0, which is contradictory to the requirement that 0 µ 1. We have to conclude<br />

that ε = 0, which yields M (θ 0 ) = m(θ 0 ).<br />

We proved that if on the surface of the unit sphere a horizontal circle B exists for which µ is not<br />

constant, then µ /∈ [0, 1], hence µ is constant on constant latitude and does not depend on ϕ,<br />

which proves theorem 3. □


196 APPENDIX A. GLEASON’S THEOREM<br />

A. 4 AN ANALYTIC LEMMA<br />

We have to take one more step to prove that µ = µ 0 , but first we prove a lemma using results<br />

from previous sections.<br />

LEMMA:<br />

The special measure µ 0 can be written as<br />

µ 0 (χ s ) = χ s , (A. 40)<br />

Proof<br />

The special measure (A. 6) can, as we saw in (A. 17), be written as<br />

µ 0 (P ) = Tr P 0 P = |⟨ψ | e 0 ⟩| 2 = cos 2 θ. (A. 41)<br />

As we proved that µ is a nonincreasing function in θ, and does not depend on ϕ, we can take µ<br />

to be a function of a function of θ, and to already make a connection with the analytic lemma<br />

of step 4 which will follow shortly, we choose this function to be the constant, nonincreasing<br />

function χ s : [0, 1 2 π] → [0, 1], χ(θ s) := cos 2 θ s , where θ s is the angle between the direction s<br />

and the north pole. In the next step we will show that this measure satisfies the requirements for µ<br />

to be a measure.<br />

For the special measure µ 0 (θ s ), with s representing an arbitrary P , (A. 41) now reads<br />

µ 0 (χ(θ s )) = cos 2 θ s (A. 42)<br />

which can be written as<br />

µ 0 (χ s ) = χ s . □ (A. 43)<br />

What is left for us to do is to see whether a measure exists, not equal to µ 0 , for which this does<br />

not hold for some P ∈ P (H), as was our assumption (A. 8) in A. 2. 1. This will be the final step,<br />

where, by proving the next theorem, we will see that such a measure does not exist.<br />

A. 4. 1 STEP 4<br />

THEOREM 4:<br />

The only form of µ satisfying (A. 24),<br />

is<br />

θ s < θ t ⇒ µ(s) µ(t), (A. 44)<br />

µ(χ s ) = χ s . (A. 45)


A. 4. AN ANALYTIC LEMMA 197<br />

To prove this theorem, we will use an analytic lemma given by Cooke, Keane and Moran (1985).<br />

But before we will do so, we make some observations.<br />

First, for any triple of mutually perpendicular directions (r, s, t) and some direction q it holds in<br />

general that<br />

cos 2 θ r + cos 2 θ s + cos 2 θ t = 1, (A. 46)<br />

where θ r is the angle between the direction q and axis r, and r corresponds to cos θ r , likewise for s<br />

and t. We can easily see that (A. 46) holds in general if we express the directions in the usual spherical<br />

coordinates,<br />

cos θ r = cos ϕ sin θ, cos θ s = sin ϕ sin θ, and cos θ t = cos θ, (A. 47)<br />

from which we readily know that their squares add up to 1.<br />

With χ(θ r ) = cos 2 θ r etc., we can write (A. 46) as<br />

χ r + χ s + χ t = 1. (A. 48)<br />

Second, for µ as a function of χ(θ s ), µ : [0, 1] → [0, 1], it holds that although µ is nonincreasing<br />

in θ, it is nondecreasing in χ s . The requirements for µ to be a measure, (A. 14) and (A. 15), can now<br />

be rewritten as<br />

µ(χ r ) + µ(χ s ) + µ(χ t ) = 1, (A. 49)<br />

µ(0) = 0 and µ(1) = 1. (A. 50)<br />

With these properties, µ equals the function f in the analytic lemma which now follows.<br />

ANALYTIC LEMMA:<br />

If f : [0, 1] → [0, 1] is a function such that<br />

(1) f (0) = 0,<br />

(2) f is nondecreasing, i.e., if a < b then f (a) f (b),<br />

(3) if a, b, c ∈ [0, 1] and a + b + c = 1, then f (a) + f (b) + f (c) = 1,<br />

then f is the identity function: f (a) = a for all a ∈ [0, 1].<br />

Proof<br />

Choosing c = 0, from (3) we have b = 1 − a, yielding<br />

f (a) = 1 − f (1 − a) (A. 51)<br />

for all values a ∈ [0, 1]. Next, choose c = 1 − (a + b),<br />

f (a) + f (b) = 1 − f (1 − (a + b) = 1 − (1 − f (a + b)) = f (a + b) (A. 52)<br />

for all a, b, a + b ∈ [0, 1].


198 APPENDIX A. GLEASON’S THEOREM<br />

Iteration of (A. 52) yields, for n ∈ N + ,<br />

nf (a) = f (na) for n a 1. (A. 53)<br />

Taking a = 1 n<br />

we see that<br />

( 1<br />

f =<br />

n)<br />

f (1)<br />

n<br />

and iterating again, we have<br />

or, indeed,<br />

( m<br />

)<br />

f = m n n<br />

= 1 , (A. 54)<br />

n<br />

for m, n ∈ N, m < n, (A. 55)<br />

f (a) = a ∀ a ∈ Q. (A. 56)<br />

From (2) we see that<br />

lim f (a) = sup f (a) = 0, (A. 57)<br />

a→0 a→0<br />

and, using again (A. 52),<br />

lim f (a + b) = f (b) ∀ 0 b 1. (A. 58)<br />

a→0<br />

Therefore, f is continuous, and<br />

f (a) = a ∀ a. □ (A. 59)<br />

A. 5 SUMMARY<br />

In this appendix we proved Gleason’s theorem for pure states, represented by extreme measures µ.<br />

In section A. 2 we proved that if Gleason’s theorem for pure states is true for any 3 - dimensional<br />

real Hilbert space, it is also true for any complex Hilbert space with dim H > 2. In A. 3. 1 we showed<br />

that µ is a nonincreasing function in θ, and in A. 3. 2 we proved that µ does not depend on ϕ.<br />

Finally, by proving the analytic lemma we showed that there can only be one form for the measure<br />

µ which satisfies these requirements for all P ∈ P (H) and that is the quantum mechanical one,<br />

i.e., in accordance with cos 2 θ.


WORKS CONSULTED<br />

Most subjects in these lecture notes are also found in Redhead (1987), Krips (1987), Hughes (1989),<br />

D’Espagnat (1989) and Bub (1997).<br />

Dickson (1998) is an accessible monograph.<br />

Jammer (1974) is a survey of the research in foundations of quantum mechanics in historical perspective<br />

from the beginnings of quantum mechanics until 1974. However, Jammer remains indispensable<br />

for every student seriously studying foundations of quantum mechanics.<br />

Bell (1987) contains his articles on quantum mechanics.<br />

Von Neumann’s Grundlagen (1932) is a masterpiece, which is still fully worth studying.<br />

Prugovečki (2006) is a modernized and more systematic version, but it evades subjects of interpretation<br />

and is mainly a mathematical reference book.<br />

Busch, Lahti and Mittelstaedt (1996) is a monograph on quantum mechanical measurement theory.<br />

Hooker (1975) is a collection of important articles of algebraic and logical signature.<br />

Wheeler and Zurek (1983) is an extensive collection of photocopies of important articles (EPR, Bohr,<br />

Bohm, Everett, etc.).<br />

Fine (1986) is the unequalled monograph on Einstein and quantum mechanics.<br />

Contributions to the research of foundations of quantum mechanics from Utrecht University are the<br />

work of Hilgevoord and Uffink and vice versa, of Dieks and Vermaas about the modal interpretation<br />

of quantum mechanics and Uffink’s thesis (1990) about uncertainty relations.


BIBLIOGRAPHY<br />

Albers, D.J., Alexanderson, G.L., Reid, C. (1990) More Mathematical People : Contemporary Conversations<br />

Boston: Harcourt Brace Jovanovich<br />

Araki, H., Yanase, M.M. (1960) ‘Measurement of Quantum Mechanical Operators’<br />

Physical Review 120 (2) pp. 622-626<br />

Aspect, A., Dalibard, J., Roger, G. (1982) ‘Experimental Test of Bell’s Inequalities Using Time -<br />

Varying Analyzers’<br />

Physical Review Letters 49 (25) pp. 1804-1807<br />

Belinfante, F.J. (1973) A Survey of Hidden - Variables Theories<br />

Oxford: Pergamon Press<br />

Bell, J.S. (1964) ‘On the Einstein Podolsky Rosen Paradox’<br />

Physics 1 (3) pp. 195-200, repr. in Wheeler and Zurek (1983)<br />

Bell, J.S. (1966) ‘On the Problem of Hidden Variables in Quantum Mechanics’<br />

Reviews of modern physics 38 pp. 447-452<br />

Bell, J.S. (1971) ‘Introduction to the hidden - variables question’<br />

In d’Espagnat (1971), repr. in Bell (1987)<br />

Bell, J.S. (1975) ‘The Theory of Local Beables’<br />

Presented at the sixth GIFT Seminar, Jaca, 2 - 7 June 1975, repr. in Bell (1987)<br />

Bell, J.S. (1982) ‘On the impossible pilot wave’<br />

Foundations of Physics 12 (10) pp. 989-999<br />

Bell, J.S. (1987) Speakable and Unspeakable in Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Bell, J.S. (1990) ‘Against measurement’<br />

Physics World (August) pp. 33-40<br />

Beltrametti, E.G., Cassinelli, G. (1981) The Logic of Quantum Mechanics<br />

Reading: Addison - Wesley Publishing Company<br />

Birkhoff, G., Von Neumann, J. (1936) ‘The Logic of Quantum Mechanics’<br />

The Annals of Mathematics, Second Series 37 (4) pp. 823-843<br />

Bohm, D.J. (1952) ‘A Suggested Interpretation of the Quantum Theory in Terms of “Hidden” Variables.<br />

I, II’<br />

Physical Review 85 (2) pp. 166-179, pp. 180-193


202 BIBLIOGRAPHY<br />

Bohm, D.J., Aharonov, Y. (1957) ‘Discussion of Experimental Proof for the Paradox of Einstein,<br />

Rosen, and Podolsky’<br />

Physical Review 108 (4) pp. 1070-1076<br />

Bohm, D.J. (1981) Wholeness and the implicate order<br />

London: Routledge & Kegan Paul<br />

Bohm, D.J., Peat, F.D. (1989) Science, order, and creativity<br />

London: Routledge<br />

Bohr, N.H.D. (1928) ‘The Quantum Postulate and the Recent Development of Atomic Theory’<br />

Nature 121 (3050) pp. 580-590<br />

Bohr, N.H.D. (1931) ‘Maxwell and Modern Theoretical Physics’<br />

Nature 128 (3234) pp. 691-692<br />

Bohr, N.H.D. (1934) Atomic Theory and the Description of Nature<br />

New York: The Macmillan Company<br />

Bohr, N.H.D. (1935a) ‘Quantum Mechanics and Physical Reality’<br />

Nature 136 p. 65<br />

Bohr, N.H.D. (1935b) ‘Can Quantum - Mechanical Description of Physical Reality Be Considered<br />

Complete?’<br />

Physical Review 48 (8) pp. 696-702<br />

Bohr, N.H.D. (1939) ‘The causality problem in atomic physics’<br />

In Bohr, N.H.D. (1939) New Theories in Physics<br />

Paris: International Institute of Intellectual Co - operation<br />

Bohr, N.H.D. (1947) ‘Newton’s Principles and Modern Atomic Mechanics’<br />

In The Royal Society of London (1947) Newton Tercentenary Celebrations. 15 - 19 July 1946<br />

Cambridge: Cambridge University Press<br />

Bohr, N.H.D. (1949) ‘Discussion with Einstein on epistemological problems in atomic physics’<br />

In Schilpp (1949), repr. in Wheeler and Zurek (1983)<br />

Bopp, F.A. (1947) ‘Quantenmechanische Statistik und Korrelationsrechnung’<br />

Zeitschrift für Naturforschung A 2 pp. 202-216<br />

Born, M., Jordan, P., (1925) ‘Zur Quantenmechanik’<br />

Zeitschrift fur Physik 34 (1) pp. 858-888<br />

Eng. tr. (abridged): ‘On Quantum mechanics’<br />

In Van der Waerden (1967)<br />

Bródy F., Vámos, T. (eds) (1995) The Neumann Compendium<br />

Singapore: World Scientific Publishing Company


BIBLIOGRAPHY 203<br />

Broglie, L.V.P.R. de (1928) ‘La nouvelle dynamique des quanta’<br />

La Commission Administrative de l’Institut Internale de Physique Solvay (1928) Électrons et<br />

Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24<br />

au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay<br />

Paris: Gauthier - Villars<br />

Eng. tr.: ’The new dynamics of quanta’<br />

In Bacciagaluppi, G., Valentini, A. (2009) Quantum Theory at the Crossroads : Reconsidering<br />

the 1927 Solvay Conference<br />

Cambridge: Cambridge University Press<br />

Bub, J., Clifton, R.K. (1996) ‘A Uniqueness Theorem for ‘No Collapse’ Interpretations of Quantum<br />

Mechanics’<br />

Studies in the History and Philosophy of Modern Physics B 27 (2) pp. 181-219<br />

Bub, J., Clifton, R.K., Goldstein, S. (2000) ‘Revised Proof of the Uniqueness Theorem for ‘No Collapse’<br />

Interpretations of Quantum Mechanics’<br />

Studies in the History and Philosophy of Modern Physics B 31 pp. 95-98<br />

Bub, J. (1997) Interpreting the Quantum World<br />

Cambridge: Cambridge University Press<br />

Busch, P.S., Grabowski, M.P., and Lahti, P.J. (1995) Operational Quantum physics<br />

Berlin: Springer - Verlag<br />

Busch, P., Lahti, P.J., Mittelstaedt, P. (1991) The Quantum Theory of Measurement<br />

Berlin: Springer - Verlag<br />

Capasso, V., Fortunato, D., Selleri, F. (1973)‘Sensitive Observables of Quantum Mechanics’<br />

International Journal of Theoretical Physics 7 (5) pp. 319-326<br />

Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A. (1969) ‘Proposed Experiment to test Local Hidden<br />

- Variable Theories’<br />

Physical Review Letters 23 (15) pp. 880-884<br />

Clifton, R.K., Butterfield, J.N., Redhead, M.L.G. (1990) ‘Nonlocal Influences and Possible Worlds –<br />

A Stapp in the Wrong Direction’<br />

British Journal for the Philosophy of Science 41 (1) pp. 5-58<br />

Condon, E.U. (1929) ‘Remarks on uncertainty principles’<br />

Science 69 pp. 573-574<br />

Cooke, R.M, Hilgevoord, J. (1979) ‘Correspondence, Equivalence and Completeness’<br />

Epistemological Letters (March) pp. 42-54<br />

Cooke, R.M., Keane, M.S., Moran, W. (1985) ‘An elementary proof of Gleason’s theorem’<br />

Mathematical Proceedings of the Cambridge Philosophical Society 98 pp. 117-128<br />

Cushing, J.T. (1994) Quantum Mechanics : Historical Contingency and the Copenhagen Hegemony<br />

Chicago: The University of Chicago Press


204 BIBLIOGRAPHY<br />

Daneri, A., Loinger, A., Prosperi, G.M. (1962) ‘Quantum Theory of Measurement and Ergodicity<br />

Conditions’<br />

Nuclear Physics 33 (1962) pp. 297-319<br />

De Muynck, W.M. (1986) ‘The Bell Inequalities and their Irrelevance to the Problem of Locality in<br />

Quantum Mechanics’<br />

Physics Letters A 114 (2) pp. 65-67<br />

De Muynck, W.M. (1996) ‘Can We Escape from Bell’s Conclusion that Quantum Mechanics Describes<br />

a Non - Local Reality?’<br />

Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of<br />

Modern Physics 27 (3) pp. 315-330<br />

DeWitt, B.S. (1970) ‘Quantum mechanics and reality’<br />

Physics Today 23 (9) pp. 30-40<br />

DeWitt, B.S. (1971) ‘The Many - Universes Interpretation of Quantum Mechanics’<br />

In d’Espagnat (1971), repr. in DeWitt and Graham (1973)<br />

DeWitt, B.S., Graham, R.N. (eds) (1973) The Many - Worlds Interpretation of Quantum Mechanics<br />

Princeton: Princeton University Press<br />

Dickson W.M. (1998) Quantum Chance and Non - locality : Probability and Non - locality in the<br />

Interpretations of Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Dieks, D.G.B.J. (1983) ‘Stochastic Locality and Conservation Laws’<br />

Lettere al Nuovo Cimento 38 (13) pp. 443-447<br />

Dieks, D.G.B.J. (1989) ‘Resolution of the Measurement Problem through Decoherence of the Quantum<br />

State’<br />

Physics Letters A 142 (8,9) pp. 439-446<br />

Dieks, D.G.B.J. and Vermaas, P.E. (eds) (1998) The Modal Interpretation of Quantum Mechanics<br />

Dordrecht: Kluwer Academic Publishers<br />

Dirac, P.A.M., (1925) ‘The Fundamental Equations of Quantum Mechanics’<br />

Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical<br />

and Physical Character 109 (752) pp. 642-653<br />

Dirac, P.A.M. (1958) The Principles of Quantum Mechanics<br />

Oxford: at the Clarendon Press<br />

Dirac, P.A.M. (1963) ‘The Evolution of the Physicist’s Picture of Nature’<br />

Scientific American 208 (5) pp. 45-53<br />

Eberhard, P.H. (1977) ‘Bell’s Theorem without Hidden Variables’<br />

Il Nuovo Cimento B 38 (1) pp. 75-80


BIBLIOGRAPHY 205<br />

Einstein, A. (1921) ‘Geometrie und Erfahrung’<br />

Sitzungsberichte der Preussischen Akademie der Wissenschaften pp. 123-130<br />

Eng. tr.: Bargmann, S. (transl) ‘Geometry and Experience’<br />

In Janssen, M., Schulmann, R., Illy, J., Lehner, C., Buchwald, D. (eds) (2002) The Collected<br />

Papers of Albert Einstein, Volume 7 : The Berlin Years: Writings, 1918 - 1921<br />

Princeton: Princeton University Press<br />

Einstein, A. (1934) Mein Weltbild<br />

Amsterdam: Querido Verlag<br />

Eng. tr.: Bargmann, S. (transl), Seelig, C. (ed) (1954) Ideas and opinions<br />

New York: Bonanza Books<br />

Einstein, A., Podolsky, B., Rosen, N. (1935) ‘Can Quantum - Mechanical Description of Physical<br />

Reality Be Considered Complete?’<br />

Physical Review 47 (10) pp. 777-780<br />

Einstein, A., Born, M., Born, H. (1971) The Born - Einstein letters : correspondence between Albert<br />

Einstein and Max and Hedwig Born from 1916 to 1955 with commentaries by Max Born<br />

London: The Macmillan Press<br />

Espagnat, B. d’ (ed) (1971) Foundations of Quantum Mechanics : Proceedings of the International<br />

School of Physics ”Enrico Fermi”, held at Varenna, 29th June-11th July, 1970, Course IL<br />

New York: Academic Press<br />

Espagnat, B. d’ (1989) Conceptual Foundations Of Quantum Mechanics<br />

New York: Perseus Books<br />

Everett, H. III (1957) ‘The Theory of the Universal Wave Function’<br />

In DeWitt and Graham (1973)<br />

Everett, H. III (1957) “‘Relative State” Formulation of Quantum Mechanics’<br />

Reviews of Modern Physics 29 (3) pp. 454-462<br />

Fine, A.I. (1982) ‘Hidden Variables, Joint Probability, and the Bell Inequalities’<br />

Physical Review Letters 48 (5) pp. 291-295<br />

Fine, A. (1986) The shaky game : Einstein, realism and the quantum theory<br />

Chicago: University of Chicago Press<br />

Folse, H.J. (1985) The philosophy of Niels Bohr : the framework of complimentarity<br />

Amsterdam: North - Holland Physics Publishing<br />

Fraassen, B.C. van (1973) ‘Semantic Analysis of Quantum Logic’<br />

In Hooker, C.A. (ed) (1973) Contemporary Research in the Foundations and Philosophy of<br />

Quantum Theory<br />

Dordrecht: D. Reidel Publishing Company<br />

Fraassen, B.C. van (1979) ‘Hidden Variables and the Modal Interpretation of Quantum Theory’<br />

Synthese 42 (1) pp. 155-165


206 BIBLIOGRAPHY<br />

Frank, P.G. (1949) Modern Science and Its Philosophy<br />

Cambridge: Harvard University Press<br />

Freedman, S.J., Clauser, J.F. (1972) ‘Experimental Test of Local Hidden - Variable Theories’<br />

Physical Review Letters 28 (14) pp. 938-941<br />

Ghirardi, G.C., Rimini, A., Weber, T. (1980) ‘A General Argument against Superluminal Transmission<br />

through the Quantum Mechanical Measurement Process’<br />

Lettere al Nuovo Cimento 27 (10) pp. 293-298<br />

Ghirardi, G.C., Rimini, A., Weber, T. (1986) ‘Unified dynamics for microscopic and macroscopic<br />

systems’<br />

Physical Review D 34 (2) pp. 470-491<br />

Gleason A.M., (1957) ‘Measures on the Closed Subspaces of a Hilbert space’<br />

Journal of Mathematics and Mechanics 6 pp. 885-893<br />

Gottfried, K. (1989) ‘Does Quantum Mechanics describe the Collapse of the Wavefunction?’<br />

Unpublished contribution to the 1989 Conference, International School of History of Science,<br />

Erice, Italy, 5 - 14 August<br />

Greenberger, D.M., Horne, M.A., Zeilinger, A. (1989) Going Beyond Bell’s Theorem<br />

In Kafatos, M.C. (ed) (1989) Bell’s Theorem, Quantum Theory and Conceptions of the Universe<br />

Dordrecht: Kluwer Academic Publishers<br />

http://arxiv.org/abs/0712.0921<br />

Groenewold, H.J. (1946) ‘On the Principles of Elementary Quantum Mechanics’<br />

Physica 12 (7) pp. 405-460<br />

Haag, R. (1990) ‘Fundamental Irreversibility and the Concept of Events’<br />

Communications in Mathematical Physics 132 pp. 245-251<br />

Healey, R.A. (1989) The philosophy of quantum mechanics : An interactive interpretation<br />

Cambridge: Cambridge University Press<br />

Heisenberg, W. (1925) ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />

Beziehungen’<br />

Zeitschrift für Physik 33 (1) pp. 879-893<br />

Eng. tr.: ‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’<br />

In Van der Waerden, B.L. (1967)<br />

Heisenberg, W.K. (1927) ‘Über den anschaulichen Inhalt der quantentheoretischen Kinematik und<br />

Mechanik’<br />

Zeitschrift für Physik 43 (3/4) pp. 172-198<br />

Eng. tr.: ‘The physical content of quantum kinematics and mechanics’<br />

In Wheeler and Zurek (1983)


BIBLIOGRAPHY 207<br />

Heisenberg, W.K., (1930) Die Physikalischen Prinzipien der Quantentheorie<br />

Leipzig: Verlag von S. Hirzel<br />

Eng. tr.: Eckart, C., Hoyt, F.C. (transl) (1930) The Physical Principles of Quantum Theory<br />

New York: Dover Publications<br />

Heisenberg, W. (1963) Niels Bohr Library and Archives<br />

Interview with Werner Heisenberg by T. S. Kuhn at the Max Planck Institute, Munich, Germany,<br />

February 25. Transcript Session VIII<br />

http://www.aip.org/history/ohilist/4661underscore8.html<br />

Heitler, W.H. (1970) Der Mensch und die naturwissenschaftliche Erkenntniss<br />

Braunschweig: Friedrich Vieweg & Sohn Verlagsgesellschaft<br />

Hey, T., Walters, P. (2003) The New Quantum Universe<br />

Cambridge: Cambridge University Press<br />

Hilgevoord, J., Uffink, J.B.M. (1988) ‘The mathematical expression of the uncertainty principle’<br />

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1988) Microphysical Reality and Quantum<br />

Formalism. Volume I<br />

Dordrecht: Kluwer Academic Publishers<br />

Hilgevoord, J., Uffink, J.B.M. (1990) ‘A new view on the uncertainty principle’<br />

In Miller A.I. (ed) (1990) Sixty - Two years of Uncertainty : Historical, Philosophical and<br />

Physical Inquiries into the Foundations of Quantum Mechanics<br />

New York: Plenum Press<br />

Hilgevoord, J. (2002) ‘Time in quantum mechanics’<br />

American Journal of Physics 70 (3) pp. 301-306<br />

Holevo, A.S. (1982) Probabilistic and Statistical Aspects of Quantum Theory<br />

Amsterdam: North - Holland Publishing Company<br />

Holland, P.R. (1993) The Quantum Theory of Motion : An Account of the de Broglie - Bohm Causal<br />

Interpretation of Quantum Mechanics<br />

Cambridge: Cambridge University Press<br />

Home, D., Selleri, F. (1991) ‘Bell’s Theorem and the EPR Paradox’<br />

La Rivista del Nuovo Cimento 14 (9) pp. 1-95<br />

’t Hooft, G. (1997) In search of the ultimate building blocks<br />

Cambridge: Cambridge University Press<br />

Hooker, C.A. (ed) (1975) The Logico - Algebraic Approach to Quantum Mechanics. Volume I: the<br />

Historical Evolution<br />

Dordrecht: D. Reidel Publishing Company<br />

Hughes, R.I.G. (1989) The Structure and Interpretation of Quantum Mechanics<br />

Cambridge: Harvard University Press


208 BIBLIOGRAPHY<br />

Isham, C.J. (1995) Lectures on Quantum Theory : Mathematical and Structural Foundations<br />

River Edge: Imperial College Press<br />

Jacques, V., Wu, E., Grosshans, F., Treussart, F., Grangier, P., Aspect, A., Roch, J-F. (2007) ‘Experimental<br />

realization of Wheeler’s delayed - choice gedanken experiment’<br />

Science 315 (5814) pp. 966-968<br />

Jammer, M. (1974) The Philosophy of Quantum Mechanics : The Interpretations of Quantum Mechanics<br />

in Historical Perspective<br />

New York: John Wiley & Sons<br />

Jammer, M. (1990) ‘John Stewart Bell and His Work - On the Occasion of His Sixtieth Birthday’<br />

Foundations of Physics 20 (10) pp. 1139-1145<br />

Jammer, M., (1992) ‘John Stewart Bell and the Debate on the Significance of His Contributions to<br />

the Foundations of Quantum Mechanics’<br />

In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1992) International Conference on Bell’s<br />

Theorem and the Foundations of Modern Physics<br />

Singapore: World Scientific Publishing<br />

Jarrett, J.P. (1984) ‘On the Physical Significance of the Locality Conditions in the Bell Arguments’<br />

Noûs 18 (4) pp. 569-589<br />

Jauch, J.M. (1968) Foundations of Quantum Mechanics<br />

Reading: Addison - Wesley Educational Publishers<br />

Kalckar, J. (ed) (1996) Niels Bohr - Collected Works : Volume 7 - Foundations of Quantum Physics II<br />

(1933 - 1958)<br />

Amsterdam: Elsevier Science<br />

Kampen, N.G. van (1988) ‘Ten Theorems about Quantum Mechanical Measurements’<br />

Physica A 153 pp. 97-113<br />

Kennard, E.H. (1927) ‘Zur Quantenmechanik einfacher Bewegungstypen’<br />

Zeitschrift für Physik 44 (4/5) pp. 326-352<br />

Kochen, S., Specker, E.P. (1967) ‘The Problem of Hidden Variables in Quantum Mechanics’<br />

Journal of Mathematics and Mechanics 17 (1) pp. 59-87<br />

Kochen, S. (1985) ‘A New Interpretation of Quantum Mechanics’<br />

In Lahti, P.J., Mittelstaedt, P. (eds) (1985) Symposium on the foundations of modern physics<br />

1985 : 50 years of the Einstein - Podolsky - Rosen Gedankenexperiment<br />

Singapore: World Scientific Publishing Company<br />

Krips, H. (1987) The Metaphysics of Quantum Theory<br />

Oxford: Clarendon Press<br />

Landau, L.D., Lifshitz, E.M. (1958) Quantum Mechanics : Non - Relativistic theory<br />

London: Pergamon Press


BIBLIOGRAPHY 209<br />

Landau, H.J., Pollack, H.O. (1961) ‘Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty<br />

- II’<br />

The Bell System Technical Journal 40 pp. 65-84<br />

London, F., Bauer, E. (1939) La Théorie de l’Observation en Mécanique Quantique<br />

Paris: Hermann<br />

Eng. tr.: ‘The Theory of Observation in Quantum Mechanics’<br />

In Wheeler and Zurek (1983)<br />

Lüders, G., (1951) ‘Über die Zustandsänderung durch den Meßprozeß’<br />

Annalen der Physik 443 (5 - 8) pp. 322-328<br />

Eng. tr.: Kirkpatrick, K.A. (transl) (2006) ‘Concerning the state - change due to the measurement<br />

process’<br />

Annalen der Physik 15 (9) pp. 663-670<br />

Maczynski, M.J. (1971) ‘Boolean Properties of Observables in Axiomatic Quantum Mechanics’<br />

Reports on Mathematical Physics 2 (2) pp. 135-150<br />

Mermin, N.D. (1993) ‘Hidden variables and the two theorems of John Bell’<br />

Reviews of Modern Physics 65 (3) pp. 803-815<br />

Meyer, D.A. (1999) ‘Finite precision measurement nullifies the Kochen - Specker theorem’<br />

Physical Review Letters 83 pp. 3751-3754<br />

Meyer, D.A. (2003) ‘Coloring, quantum mechanics, and Euclid’<br />

Pdf file: math.ucsd.edu/ dmeyer/research/talks/cqmE.pdf<br />

Miller, A.I. (1990) (ed) Sixty - two Years of Uncertainty : Historical, Philosophical and Physical<br />

Inquiries into the Foundations of Quantum Mechanics<br />

New York: Plenum Press<br />

Miller, W.A., Wheeler, J.A. (1984) ‘Delayed - Choice Experiments and Bohr’s Elementary Quantum<br />

Phenomenon’<br />

In Nakajima, S., Murayama, Y., Tonomura, A. (eds) (1996) Foundations of Quantum Mechanics<br />

in the Light of New Technology<br />

Singapore: World Scientific Publishing<br />

Muller, F.A. (1997a) ‘The Equivalence Myth of Quantum Mechanics–Part I’<br />

Studies in History and Philosophy of Modern Physics 28 (1) pp. 35-61<br />

(1997b) ‘Part II’<br />

ibid. 28 (2) pp. 219-247<br />

(1999) ‘(Addendum)’<br />

ibid. 30 (4) pp. 543-545<br />

Neumann, J. Von (1932) Mathematische Grundlagen der Quantenmechanik<br />

Berlin: Verlag von Julius Springer<br />

Eng. tr.: Beyer, R.T. (transl) (1955) The Mathematical Foundations of Quantum Mechanics<br />

Princeton: Princeton University Press


210 BIBLIOGRAPHY<br />

Pauli, W.E. (1933) Die allgemeinen Prinzipien der Wellenmechanik<br />

Berlin: Verlag von Julius Springer<br />

Eng. tr.: (1950) The General principles of wave mechanics<br />

Urbana - Champaign: University of Illinois Press<br />

Penrose, R. (1996) ‘On Gravity’s Role in Quantum State Reduction’<br />

General Relativity and Gravitation 28 (5) pp. 581-600<br />

Peres, A. (1993) Quantum Theory: Concepts and Methods<br />

Dordrecht: Kluwer Academic Publishers<br />

Petersen, A. (1963) ‘The Philosophy of Niels Bohr’<br />

Bulletin of the Atomic Scientists 19 (7) pp. 8-14<br />

Petersen, A. (1968) Quantum Physics and the Philosophical Tradition<br />

Cambridge: M.I.T. Press<br />

Piron, C. (1976) Foundations of Quantum Physics<br />

Reading: W.A. Benjamin<br />

Prugovečki, E. (2006) Quantum Mechanics in Hilbert Space<br />

Mineola: Dover Publications<br />

Przibram, K. (ed) (1963) Briefe zur Wellenmechanik : Schrödinger, Planck, Einstein, Lorentz<br />

Wien: Springer - Verlag<br />

Eng. tr.: Przibram, K. (ed) (1963) Letters on wave mechanics : Schrödinger, Planck, Einstein,<br />

Lorentz<br />

New York: Philosophical Library<br />

Rauch, H., Werner, S.A. (2000) Neutron Interferometry : Lessons in Experimental Quantum Mechanics<br />

Oxford: Oxford University Press<br />

Redhead, M.L.G. (1987) Incompleteness, Nonlocality and Realism : A Prolegomenon to the Philosophy<br />

of Quantum Mechanics<br />

Oxford: Clarendon Press<br />

Robertson, H.P. (1929) ‘The Uncertainty Principle’<br />

Physical Review 34 p. 163<br />

Scheibe, E., Sykes, J.B., (transl) (1973) The Logical Analysis of Quantum Mechanics<br />

Oxford: Pergamon Press<br />

Schiff, L.I. (1949) Quantum Mechanics<br />

New York: McGraw - Hill<br />

Schilpp, P.A. (ed) (1949) Albert Einstein : Philosopher - Scientist<br />

Evanston: The Library of Living Philosophers


BIBLIOGRAPHY 211<br />

Schmidt, E. (1907) ‘Zur Theorie der linearen und nichtlinearen Integralgleichungen. I. Teil’<br />

Mathematische Annalen 63 pp. 433-476<br />

(1907) ‘Zweite Abhandlung’<br />

ibid. 64 pp. 161-174<br />

(1907) ‘III. Teil’<br />

ibid. 65 pp. 370-399<br />

Schrödinger, E.R.J.A. (1926) ‘An Undulatory Theory of the Mechanics of Atoms and Molecules’<br />

The Physical Review 28 (6) pp. 1049-1070<br />

Schrödinger, E.R.J.A. (1930) ‘Zum Heisenbergschen Unschärfeprinzip’<br />

Sitzungsberichte der Preußischen Akademie der Wissenschaften. Physikalisch - mathematische<br />

Klasse pp. 296-303<br />

Schrödinger, E.R.J.A. (1935a) ‘Discussion of Probability Relations between Separated Systems’<br />

Mathematical Proceedings of the Cambridge Philosophical Society 31 (4) pp. 555-563<br />

Schrödinger, E.R.J.A. (1935b) ‘Die gegenwärtige Situation in der Quantenmechanik’<br />

Naturwissenschaften 23 (48) pp. 807-812, (49) pp. 823-828, (50) pp. 844-849<br />

Eng. tr.: Trimmer, J.D. (transl) (1980) ‘The Present Situation in Quantum Mechanics: A Translation<br />

of Schrödinger’s “Cat Paradox”’<br />

Proceedings of the American Philosophical Society 124 (5) pp. 323-338<br />

Repr. in Wheeler and Zurek (1983)<br />

Shimony, A. (1984) ‘Controllable and Uncontrollable Non - Locality’<br />

In Kamefuchi, S., et al. (eds) Proceedings of the International Symposium : Foundations of<br />

Quantum Mechanics in the Light of New Technology<br />

Tokyo: Physical Society of Japan<br />

Shimony, A. (1989) ‘Search for a Worldview Which Can Accommodate Our Knowledge of Microphysics’<br />

In Cushing, J.T., McMullin, E. (eds) Philosophical Consequences of Quantum Theory : Reflections<br />

on Bell’s Theorem<br />

Notre Dame: University of Notre Dame Press<br />

Shimony, A. (1995) ‘Degree of entanglement’<br />

In Greenberger, D.M., Zeilinger, A. (eds) Fundamental Problems in Quantum Theory : In<br />

Honor of Professor John A. Wheeler<br />

New York: New York Academy of Sciences<br />

Stapp, H.P. (1975) ‘Bell’s Theorem and World Process’<br />

Il Nuovo Cimento B 29 (2) pp. 270-276<br />

Stapp, H.P. (1977) ‘Are Superluminal Connections Necessary?’<br />

Il Nuovo Cimento B 40 (1) pp. 191-205<br />

Stone, M.H. (1932) ‘On One - Parameter Unitary Groups in Hilbert Space’<br />

The Annals of Mathematics, Second Series 33 (3) pp. 643-648


212 BIBLIOGRAPHY<br />

Suppes, P., Zanotti, M. (1976) ‘On the Determinism of Hidden Variable Theories with Strict Correlation<br />

and Conditional Statistical Independence of Observables’<br />

In Suppes, P. (ed) Logic and Probability in Quantum Mechanics<br />

Dordrecht: D. Reidel Publishing Company<br />

Svetlichny, G., Redhead, M.L.G., Brown, H.R., Butterfield, J. (1988) ‘Do the Bell Inequalities Require<br />

the Existence of Joint Probability Distributions?’<br />

Philosophy of Science 55 (3) pp. 387-401<br />

Tkadlec, J. (2000) ‘Diagrams of Kochen - Specker Type Constructions’<br />

International Journal of Theoretical Physics 39 (3) pp. 921-926<br />

Uffink, J.B.M., Hilgevoord, J. (1985) ‘Uncertainty Principle and Uncertainty Relations’<br />

Foundations of Physics 15 (9) pp. 925-944<br />

Uffink, J.B.M., Hilgevoord, J. (1988) ‘Interference and Distinguishability in Quantum Mechanics’<br />

Physica B 151 pp. 309-313<br />

Uffink, J.B.M. (1990) Measures of Uncertainty and the Uncertainty Principle<br />

Utrecht: Rijksuniversiteit te Utrecht, Dissertation<br />

Vermaas, P.E., Dieks, D.G.B.J. (1995) ‘The Modal Interpretation of Quantum Mechanics and its<br />

Generalization to Density Operators’<br />

Foundations of Physics 25 (1) pp. 145-158<br />

Vermaas, P.E. (1999) A Philosopher’s Understanding of Quantum Mechanics : Possibilities and Impossibilities<br />

of a Modal Interpretation<br />

Cambridge: Cambridge University Press<br />

Vigier, J.-P., Dewdney, C., Holland, P.R., Kyprianidis, A. (1987) ‘Causal particle trajectories and the<br />

interpretation of quantum mechanics’<br />

In Hiley, B.J., Peat, F.D. (eds) (1987) Quantum implications : essays in honour of David Bohm<br />

London: Routledge & Kegan Paul<br />

Waerden, B.L. Van der (ed) (1967) Sources of Quantum mechanics<br />

Amsterdam: North - Holland Publishing Company<br />

Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., Zeilinger, A. (1998) ‘Violation of Bell’s Inequality<br />

under Strict Einstein Locality Conditions’<br />

Physical Review Letters 81 (23) pp. 5039-5043<br />

Wheatley, M.J. (2001) Leadership and the New Science : Discovering Order in a Chaotic World<br />

San Francisco: Berrett - Koehler Publishers<br />

Wheeler, J.A. (1957) ‘Assessment of Everett’s “Relative State” Formulation of Quantum Theory’<br />

Reviews of Modern Physics 29 (3) pp. 463-465<br />

Wheeler, J.A., Zurek, W.H. (eds) (1983) Quantum Theory and Measurement<br />

Princeton: Princeton University Press


BIBLIOGRAPHY 213<br />

Wick, G.C., Wightman, A.S., Wigner E.P. (1952)‘The Intrinsic Parity of Elementary Particles’<br />

Physical Review 88 (1) pp. 101-105<br />

Wigner, E.P. (1952) ‘Die Messung quantenmechanischer Operatoren’<br />

Zeitschrift für Physik 133 pp. 101-108<br />

Wigner, E.P. ‘Remarks on the mind - body question’<br />

In Good, I.J. (1962) The scientist speculates : an anthology of partly - baked ideas<br />

London: Heinemann<br />

Repr in Wheeler and Zurek (1983)<br />

Wigner, E.P. (1963) ‘The problem of measurement’<br />

American Journal of Physics 31 (6) pp. 6-15<br />

Wigner, E.P. (1970) ‘On Hidden Variables and Quantum Mechanical Probabilities’<br />

Americal Journal of Physics 38 (8) pp. 1005-1009<br />

Wigner, E.P. (1983) ‘Interpretation of Quantum Mechanics’<br />

In Wheeler and Zurek (1983)<br />

Zukav, G. (1984) The Dancing Wu Li Masters : An Overview of the New Physics<br />

New York: Bantam Books<br />

Zurek, W.H. (1981) ‘Pointer basis of quantum apparatus: Into what mixture does the wave packet<br />

collapse?’<br />

Physical Review D 24 (6) pp. 1516-1525<br />

Zurek, W.H. (1982) ‘Environment - induced superselection rules’<br />

Physical Review D 26 (8) pp. 1862-1880

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!