FOUNDATIONS OF QUANTUM MECHANICS
FOUNDATIONS OF QUANTUM MECHANICS FOUNDATIONS OF QUANTUM MECHANICS
FOUNDATIONS OF QUANTUM MECHANICS JOS UFFINK INSTITUTE FOR HISTORY AND FOUNDATIONS OF SCIENCE UTRECHT UNIVERSITY SEPTEMBER 2010
- Page 2 and 3: PREFACE These lecture notes serve a
- Page 4 and 5: III. 6. 1 Spin 1/2 and rotations in
- Page 6 and 7: VIII. 4. 7 Modal interpretation . .
- Page 8 and 9: VII. 5 Unit spheres for a n , b n a
- Page 10 and 11: 8 CHAPTER I. CONCEPTUAL PROBLEMS be
- Page 12 and 13: 10 CHAPTER I. CONCEPTUAL PROBLEMS t
- Page 14 and 15: 12 CHAPTER I. CONCEPTUAL PROBLEMS t
- Page 16 and 17: 14 CHAPTER I. CONCEPTUAL PROBLEMS g
- Page 18 and 19: 16 CHAPTER I. CONCEPTUAL PROBLEMS E
- Page 20 and 21: 18 CHAPTER II. THE FORMALISM and th
- Page 22 and 23: 20 CHAPTER II. THE FORMALISM With (
- Page 24 and 25: 22 CHAPTER II. THE FORMALISM EXERCI
- Page 26 and 27: 24 CHAPTER II. THE FORMALISM A set
- Page 28 and 29: 26 CHAPTER II. THE FORMALISM can al
- Page 30 and 31: 28 CHAPTER II. THE FORMALISM for wh
- Page 32 and 33: 30 CHAPTER II. THE FORMALISM Thus,
- Page 34 and 35: 32 CHAPTER II. THE FORMALISM (i) Th
- Page 36 and 37: 34 CHAPTER II. THE FORMALISM Finall
- Page 38 and 39: 36 CHAPTER II. THE FORMALISM The pr
- Page 40 and 41: 38 CHAPTER II. THE FORMALISM To int
- Page 42 and 43: 40 CHAPTER II. THE FORMALISM which,
- Page 44 and 45: 42 CHAPTER III. THE POSTULATES wher
- Page 46 and 47: 44 CHAPTER III. THE POSTULATES corr
- Page 48 and 49: 46 CHAPTER III. THE POSTULATES A st
- Page 50 and 51: 48 CHAPTER III. THE POSTULATES Here
<strong>FOUNDATIONS</strong><br />
<strong>OF</strong><br />
<strong>QUANTUM</strong> <strong>MECHANICS</strong><br />
JOS UFFINK<br />
INSTITUTE FOR HISTORY AND <strong>FOUNDATIONS</strong><br />
<strong>OF</strong><br />
SCIENCE<br />
UTRECHT UNIVERSITY<br />
SEPTEMBER 2010
PREFACE<br />
These lecture notes serve as a support for the course on Foundations of Quantum Mechanics, provided<br />
by the Institute for History and Foundations of Science of the University of Utrecht. Although<br />
the text has been revised repeatedly, efforts to improve can sometimes bring along new imperfections,<br />
making revision a never-ending process. The current version, the 11 th , is slightly modified<br />
with respect to the previous one. Many thanks are due to Anne van Weerden for help in the English<br />
translation.<br />
Remarks and comments remain very welcome.<br />
Jos Uffink<br />
Utrecht, August 2010
CONTENTS<br />
I CONCEPTUAL PROBLEMS 7<br />
I. 1 Introduction . . . . . . . . . . . . . . . 7<br />
I. 2 Incompleteness and locality . . . . . . . . . . . 11<br />
II THE FORMALISM 17<br />
II. 1 Finite - dimensional Hilbert spaces . . . . . . . . . . 17<br />
II. 2 Operators . . . . . . . . . . . . . . . . 20<br />
II. 3 Eigenvalue problem and spectral theorem . . . . . . . . 24<br />
II. 3. 1 Appendix . . . . . . . . . . . . . . . 26<br />
II. 4 Functions of normal operators . . . . . . . . . . . 27<br />
II. 5 Direct sum and direct product . . . . . . . . . . . 30<br />
II. 5. 1 Direct sum . . . . . . . . . . . . . . 30<br />
II. 5. 2 Direct product . . . . . . . . . . . . . 31<br />
II. 6 Addendum: Infinite - dimensional Hilbert spaces . . . . . . 34<br />
II. 6. 1 The structure of vector spaces . . . . . . . . . . 34<br />
II. 6. 2 Operators . . . . . . . . . . . . . . . 36<br />
II. 6. 2. 1 Unbounded operators . . . . . . . . . . 37<br />
II. 6. 2. 2 Continuous spectra . . . . . . . . . . . 38<br />
II. 6. 2. 3 Spectral theorem . . . . . . . . . . . . 39<br />
II. 6. 3 Dirac . . . . . . . . . . . . . . . . 40<br />
II. 6. 4 Summary . . . . . . . . . . . . . . . 40<br />
III THE POSTULATES 41<br />
III. 1 Von Neumann’s postulates . . . . . . . . . . . . 41<br />
III. 2 Pure and mixed states . . . . . . . . . . . . . 45<br />
III. 3 The interpretation of mixed states . . . . . . . . . . 51<br />
III. 4 Composite systems . . . . . . . . . . . . . 55<br />
III. 4. 1 Summary . . . . . . . . . . . . . . . 63<br />
III. 5 Proper and improper mixtures . . . . . . . . . . . 63<br />
III. 6 Spin 1/2 particles . . . . . . . . . . . . . . 64
III. 6. 1 Spin 1/2 and rotations in spin space . . . . . . . . 67<br />
III. 6. 2 Mixed spin 1/2 states . . . . . . . . . . . . 70<br />
III. 6. 3 Two spin 1/2 particles . . . . . . . . . . . . 72<br />
III. 6. 3. 1 Singlet and triplet states . . . . . . . . . . 72<br />
III. 6. 3. 2 Correlations . . . . . . . . . . . . . 73<br />
III. 6. 3. 3 Conditional probabilities . . . . . . . . . . 74<br />
III. 6. 3. 4 Example of a mixed state of two spin 1/2 particles . . . 75<br />
IV THE COPENHAGEN INTERPRETATION 77<br />
IV. 1 Heisenberg and the uncertainty principle . . . . . . . . 77<br />
IV. 1. 1 Remarks . . . . . . . . . . . . . . . 81<br />
IV. 2 Bohr and complementarity . . . . . . . . . . . . 82<br />
IV. 2. 1 Complementary phenomena . . . . . . . . . . 84<br />
IV. 2. 2 Remarks and problems . . . . . . . . . . . 86<br />
IV. 2. 3 Agreement and difference between Heisenberg and Bohr . . . 87<br />
IV. 3 Debate between Einstein en Bohr . . . . . . . . . . 88<br />
IV. 3. 1 Introduction . . . . . . . . . . . . . . 88<br />
IV. 3. 2 The photon box . . . . . . . . . . . . . 90<br />
IV. 3. 3 Einstein, Podolsky and Rosen . . . . . . . . . . 92<br />
IV. 3. 4 Heisenberg, Bohr and Einstein, Podolsky and Rosen . . . . 92<br />
IV. 4 Neutron interferometry . . . . . . . . . . . . 93<br />
IV. 5 The uncertainty relations . . . . . . . . . . . . 97<br />
IV. 5. 1 Introduction . . . . . . . . . . . . . . 97<br />
IV. 5. 2 The standard uncertainty relations . . . . . . . . 98<br />
IV. 5. 3 Single slit experiment . . . . . . . . . . . . 100<br />
IV. 5. 4 Time and energy . . . . . . . . . . . . . 103<br />
IV. 5. 5 Double slit experiment . . . . . . . . . . . 104<br />
IV. 5. 6 A new uncertainty measure . . . . . . . . . . 105<br />
IV. 5. 7 Interpretation . . . . . . . . . . . . . . 108<br />
V HIDDEN VARIABLES 109<br />
V. 1 Hidden reality . . . . . . . . . . . . . . . 109<br />
V. 2 Non - contextual hidden variables . . . . . . . . . . 110<br />
V. 3 Kochen and Specker’s theorem . . . . . . . . . . 115<br />
V. 3. 1 Summary . . . . . . . . . . . . . . . 120<br />
V. 4 Contextual hidden variables . . . . . . . . . . . 120
VI BOHMIAN <strong>MECHANICS</strong> 127<br />
VI. 1 Introduction . . . . . . . . . . . . . . . 127<br />
VI. 2 The quantum potential . . . . . . . . . . . . . 128<br />
VI. 3 Composite systems . . . . . . . . . . . . . 132<br />
VI. 4 Remarks and problems . . . . . . . . . . . . 135<br />
VI. 5 The Hamilton - Jacobi equation . . . . . . . . . . 136<br />
VII BELL’S INEQUALITIES 139<br />
VII. 1 Local deterministic hidden variables . . . . . . . . . 139<br />
VII. 1. 1 Derivation of the first Bell inequality . . . . . . . . 139<br />
VII. 1. 2 The Bell inequality of Clauser, Horne, Shimony and Holt . . . 141<br />
VII. 1. 3 Violation of the Bell inequalities by quantum mechanics . . . 142<br />
VII. 1. 4 The Bell inequality in a non-contextual, local deterministic HVT . 144<br />
VII. 2 Local deterministic contextual hidden variables . . . . . . 145<br />
VII. 3 Wigner’s derivation . . . . . . . . . . . . . 147<br />
VII. 4 The derivation of Eberhard and Stapp . . . . . . . . . 150<br />
VII. 4. 1 Counterfactual conditional statements and indeterminism . . . 152<br />
VII. 5 Stochastic hidden variables . . . . . . . . . . . 153<br />
VII. 5. 1 Outcome, parameter and source independence . . . . . 155<br />
VII. 5. 2 Quantum mechanics as a stochastic HVT . . . . . . . 156<br />
VII. 6 An algebraic proof without inequalities . . . . . . . . 158<br />
VII. 7 Miscellanea . . . . . . . . . . . . . . . 160<br />
VII. 7. 1 Locality and relativity . . . . . . . . . . . 160<br />
VII. 7. 2 Locality versus conditional independence . . . . . . . 161<br />
VII. 7. 3 Determinism . . . . . . . . . . . . . . 161<br />
VIII THE MEASUREMENT PROBLEM 163<br />
VIII. 1 Introduction . . . . . . . . . . . . . . . 163<br />
VIII. 2 Measurement according to classical physics . . . . . . . 164<br />
VIII. 3 Measurement according to quantum mechanics . . . . . . 166<br />
VIII. 4 The measurement problem in the narrow sense . . . . . . 170<br />
VIII. 4. 1 The projection postulate and consciousness . . . . . . 172<br />
VIII. 4. 2 Bohmian mechanics . . . . . . . . . . . . 173<br />
VIII. 4. 3 Spontaneous collapse . . . . . . . . . . . . 173<br />
VIII. 4. 4 Many worlds . . . . . . . . . . . . . . 174<br />
VIII. 4. 5 Superselection rules . . . . . . . . . . . . 175<br />
VIII. 4. 6 Irreversibility of measurement . . . . . . . . . 176
VIII. 4. 7 Modal interpretation . . . . . . . . . . . . 176<br />
VIII. 4. 8 Decoherence . . . . . . . . . . . . . . 177<br />
VIII. 5 Incompatible quantities . . . . . . . . . . . . 179<br />
VIII. 6 Comments on the theory of measurement . . . . . . . . 181<br />
A GLEASON’S THEOREM 183<br />
A. 1 Introduction . . . . . . . . . . . . . . . 183<br />
A. 2 Conversion to a 3 - dimensional real problem . . . . . . . 184<br />
A. 2. 1 Step 1 . . . . . . . . . . . . . . . 185<br />
A. 3 Formulation of the problem on the surface of a sphere . . . . . 186<br />
A. 3. 1 Step 2 . . . . . . . . . . . . . . . 188<br />
A. 3. 1. 1 Lemma 1 . . . . . . . . . . . . . 188<br />
A. 3. 1. 2 Lemma 2 . . . . . . . . . . . . . 189<br />
A. 3. 1. 3 Result of lemma 1 and 2 . . . . . . . . . . 191<br />
A. 3. 2 Step 3 . . . . . . . . . . . . . . . 192<br />
A. 4 An analytic lemma . . . . . . . . . . . . . 196<br />
A. 4. 1 Step 4 . . . . . . . . . . . . . . . 196<br />
A. 5 Summary . . . . . . . . . . . . . . . . 198<br />
WORKS CONSULTED 199<br />
BIBLIOGRAPHY 200
LIST <strong>OF</strong> FIGURES<br />
III. 1 A discontinuous measure for dim H = 2 . . . . . . . . . 48<br />
III. 2 A rotated unit vector in the xz - plane . . . . . . . . . . 68<br />
III. 3 Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b . . . . . . 73<br />
IV. 1 Heisenberg’s γ - microscope . . . . . . . . . . . . 79<br />
IV. 2 The double slit interference experiment (Bohr 1949 ) . . . . . . 89<br />
IV. 3 Contexts of measurement in which the interference of the particles is visible,<br />
and those in which the recoil of the screen is visible, exclude each other. (Bohr<br />
1949 ) . . . . . . . . . . . . . . . . . . 90<br />
IV. 4 Several perfect crystal neutron interferometers (Rauch and Werner 2000 ) . 93<br />
IV. 5 The interference pattern in the neutron interferometer is acquired by measuring<br />
the intensity in the detectors at a variable optical path length difference. . 94<br />
IV. 6 The probability distribution in position for a slit of width 2 a . . . . 101<br />
IV. 7 The diffraction pattern for a small slit of width 2 a . . . . . . . 101<br />
IV. 8 The probability distribution in position for a double slit, 2 a is the width of each<br />
slit and 2 A the distance between the slits . . . . . . . . . 104<br />
IV. 9 The interference pattern for the double slit . . . . . . . . . 104<br />
IV. 10 Moving screen . . . . . . . . . . . . . . . . 106<br />
V. 1 A solution for dim H = 2 . . . . . . . . . . . . . 117<br />
V. 2 a) Kochen - Specker diagram b) Conway - Kochen diagram . . . . 118<br />
V. 3 M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of<br />
the left pillar. Each cube has 4 lines from the mutual center to its vertices, 6<br />
lines to the centers of its edges, and 3 lines to the centers of its faces. Three of<br />
the lines are shared by all three cubes, giving 3 · (4 + 6 + 3 ) − 6 = 33 lines.<br />
These are Peres’ vectors. (Text Meyer 2003 ) . . . . . . . . 119<br />
V. 4 µ(P i ) = cos 2 θ . . . . . . . . . . . . . . . 120<br />
VI. 1<br />
VI. 2<br />
The quantum potential for the two slit system as viewed from the screen, under<br />
assumption of a Gaussian distribution at the slits (Bohm 1989 ) . . . 131<br />
A simulation of the double slit experiment in Bohmian mechanics. Each particle<br />
follows a certain path between the slits and the photographic plate. All<br />
particles coming from the upper slit arrive at the upper half of the photographic<br />
plate, likewise for the lower slit and lower half of the plate. The twists in the<br />
paths are caused by the quantum potential U. (Vigier et al. 1987 ) . . . 132<br />
VII. 1 Thought experiment of Einstein, Podolsky and Rosen on the singlet . . . 140<br />
VII. 2 A configuration in which the spin quantities violate the Bell inequality . . 142<br />
VII. 3 The Bell inequality violated for every acute angle ϕ . . . . . . 143<br />
VII. 4 the configuration giving the largest violation of the Bell inequality (all vectors<br />
in the same plane) . . . . . . . . . . . . . . . 143
VII. 5 Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n<br />
is positive, in the unshaded areas a n b n is negative. . . . . . . . 144<br />
VII. 6 Comparison of the quantum mechanical expectation values and those for the<br />
local deterministic HVT . . . . . . . . . . . . . . 145<br />
VII. 7 Violation of the Bell inequality again . . . . . . . . . . 149<br />
VII. 8 The Mermin pentagon . . . . . . . . . . . . . . 159<br />
VII. 9 Minkowski diagram of the EPRB experiment, where λ is in the past light cones<br />
of both A and B . . . . . . . . . . . . . . . 160<br />
VIII. 1 Schrödinger’s cat paradox (DeWitt 1970 ) . . . . . . . . . 170<br />
A. 1 Construction of a 3 - dimensional subspace E . . . . . . . . 185<br />
A. 2 Rotation of s to s 0 and t to t ′ along a great circle around axis r . . . 188<br />
A. 3 Projection of points on a great circle onto a plane P through the north pole 189<br />
A. 4 Projection of meridians, circles with constant latitude, and a great circle . 190<br />
A. 5 Spiral representing a projected path from s to t along subsequent great circles,<br />
each time starting at their most northern point . . . . . . . . 190<br />
A. 6 Path from t to v, having the same longitude . . . . . . . . 191<br />
A. 7 A strictly in - or decreasing curve C . . . . . . . . . . 192<br />
A. 8 Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ ) . . 193<br />
A. 9 Great circle C and tilted great circles C ′ and C ′′ . . . . . . . 194<br />
A. 10 Two continuous curves on S 2 , intersecting in q . . . . . . . 195
I<br />
CONCEPTUAL PROBLEMS<br />
Anyone who is not shocked by quantum theory has not understood it.<br />
— Niels Bohr<br />
I think it is safe to say that no one understands quantum mechanics.<br />
— Richard Feynman<br />
I. 1 INTRODUCTION<br />
Quantum mechanics emerged at the beginning of the 20 th century from an attempt to understand<br />
the interaction between atoms and radiation. The presence of discrete lines in the emission<br />
and absorption spectra of chemical elements indicates that this interaction takes the form of discrete<br />
quanta. When, in the years 1925 and 1926, a coherent theory was developed by the unified efforts of<br />
Werner Heisenberg, Paul Dirac, Max Born, Pascual Jordan, Wolfgang Pauli and Erwin Schrödinger,<br />
and this theory was axiomatized seven years later by John von Neumann, the question about the<br />
physical interpretation of the mathematical symbols of the theory arose.<br />
The central mathematical concept in quantum mechanics is ψ, in the form of a wave function ψ(q)<br />
in Schrödinger’s wave mechanics, or of a vector |ψ⟩ in Hilbert space, à la Von Neumann. According<br />
to Born, its physical meaning is that ψ determines probabilities for results of measurements, and a<br />
key question is then how such probabilities must be interpreted. By means of four examples we will<br />
give an idea of the conceptual problems raised by quantum mechanics.<br />
(i) Consider as a first example the decay of radioactive nuclei of a certain kind, as discussed by<br />
Einstein (P.A. Schilpp (1949, p.667, ff). We see the unstable nuclei decay at various times, one almost<br />
immediately, another only after a long time; the α - particles are radiated in ever different directions.<br />
Quantum mechanics describes these nuclei by a non-stationary wave function, and using this function<br />
one can calculate the expected lifetime of the nuclei.<br />
A natural reaction is to assume that the nuclei differ from each other, and that this difference is<br />
the cause of the mutually different individual life spans and the different directions the α - particles<br />
are radiated in. In this view, the quantum mechanical expectation value would be comparable to<br />
the average life span in a population. However, this does not fit in a natural way in the quantum<br />
mechanical description. Quantum mechanics describes all nuclei by the same wave function. If this<br />
description is complete, the fact that quantum mechanics gives only expected life spans is not due to<br />
a lack of knowledge. Rather, there simply is nothing more to know concerning the nuclei than their<br />
wave function and the probabilities that follow from it.<br />
On the other hand, we see before our eyes that the nuclei do not behave the same way, they decay<br />
at different times and send the α - particles in ever different directions. This suggests that more can
8 CHAPTER I. CONCEPTUAL PROBLEMS<br />
be known about nuclei than their expected life spans, just like a more thorough investigation of the<br />
individuals of a population enables us to know more than their mere average life span; we would<br />
then be able to make a more detailed statement about their individual life spans. In this view the<br />
quantum mechanical description is not complete, there are extra, until now ‘hidden’, variables which<br />
say something about the individual case.<br />
There is a standard answer to this problem, called the ‘Copenhagen interpretation’, after the view<br />
developed by Bohr and his coworkers. This answer is that the idea that the individual nuclei have a<br />
definite life span, independent of the observation of this life span, is incorrect. We can only speak of<br />
an individual life span within the context of an experiment in which this is measured. An experiment<br />
always entails a disturbance of the system. For this reason no conclusions can be drawn concerning<br />
the undisturbed system. It is incorrect to speak of the life span of a nucleus which is not observed.<br />
The statistical spread in the measured individual life spans is due to the quantum character of the<br />
interaction between object and measuring apparatus. As a matter of principle, what happens in this<br />
interaction cannot be described more precisely. This makes every individual measurement into a<br />
unique event.<br />
Characteristic for the Copenhagen interpretation is, furthermore, that one cannot simply combine<br />
the description of the system, obtained within the context of a certain type of experiment, with a<br />
description of the same system, obtained in a different kind of experiment. The best known example<br />
of such mutually excluding experiments are measurements of position and momentum. According to<br />
Bohr, descriptions of a system with terms like ‘position’ or ‘momentum’ are complementary; they are<br />
supplementary to each other, but they can never be united in one picture.<br />
The main point behind the Copenhagen answer is the idea of measurement disturbance. According<br />
to this line of thought quantum mechanics is distinguished from classical physics by the quantization<br />
of the interaction between system and measuring apparatus. Every observation involves an<br />
interaction with, and therefore a disturbance of, the observed system. This disturbance cannot be<br />
made arbitrarily small; ≠ 0. Therefore, one cannot identify observation results with properties<br />
the system has independently of the observation. One can only talk meaningfully about observation<br />
results which are created by the measurement. In contrast to classical physics, quantum mechanics<br />
does not deal with what exists, but with what is observed.<br />
At first sight this reasoning seems to be plausible, it is, however, not without problems. Can<br />
we use the same reasoning if the observed system is macroscopic? And, as a matter of fact, what<br />
exactly is an observation? Is it essential that some conscious being takes notice of the result of the<br />
observation, or is an apparatus registering the outcome sufficient? These problems will appear in the<br />
third and fourth example.<br />
(ii) The next example is from a letter Einstein wrote to Born in 1948 (Born 1971, pp. 169, 170).<br />
Consider a free particle described by a wave function ψ. According to the quantum mechanical<br />
description, ψ satisfies an uncertainty relation; the statistical deviations of position and momentum<br />
cannot simultaneously be made arbitrarily small. Apparently the outcomes of measurements of position<br />
and momentum of an individual particle cannot both be predicted exactly, and the question arises<br />
how to interpret this situation. Einstein distinguishes two points of view.<br />
(a) The (free) particle really has a definite position and a definite momentum, even if they<br />
cannot both be ascertained by measurement in the same individual case. According to<br />
this point of view, the ψ - function represents an incomplete description of the real state
I. 1. INTRODUCTION 9<br />
of affairs. [. . . ] Its acceptance would lead to an attempt to obtain a complete description<br />
of the real state of affairs as well as the incomplete one, and to discover physical laws<br />
for such a description. The theoretical framework of quantum mechanics would then be<br />
exploded.<br />
(b) In reality the particle has neither a definite momentum nor a definite position; the description<br />
by [the] ψ - function is, in principle, a complete description. The strictly defined<br />
position of the particle, obtained by measuring the position, cannot be interpreted as the<br />
position of the particle prior to the measurement. The sharp localization which appears as<br />
a result of the measurement is brought about only as a result of the unavoidable (but not<br />
unimportant) operation of measurement. The result of the measurement depends not only<br />
on the real particle situation but also on the nature of the measuring mechanism, which<br />
in principle is incompletely known. An analogous situation arises when the momentum<br />
or any other observable quantity relating to the particle is measured.<br />
Interpretation (b) is accepted by the majority of the physicists and Einstein admits<br />
[. . . ] it alone does justice in a natural way to the empirical state of affairs expressed in<br />
Heisenberg’s principle within the framework of quantum mechanics.<br />
Nevertheless, he emphasizes his preference for interpretation (a). His argument is that it is basic<br />
to physics that physical concepts refer to entities, such as particles, fields, etc., that exist independently<br />
of the observer, and are situated in space and time. Interpretation (b) renders this kind of<br />
description impossible. A second argument has to do with composite systems and will be discussed<br />
in section I. 2.<br />
(iii) The next example, also originating from the correspondence between Einstein and Born<br />
(Born 1971, pp. 188, 208 - 209), concerns a freely moving macroscopic object, for instance a star.<br />
A simple Schrödinger equation applies to the center of mass of such a body, namely that of a free<br />
particle. Since all wave functions which are solutions of the Schrödinger equation are admissible,<br />
one may consider as a solution a wave function with two peaks of equal size, located far from each<br />
other.<br />
Upon measurement of the position of the center of mass of such a body, the outcome is found at<br />
one peak in about half of the measurements, in the other half the outcome is found at the other peak.<br />
In this case it is tempting to say that for half of these measurements the center of mass was at that<br />
one position, that the object was at that position, while at the other half the center of mass was at the<br />
other position. But according to the standard interpretation this is incorrect: prior to the measurement<br />
no position can be assigned to the center of mass. Quantum mechanics applies just as well to the<br />
center of mass of a macroscopic body as to an electron. It is, however, difficult to imagine how a<br />
measurement ‘creates’ the position of the center of mass of a star as a result of a disturbance in the<br />
order of the size of one single quantum .<br />
According to Pauli, one of the representatives of the Copenhagen interpretation, this is a creation<br />
outside the laws of nature (ibid., p. 223). The laws of nature only say something about the statistics<br />
of the outcomes. The quantum mechanical probability description does not express our ignorance<br />
concerning the position of the center of mass of the body; the probability description corresponds
10 CHAPTER I. CONCEPTUAL PROBLEMS<br />
to an essential indeterminacy of that position. Pauli states that the question whether the ‘position’<br />
of a body would also exist without observation is fundamentally unanswerable and for this reason<br />
meaningless.<br />
In this example the problem of the transition between the microscopic and the macroscopic levels<br />
arises. Our intuition tells us that somewhere along the way the quantum mechanical probability<br />
description must turn into a classical description of an ensemble, an ensemble of objects that have<br />
properties. But if we accept at the same time that quantum mechanics applies as well to macroscopic<br />
bodies as to microscopic ones, our expectation is refuted. This transition of the one type of ensemble<br />
to the other is a problem which invariably emerges in considerations concerning the ‘measurement<br />
problem’. We will come back to this in chapter VIII.<br />
The previous discussion follows rather closely the formulations of Einstein and Pauli in the<br />
years 1948-1954, as can be found in the correspondence between Born and Einstein (Born 1971).<br />
An interesting aspect is that the discussion actually takes place over Born’s head. Born saw Einstein<br />
as the one who had, in his theory of relativity, abolished the idea of absolute simultaneity by means of<br />
the argument that it is meaningless to want to speak about something you cannot measure in principle.<br />
Einstein reacts (ibid., p. 188)<br />
There is nothing analogous in relativity to what I call incompleteness of description in<br />
the quantum theory. Briefly it is because the ψ - function is incapable of describing certain<br />
qualities of an individual system, whose ‘reality’ we none of us doubt (such as a<br />
macroscopic parameter).<br />
Moreover, Born continues to believe, despite everything Einstein writes, that Einstein objects<br />
to the indeterministic character of quantum mechanics, i.e., the fact that it only provides probability<br />
statements, instead of objecting to the alleged completeness of quantum mechanics, until Pauli<br />
intervenes in the discussion and explains Einstein’s position to Born (ibid., pp. 217-219).<br />
(iv) The last example is Schrödinger’s notorious cat paradox (Schrödinger 1935b).<br />
One can even set up quite ridiculous cases. A cat is penned up in a steel chamber, along<br />
with the following diabolical device (which must be secured against direct interference<br />
by the cat): in a Geiger counter there is a tiny bit of radioactive substance, so small<br />
that perhaps in the course of one hour one of the atoms decays, but also, with equal<br />
probability, perhaps none; if it happens, the counter tube discharges and through a relay<br />
releases a hammer which shatters a small flask of hydrocyanic acid. If one has left this<br />
entire system to itself for an hour, one would say that the cat still lives if meanwhile no<br />
atom has decayed. The first atomic decay would have poisoned it. The Ψ - function for<br />
the entire system would express this by having in it the living and the dead cat (pardon<br />
the expression) mixed or smeared out in equal parts.<br />
In this example a number of problems is combined. In the first place there is again the difference<br />
between a classical state and a quantum state. If the standard interpretation is extended consistently,<br />
the cat cannot be considered dead or alive as long as the chamber is not opened and the cat is not<br />
observed. (One may wonder what the cat itself thinks of this.)<br />
The question whether it is permitted to extend the standard interpretation in this way coincides<br />
with the question if and to what extent the quantum mechanical description can be transferred from
I. 2. INCOMPLETENESS AND LOCALITY 11<br />
the microscopic to the macroscopic level. Then there is the question what an observation exactly is.<br />
Are cats observers of their own situation? And if consciousness is essential for an observation, do<br />
cats have the correct type of consciousness?<br />
From the examples above we can isolate the following central concepts:<br />
1. the real state of a system independent of measurement,<br />
2. incompleteness,<br />
3. measurement disturbance,<br />
4. complementarity,<br />
5. the transition from microscopic to macroscopic,<br />
6. consciousness. 1<br />
I. 2 INCOMPLETENESS AND LOCALITY<br />
The previous discussion only served to get the reader in the right mood! In 1935 Albert Einstein,<br />
Boris Podolsky and Nathan Rosen, from now on abbreviated as EPR, came up with an example<br />
which considerably sharpened the discussion (EPR 1935). Using rigorous reasoning they argued that<br />
quantum mechanics is an incomplete theory. As an introduction to their argumentation we will first<br />
examine a more simple argument that Einstein formulated in the same year in a letter to Schrödinger,<br />
as paraphrased by A. Fine (1986, p. 37).<br />
Consider a composite system of two particles which interacted with each other but are so widely<br />
separated in space now that they no longer interact. Suppose they are in a state |ψ⟩ which is an eigenstate<br />
of the total momentum P 1 +P 2 with eigenvalue 0, but is not an eigenstate of P 1 or P 2 separately,<br />
(P 1 + P 2 ) |ψ⟩ = 0 and P 1 |ψ⟩ ̸= a |ψ⟩, P 2 |ψ⟩ ̸= b |ψ⟩, for a, b ∈ R. (I. 1)<br />
Through a measurement of the momentum of particle 1 we can predict with certainty what the result<br />
will be of a measurement of the momentum of particle 2. Moreover, the measurement of particle 1<br />
has absolutely no physical influence on particle 2. But if it is possible to predict the momentum of<br />
particle 2 with certainty without any interaction with that particle, then particle 2 must already have<br />
this momentum before the measurement, and this must even be the case before the measurement of<br />
particle 1, since the measurement absolutely does not disturb particle 2. However, the value of this<br />
property of particle 2 cannot be derived from the quantum mechanical description using the state |ψ⟩.<br />
Therefore, quantum mechanics is incomplete.<br />
We see how Einstein succeeds, thanks to the strict correlation between the particles that quantum<br />
mechanics allows for, and thanks to the spatial separation of the particles, to refute the argument of<br />
1 The role of consciousness is regarded as essential by mathematicians and physicists like Von Neumann, London,<br />
Heitler and Wigner. The fact that they felt forced to take this highly unusual step in physical theory illustrates how serious<br />
the situation is.
12 CHAPTER I. CONCEPTUAL PROBLEMS<br />
the measurement disturbance as a physical process. In the earlier examples we could imagine the<br />
measurement to create the outcome (although this already seemed a hardly convincing escape in Einstein’s<br />
example of macroscopic bodies), and that this outcome did not exist prior to the measurement<br />
because of the disturbance that comes with the measurement. We now see that we cannot imagine<br />
these measurement disturbances as spatially limited, ‘local’ processes. Einstein spoke of “a spooky<br />
action at a distance” and of “telepathy”.<br />
The case against the completeness of the quantum mechanics gained strength with this example.<br />
However, objections can be made. (Later Einstein would be amused about the fact that everyone knew<br />
the argumentation was not correct but that everyone had another reason to think so.) The argument<br />
uses the fact that in quantum mechanics there are eigenstates of P 1 + P 2 in which the momentum<br />
of each individual particle is undetermined. It could be objected that such states are perhaps not<br />
physically realizable, that only eigenstates of P 1 + P 2 which are at the same time also eigenstates of<br />
both P 1 and P 2 would be realizable, and that we should therefore replace the state |ψ⟩ by a mixture<br />
of such eigenstates, in which case the argumentation does not hold any longer.<br />
The EPR article itself gives a more balanced argumentation that does not have this shortcoming.<br />
The article deviates from the above on two points. First, not only the momentum, but also the position<br />
of the two particles is brought into the consideration. Second, EPR formulate a ‘sufficient condition of<br />
reality’ by means of the term ‘element of physical reality’, which we will call EPR(EPR). As worded<br />
by EPR, p. 777,<br />
EPR(EPR): If, without in any way disturbing a system, we can predict with certainty<br />
(i.e., with probability equal to unity) the value of a physical quantity, then there exists an<br />
element of physical reality corresponding to this physical quantity.<br />
How else could we explain that we are able to predict the outcomes of measurements with certainty?<br />
A necessary, and certainly sufficient, condition for a complete physical theory is, that each<br />
element of physical reality must have a counterpart in the theoretical description,<br />
COMP(T): If a physical theory T is complete, then every element of physical reality<br />
must have a counterpart in the theory T .<br />
It is possible to choose for |ψ⟩ a state which is a simultaneous eigenstate of the commuting operators<br />
P 1 + P 2 and Q 1 − Q 2 . In Dirac - notation, and only considering one spatial dimension, such a<br />
state is written in the ‘p - language’ and in the ‘q - language’, as<br />
∫<br />
∫<br />
|ψ⟩ = |p 1 = p⟩ ⊗ |p 2 = −p⟩ e − i l p dp = |q 1 = q⟩ ⊗ |q 2 = q − l⟩ dq, (I. 2)<br />
R<br />
where l is the eigenvalue of the mutual distance Q 1 − Q 2 and can be chosen arbitrarily large, and<br />
the terms with the ‘cartwheels’ are the direct products, see subsection II. 5. 2, p. 31, of which the first<br />
factor refers to particle 1, and the second to particle 2. The ‘p - language’ and the ‘q - language’ can<br />
be ‘translated’ into each other by means of a Fourier - transformation. 2<br />
2 Without Dirac - notation but in terms of Dirac’s δ - ‘functions’ the wave function has, in ‘p - language’ and in ‘q - language’,<br />
the following form,<br />
ψ(p 1 , p 2 ) = e − i lp 1<br />
δ(p 1 + p 2 ) and ˜ψ(q1 , q 2 ) = δ(q 1 − q 2 + l).<br />
R
I. 2. INCOMPLETENESS AND LOCALITY 13<br />
Although this state |ψ⟩ is an eigenstate of the total momentum P 1 + P 2 of the two particles and<br />
their mutual distance Q 1 − Q 2 , with eigenvalues 0 or l, respectively,<br />
(<br />
P 1 + P 2<br />
)<br />
|ψ⟩ = 0 |ψ⟩ and<br />
(<br />
Q1 − Q 2<br />
)<br />
|ψ⟩ = l |ψ⟩, (I. 3)<br />
it is not an eigenstate of any of the 1 - particle operators P 1 , Q 1 , P 2 or Q 2 . However, given the<br />
outcome of a measurement of P 1 , e.g. a, we can predict the result of a measurement of P 2 with<br />
certainty, namely −a. In the same way, from a measurement of Q 1 with outcome x, the result of a<br />
measurement of Q 2 follows with certainty, namely x − l.<br />
Now the argumentation is as follows. If we would measure the momentum P 1 of particle 1,<br />
then we could predict the value of P 2 with certainty, without disturbing particle 2. According to<br />
the aforementioned criterion the momentum P 2 of particle 2 must then correspond to an element of<br />
physical reality. On the other hand, if we would measure the position Q 1 of particle 1, then we could<br />
predict the value of Q 2 with certainty, again without disturbing particle 2. In that case there must<br />
be an element of physical reality which corresponds to Q 2 . Therefore we can, depending on which<br />
measurement we perform on particle 1, assign an element of physical reality to particle 2.<br />
However, because of the absence of physical interaction between the particles there can be no real<br />
change in particle 2 as a result of what is done with particle 1. Consequently, particle 2 must have both<br />
elements of physical reality. But such a simultaneous assignment of exact position and momentum<br />
has no counterpart in the quantum mechanical formalism, there are no wave functions which are<br />
simultaneous eigenfunctions of position and momentum. The conclusion is unavoidable, the answer<br />
to the question in the title of their article ‘Can quantum - mechanical description of physical reality be<br />
considered complete?’ must be negative.<br />
Notice that it is not necessary to perform the measurements on P 1 or Q 1 simultaneously, the<br />
only thing that matters is the possibility to choose whether to predict the position or momentum of<br />
particle 2 with certainty. Because of the absence of interaction between both particles it makes no<br />
difference for particle 2 which choice is made for particle 1. This part of the argumentation relies<br />
on the supposition that the elements of physical reality have a local character. This implicit, but<br />
reasonable locality premise, runs as follows,<br />
LOC(EPR): Performing a measurement on a physical system S 1 does not have an instantaneous<br />
effect on elements of physical reality belonging to any system S 2 which is spatially<br />
separated from S 1 .<br />
We can thus summarize the argument of EPR schematically; quantum mechanics, QM, together<br />
with EPR(EPR) and LOC(EPR), implies that quantum mechanics is an incomplete theory,<br />
not COMP(QM). Or:<br />
QM ∧ EPR(EPR) ∧ LOC(EPR) → ¬ COMP(QM). (I. 4)<br />
In comparison to the foregoing, the strength of this argument is, in the first place, the larger precision<br />
with which the argumentation has been set up: the conclusion follows logically from a number<br />
of explicitly formulated premises and conditions. Moreover, we see that we are able to attribute to<br />
particle 2 both position and momentum without interacting with particle 2. This means that we cannot<br />
avoid the argumentation by assuming that for the correct quantum mechanical description the
14 CHAPTER I. CONCEPTUAL PROBLEMS<br />
given wave function ψ must be replaced by a mixture of eigenstates. Such eigenstates of position and<br />
momentum are simply not available in quantum mechanics. The possibility to assign values to P 2<br />
and Q 2 attacks the complementarity idea in the heart.<br />
EPR anticipated the objection that only that which has been measured is real (EPR 1935, p. 780),<br />
Indeed, one would not arrive at our conclusion if one insisted that two or more physical<br />
quantities can be regarded as simultaneous elements of reality only when they can be<br />
simultaneously measured or predicted. On this point of view, since either one or the<br />
other, but not both simultaneously, of the quantities P and Q can be predicted, they are<br />
not simultaneously real. This makes the reality of P and Q depend upon the process of<br />
measurement carried out on the first system, which does not disturb the second system in<br />
any way. No reasonable definition of reality could be expected to permit this.<br />
They conclude their article with the next paragraph,<br />
While we have thus shown that the wave function does not provide a complete description<br />
of physical reality, we left open the question of whether or not such a description exists.<br />
We believe, however, that such a theory is possible.<br />
The problem whether a complete theory is possible or not, is called the hidden variable problem. The<br />
so - called ‘hidden variable theories’ are attempts to solve this problem. We will come back to this in<br />
chapter V.<br />
Bohr’s (1935a) response to the argument of EPR aims at the question to what extent the condition<br />
for an element of ‘physical reality’, as worded by EPR, is fulfilled in their example. The next quotation<br />
is from Bohr (1935b, p. 700),<br />
From our point of view we now see that the wording of the aforementioned criterion of<br />
physical reality proposed by Einstein, Podolsky and Rosen contains an ambiguity as regards<br />
the meaning of the expression “without in any way disturbing a system.” Of course<br />
there is in a case like that just considered no question of a mechanical disturbance of<br />
the system under investigation during the last critical stage of the measuring procedure.<br />
But even at this stage there is essentially the question of an influence on the very conditions<br />
which define the possible types of predictions regarding the future behavior of<br />
the system. Since these conditions constitute an inherent element of the description of<br />
any phenomenon to which the term ‘physical reality’ can be properly attached, we see<br />
that the argumentation of the mentioned authors does not justify their conclusion that<br />
quantum mechanical description is essentially incomplete. (Emphasis added.)<br />
It is not easy to completely comprehend what Bohr says here. Evidently, he abandons the original<br />
idea that the measurement disturbance creates the measurement results, or, at least, that such a creation<br />
can be understood as a physical process. It is replaced by the idea that applicability of physical concepts<br />
depends on the context of measurement. Performing a measurement on one of the particles<br />
is considered as determinative for the applicability of concepts to the other particle. Bohr says that<br />
the measurement disturbance is not a mechanical disturbance; apparently LOC(EPR) continues to<br />
apply for him if we, using the term ‘influence’, refer to a mechanical interaction, but not if we mean
I. 2. INCOMPLETENESS AND LOCALITY 15<br />
by ‘influence’ the ‘defining effect’ of the context of measurement. The experimental circumstances<br />
define what you may call physical reality. Physical reality is not defined by experiments you could do,<br />
as is the case according to EPR, but exclusively by experiments you actually do. Under circumstances<br />
as described in the EPR experiment this ‘defining effect’ of the experimental setup also reaches parts<br />
of the system with which the measuring apparatus has no physical interaction.<br />
A distinct difference between Einstein and Bohr is that Einstein wants to visualize reality independent<br />
of observation, whereas Bohr is satisfied with complementary pictures of which the applicability<br />
always remains dependent on the chosen measurement setup. In 1955 Einstein says (Fine 1986, p.95)<br />
It is basic for physics that one assumes a real world existing independently from any act<br />
of perception. But this we do not know. We take it only as a programme in our scientific<br />
endeavors. This programme is, of course, prescientific and our ordinary language is<br />
already based on it.<br />
And concerning the EPR situation he says (Schilpp 1949, p. 85)<br />
But on one supposition we should, in my opinion, absolutely hold fast: the real factual<br />
situation of the system S 2 is independent of what is done with the system S 1 , which is<br />
spatially separated from the former.<br />
Bohr’s conceptions concerning physical reality are much more difficult to characterize. According<br />
to him there is no independent reality of which the physical theory would have to give an unambiguous<br />
representation. He writes (Schilpp 1949, p. 211)<br />
Thus, a sentence like “we cannot know both the momentum and the position of an atomic<br />
object” immediately raises questions as to the physical reality of two such attributes of the<br />
object, which can be answered only by referring to the conditions for the unambiguous<br />
use of space - time concepts, on the one hand, and dynamical conservation laws, on the<br />
other hand.<br />
An exhaustive description of reality must always use concepts which themselves remain dependent<br />
on mutually excluding contexts. Bohr says (A. Petersen 1963, p. 11)<br />
The word ‘reality’ is also a word, a word which we must learn to use correctly.<br />
He constantly emphasizes the restricted applicability of our physical concepts, which makes the link<br />
between description and reality very complicated. Petersen mentions (ibid., p. 12)<br />
When asked whether the algorithm of quantum mechanics could be considered as somehow<br />
mirroring an underlying quantum world, Bohr would answer, “There is no quantum<br />
world. There is only an abstract quantum physical description. It is wrong to think that<br />
the task of physics is to find out how nature is. Physics concerns what we can say about<br />
nature.”
16 CHAPTER I. CONCEPTUAL PROBLEMS<br />
Einstein’s conceptions are, in a certain way, easier than those of Bohr and correspond to the<br />
intuition of the majority of physicists. When the preponderance of the Copenhagen school started to<br />
wane, in the 1960s, attention for Einstein’s viewpoint revived.<br />
In 1964, John Bell gave a reconstruction of the EPR experiment (see chapter VII) satisfying Einstein’s<br />
requirement that the real, factual situation of physical system S 2 is independent of what is<br />
done with system S 1 , the two systems being spatially separated. He constructed a very general model<br />
and made the surprising discovery that such a model cannot completely reproduce the quantum mechanical<br />
predictions. Especially remarkable are the broad generality of his derivation and the fact that<br />
the differences with quantum mechanics are large enough to be able to be measured. Sensationally, a<br />
‘philosophical’ issue thus came within the range of experimental physics! Abner Shimony has spoken<br />
in this respect of experimental metaphysics.<br />
Bell’s work is an attempt solve the completeness problem. Hereafter attempts were undertaken<br />
to really carry out the EPR experiment, which was thus far only a thought experiment. The first<br />
experiment was done in 1972 by Freedman and Clauser. Later, several other experiments have been<br />
done, the highlight of which was, in 1982, the experiment of Alain Aspect and his group in Paris. In<br />
turn, this has been superseded by the experiments of Anton Zeilinger and his groups in Vienna and<br />
Innsbruck (e.g. Weihs 1998). The results of these experiments are in good to excellent agreement<br />
with quantum mechanics, and therefore in conflict with all models meeting Einstein’s requirements.<br />
The latter conclusion applies irrespective of the validity of quantum mechanics.<br />
These results brought about a great number of responses and is one of the main causes of the<br />
revived interest for interpretation problems of quantum mechanics. The discussion focusses on the<br />
question what exactly the suppositions are that lead to the result of Bell and whether his model is<br />
indeed the most general model that meets Einstein’s requirements.<br />
The consequences of Bell’s result seem to be considerable. It can be argued that no independent<br />
existence can be granted to objects that at some time interacted, irrespective of how far apart they are,<br />
this even holds completely independent of the distance. This suggests that reality cannot be reduced<br />
to the ‘sum’ of its parts and that a more holistic approach is imperative, making our picture of nature<br />
much more complicated.<br />
Through the discussion of the EPR argument some more basic concepts are added to our list:<br />
7. element of physical reality,<br />
8. separability of physical systems,<br />
9. locality,<br />
10. holism.<br />
These ten concepts play a central role in the research on the foundations of quantum mechanics.
II<br />
THE FORMALISM<br />
As far as the laws of mathematics refer to reality, they are not certain; and as far as they<br />
are certain, they do not refer to reality.<br />
In mathematics you don’t understand things. You just get used to them.<br />
— Albert Einstein<br />
— John von Neumann<br />
The usual mathematical formulation of quantum mechanics has been developed by John von Neumann<br />
in 1932 as an operator calculus on a Hilbert space. We will not need all details of this<br />
calculus, therefore, give only a succinct review. For our purposes we can limit ourselves to a<br />
finite - dimensional Hilbert space, a complex vector space with an inner product. We will give an<br />
overview of the elementary concepts of this Hilbert space, and in an addendum concisely summarize<br />
the infinite - dimensional case. For a more extensive treatment of Hilbert spaces we refer<br />
to the first chapters of E. Prugovečki (2006).<br />
II. 1<br />
FINITE - DIMENSIONAL HILBERT SPACES<br />
We start this chapter by defining a space called a Hilbert space, denoted by H. The elements H are<br />
called vectors. Following Dirac’s ket notation the vectors will be written as |α⟩,|β⟩,|γ⟩,|ϕ⟩,|ψ⟩,|χ⟩, . . . ,<br />
complex numbers will be specified by the first characters of the alphabet a, b, c ∈ C.<br />
Vectors can be added, and multiplied with a complex number, also called a scalar, we then remain<br />
in H, i.e., for all |ϕ⟩, |ψ⟩ ∈ H and a, b ∈ C we have<br />
a|ϕ⟩ + b|ψ⟩ ∈ H. (II. 1)<br />
In other words, H is closed under linear combinations.<br />
The addition is commutative and associative,<br />
|ϕ⟩ + |ψ⟩ = |ψ⟩ + |ϕ⟩, (II. 2)<br />
|ϕ⟩ + ( |ψ⟩ + |χ⟩ ) = ( |ϕ⟩ + |ψ⟩ ) + |χ⟩. (II. 3)<br />
We require the existence of a null vector, 0 ∈ H, which is provable unique and has the property<br />
that for all |ϕ⟩ ∈ H<br />
0 + |ϕ⟩ = |ϕ⟩, (II. 4)
18 CHAPTER II. THE FORMALISM<br />
and that every vector has an additive inverse, i.e., for every |ϕ⟩ ∈ H there is a vector |ϕ ′ ⟩ ∈ H, also<br />
provable unique, such that<br />
|ϕ⟩ + |ϕ ′ ⟩ = 0 . (II. 5)<br />
The scalar multiplication is distributive and associative,<br />
(a + b) ( |ϕ⟩ + |ψ⟩ ) = a |ϕ⟩ + a |ψ⟩ + b |ϕ⟩ + b |ψ⟩, (II. 6)<br />
a ( b |ϕ⟩ ) = (a b) |ϕ⟩, (II. 7)<br />
and we demand that<br />
1 |ψ⟩ = |ψ⟩. (II. 8)<br />
Incidentally we also write<br />
a |ψ⟩ ≡ |a ψ⟩ ≡ |ψ⟩ a. (II. 9)<br />
EXERCISE 1. Prove (a) 0|ϕ⟩ = 0 ,<br />
(b) the additive inverse of |ϕ⟩ equals −1|ϕ⟩.<br />
An inner product on a vector space is a mapping H × H → C, where the image in C<br />
of ( |ϕ⟩, |ψ⟩ ) ∈ H × H is written as ⟨ϕ | ψ⟩. The inner product has the following properties:<br />
(i)<br />
⟨ϕ | a ψ + b χ⟩ = a ⟨ϕ | ψ⟩ + b ⟨ϕ | χ⟩,<br />
(ii) ⟨ϕ | ψ⟩ = ⟨ψ | ϕ⟩ ∗ ,<br />
(iii) ⟨ϕ | ϕ⟩ 0, (II. 10)<br />
(iv) ⟨ϕ | ϕ⟩ = 0 iff |ϕ⟩ = 0 .<br />
The value<br />
∥ψ∥ := √ ⟨ψ | ψ⟩ (II. 11)<br />
is called the norm of |ψ⟩ and meets the usual requirements for a norm; its value is positive, except<br />
for the zero vector which is assigned 0, it is homogeneous, in the sense that ∥aψ∥ = |a|∥ψ∥, and it<br />
satisfies the triangle inequality ∥ψ + ϕ∥ ∥ψ∥ + ∥ϕ∥. A vector is called a unit vector if the norm<br />
equals 1.
II. 1. FINITE - DIMENSIONAL HILBERT SPACES 19<br />
An important inequality is the Cauchy - Schwarz inequality<br />
|⟨ϕ | ψ⟩| 2 ⟨ϕ | ϕ⟩ ⟨ψ | ψ⟩. (II. 12)<br />
EXERCISE 2. Prove (a) the Cauchy - Schwarz inequality (II. 12),<br />
(b) the definition of the norm satisfies the standard requirements for a norm.<br />
The n vectors |α 1 ⟩, . . . , |α n ⟩ are called (linearly) independent if it follows from<br />
n∑<br />
c i |α i ⟩ = 0 (II. 13)<br />
i=1<br />
that all coefficients c i are equal to zero, otherwise the vectors are called dependent.<br />
EXERCISE 3. Prove that mutually orthogonal vectors are linearly independent.<br />
A set of vectors |α 1 ⟩, . . . , |α N ⟩ in H is complete 1 if every vector |ψ⟩ ∈ H can be written as a<br />
linear combination of this set,<br />
|ψ⟩ =<br />
N∑<br />
c i |α i ⟩. (II. 14)<br />
i=1<br />
A complete, independent set of vectors is called a basis. A basis is called orthonormal if<br />
⟨α i | α j ⟩ = δ ij , (II. 15)<br />
where δ ij is the Kronecker delta. It can be proved that every basis of a space H contains the same<br />
number of elements, this number is, by definition, the dimension of H, and is written dim H. The<br />
dimension of a Hilbert space is infinite if every finite set of linearly independent vectors is incomplete.<br />
If |α 1 ⟩, . . . , |α N ⟩ is an orthonormal basis, with N = dim H, then it follows from (II. 15) that the<br />
coefficients in (II. 14) are given by<br />
c i = ⟨α i | ψ⟩, (II. 16)<br />
and the vectors |ψ⟩ can thus be represented in such a basis by columns of N complex numbers.<br />
Therefore, an N - dimensional Hilbert space can also be written as C N .<br />
1 The use of the term ‘complete’ for a system of vectors should not be confused with the same phrase as used within the<br />
context of the foundations of quantum mechanics, that is, as a property of a physical theory.
20 CHAPTER II. THE FORMALISM<br />
With (II. 16), in an orthonormal basis we have<br />
⎛ ⎞<br />
c 1<br />
c 2<br />
|ψ⟩ = ⎜ ⎟<br />
⎝ . ⎠<br />
c n<br />
(II. 17)<br />
and hence ⟨ψ| = (c 1 ∗ , c 2 ∗ , . . . , c ∗ n), therefore<br />
⎛<br />
c 1 c<br />
∗ 1 . . . c 1 c ∗ ⎞<br />
n<br />
⎜<br />
|ψ⟩ ⟨ψ| = ⎝<br />
.<br />
. ..<br />
⎟<br />
⎠ , (II. 18)<br />
c n c<br />
∗ 1 c n cn<br />
∗<br />
from which it is evident that for the vectors of the orthonormal basis {|α i ⟩} it holds that<br />
N∑<br />
|α i ⟩ ⟨α i | = 11, (II. 19)<br />
i=1<br />
with 11 the identity mapping on H,<br />
11 |ψ⟩ = |ψ⟩ ∀ |ψ⟩ ∈ H. (II. 20)<br />
Using (II. 14) and (II. 16), we see that an orthonormal basis is indeed characterized by the relation<br />
|ψ⟩ =<br />
N∑<br />
⟨α i | ψ⟩ |α i ⟩ =<br />
i=1<br />
N∑<br />
|α i ⟩ ⟨α i | ψ⟩. (II. 21)<br />
i=1<br />
The definition of a finite - dimensional Hilbert space is now completed; it is a finite - dimensional<br />
complex Hilbert space with an inner product which is related to the norm by means of (II. 11). A real<br />
finite - dimensional Hilbert space is obtained by replacing C everywhere by R, i.e., the set of scalars is<br />
in R and the inner product is always real. In section II. 6 we will see that for the infinite - dimensional<br />
case the definition must be extended with two requirements, ‘separability’ and ‘completeness’, which<br />
we can prove in the finite - dimensional case.<br />
II. 2<br />
OPERATORS<br />
An operator A on a Hilbert space H is a linear mapping of H onto itself,<br />
A : H → H, |ψ⟩ ↦→ A |ψ⟩ with A ( a |ψ⟩ + b |ϕ⟩ ) = a A |ψ⟩ + b A |ϕ⟩. (II. 22)<br />
From (II. 16) we saw that in a given orthonormal basis |α 1 ⟩, . . . , |α N ⟩ the vectors |ψ⟩ ∈ H are<br />
unambiguously represented by rows of N complex numbers c i = ⟨α i | ψ⟩. This corresponds to the
II. 2. OPERATORS 21<br />
representation of an operator A as an N × N - matrix A in a basis {|α i ⟩}, and the coefficients of the<br />
vector A|ψ⟩ in this basis are, using (II. 19),<br />
with<br />
⟨α i | A | ψ⟩ = ⟨α i | A 11 | ψ⟩ =<br />
N∑<br />
⟨α i | A | α j ⟩ ⟨α j | ψ⟩ =<br />
j=1<br />
N∑<br />
A ij c j , (II. 23)<br />
j=1<br />
A ij := ⟨α i | A | α j ⟩. (II. 24)<br />
Operators A and B can be added and multiplied,<br />
(A + B) |ψ⟩ := A |ψ⟩ + B |ψ⟩ and (A B) |ψ⟩ := A ( B |ψ⟩ ) . (II. 25)<br />
The adjoint A † of an operator A is defined by the following equation<br />
⟨ψ | A † | ϕ⟩ = ⟨ϕ | A | ψ⟩ ∗ ∀ |ϕ⟩, |ψ⟩ ∈ H. (II. 26)<br />
EXERCISE 4.<br />
( ) A<br />
† = A ∗<br />
ij ji .<br />
Show that for the matrix representation in an orthonormal basis it holds that<br />
Every operator on a finite - dimensional vector space has a unique adjoint, and the following holds<br />
(c A) † = c ∗ A † ,<br />
(A + B) † = A † + B † ,<br />
(A B) † = B † A † ,<br />
(<br />
A<br />
† ) † = A. (II. 27)<br />
An operator B is called an inverse of A if<br />
A B = B A = 11. (II. 28)<br />
In this case we write A −1 for B, because the inverse, if it exists, is unique. Not every operator has an<br />
inverse, an example in the Hilbert space C 2 is<br />
( ) 0 1<br />
. (II. 29)<br />
0 0<br />
The trace of an operator A is defined as follows,<br />
Tr A :=<br />
N∑<br />
⟨γ i | A | γ i ⟩, (II. 30)<br />
i=1<br />
where |γ 1 ⟩, . . . , |γ N ⟩ is an arbitrary orthonormal basis and N = dim H.
22 CHAPTER II. THE FORMALISM<br />
EXERCISE 5. Show that Tr A is independent of the choice of the orthonormal basis.<br />
The trace has the following properties:<br />
Tr A † = Tr A ∗ ,<br />
Tr (bA + cB) = b Tr A + c Tr B,<br />
Tr AB = Tr BA. (II. 31)<br />
EXERCISE 6. Prove the three statements in (II. 31).<br />
We will now list the most important types of operators. An operator A is called normal if it<br />
commutes with its adjoint,<br />
[<br />
A, A<br />
† ] := A A † − A † A = 0 , (II. 32)<br />
where 0 is actually the ‘zero operator’, it maps all vectors to the zero vector 0 . An operator is called<br />
self - adjoint or Hermitian if it is equal to its adjoint,<br />
A † = A, (II. 33)<br />
and with the first statement of (II. 31) we see that the trace of a self-adjoint operator is always real.<br />
Self - adjoint operators are normal, but not all normal operators are self - adjoint, e.g., the unitary<br />
operator,<br />
U † = U − 1 . (II. 34)<br />
EXERCISE 7. Prove that a unitary operator preserves the inner product, e.g., for all |ϕ⟩,|ψ⟩ ∈ H<br />
the following holds: if |ϕ ′ ⟩ = U |ϕ⟩ and |ψ ′ ⟩ = U |ψ⟩ then ⟨ψ ′ | ϕ ′ ⟩ = ⟨ψ | ϕ⟩.<br />
An operator A is called positive, i.e. A 0, if<br />
⟨ψ | A | ψ⟩ 0 ∀ |ψ⟩ ∈ H. (II. 35)<br />
An operator P is called a projection operator, or a projector for short, if it is self - adjoint and<br />
idempotent,<br />
P = P † and P 2 = P. (II. 36)
II. 2. OPERATORS 23<br />
An example of a projector, apart from the obvious examples of the zero operator 0 and the identity<br />
operator 11, is the mapping P ϕ = |ϕ⟩ ⟨ϕ| which projects on a given unit vector |ϕ⟩,<br />
P ϕ : |ψ⟩ ↦→ ⟨ϕ | ψ⟩ |ϕ⟩ = |ϕ⟩ ⟨ϕ | ψ⟩. (II. 37)<br />
EXERCISE 8. Show that (a) every projector is positive,<br />
(b) if P is a projector, then 11 − P is one also.<br />
Projectors are the workhorses of Hilbert space. Nearly all of our further considerations concerning<br />
quantum mechanics can be formulated in terms of projectors, and therefore we will now discuss their<br />
properties somewhat more elaborate.<br />
We write the set of all projectors on a Hilbert space H as P (H). Every projector P can be<br />
characterized by means of its range, i.e. the set<br />
H P := { P |ψ⟩ : |ψ⟩ ∈ H } . (II. 38)<br />
This set is closed under linear combinations and thus forms another Hilbert space by itself, called a<br />
subspace of H. Conversely, every subspace of H corresponds unambiguously to a projector. 2<br />
The subspace corresponding to a projector is also called its eigenspace, and if the dimension of its<br />
eigenspace is N, the projector is called N - dimensional.<br />
Two projectors P 1 and P 2 are called mutually orthogonal, written as P 1 ⊥ P 2 , if<br />
P 1 P 2 = 0. (II. 39)<br />
In that case their eigenspaces are also orthogonal,<br />
P 1 ⊥ P 2 iff ∀ |ψ⟩ ∈ H P 1<br />
, ∀ |ϕ⟩ ∈ H P 2<br />
it holds that ⟨ϕ | ψ⟩ = 0. (II. 40)<br />
EXERCISE 9. Verify that P 1 P 2 = 0 =⇒ P 2 P 1 = 0 holds for projectors.<br />
For two orthogonal projectors P 1 ⊥ P 2 , the sum P 1 + P 2 is also a projector since it is, as can be<br />
seen using (II. 27), self - adjoint, and it is idempotent,<br />
(P 1 + P 2 ) 2 = P 1 2 + P 1 P 2 + P 2 P 1 + P 2 2 = P 1 2 + P 2 2 = P 1 + P 2 , (II. 41)<br />
thereby satisfying the requirements (II. 36). The eigenspace of the projector P 1 + P 2 is the linear<br />
space spanned by the vectors in H P1 and H P2 .<br />
2 In infinite - dimensional Hilbert spaces this only holds for closed subspaces.
24 CHAPTER II. THE FORMALISM<br />
A set of projectors P 1 , . . . , P N is called mutually orthogonal if<br />
P i P j = δ ij P i for i, j = 1, . . . , N, (II. 42)<br />
a set of mutually orthogonal projectors is called complete if<br />
N∑<br />
P i = 11. (II. 43)<br />
i=1<br />
In particular, in accordance with (II. 19), for an orthonormal basis |α i ⟩, . . . , |α N ⟩ it holds that the<br />
associated 1 - dimensional projectors form a complete set,<br />
N∑<br />
|α i ⟩ ⟨α i | = 11. (II. 44)<br />
i=1<br />
II. 3<br />
EIGENVALUE PROBLEM AND SPECTRAL THEOREM<br />
If |β 1 ⟩, . . . , |β N ⟩ is an arbitrary orthonormal basis, an operator A is represented in this basis as<br />
an arbitrary N × N - matrix,<br />
A ij = ⟨β i | A | β j ⟩. (II. 45)<br />
A powerful tool for the study of such matrices is obtained if they can be ‘diagonalized’, i.e., if an<br />
orthonormal basis |α 1 ⟩, . . . , |α N ⟩ can be found where the matrix representation of A is of the form<br />
⎛ ⎞<br />
a 1<br />
A = ⎝ . ..<br />
0<br />
⎠ , (II. 46)<br />
0 a N<br />
or, equivalently,<br />
A ij = a j δ ij . (II. 47)<br />
For such a basis it holds that<br />
A |α i ⟩ = a i |α i ⟩. (II. 48)<br />
Equation (II. 48) is called the eigenvalue equation of the operator A, the values a i are called the<br />
eigenvalues of A, the set of eigenvalues of A the spectrum of A, written as Spec A, the vectors |α i ⟩<br />
are called the eigenvectors, and the system |α 1 ⟩, . . . , |α N ⟩ an eigenbasis of A. For a self - adjoint<br />
operator it holds that the eigenvalues are all real, and the eigenvalues are not negative if the operator<br />
is positive. For a unitary operator U all eigenvalues u i ∈ C are on the complex unit circle, |u i | = 1,<br />
for a projector the eigenvalues are 0 or 1.<br />
The eigenvalue equation does, however, not always have a solution. See as an example operator<br />
(II. 29). The conditions under which the equation can be solved are given by the next important<br />
theorem which we mention without proof.
II. 3. EIGENVALUE PROBLEM AND SPECTRAL THEOREM 25<br />
SPECTRAL THEOREM:<br />
Every normal operator A has an orthonormal basis of eigenvectors |α 1 ⟩, . . . , |α N ⟩ and<br />
associated eigenvalues a 1 , . . . , a N , not necessarily distinct, satisfying (II. 48).<br />
The spectral theorem tells us that normal operators can be diagonalized. This can be formulated<br />
more elegantly in Dirac notation, where we must distinguish between the case in which all eigenvalues<br />
differ from each other, and the case in which some eigenvalues are equal. In the first case the operator<br />
is called maximal, in the second case the operator is called degenerate.<br />
Suppose that the operator A is maximal, i.e. all eigenvalues a i differ from each other, a i ≠ a j<br />
if i ≠ j. In this case we often use the eigenvalues as a label for the eigenvectors and write |a i ⟩ instead<br />
of |α i ⟩. This notation is unambiguous, since there is exactly one eigenvalue for every eigenvector.<br />
Now, according to the spectral theorem, there is an orthonormal basis |a 1 ⟩, . . . , |a n ⟩ such that<br />
A =<br />
N∑<br />
a i |a i ⟩ ⟨a i |, (II. 49)<br />
i=1<br />
since, with (II. 44), it holds for all |ψ⟩ ∈ H that<br />
A |ψ⟩ = A 11 |ψ⟩ = A<br />
N∑<br />
|a i ⟩ ⟨a i | ψ⟩ =<br />
i=1<br />
N∑<br />
a i |a i ⟩ ⟨a i | ψ⟩. (II. 50)<br />
i=1<br />
If the operator is degenerate there are only M < N distinct eigenvalues a 1 , . . . , a M . For every<br />
eigenvalue a i , there exists a number n i of mutually orthogonal eigenvectors, for which we have<br />
M∑<br />
n i = N. (II. 51)<br />
i=1<br />
The eigenvalue a i is called n i - fold degenerate. The associated eigenvectors span a n i - dimensional<br />
subspace of eigenvectors for the value a i .<br />
Choose, in this subspace, an orthonormal basis {|α i , j⟩} with j = 1, . . . , n i . Here we can also<br />
use the eigenvalues a i as a label for the basis vectors because the extra label j prevents our notation<br />
from becoming ambiguous. Now the eigenvalue equation (II. 48) becomes<br />
A |a i , j⟩ = a i |a i , j⟩. (II. 52)<br />
Analogous to (II. 49), we find<br />
A =<br />
M∑<br />
i=1<br />
a i<br />
∑n i<br />
j=1<br />
|a i , j⟩ ⟨a i , j|, (II. 53)<br />
which, in terms of the n i - dimensional eigenprojectors<br />
P ai =<br />
∑n i<br />
j=1<br />
|a i , j⟩ ⟨a i , j|, (II. 54)
26 CHAPTER II. THE FORMALISM<br />
can also be written as<br />
A =<br />
M∑<br />
a i P ai . (II. 55)<br />
i=1<br />
EXERCISE 10. (a.) Show that P ai in (II. 54) is independent of the choice of the orthonormal<br />
basis |a i , 1⟩, . . . , |a i , n i ⟩. (b). Show that for P ai as defined in (II. 54) and P ϕ given by II. 37:<br />
TrP ai P ϕ = ⟨ϕ|P ai |ϕ⟩ (II. 56)<br />
We summarize the two preceding cases in the following, equivalent, form of the spectral theorem,<br />
formulated in terms of projectors.<br />
SPECTRAL THEOREM:<br />
For every normal operator A a unique set of mutually distinct eigenvalues a 1 , . . . , a M<br />
exists, with M N, and an associated unique complete set of mutually orthogonal projectors<br />
P a1 , . . . , P aM , such that<br />
A =<br />
11 =<br />
M∑<br />
a i P ai , (II. 57)<br />
i=1<br />
M∑<br />
P ai . (II. 58)<br />
i=1<br />
If the operator is non - degenerate, all of these projectors are 1 - dimensional; if it is degenerate,<br />
dim P ai gives the degeneracy of eigenvalue a i . Equation (II. 57) is called the spectral decomposition<br />
of A, the set of mutually orthogonal projectors P ai is called the spectral family of A, and (II. 58) a<br />
resolution of identity.<br />
II. 3. 1<br />
APPENDIX<br />
A formulation of the spectral theorem which is equivalent to the preceding, but is more suitable<br />
for generalizations, can be obtained if we introduce the correspondence between the eigenvalues and<br />
the associated eigenprojectors as a mapping A of all subsets of Spec A ⊂ C to the set P (H) of<br />
projectors on H.<br />
We construct that mapping by demanding<br />
{a i } ↦→ P ai , (II. 59)
II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 27<br />
and extend this with the condition<br />
{a 1 , a 2 } ↦→ P {a1 , a 2 } := P a1 + P a2 , (II. 60)<br />
or, more generally, if ∆ represents an arbitrary set of eigenvalues, we define<br />
∆ ↦→ P ∆ = ∑<br />
P a . (II. 61)<br />
a ∈ ∆<br />
A mapping A : C → P (H) is called a projection - valued measure if<br />
(i) P ∅ = 0<br />
(ii) P Spec A = 11<br />
(iii) P ∪i ∆ i<br />
= ∑ i<br />
P ∆i , for all ∆ i mutually disjoint. (II. 62)<br />
EXERCISE 11. Verify that: P ∆ c = 11 − P ∆ where ∆ c = Spec A \ ∆ is the complement of ∆.<br />
The spectral theorem can now again be formulated.<br />
SPECTRAL THEOREM:<br />
Every normal operator A corresponds unambiguously to a projection - valued measure A.<br />
II. 4<br />
FUNCTIONS <strong>OF</strong> NORMAL OPERATORS<br />
The spectral theorem makes it possible to treat functions of normal operators in a simple manner.<br />
If f is an arbitrary function, real or complex, and A is an operator with spectral decomposition<br />
A =<br />
M∑<br />
a i P ai , (II. 63)<br />
i=1<br />
then the function f (A) of A is defined as<br />
f (A) :=<br />
M∑<br />
f (a i ) P ai . (II. 64)<br />
i=1<br />
This means that f (A) always has the same eigenvectors and eigenprojections as A, and only differs<br />
from A in the labeling of its eigenvalues, namely by f (a i ) instead of a i . As an example, consider the<br />
characteristic function χ a of a ∈ C,<br />
χ a : C → {0, 1}, x ↦→ χ a (x) :=<br />
{ 1 if x = a<br />
0 otherwise<br />
(II. 65)
28 CHAPTER II. THE FORMALISM<br />
for which, with (II. 64), we have<br />
χ ak (A) : =<br />
M∑<br />
χ ak (a i ) P ai = P ak , (II. 66)<br />
i=1<br />
and we see that the projectors from the spectral decomposition of A, (II. 63), are functions of A.<br />
We use the spectral decompositions in the proof of the following theorem.<br />
THEOREM:<br />
If two self - adjoint operators A and B commute, there is a maximal, self - adjoint operator<br />
C of which both A and B are a function.<br />
To prove this theorem we first prove a useful lemma.<br />
LEMMA:<br />
If [A, B] = 0, a basis {|γ i ⟩} exists in which A and B are simultaneously diagonal.<br />
Proof<br />
Let {|a i , j⟩} be an orthonormal eigenbasis of operator A, where j = 1, . . . , n i is the degeneracy<br />
of eigenvalue a i , and we have<br />
⟨a p , q | a i , j⟩ = δ pi δ qj . (II. 67)<br />
Analogously, let there be an orthonormal eigenbasis {|b k , l⟩} for operator B. From [A, B] = 0<br />
and (II. 63) it follows that<br />
A ( B |a i , j⟩ ) = B A |a i , j⟩ = a i B |a i , j⟩, (II. 68)<br />
and B |a i , j⟩ is, apparently, an eigenvector of A with the eigenvalue a i , i.e., B |a i , j⟩ is in the<br />
eigenspace spanned by |a i , 1⟩, . . . , |a i , n i ⟩. Or, equivalently,<br />
B |a i , j⟩ =<br />
∑n i<br />
k=1<br />
holds for certain numbers Λ [i]<br />
j,k ∈ C.<br />
Λ [i]<br />
j,k |a i, k⟩ (II. 69)<br />
By assmuptionion, B is self - adjoint and therefore the matrix Λ [i] must be Hermitian,<br />
and we see that<br />
⟨a k , l | B | a i , j⟩ = Λ [i]<br />
l,j δ ki = Λ [i]<br />
l,j<br />
, (II. 70)<br />
⟨a k , l | B | a i , j⟩ ∗ = Λ [i] ∗<br />
l,j = ⟨ai , j | B † | a k , l⟩ = Λ [k]<br />
j,l δ ik = Λ [i]<br />
j,l<br />
, (II. 71)<br />
Λ [i]<br />
l,j<br />
∗ [i] = Λ<br />
j,l<br />
. (II. 72)
II. 4. FUNCTIONS <strong>OF</strong> NORMAL OPERATORS 29<br />
Because Λ [i] is self - adjoint, it can be diagonalized by a unitary matrix S [i] ,<br />
Λ ′ [i]<br />
= S [i]− 1 Λ [i] S [i] . (II. 73)<br />
This corresponds to an orthonormal basis transformation within the n i - dimensional subspace<br />
with eigenvalue a i . Carrying out this transformation in each of the subspaces and writing |a i , m ′ ⟩<br />
for the transformed eigenvectors of A, we have<br />
|a i , m ′ ⟩ =<br />
∑n i<br />
j=1<br />
S [i]<br />
j,m ′ |a i, j⟩. (II. 74)<br />
In the new basis {|a i , m ′ ⟩} the matrix Λ [i] is diagonalized and therefore<br />
B |a i , m ′ ⟩ = Λ ′ [i]<br />
m ′ , m ′ δ m ′ j |a i , j⟩ = Λ ′ [i]<br />
m ′ , m ′ |a i, m ′ ⟩. (II. 75)<br />
The vectors |a i , m ′ ⟩ are not just eigenvectors of A, but also of B and form, by construction, a<br />
basis. □<br />
Notice that it is not in contradiction to this lemma if non - commuting operators have some eigenvectors<br />
in common. ▹<br />
Now we come to the proof of the theorem.<br />
Proof<br />
Define, in the basis {|γ i ⟩} of the lemma<br />
A = ∑ i<br />
a i P |γi⟩ and B = ∑ i<br />
b i P |γi⟩, (II. 76)<br />
where the eigenvalues a i and b i are allowed to be degenerate. Next, define a maximal self - adjoint<br />
operator<br />
C = ∑ i<br />
c i P |γi ⟩, (II. 77)<br />
with all c i ∈ C distinct.<br />
Then, according to (II. 66), with χ ci defined analogously to (II. 65),<br />
P |γi⟩ = χ ci (C). (II. 78)<br />
With f (x) = ∑ i<br />
a i χ ci (x) and g(x) = ∑ i<br />
b i χ ci (x), as defined in (II. 64), we now find<br />
A = ∑ i<br />
a i χ ci (C) = f (C) and B = ∑ i<br />
b i χ ci (C) = g(C). (II. 79)
30 CHAPTER II. THE FORMALISM<br />
Thus, both self - adjoint, and mutually commuting, operators A and B are functions of the maximal,<br />
self - adjoint operator C, which is what we set out to prove. □<br />
Note that the choice of C in the above theorem is not unique. Indeed, suppose that<br />
A = f (C 1 ) = g(C 2 ), (II. 80)<br />
where C 1 and C 2 are both maximal. In general, it is not required for C 1 and C 2 to commute.<br />
But they do commute if A itself is maximal . In that case f can be inverted<br />
C 1 = f − 1 (A) = f − 1 (g(C 2 )) (II. 81)<br />
from which it follows that<br />
[C 1 , C 2 ] = 0. ▹ (II. 82)<br />
II. 5<br />
DIRECT SUM AND DIRECT PRODUCT<br />
There are two ways to construct a new Hilbert space H from two given Hilbert spaces H 1<br />
and H 2 , or vice versa, to divide a given Hilbert space H into smaller spaces.<br />
II. 5. 1<br />
DIRECT SUM<br />
Let H 1 and H 2 be two Hilbert spaces. By definition we call the space H := H 1 ⊕ H 2 the direct<br />
sum space of H 1 and H 2 if the following requirements are satisfied:<br />
(i) The space H 1 ⊕ H 2 contains as its elements all ordered pairs of vectors, written as |ϕ⟩ 1 ⊕ |ψ⟩ 2 ,<br />
with |ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 ∈ H 2 .<br />
(ii) Addition and scalar multiplication are defined on H 1 ⊕ H 2 , and obey<br />
a ( |ϕ⟩ 1 ⊕ |ψ⟩ 2<br />
)<br />
+ b<br />
(<br />
|χ⟩1 ⊕ |ξ⟩ 2<br />
)<br />
=<br />
(<br />
a |ϕ⟩1 + b |χ⟩ 1<br />
)<br />
⊕<br />
(<br />
a |ψ⟩2 + b |ξ⟩ 2<br />
)<br />
. (II. 83)<br />
(iii) The inner product is additive,<br />
(<br />
1⟨ϕ| ⊕ 2 ⟨ϕ| ) ( |ψ⟩ 1 ⊕ |ψ⟩ 2<br />
)<br />
= 1 ⟨ϕ | ψ⟩ 1 + 2 ⟨ϕ | ψ⟩ 2 . (II. 84)<br />
(iv) H 1 ⊕ H 2 is the smallest Hilbert space spanned by the elements of the form |ϕ⟩ 1 ⊕ |ψ⟩ 2 and<br />
their linear combinations.
II. 5. DIRECT SUM AND DIRECT PRODUCT 31<br />
A few remarks about this definition are in order. (a) According to (II. 83), an arbitrary linear<br />
combination of elements in H 1 ⊕ H 2 is, of the form<br />
∑ ( ) ∑<br />
a i |ϕi ⟩ 1 ⊕ |ψ i ⟩ 2 = a i |ϕ i ⟩ 1 ⊕ ∑ a i |ψ i ⟩ 2 . (II. 85)<br />
i<br />
i<br />
i<br />
Consequently, with |ϕ⟩ 1 := ∑ i a i|ϕ⟩ 1 ∈ H 1 and |ψ⟩ 2 := ∑ i a i|ψ⟩ 2 ∈ H 2 , all elements in H 1 ⊕ H 2<br />
are of the form |ϕ⟩ 1 ⊕|ψ⟩ 2 . This means that the requirements (i) and (ii) imply that H 1 ⊕H 2 is closed<br />
under linear combinations.<br />
(b) The subspace of H 1 ⊕H 2 , existing of all vectors of the form 0 1 ⊕|ψ⟩ 2 , with 0 1 the null vector<br />
in H 1 , and |ψ⟩ 2 ∈ H 2 arbitrary, is isomorphic to H 2 , likewise for |ϕ⟩ 1 ⊕ 0 2 and H 1 . Moreover, these<br />
two subspaces of H 1 ⊕ H 2 are mutually orthogonal, because<br />
(<br />
1⟨ϕ| ⊕ 0 2<br />
) (<br />
0 1 ⊕ |ψ⟩ 2<br />
)<br />
= 1 ⟨ϕ | 0 ⟩ 1 + 2 ⟨0 | ψ⟩ 2 = 0. (II. 86)<br />
Therefore, every vector |χ⟩ ∈ H 1 ⊕ H 2 can be written uniquely as the direct sum of two orthogonal<br />
terms,<br />
|χ⟩ = |ϕ⟩ 1 ⊕ |ψ⟩ 2 = |ϕ⟩ 1 ⊕ 0 2 + 0 1 ⊕ |ψ⟩ 2 . (II. 87)<br />
Vice versa, suppose that H is an arbitrary Hilbert space, and that H 1 is a subspace of H. Now<br />
let H 2 = H 1 ⊥ be the orthocomplement of H 1 , i.e., H 2 contains all vectors in H which are perpendicular<br />
to all vectors in H 1 . Then H = H 1 ⊕ H 2 holds, with the identification<br />
and<br />
|ϕ⟩ 1 ⊕ 0 2 ↔ |ϕ⟩ ∈ H 1 , (II. 88)<br />
0 1 ⊕ |ψ⟩ 2 ↔ |ψ⟩ ∈ H 2 , (II. 89)<br />
|ϕ⟩ ⊕ |ψ⟩ = |ϕ⟩ + |ψ⟩. (II. 90)<br />
In this case the direct sum ⊕ is nothing but ordinary addition in H, which was given in (II. 1) as<br />
a general property of H. This means that every Hilbert space can be written as a direct sum of<br />
an arbitrary subspace and its orthocomplement. We also see something that holds generally: the<br />
dimension of H 1 ⊕ H 2 is the sum of the dimensions of H 1 and H 2 ,<br />
dim (H 1 ⊕ H 2 ) = dim H 1 + dim H 2 . (II. 91)<br />
II. 5. 2<br />
DIRECT PRODUCT<br />
There is another, actually more important, way to construct a new Hilbert spaces out of two given<br />
spaces. Again, let H 1 and H 2 be two Hilbert spaces. By definition we call the space H :=<br />
H 1 ⊗ H 2 the direct product space if the following requirements have been satisfied.
32 CHAPTER II. THE FORMALISM<br />
(i) The space H 1 ⊗ H 2 has as its elements at least all ordered pairs ( |ϕ⟩ 1 , |ψ⟩ 2<br />
)<br />
, with |ϕ⟩1 ∈ H 1<br />
and |ψ⟩ 2 ∈ H 2 , which we now write as |ϕ⟩ 1 ⊗ |ψ⟩ 2 .<br />
(ii) The addition and scalar multiplication on H 1 ⊗ H 2 satisfy<br />
|ϕ⟩ 1 ⊗ |ψ⟩ 2 + |ϕ⟩ 1 ⊗ |χ⟩ 2 = |ϕ⟩ 1 ⊗ ( |ψ⟩ 2 + |χ⟩ 2<br />
)<br />
. (II. 92)<br />
and<br />
a ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />
)<br />
= a |ϕ⟩1 ⊗ |ψ⟩ 2 = |ϕ⟩ 1 ⊗ a |ψ⟩ 2 (II. 93)<br />
(iii) The inner product is multiplicative,<br />
(<br />
1⟨ϕ| ⊗ 2 ⟨χ| ) ( |ψ⟩ 1 ⊗ |ξ⟩ 2<br />
)<br />
= 1 ⟨ϕ | ψ⟩ 1 2 ⟨χ | ξ⟩ 2 . (II. 94)<br />
(iv) H 1 ⊗ H 2 is the smallest Hilbert space spanned by vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 ∈ H and<br />
their linear combinations.<br />
If |α 1 ⟩ 1 , . . . , |α N1 ⟩ 1 is an orthonormal basis in H 1 , and |β 1 ⟩ 2 , . . . , |β N2 ⟩ 2 is likewise in H 2 ,<br />
with N 1 = dim H 1 , N 2 = dim H 2 , their direct products, i.e. the vectors of the form |α i ⟩ 1 ⊗ |β j ⟩ 2<br />
provide, an orthonormal set of vectors in H 1 ⊗ H 2 . Indeed, using (II. 94),<br />
(<br />
1⟨α j | ⊗ 2 ⟨β k | ) ( |α m ⟩ 1 ⊗ |β n ⟩ 2<br />
)<br />
= 1 ⟨α j | α m ⟩ 1 2 ⟨β k | β n ⟩ 2 = δ jm δ kn . (II. 95)<br />
Because orthonormal vectors are independent, the dimension of H 1 ⊗ H 2 cannot be smaller than<br />
the product of the separate dimensions. But furthermore, according to (iv), all vectors in H 1 ⊗ H 2<br />
are obtainable as linear combinations of vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 , which in turn are linear<br />
combinations of the vectors |α j ⟩ 1 ⊗|β k ⟩ 2 . Therefore, these vectors also span the entire space H 1 ⊗H 2 .<br />
In other words, |α 1 ⟩ 1 ⊗ |β 1 ⟩ 2 , |α 2 ⟩ 1 ⊗ |β 1 ⟩ 2 , . . . , |α N1 ⟩ 1 ⊗ |β N2 ⟩ 2 is also a basis for H 1 ⊗ H 2 . For<br />
the dimension of H 1 ⊗ H 2 we thus find<br />
dim (H 1 ⊗ H 2 ) = dim H 1 · dim H 2 . (II. 96)<br />
Consequently, an arbitrary vector |χ⟩ ∈ H 1 ⊗ H 2 can, in this product basis |α j ⟩ 1 ⊗ |β k ⟩ 2 , be written<br />
as<br />
|χ⟩ =<br />
∑N 1 ∑N 2<br />
j=1 k=1<br />
c jk |α j ⟩ 1 ⊗ |β k ⟩ 2 with c jk = ( 1⟨α j | ⊗ 2 ⟨β k | ) |χ⟩ ∈ C. (II. 97)<br />
For vectors of the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 it holds that<br />
N 1 ∑<br />
j=1<br />
a j |α j ⟩ 1 ⊗<br />
N 2 ∑<br />
k=1<br />
b k |β k ⟩ 2 =<br />
∑N 1 ∑N 2<br />
j=1 k=1<br />
a j b k |α j ⟩ 1 ⊗ |β k ⟩ 2 . (II. 98)
II. 5. DIRECT SUM AND DIRECT PRODUCT 33<br />
We see that (II. 98) is a special case of (II. 97), that is, where c jk = a j b k . The special vectors which<br />
can be written as (II. 98), i.e., in the form |ϕ⟩ 1 ⊗|ψ⟩ 2 , are called direct product vectors, or factorizable.<br />
In a direct sum space H 1 ⊕H 2 all vectors can be written in the form |ϕ⟩ 1 ⊕|ψ⟩ 2 , but in a direct product<br />
space H 1 ⊗ H 2 not all vectors can be written in the form |ϕ⟩ 1 ⊗ |ψ⟩ 2 . Further on we will see that<br />
states for which c jk cannot be written as a j b k give rise to typical quantum mechanical behavior, as<br />
in the thought experiment of EPR where composite systems are considered, corresponding to states<br />
on H 1 ⊗ H 2 which cannot be factorized. Such states are called non - factorizable or entangled states.<br />
If A and B are operators on H 1 and H 2 , respectively, the direct product operator A ⊗ B is the<br />
operator on H 1 ⊗ H 2 , defined by<br />
(A ⊗ B) ( |ϕ⟩ 1 ⊗ |ψ⟩ 2<br />
)<br />
:= A |ϕ⟩1 ⊗ B |ψ⟩ 2 . (II. 99)<br />
It follows that, with operators C ∈ H 1 and D ∈ H 2 ,<br />
(A ⊗ B) (C ⊗ D) = (A C) ⊗ (B D). (II. 100)<br />
Similar to vectors, operators on the direct product space H 1 ⊗ H 2 are not always factorizable. The<br />
total momentum operator P 1 + P 2 and the distance operator Q 1 − Q 2 of EPR, with P as defined in<br />
section I. 2, (I. 1), and Q likewise, are examples of such non - factorizable direct product operators,<br />
P 1 ⊗ 11 2 + 11 1 ⊗ P 2 and Q 1 ⊗ 11 2 − 11 1 ⊗ Q 2 . (II. 101)<br />
EXERCISE 12. Calculate the commutator of these operators, given that [ P i , Q j<br />
]<br />
= −iδij .<br />
The following properties of the direct product of operators will, further on, be used frequently:<br />
A ⊗ 0 = 0 ⊗ B = 0 ,<br />
(A 1 + A 2 ) ⊗ B = (A 1 ⊗ B) + (A 2 ⊗ B),<br />
11 ⊗ 11 = 11,<br />
a A ⊗ b B = a b (A ⊗ B), (II. 102)<br />
(A ⊗ B) − 1 = A − 1 ⊗ B − 1 ,<br />
(A ⊗ B) † = A † ⊗ B † ,<br />
Tr ( bA ⊗ cB ) = b c Tr A · Tr B.<br />
EXERCISE 13. Prove the properties of ⊗ in (II. 102).
34 CHAPTER II. THE FORMALISM<br />
Finally, the matrix A ⊗ B of the operator A ⊗ B in the direct product space H 1 ⊗ H 2 is of the<br />
form<br />
⎛ ⎛<br />
⎞<br />
⎞<br />
b 11 · · · b 1N2<br />
⎜<br />
a 11 ⎝<br />
.<br />
. ..<br />
⎟<br />
⎠ · · · a 1N1 B<br />
b N2 1 b N2 N 2 A ⊗ B =<br />
, (II. 103)<br />
a 22 B .<br />
⎜<br />
.<br />
⎝ .<br />
..<br />
⎟<br />
⎠<br />
a N1 1 B · · · a N1 N 1<br />
B<br />
where a ij = ⟨α i | A | α j ⟩ and b kl = B kl = ⟨β k | B | β l ⟩, as in (II. 24). This matrix is called the<br />
Kronecker product of the matrices A and B.<br />
II. 6<br />
ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES<br />
This section is intended for interested readers, who wish to gain more in - depth knowledge of<br />
Hilbert spaces.<br />
In physical applications of quantum mechanics we nearly always need infinite - dimensional Hilbert<br />
spaces. Indeed, this already applies to the case of a free particle in one spatial dimension.<br />
The mathematical theory of infinite - dimensional Hilbert spaces is in some aspects more difficult<br />
than that of finite - dimensional ones.<br />
II. 6. 1<br />
THE STRUCTURE <strong>OF</strong> VECTOR SPACES<br />
An infinite - dimensional space H is a space where for every n independent vectors in H, with<br />
n arbitrarily large, it is always possible to find still another vector in H that is independent of these<br />
vectors. In rough approximation it can be said that all formulas of the previous sections remain valid<br />
if we replace the sums from 1 to N by sums from 1 to infinity. But, of course, attention must be given<br />
to the convergence of such sums. This leads to two extra assumptions which were superfluous in the<br />
theory of finite - dimensional spaces.<br />
(i) Separability. A Hilbert space H is called separable if it has a countable basis, i.e., a countable<br />
set of independent vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H exists such that every vector |ϕ⟩ ∈ H<br />
can, analogously to (II. 14), be written as<br />
|ϕ⟩ =<br />
∞∑<br />
c j |ϕ j ⟩ with c j = ⟨ϕ j | ϕ⟩. (II. 104)<br />
j=1<br />
This equation is shorthand for<br />
lim<br />
m→∞<br />
∥<br />
∥ϕ −<br />
m∑ ∥ ∥∥<br />
c j ϕ j = 0. (II. 105)<br />
j=1
II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 35<br />
(ii) Completeness. We require that the space is complete, which means that every Cauchy sequence,<br />
i.e., a sequence of vectors |ϕ 1 ⟩, |ϕ 2 ⟩, . . . , |ϕ j ⟩, . . . ∈ H, for which<br />
lim ∥ϕ j − ϕ k ∥ = 0, (II. 106)<br />
j, k→∞<br />
has a limit vector |ϕ⟩ in H,<br />
lim ∥ϕ m − ϕ∥ = 0. (II. 107)<br />
m→∞<br />
for example, in this sense Q, the set of rational numbers, is incomplete, since many Cauchy<br />
sequences of rational terms exist which have no limit in Q, for instance the series expansions of π<br />
and e. If the limiting points of all Cauchy sequences are added to Q, we obtain exactly R. Q is called<br />
a countably infinite set, R is called an uncountably infinite set.<br />
Below, we will assume Hilbert spaces to be separable and complete.<br />
EXERCISE 14. Prove that every finite - dimensional complex vector space with an inner product<br />
is separable and complete.<br />
The claim in the3 above exercise makes clear that in the finite - dimensional case the requirements<br />
of separability and completeness are indeed superfluous.<br />
The next two spaces are well - known examples of infinite - dimensional Hilbert spaces.<br />
(i) The space of all complex, square integrable functions,<br />
{<br />
∫<br />
}<br />
L 2 (R) := ψ : R → C ∣ |ψ(q)| 2 dq < ∞ , (II. 108)<br />
R<br />
with an inner product defined as<br />
∫<br />
⟨ψ | ϕ⟩ := ψ ∗ (q) ϕ(q) dq, (II. 109)<br />
R<br />
and likewise for L 2 (R n ) with arbitrary n ∈ N + .<br />
(ii) The space of square summable sequences of complex numbers, defined by Erhard Schmidt,<br />
l 2 (N) :=<br />
{<br />
c : N → C ∣<br />
∞∑<br />
j=0<br />
}<br />
|c j | 2 < ∞ , (II. 110)<br />
with inner product<br />
⟨c | d⟩ :=<br />
∞∑<br />
cj ∗ d j . (II. 111)<br />
j=0
36 CHAPTER II. THE FORMALISM<br />
The proof that these vector spaces are complete is not simple, however, the proof that the remaining<br />
requirements for a Hilbert space have been met, is.<br />
These two spaces correspond to two versions of quantum mechanics, where L 2 (R) corresponds<br />
to Schrödingers wave mechanics (1926) and l 2 (N) to the matrix mechanics of Heisenberg, Born, and<br />
Jordan (1925), that is, if we take matrix mechanics in the enriched version of Von Neumann, since<br />
the original version did not contain a ‘state space’. These two versions of quantum mechanics are<br />
mathematically equivalent, see F.A. Muller (1997a, 1997b and 1999) for historical details.<br />
II. 6. 2<br />
OPERATORS<br />
More serious complications occur when introducing of operators on infinite - dimensional Hilbert<br />
spaces. First, we will see that such operators are in general ‘unbounded’, which entails that<br />
they cannot be defined on the entire Hilbert space. Consequently, the definition of sum and product<br />
of operators, as well as their adjoints, becomes more cumbersome, and the terms ‘self - adjoint’<br />
and ‘Hermitian’ no longer coincide. Second, these operators do not always have eigenvectors in H.<br />
Therefore it is more difficult to give a useful version of the spectral theorem.<br />
The second problem is independent of the first, i.e., it can also appear for bounded self - adjoint<br />
operators. ▹<br />
For position and momentum both complications occur together which is shown by an example.<br />
EXAMPLE<br />
Consider the position operator<br />
Q : ψ(q) ↦→ q ψ(q), (II. 112)<br />
and the momentum operator<br />
P : ψ(q) ↦→ − i d ψ(q), (II. 113)<br />
dq<br />
both acting on L 2 (R).<br />
The first problem is that these operators do not map every vector in L 2 (R) to another vector<br />
in L 2 (R). For instance, every non - differentiable function in L 2 (R) is outside the domain of P .<br />
Vice versa, taking for Q, for example, ψ (q) = (a + q) − 3 2 with a ∈ R, we have ψ ∈ L 2 (R),<br />
but Qψ ∉ L 2 (R).<br />
The second problem is that the eigenvalue equation for momentum, −i d dq<br />
ψ (q) = pψ (q), has<br />
solutions ψ (q) ∝ e i pq for p ∈ R, but these functions are not square integrable and therefore<br />
they are not in L 2 (R). Something similar applies to the eigenvalue equation Qψ(q) = q 0 ψ(q)<br />
and its solutions ψ(q) = δ(q − q 0 ).
II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 37<br />
II. 6. 2. 1<br />
UNBOUNDED OPERATORS<br />
Let us start with a definition: an operator A on Hilbert space H is called bounded if the set of<br />
positive numbers ∥Aχ∥ = ∥⟨χ | A | χ⟩∥ has an upper bound for all unit vectors |χ⟩, where the least<br />
upper bound, or supremum, is called the norm of A,<br />
{<br />
}<br />
∥A∥ = sup ∥Aχ∥ ∈ R ∣ ∥χ∥ = 1 . (II. 114)<br />
The set of all bounded operators on H is written as B(H).<br />
In finite - dimensional Hilbert spaces all operators are bounded, but this is not the case in infinite -<br />
dimensional Hilbert spaces. As we want to hold on to the requirement that every vector A|ψ⟩ has a<br />
finite norm, we have to exclude from the domain of A the set of vectors |ϕ⟩ for which<br />
∥A χ∥<br />
∥χ∥<br />
→ ∞ if |χ⟩ → |ϕ⟩. (II. 115)<br />
Therefore, from now an operator A is a linear mapping from a subset of H to H. This subset is called<br />
the domain of A, written as Dom A ⊂ H. Hence, an operator is a linear mapping<br />
ψ ∈ Dom A, A : ψ ↦→ A ψ ∈ H. (II. 116)<br />
We will, however, always assume that Dom A is dense in H which means that every vector ϕ in H<br />
can be approximated arbitrarily well by vectors in Dom A. The foregoing implies that also sums and<br />
products of operators are generally defined on a limited domain only,<br />
Dom (A + B) = Dom A ∩ Dom B (II. 117)<br />
{<br />
}<br />
Dom (A B) = ψ ∈ Dom B : B ψ ∈ Dom A . (II. 118)<br />
It is more difficult to introduce the adjoint A † of an operator A. The operator is again called<br />
Hermitian if<br />
⟨ϕ | A | ψ⟩ = ⟨ψ | A | ϕ⟩ ∗ ∀ ϕ, ψ ∈ Dom A, (II. 119)<br />
but this definition is no longer sufficient for our purposes, as can be seen in the next example.<br />
EXAMPLE<br />
Consider the operator P from (II. 113), now acting on L 2( [0, ∞⟩ ) , and choose as its domain<br />
Dom P =<br />
{<br />
ψ :<br />
∫ ∞<br />
0<br />
∫<br />
|ψ(q)| 2 dq < ∞,<br />
}<br />
|P ψ(q)| 2 dq < ∞, ψ(0) = 0 . (II. 120)<br />
This operator is indeed Hermitian, which can be checked using integration by parts, where the<br />
non - integral term cancels out because of the boundary condition ψ(0) = 0. But the operator is<br />
not self - adjoint, as we will see in the next exercise.
38 CHAPTER II. THE FORMALISM<br />
To introduce the adjoint of an operator we first delimit the domain. Let Dom A † be the set of all<br />
vectors |ϕ⟩ such that a vector |η⟩ exists for which<br />
⟨ϕ | A | ψ⟩ = ⟨η | ψ⟩ ∀ |ψ⟩ ∈ Dom A. (II. 121)<br />
Using the assumption that Dom A is dense in H it is possible to show that if such a vector |η⟩ exists<br />
it is also unique. The adjoint A † of operator A is now, by definition, the mapping<br />
A † : |ϕ⟩ ∈ Dom A † ↦→ |η⟩ := A † |ϕ⟩, (II. 122)<br />
and the operator is called self - adjoint if<br />
A = A † and Dom A = Dom A † . (II. 123)<br />
This requirement is stronger than Hermiticity; it can be shown that in general it holds for Hermitian<br />
operators that Dom A ⊂ Dom A † , instead of (II. 123).<br />
EXERCISE 15. Verify that the domain of P † , with P as in the example above, is indeed larger<br />
than the domain of P .<br />
II. 6. 2. 2<br />
CONTINUOUS SPECTRA<br />
Another aspect in which infinite - dimensional Hilbert spaces deviate from finite - dimensional<br />
ones is the possibility for an operator to have a continuous spectrum, a mathematical impossibility<br />
in the finite - dimensional case since the term ‘spectrum’ was defined as the set of eigenvalues of<br />
operators. Examples of operators with continuous spectra are, again, the position operator and the<br />
momentum operator, whose spectra consist of the entire line of real numbers R. Therefore, the<br />
term ‘spectrum’ needs to be redefined. The spectrum of operator A is now defined as the set of all<br />
values λ ∈ C for which the operator A − λ11 has no inverse operator. To illustrate the deviations from<br />
the finite - dimensional case we give two examples, the angle operator and the angular momentum<br />
operator.<br />
EXAMPLE<br />
Consider the Hilbert space L 2( [0, 2π] ) and the angle operator<br />
Q : ψ(q) ↦→ q ψ(q), 0 q 2 π. (II. 124)<br />
This operator has, analogous to (II. 112), eigenfunctions which are not in H, its spectrum is the<br />
interval [0, 2π], but it is bounded, ∥Q∥ = 2π.
The angular momentum operator<br />
II. 6. ADDENDUM: INFINITE - DIMENSIONAL HILBERT SPACES 39<br />
L : ψ(q) ↦→ − i d ψ(q), (II. 125)<br />
dq<br />
with domain<br />
Dom L =<br />
{<br />
}<br />
ψ : ∥L ψ∥ < ∞, ψ(0) = ψ(2π) , (II. 126)<br />
does have normalized eigenfunctions,<br />
ψ(q) = 1 √<br />
2 π<br />
e i l q , (II. 127)<br />
and a discrete spectrum l ∈ Z. But, since l can be arbitrarily large, it is unbounded.<br />
II. 6. 2. 3<br />
SPECTRAL THEOREM<br />
Von Neumann succeeded in proving the spectral theorem, in the version of II. 3. 1, for infinite -<br />
dimensional Hilbert spaces for which we can formulate the theorem now.<br />
SPECTRAL THEOREM:<br />
To every normal operator A, bounded or unbounded, corresponds a unique mapping of<br />
subsets of Spec A to the set P (H) of projectors on H, ∆ ↦→ P A (∆), having the following<br />
properties:<br />
(i) P ∅ = 0<br />
(ii) P C = 11<br />
(iii) P ∪i ∆ i<br />
= ∑ i<br />
P ∆i for all ∆ i mutually disjoint. (II. 128)<br />
For the position operator Q we have an explicit expression for the spectral family of eigenprojectors<br />
of Q,<br />
P Q (∆) ψ(q) =<br />
{ q ψ(q) if q ∈ ∆<br />
0 otherwise<br />
, (II. 129)<br />
hence, P Q (∆) is in fact a multiplication with the characteristic function of ∆. The spectral family of<br />
the momentum operator is obtained by applying a Fourier transform to the aforementioned expression.<br />
The probability of finding upon measurement for the physical quantity A, which corresponds to<br />
the normal operator A if the physical system is in the state ψ ∈ H, a value a ∈ ∆ ⊂ R, is<br />
Prob ψ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩, (II. 130)
40 CHAPTER II. THE FORMALISM<br />
which, using (II. 129), yields for the physical quantity position Q<br />
∫<br />
Prob ψ (Q : ∆) = ⟨ψ(q) | P Q (∆) | ψ(q)⟩ = q |ψ(q)| 2 dq. (II. 131)<br />
All empirical statements of quantum mechanics can therefore be expressed in terms of projectors, or,<br />
more precisely, all empirical statements of quantum mechanics concerning physical quantity A can<br />
be expressed in terms of the spectral family of A.<br />
∆<br />
II. 6. 3<br />
DIRAC<br />
Finally we remark that quantum mechanics à la Dirac willingly and knowingly violates Von Neumann’s<br />
postulates by going outside the Hilbert space. Dirac writes (1958, p. 40)<br />
The bra and ket vectors that we now use form a more general space than a Hilbert space.<br />
To make Dirac’s approach mathematically expressible, the French mathematician Laurent Schwarz<br />
developed the theory of distributions, and the Russian mathematical physician I.M. Gel’fand developed<br />
the theory of rigged Hilbert spaces. Contrary to Schrödinger and Von Neumann, Dirac regarded<br />
wave mechanics as a generalization of matrix mechanics, going from a discrete index to a continuous<br />
index, making a transition from square summable sequences of complex numbers to wave functions,<br />
and from infinite matrices to integral kernels.<br />
II. 6. 4<br />
SUMMARY<br />
A complex Hilbert space is, by definition, a complete, separable complex vector space with an inner<br />
product which is related to the norm by ∥ψ∥ 2 = ⟨ψ | ψ⟩, its dimension is either finite or countably<br />
infinite. Contrary to the infinite - dimensional case, the requirements of separability and completeness<br />
are superfluous in the finite - dimensional case because they are derivable from the other properties of<br />
a Hilbert space, but in the vast majority of physical applications infinite - dimensional Hilbert spaces<br />
and unbounded operators are required.
III<br />
THE POSTULATES<br />
The sciences do not try to explain, they hardly even try to interpret, they mainly make<br />
models. By a model is meant a mathematical construct which, with the addition of certain<br />
verbal interpretations, describes observed phenomena. The justification of such a<br />
mathematical construct is solely and precisely that it is expected to work [. . . ]<br />
— John von Neumann<br />
It would seem that the theory is exclusively concerned about ‘results of measurement’,<br />
and has nothing to say about anything else. [. . . ] To restrict quantum mechanics to be<br />
exclusively about piddling laboratory operations is to betray the great enterprise.<br />
— John Bell<br />
In this chapter we will formulate and discuss Von Neumann’s postulates. Next, we will extend the<br />
quantum mechanical concept of ‘pure’ states by adding ‘mixed’ states, and show how quantum<br />
mechanics treats states of subsystems of composite physical systems. Finally, we apply these<br />
concepts to spin 1/2 particles and we derive some formulas needed in subsequent chapters.<br />
III. 1<br />
VON NEUMANN’S POSTULATES<br />
We are now ready to give, in some cases in simplified fashion, Von Neumann’s postulates of<br />
quantum mechanics, which link the physical concepts of the theory to the mathematical concepts of<br />
its formalism.<br />
1. State postulate, pure states. Every physical system has a corresponding Hilbert space H, the<br />
states of the system are completely described by unit vectors in H. A composite physical<br />
system corresponds to the direct product of the Hilbert spaces of the subsystems.<br />
2. Observables postulate. Every physical quantity A of the system corresponds to a self - adjoint<br />
operator A in H. Dirac called the quantities ‘observables’.<br />
3. Spectrum postulate. The only possible outcomes which can be found upon measurement of a<br />
physical quantity A, corresponding to an operator A, are values from the spectrum of A.<br />
4. Born postulate, discrete case. If the system is in a state |ψ⟩ ∈ H, and a measurement is made<br />
of a physical quantity A, corresponding to an operator A with a discrete spectrum Spec A,<br />
probability to find the outcome a i ∈ Spec A, is equal to<br />
Prob |ψ⟩ (a i ) = ⟨ψ | P ai | ψ⟩, (III. 1)
42 CHAPTER III. THE POSTULATES<br />
where P ai is the projector from the spectral decomposition (II. 57) of A.<br />
5. Schrödinger postulate. As long as no measurements are made on the system, the time evolution<br />
of the system is described by a unitary transformation,<br />
|ψ(t)⟩ = U (t, t 0 ) |ψ(t 0 )⟩. (III. 2)<br />
6. Projection postulate, discrete case. If the system is in a state |ψ⟩ ∈ H and a measurement is<br />
made on a physical quantity A corresponding to an operator A with discrete spectrum, and the<br />
outcome of the measurement is the eigenvalue a i ∈ Spec A, the system is, immediately after<br />
the measurement, in the eigenstate<br />
|ψ⟩ P a i<br />
|ψ⟩<br />
. (III. 3)<br />
∥P ai |ψ⟩∥<br />
The first four postulates connect the (undefined) concepts ‘physical system’, ‘state’, ‘quantity’<br />
and ‘measurement’ to mathematical concepts. In the literature the postulates 3 and 4 are sometimes<br />
combined into the so - called measurement postulate. The last two postulates determine the evolution<br />
of the states in time.<br />
Ad 1. The state postulate implies that systems with the same |ψ⟩ are in the same physical state.<br />
The way in which this state vector |ψ⟩ is produced, is thus unimportant. Also the fact that two systems<br />
which are described by the same |ψ⟩ can, upon measurement, have different outcomes, which is<br />
allowed according to the measurement postulate, is no reason to regard their states as being different.<br />
On the other hand, not every pair of mutually different unit vectors also represent different states.<br />
Usually it is assumed that vectors whose only difference is their phase factor e iθ , with θ ∈ R, describe<br />
the same physical state, because they predict the same probability distributions for outcomes of all<br />
possible measurements. Such vectors form a so - called unit ray.<br />
The statement that all unit vectors of H describe physical states also need not be true in general.<br />
Notice that the set of unit vectors is extremely large. Even for a particle in one spatial dimension the<br />
Hilbert space is infinite - dimensional. Furthermore, some types of superposition, linear combinations<br />
of two or more eigenstates, do not occur in nature, for instance superpositions of states with different<br />
charges, i.e., electrical, baryonic etc., or superpositions of states with different spin.<br />
It is possible to prohibit these superpositions in the theory by introducing so - called superselection<br />
rules. The requirement that, for identical particles, only states are allowed which are symmetric or<br />
antisymmetric under permutation of the particles is an example of such a superselection rule. In the<br />
presence of a superselection rule the class of allowed states breaks up into in a direct sum of the<br />
eigenspaces of the superselection operator,<br />
H = ⊕ j=1<br />
H j . (III. 4)<br />
Within one such subspace H j , called a coherent sector, superpositions of all states are allowed.
III. 1. VON NEUMANN’S POSTULATES 43<br />
In absence of superselection rules the entire Hilbert space is one coherent sector. Then the superposition<br />
principle is valid in general, which says that for every two states |ψ⟩ and |ϕ⟩ the linear<br />
combination a|ψ⟩ + b|ϕ⟩, with |a| 2 + |b| 2 = 1, is a state too. Because nature apparently imposes<br />
superselection rules, which can sometimes be derived from symmetries as was first shown by Wick,<br />
Wightman and Wigner (1952), the superposition principle only applies for coherent sectors. Since<br />
superpositions of vectors from different coherent sectors do not correspond to physical states, the<br />
state postulate has to be accordingly reformulated.<br />
As far as composite physical systems are concerned, we say that the system is in an entangled<br />
state iff the state vector is not factorizable, see section II. 5. In the thought experiment of EPR such an<br />
entangled state plays the principal part. Schrödinger (1935b) was the first to show that the occurrence<br />
of entanglement is widespread in quantum mechanics and he considered this to be the cardinal distinction<br />
between classical mechanics and quantum mechanics. In section III. 2 we will further extend<br />
the notion of state.<br />
Ad 2. The question if every self - adjoint operator represents a physical quantity, has, according<br />
to some authors, a negative answer. Wigner, for instance, asked how to measure the quantity corresponding<br />
to the self - adjoint operator P + Q. Another example is a projector which projects on<br />
superpositions of vectors from different coherent sectors, as we saw in Ad 1.<br />
Also the reverse question, whether every physically meaningful quantity is represented by a self -<br />
adjoint operator, is controversial. For some physical quantities which correspond to experimentally<br />
clear measuring procedures, such as ‘time of decay’ in case of a radioactive atom, or the ‘phase’ of<br />
a harmonic oscillator, no associated self - adjoint operator can be found. In later generalizations of<br />
the formalism of quantum mechanics this problem is somewhat relieved by considering more general<br />
mathematical constructions, the so - called positive operator valued measures, which are also capable<br />
of representing physical quantities; see for example A.S. Holevo (1982) or Busch, Grabowski and<br />
Lahti (1995).<br />
Another question is which operator exactly corresponds to which quantity. Again, no commonly<br />
accepted recipe is available here. Generally, one starts with demanding that certain classical quantities<br />
are represented by special operators. It is standard procedure to choose position and momentum to<br />
be these quantities and to require that the corresponding operators satisfy the canonical commutation<br />
relation of Born and Jordan (1925), and Dirac (1925),<br />
[P, Q] := P Q − Q P = − i 11. (III. 5)<br />
Next, a certain ‘quantization prescription’ is chosen which can be used to construct an operator<br />
corresponding to more general physical quantities. Dirac’s mathematical prescription of replacing<br />
Poisson brackets by commutators is famous. Unfortunately, this prescription is inconsistent. The<br />
alternative prescriptions for quantization which have been presented for this purpose, do not mutually<br />
agree. We will not discuss this problem further.<br />
Ad 4. With P ψ = |ψ⟩ ⟨ψ|, as defined in (II. 37), P ai as in (II. 54), and using the relation (II. 56),<br />
the probability of finding a value a i ∈ Spec A, in a measurement of the physical quantity A with
44 CHAPTER III. THE POSTULATES<br />
corresponding operator A, can also be written as<br />
⟨ψ | P ai | ψ⟩ =<br />
∑n i<br />
j=1<br />
⟨ψ | a i , j⟩ ⟨a i , j | ψ⟩ =<br />
∑n i<br />
j=1<br />
|⟨a i , j | ψ⟩| 2 = Tr P ai P ψ (III. 6)<br />
Likewise, the expectation value of A, with A as defined in (II. 55), is<br />
⟨A⟩ ψ = ⟨ψ | A | ψ⟩ =<br />
M∑ ∑n i<br />
⟨ψ | a i , j⟩ a i ⟨a i , j | ψ⟩ =<br />
i=1 j=1<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
a i |⟨a i , j | ψ⟩| 2 = Tr(III. AP ψ 7) .<br />
In case there is no degeneracy, (III. 6) takes the simpler form<br />
⟨ψ | P ai | ψ⟩ = |⟨a i | ψ⟩| 2 = Tr P ψ P ai . (III. 8)<br />
We also note that in case A has a continuous spectrum, as discussed in section II. 6, we have (II. 130),<br />
Prob |ψ⟩ (A : ∆) = ⟨ψ | P A (∆) | ψ⟩. ▹ (III. 9)<br />
Ad 5. If the system is invariant under translations in time, the unitary evolution operator U (t, t 0 )<br />
depends only on the time difference t − t 0 , and can be written as U (t − t 0 ). The evolution operators<br />
then form a continuous abelian Lie group, the group of translations in time, satisfying the group multiplication<br />
structure U(t) U(t ′ ) = U(t + t ′ ). According to the Stone - Von Neumann theorem (1932),<br />
they can be written as<br />
U (t) = e − i H t (III. 10)<br />
where H is a unique self - adjoint operator H as the generator of the Lie group. H is called the<br />
Hamiltonian. Therefore, the evolution operator U (t − t0) from the Schrödinger postulate can be<br />
written as<br />
U (t − t 0 ) = e − i H (t−t 0) , (III. 11)<br />
and the Schrödinger equation is, according to (III. 2),<br />
i d dt |ψ(t)⟩ = i d dt e − i H (t − t 0) |ψ(t 0 )⟩ = H |ψ(t)⟩. (III. 12)
III. 2. PURE AND MIXED STATES 45<br />
Ad 6. This is the notorious projection postulate. It introduces a second kind of dynamics in<br />
the theory; a projector is, in general, not unitary and therefore it cannot be written in terms of the<br />
Schrödinger postulate. Some authors do not regard the projection postulate to be a part of quantum<br />
mechanics. The problem is then how to account for the measurement process using the other<br />
postulates, this will be discussed further in chapter VIII.<br />
The version of the projection postulate we gave is a stronger version of Von Neumann’s original<br />
formulation and is defined by G. Lüders (1951). Von Neumann only required that the state, directly<br />
after a measurement of A which has a i as an outcome, is an (arbitrary) eigenstate with eigenvalue a i .<br />
In Lüders’ version the state directly after the measurement, (III. 3), is the normalized projection of the<br />
original state on the eigenspace of a i . Here the disturbance of the original state is as small as possible,<br />
in the sense that the angle between the original and the final state is as small as possible.<br />
If the operator A is maximal, both versions coincide because in that case P ai is a 1 - dimensional<br />
projector.<br />
III. 2<br />
PURE AND MIXED STATES<br />
A state vector, a unit vector in H, provides a description of the system which is as complete as the<br />
theory allows. In classical mechanics such a description corresponds, for a system of point particles,<br />
to giving all coordinates of position and momentum; (q, p) := (q 1 , . . . , q n ; p 1 , . . . , p n ) is a point<br />
in the phase space Γ. In practice, the value of these coordinates is often not known precisely and a<br />
probability distribution ρ(q, p) is introduced over the phase space Γ. The integral of ρ(q, p) over ∆<br />
is the probability to find the system in the subset ∆ ⊆ Γ. The probabilities have to be positive and<br />
normalized,<br />
∫<br />
ρ(q, p) 0 and ρ(q, p) dq dp = 1. (III. 13)<br />
Γ<br />
In classical physics it is also customary to extend the notion of state and also call a probability<br />
distribution ρ a (generalized) state of the system. A physical quantity A corresponds to a real function<br />
on the phase space, A : Γ → R, and the expectation value of A in the state ρ is<br />
∫<br />
⟨A⟩ ρ := A(q, p) ρ(q, p) dq dp. (III. 14)<br />
Γ<br />
The states ρ form a convex set, i.e., if ρ 1 and ρ 2 are states on Γ and w 1 and w 2 both are real numbers<br />
satisfying<br />
then<br />
0 w i 1 and w 1 + w 2 = 1, (III. 15)<br />
ρ := w 1 ρ 1 + w 2 ρ 2 (III. 16)<br />
also satisfies the requirements of (III. 13) and therefore it is also a state on Γ. This convex set of states<br />
is written S (Γ).
46 CHAPTER III. THE POSTULATES<br />
A state which cannot be decomposed according to (III. 16) is called a pure state, otherwise it<br />
is called a mixed state. The pure states are the states ρ concentrated on a single point of Γ, the δ -<br />
‘functions’. Generally, the elements of a convex set which cannot be written in the form (III. 16),<br />
with w 1 , w 2 ≠ 0, are called extreme elements of that set, therefore in our case the extreme elements<br />
are the pure states. Every element of a convex set can always be written as a convex sum of extreme<br />
elements. This corresponds to the expansion of ρ to δ - functions,<br />
∫<br />
ρ(q, p) = ρ(q ′ , p ′ ) δ(q − q ′ ) δ(p − p ′ ) dq ′ dp ′ . (III. 17)<br />
Γ<br />
The dynamics of an arbitrary state follows from the Hamiltonian equations of motion of the pure<br />
states, found by calculating the path of least energy. This holds for conservative systems, which<br />
the quantum mechanical states in these lecture notes are assumed to be. We will come back to the<br />
derivation of the equations in section VI. 5.<br />
The Hamiltonian equations of motion are<br />
˙q = ∂H<br />
∂p<br />
and<br />
ṗ = − ∂H . (III. 18)<br />
∂q<br />
To find the equation of motion in terms of ρ we use Liouville’s theorem which states that for points<br />
moving in phase space obeying the Hamiltonian equations of motion the time evolution of the probability<br />
distribution ρ(q, p, t) is constant. Using (III. 18) and the Poisson brackets<br />
{H, ρ} :=<br />
( ∂H<br />
∂q<br />
∂ρ<br />
∂p − ∂ρ<br />
∂q<br />
∂H<br />
∂p<br />
the Liouville equation, the equation of motion for the state ρ<br />
)<br />
, (III. 19)<br />
equals<br />
dρ<br />
dt = ∂ρ<br />
∂t + ∂ρ<br />
∂q<br />
∂ρ ˙q + ṗ = 0, (III. 20)<br />
∂p<br />
∂ρ<br />
∂t<br />
= {H, ρ}. (III. 21)<br />
Now we will consider, analogous to the classical case, a probability distribution of the state vectors<br />
in H. With help of the state ρ we introduce a mapping µ of subsets ∆ of Γ to R,<br />
∫<br />
µ(∆) := ρ(q, p) dq dp with ∆ ⊆ Γ. (III. 22)<br />
∆<br />
This mapping µ is additive<br />
µ (∪ i ∆ i ) = ∑ i=1<br />
µ(∆ i ) (III. 23)<br />
for every countable sequence of disjoint ∆ i ⊂ Γ. Furthermore,<br />
0 µ(∆) 1, µ(∅) = 0 and µ(Γ) = 1. (III. 24)
III. 2. PURE AND MIXED STATES 47<br />
Each mapping which maps a measurable subset of Γ to a number in the interval [0, 1], thereby<br />
satisfying (III. 23) and (III. 24), is called a probability measure. It is not difficult to see that every<br />
probability distribution ρ corresponds univocally to a probability measure and vice versa, this is even<br />
true for δ - functions. Therefore we can also represent a state, in the extended meaning, by a probability<br />
measure on Γ.<br />
Analogous to this reasoning we now aim to let the physical states in quantum mechanics correspond<br />
to probability measures on H. Since we want to preserve the structure of H, we do not consider<br />
arbitrary subsets of H, instead we look at the set P(H) of all subspaces of H generated by orthogonal<br />
projectors, or, equivalently, at the projectors projecting on those subspaces. What we are thus looking<br />
for is a probability measure on P (H), i.e. a mapping<br />
µ : P (H) → [0, 1], (III. 25)<br />
which is additive in the relevant manner; if P 1 , P 2 , . . . , P N is a set of pairwise orthogonal projectors,<br />
P i ⊥ P j for i ≠ j, the following holds,<br />
( ∑<br />
µ<br />
j<br />
P j<br />
)<br />
= ∑ j<br />
and the mapping satisfies<br />
µ(P j ), (III. 26)<br />
µ(0 ) = 0 and µ(11) = 1. (III. 27)<br />
In 1957 A.M. Gleason proved the following theorem.<br />
GLEASON’S THEOREM:<br />
Every probability measure µ on P (H) can, under the condition that dim H > 2, be<br />
written as<br />
µ(P ) = Tr P W, (III. 28)<br />
for a certain operator W satisfying the following requirements: 1<br />
(i) W = W † ,<br />
(ii) ⟨ψ | W | ψ⟩ 0 ∀ |ψ⟩ ∈ H,<br />
(iii) Tr W = 1. (III. 29)<br />
The original proof of Gleason’s theorem is extraordinarily difficult. In the appendix of these<br />
lecture notes, p. 183, ff, we prove a simplified version of this theorem for the interested reader.<br />
1 Conditions (i) and (ii) of (III. 29) are not mutually independent in the complex Hilbert space of the formalism of<br />
quantum mechanics, in this space (i) is in fact superfluous. In a complex Hilbert space all positive operators are self -<br />
adjoint, and an operator A is uniquely defined by all matrix elements of the form ⟨ψ | A | ψ⟩. This is, however, not the case<br />
in a real space, where Gleason’s theorem is also valid. In that case (i) and (ii) are independent.
48 CHAPTER III. THE POSTULATES<br />
Here we prove that (III. 28) indeed satisfies the requirements (III. 25), (III. 26) and (III. 27) of a<br />
probability measure.<br />
Proof<br />
Requirement (III. 27) is obvious, and verification of (III. 26) can be done with (II. 31). To<br />
prove (III. 25), i.e.<br />
µ(P ) = Tr P W ∈ [0, 1], (III. 30)<br />
we choose an orthonormal basis of eigenvectors of P ; P |v k ⟩ = |v k ⟩, P |u l ⟩ = 0. Then<br />
Tr P W = ∑ k<br />
⟨v k | P W | v k ⟩ + ∑ l<br />
⟨u l | P W | u l ⟩<br />
= ∑ k<br />
⟨v k | W | v k ⟩ 0, (III. 31)<br />
due to the positivity of the operators W . If P is a projector, then 11 − P is one also, therefore<br />
Tr (11 − P )W 0, (III. 32)<br />
such that indeed, with (III. 29) (iii), we see that<br />
0 Tr P W + Tr (11 − P )W = Tr (P + 11 − P )W = Tr W = 1. □ (III. 33)<br />
An important aspect of Gleason’s theorem is the fact that the probability measure (III. 28) is<br />
continuous in P . For measures representing pure states, this is proved in the appendix on p. 183, ff.<br />
If dim H = 2, discontinuous probability measures exist on P (H). To see this, consider a real H.<br />
The 1 - dimensional subspaces are lines through the origin connecting opposite points on the circle.<br />
Attaching values as in the diagram, figure III. 1,<br />
P 2<br />
1<br />
0<br />
P 1<br />
0<br />
1<br />
Figure III. 1: A discontinuous measure for dim H = 2<br />
we see that, with µ(0 ) = 0 and µ(11) = 1, for two arbitrary orthogonal projectors we have<br />
µ(P 1 ) + µ(P 2 ) = 1 = µ(11) = µ(P 1 + P 2 ). (III. 34)
III. 2. PURE AND MIXED STATES 49<br />
This measure is indeed additive, but we also see that it is not continuous, and consequently, Gleason’s<br />
theorem does not hold for dim H = 2.<br />
The operator W is known as the statistical operator, or as the density matrix, or the state operator.<br />
In analogy with the classical case we extend the notion of state and call W a state of the physical<br />
system. From now on states will be represented by the state operators W .<br />
The state operators W form a set S(H) which is again convex; if W 1 and W 2 are state operators,<br />
then<br />
W = w 1 W 1 + w 2 W 2 with 0 w i 1 and w 1 + w 2 = 1 (III. 35)<br />
is again a state operator. The most simple example of a state operator is a 1 - dimensional projector.<br />
A higher - dimensional projector is not a state operator.<br />
EXERCISE 16. Why not?<br />
Before showing how the state operators W represent states, we will prove the next theorem.<br />
THEOREM:<br />
The 1 - dimensional projectors in P(H) are the extreme elements of the convex set S(H)<br />
of all state operators W on H.<br />
Proof<br />
To prove this theorem we first have to show that P ψ cannot be written in the form<br />
P ψ = w W 1 + (1 − w) W 2 , with 0 w 1. (III. 36)<br />
Suppose it could be done. Then, using (II. 37), it also has to hold that, for all |ϕ⟩ ⊥ |ψ⟩,<br />
which implies<br />
⟨ϕ | P ψ | ϕ⟩ = 0 = w ⟨ϕ | W 1 | ϕ⟩ + (1 − w) ⟨ϕ | W 2 | ϕ⟩, (III. 37)<br />
⟨ϕ | W 1 | ϕ⟩ = ⟨ϕ | W 2 | ϕ⟩ = 0. (III. 38)<br />
Now, a positive operator can always be written as the square of a self - adjoint operator, W i = A 2 i ,<br />
yielding that for all |ϕ⟩ ⊥ |ψ⟩<br />
⟨ϕ | W i | ϕ⟩ = ⟨ϕ | A 2 i | ϕ⟩ = ∥A i |ϕ⟩∥ 2 = 0 ⇒ A i |ϕ⟩ = 0 ⇒ W i |ϕ⟩ = 0 .(III. 39)<br />
Therefore, W 1 and W 2 map to the 1 - dimensional space spanned by |ψ⟩. They are, according<br />
to (III. 29), therefore, both identical to the projector P ψ ,<br />
W 1 = P ψ = W 2 . (III. 40)<br />
We thus conclude that P ψ cannot be split up into other state operators.
50 CHAPTER III. THE POSTULATES<br />
Now we have to show that the 1 - dimensional projectors are the only extreme elements. A state<br />
operator is self - adjoint and has, according to the spectral theorem, p. 26, a complete orthonormal<br />
set of eigenstates |w i , j⟩, where j is the degeneracy, j = 1, . . . , n i , and which has M ∈ N +<br />
different w i . We can write an arbitrary W ∈ S (H) as<br />
W =<br />
M∑<br />
∑n i<br />
i=1 j=1<br />
w i W i,j , (III. 41)<br />
where<br />
W i,j := |w i , j⟩ ⟨w i , j|, and<br />
M∑<br />
n i = dim H. (III. 42)<br />
i=1<br />
For w i it holds that<br />
M∑<br />
n i w i = 1 and 0 < w i < 1 (III. 43)<br />
i=1<br />
because, according to (III. 29) (ii) and (III. 29) (iii),<br />
w i = ⟨w i , j | W | w i , j⟩ 0 and Tr W =<br />
M∑<br />
n i w i = 1. (III. 44)<br />
i=1<br />
Thus we se that the sum (III. 41) is a convex decomposition of W .<br />
A convex decomposition W = w 1 W 1 + w 2 W 2 can always be decomposed further through<br />
expansion of W 1 and W 2 . In case of a bounded convex set the expansion ends on extreme<br />
elements. Therefore, if W is an extreme element, the sum has to reduce to one term. In that case<br />
W is a 1 - dimensional projector, and we see that all extreme elements of S(H) are 1 - dimensional<br />
projectors. □<br />
Physical states which are represented by 1 - dimensional projectors are called pure states, where<br />
states which can be divided non - trivially are called mixed states or mixtures. To see that pure states<br />
correspond to the vector states of H, consider W to be the 1 - dimensional projector P ψ projecting<br />
on the vector |ψ⟩. The state defined by this state operator through (III. 28) behaves exactly like the<br />
vector state |ψ⟩; for arbitrary |ϕ⟩ it holds that<br />
µ W (P ϕ ) = Tr P ϕ W = Tr P ϕ P ψ = ⟨ψ | P ϕ | ψ⟩ = |⟨ψ | ϕ⟩| 2 , (III. 45)<br />
which means that the probability to find the state |ϕ⟩ in the state |ψ⟩ is equal to (III. 6). 2 It holds<br />
especially that µ(P ψ ) = 1, and if |ϕ⟩ ⊥ |ψ⟩, then µ(P ϕ ) = 0. We see that the state P ψ assigns a<br />
2 ‘The probability to find the state |ϕ⟩’ is shorthand for the probability to find, upon measurement of the quantity corresponding<br />
to the projector |ϕ⟩ ⟨ϕ|, the value 1.
III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 51<br />
probability to the orthogonal set of vectors from which |ψ⟩ is an element, which is totally concentrated<br />
on the vector |ψ⟩. In this sense P ψ is analogous to a δ - distribution on the classical phase space.<br />
But the 1 - dimensional projectors are, generally, not mutually orthogonal which means that the<br />
pure state P ψ also assigns a positive probability to P ϕ if ⟨ϕ | ψ⟩ ̸= 0. This is contradictory to the<br />
classical case, where the pure state, which is concentrated on (p 0 , q 0 ), i.e., δ(q − q 0 , p − p 0 ), always<br />
assigns a zero probability to every other pure state. This is characteristic for quantum mechanics and<br />
is the cause for the radical difference between quantum states and classical states.<br />
In this section we showed that a unique correspondence exists between the pure states, the extreme<br />
elements of the convex set S (H) of state operators, the 1 - dimensional projectors and, up to a phase<br />
factor, the unit vectors in H. We will conclude this section with a formulation of the extended version<br />
of the state postulate (1) and the generalization of the Born postulate (4).<br />
1 ′ State postulate, mixed and pure states. Every physical system has a corresponding Hilbert<br />
space. The mixed physical states of the system uniquely correspond to the state operators<br />
within S (H), the pure physical states of the system uniquely correspond to the state operators<br />
on the boundary ∂ S (H). States of composite physical systems correspond bijectively to state<br />
operators on the direct product space of the state spaces H 1 and H 2 of the subsystems, i.e., with<br />
elements of S (H 1 ⊗ H 2 ).<br />
◃ 3<br />
4 ′ Generalized Born postulate, discrete case. If the system is in the state W ∈ S (H), the probability<br />
to find, upon measurement of quantity A corresponding to an operator A having a discrete<br />
spectrum, an eigenvalue in ∆ ⊆ Spec A, is equal to<br />
Prob W (A : ∆) = Tr P A (∆)W, (III. 46)<br />
where P A (∆) ∈ P (H) projects on the subspace span by the eigenvectors having their eigenvalues<br />
in ∆.<br />
III. 3<br />
THE INTERPRETATION <strong>OF</strong> MIXED STATES<br />
The spectral decomposition (III. 41) suggests an interpretation of the state W . As we saw in (III. 45),<br />
a pure state W = P ψ corresponds to a probability measure µ, which we call concentrated on the<br />
eigenvector |ψ⟩ since µ(P ψ ) = 1. In the same way an arbitrary W corresponds, according to (III. 41),<br />
to a probability measure on its orthonormal set of eigenvectors |w i , j⟩, assigning a probability w i to<br />
the eigenvector |w i , j⟩. With the projector W i,j as in (III. 42), we have<br />
µ W (W i,j ) = Tr W i,j W = Tr<br />
M∑<br />
k=1<br />
n k ∑<br />
l=1<br />
W i,j w k |w k , l⟩ ⟨w k , l|<br />
=<br />
M∑<br />
k=1<br />
n k ∑<br />
l=1<br />
w k |⟨w k , l | w i , j⟩| 2 = w k δ ik δ jl = w i . (III. 47)<br />
3 Notice how in this extended version of the state postulate the annoying phase factor has disappeared.
52 CHAPTER III. THE POSTULATES<br />
The expectation value of operator A is, according to (III. 7) and replacing P ψ by W , also forming<br />
an orthonormal basis,<br />
⟨A⟩ W = Tr AW, (III. 48)<br />
which yields, using again (III. 7) and the spectral decomposition of W , (III. 41),<br />
⟨A⟩ W = Tr<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
A w i W i,j<br />
=<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
w i Tr AW i,j =<br />
M∑<br />
i=1<br />
w i<br />
∑n i<br />
j=1<br />
⟨w i , j | A | w i , j⟩. (III. 49)<br />
This is exactly the weighted sum of w i and the expectation values of A in the states |w i , j⟩.<br />
The above suggests that W describes an ensemble of physical systems each of which is in one<br />
of the pure states |w i , j⟩ and that w i is the fraction of systems in |w i , j⟩. This is the way Von Neumann<br />
originally introduced state operators, in analogy to ensembles in classical statistical mechanics,<br />
hence his terminology statistical operator. But this attractive interpretation, known as the ignorance<br />
interpretation of mixtures, is not without problems as we will show now.<br />
In case of degeneracy the choice of the basis vectors in (III. 41) is not unique, and the projector P i<br />
in the subspace corresponding to the eigenvalue w i can be written in terms of basis states in arbitrarily<br />
many ways,<br />
∑n i<br />
j=1<br />
|w i , j⟩ ⟨w i , j| =<br />
∑n i<br />
k=1<br />
|w i , k⟩ ⟨w i , k|, (III. 50)<br />
with { |w i , k⟩} another arbitrary orthonormal basis in this subspace. Therefore, given any W we<br />
cannot say of which vector states the ensemble is composed. To see that this is a general phenomenon,<br />
consider the operator<br />
W =<br />
K∑<br />
p k U k =<br />
k=1<br />
K∑<br />
p k |u k ⟩ ⟨u k |. (III. 51)<br />
k=1<br />
Here K ∈ N + is arbitrary and {|u k ⟩} is an arbitrary basis of unit vectors which are, in general,<br />
not orthogonal, but as long as the p k satisfy 0 p k 1 and ∑ p k = 1, as required in (III. 35), the<br />
operator W in (III. 51) is still a state operator.<br />
Indeed, equation (III. 51) is an alternative decomposition of W into extreme elements, just like<br />
the spectral decomposition. We see that, in contrast to the classical case, convex decompostions are<br />
not unique.<br />
According to the ignorance interpretation, W describes the ensemble as consisting of systems of<br />
which a fraction p k is in the state |u k ⟩, e.g.<br />
⟨A⟩ W = Tr AW =<br />
K∑<br />
p k ⟨u k | A | u k ⟩, (III. 52)<br />
k=1
ut the probability to find the system in |u k ⟩ is<br />
µ W (U k ) = Tr U k W = Tr<br />
III. 3. THE INTERPRETATION <strong>OF</strong> MIXED STATES 53<br />
K∑<br />
m=1<br />
U k p m |u m ⟩ ⟨u m | =<br />
K∑<br />
m=1<br />
p m |⟨u k | u m ⟩| 2 (III. 53)<br />
Although the result (III. 52) is in accordance with the behavior of an ensemble of systems being in<br />
the state |u k ⟩ with probability p k , we see that for (III. 53), contrary to (III. 47), the outcome, i.e. the<br />
probability to find in (III. 51) the state |u k ⟩, is in general not p k , which is a consequence of the non -<br />
orthogonality of the states |u k ⟩. On the other hand, (III. 51) can always be written in the form (III. 41),<br />
in terms of the orthonormal set of eigenvectors of W , which leads to the conclusion that ensembles<br />
which are interpreted as being physically completely different, are described by the same operator W .<br />
This can be compared with the fact that a pure state |ψ⟩ can be written in numerous ways as a<br />
superposition of other pure states, which corresponds to different ways of preparation of |ψ⟩ by superposition<br />
of other states, for instance in a tilted Stern - Gerlach apparatus in case of measurement of<br />
spin. We can no longer see if |ψ⟩ is, for example, a superposition of spin up and down in the z - direction,<br />
or of spin up and down in the x - direction.<br />
For pure states this seems completely natural; it is a direct consequence of the state postulate<br />
which forms a vector space of states. In case of mixed states the situation is less clear. It can be<br />
maintained that an ensemble, of which each system is in the state |u k ⟩ with probability p k , really<br />
differs from an ensemble of systems which are in a state |w i , j⟩ with probability w i , even though the<br />
expectation values of all physical quantities are equal for both ensembles. In that case, from<br />
W =<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
w i |w i , j⟩ ⟨w i , j| =<br />
K∑<br />
p k |u k ⟩ ⟨u k | (III. 54)<br />
k=1<br />
it has to be concluded that the state operator W characterizes these ensembles incompletely. There is<br />
no postulate in quantum mechanics by which this is prohibited.<br />
Another view is, however, that the state operator is a complete description of a state, the different<br />
possible ways of preparation are not retrievable from the state W . Consequently, the conclusion has<br />
to be that W , in (III. 51), does not characterize an ensemble which exists of a mixture of systems in<br />
pure states |u k ⟩, but an ensemble characterized by W only presents itself as such an ensemble upon<br />
measurement. Here we see again that, in quantum mechanics, we get in trouble if we speak in terms<br />
of what really exists. In section III. 5 we will return to this discussion in the context of improperly<br />
mixed states.<br />
The dynamics of mixed states follows, as in the classical case, from the pure states. Define,<br />
analogously to (III. 41),<br />
W (t) :=<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
w i W i,j (t). (III. 55)<br />
According to the Schrödinger postulate, (III. 2),<br />
|w i , j, t⟩ := U (t − t 0 ) |w i , j, t 0 ⟩, (III. 56)
54 CHAPTER III. THE POSTULATES<br />
which yields for (III. 55)<br />
W (t) =<br />
M∑<br />
i=1<br />
∑n i<br />
j=1<br />
w i U (t − t 0 ) W i,j (t 0 ) U † (t − t 0 ), (III. 57)<br />
and therefore<br />
W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 58)<br />
With (III. 11) we find<br />
i d dt W (t) = [H, W (t 0)], (III. 59)<br />
which is the analogue of the Liouville equation of motion, (III. 21), describing the time evolution of<br />
the states ρ. Equation (III. 59) is called the Liouville - Von Neumann equation, it is the generalization<br />
of the Schrödinger equation to an equation for mixed states.<br />
The extensions of the Schrödinger postulate and the projection postulate for mixed states can now<br />
be formulated.<br />
5 ′ Generalized Schrödinger postulate. If no measurements are made on the physical system, the<br />
time evolution of the state of the system is described by a unitary transformation,<br />
W (t) = U (t − t 0 ) W (t 0 ) U † (t − t 0 ). (III. 60)<br />
6 ′ Generalized projection postulate, discrete case. If the system is in a state W when a measurement<br />
is made on a physical quantity A corresponding to an operator A having a discrete spectrum,<br />
and the outcome of the measurement is the eigenvalue a i ∈ R, the system is, directly<br />
after the measurement, in the eigenspace corresponding to the eigenvalue a i ,<br />
W P a i<br />
W P ai<br />
Tr P ai W P ai<br />
. (III. 61)<br />
◃ Remark<br />
Remember that, in general, the projectors P ai do not have to be 1 - dimensional. ▹<br />
Finally, we give a theorem concerning the generalized Schrödinger postulate which is important<br />
for the measurement problem.<br />
VON NEUMANN’S THEOREM A:<br />
The properties ‘pure’ and ‘mixed’ are invariant under a unitary time evolution.
III. 4. COMPOSITE SYSTEMS 55<br />
Proof<br />
We know that if W is pure, i.e. equal to a 1-dimensional projector, then W 2 = W .<br />
Now consider the expression (sometimes called the purity of W ):<br />
Tr W 2 = ∑ i,<br />
w 2<br />
i (III. 62)<br />
since Tr W = 1 → ∑ i w i = 1, and W is pure iff exactly one of the w i is equal to 1, and all<br />
others vanish, we conclude that<br />
Tr W 2 = 1iff W is pure; Tr W 2 < 1iff W is mixed (III. 63)<br />
But Tr W 2 is invariant under the time evolution (III. 60). Indeed, if we remember that U † (t −<br />
t 0 ) = U −1 (t − t 0 ) and that Tr AB = Tr BA, it follows that<br />
Tr (W(t)) 2 = Tr U(t−t 0 ) W(t 0 ) U † (t−t 0 )U(t−t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U(t−t 0 ) W(t 0 ) W(t 0 ) U † (t−t 0 ) = Tr U<br />
□<br />
III. 4<br />
COMPOSITE SYSTEMS<br />
Suppose that a system S is composed of two subsystems S I and S II . The Hilbert spaces associated<br />
with S I and S II are H I and H II , with dim H I = N I and dim H II = N II , the Hilbert space<br />
associated with S is the direct product space H = H I ⊗ H II , with dim H = N. If |α 1 ⟩, . . . , |α n ⟩<br />
and |β 1 ⟩, . . . , |β m ⟩ are bases of the subspaces H I and H II , {|α i ⟩ ⊗ |β j ⟩} forms a basis in H. An<br />
arbitrary vector in H is a superposition of such direct products of basis vectors and is generally not of<br />
the form |ψ⟩ ⊗ |ϕ⟩, with |ψ⟩ ∈ H I and |ϕ⟩ ∈ H II . Consequently, one cannot say for such an arbitrary<br />
state in H that the subsystems are in some pure state in H I or H II .<br />
This entanglement of the subsystems, when |Ψ⟩ ̸= |ψ⟩ ⊗ |ϕ⟩, with |Ψ⟩ ∈ H, which is characteristic<br />
for quantum mechanics, has no analogue in classical mechanics. It is a consequence of the formal<br />
requirement that the state space of a composite system is also a vector space. Entanglement is the aspect<br />
of the quantum mechanical description that gives rise to the EPR - paradox and the measurement<br />
problem as we shall see in later chapters.<br />
The quantities of system S correspond to self - adjoint operators in H. We make the supposition<br />
that quantities of the subsystem S I correspond to operators of the form A ⊗ 11 in H, where A is<br />
a self - adjoint operator in H I , and quantities of S II correspond analogously to operators of the<br />
form 11 ⊗ B, with B in H II . A state of S is given by a state operator W in H; W ∈ S (H). In<br />
general, W is not a direct product of operators, but in case W can be written as a direct product, we<br />
write W = W 1 ⊗ W 2 , with W 1 and W 2 state operators in H I and H II , respectively.
56 CHAPTER III. THE POSTULATES<br />
EXERCISE 17. Prove the following statements.<br />
(a) W = W 1 ⊗ W 2 is a state operator if W 1 and W 2 are state operators.<br />
(b) The opposite of (a) is not true; give a counterexample.<br />
(c) W = W 1 ⊗ W 2 is pure iff both W 1 and W 2 are pure.<br />
EXERCISE 18. Prove that for all vectors |ψ⟩, |ψ ′ ⟩ ∈ H I and |ϕ⟩, |ϕ ′ ⟩ ∈ H II we have<br />
(<br />
|ψ⟩ ⊗ |ϕ⟩<br />
)(<br />
⟨ψ<br />
′<br />
| ⊗ ⟨ϕ ′ | ) = |ψ⟩ ⟨ψ ′ | ⊗ |ϕ⟩ ⟨ϕ ′ |. (III. 65)<br />
THEOREM:<br />
If W is a direct product of operators, W = W 1 ⊗ W 2 , the subsystems are mutually<br />
independent, i.e., the probability to find for A⊗11 the value a i and for 11⊗B the value b j<br />
is equal to the product of the separate probabilities. In this case the expectation values<br />
factorize too, such that ⟨A ⊗ B⟩ W 1 ⊗ W 2<br />
= ⟨A⟩ W 1<br />
⟨B⟩ W 2<br />
.<br />
Proof<br />
Let a i and b j be eigenvalues of A and B, respectively. Using (III. 65) we see that the projector on<br />
the eigenstate |a i ⟩ ⊗ |b j ⟩ of A ⊗ B is P |ai⟩ ⊗ P |bj⟩. Therefore, with (II. 102), p. 33,<br />
( )<br />
µ W P|ai⟩ ⊗ P |bj⟩<br />
= Tr ( )( )<br />
P |ai⟩ ⊗ P |bj⟩ W 1 ⊗ W 2<br />
= Tr ( )<br />
P |ai ⟩W 1 ⊗ P |bj ⟩W 2<br />
= Tr P |ai ⟩W 1 Tr P |bj ⟩W 2<br />
=<br />
( ( )<br />
µ W 1 P|ai⟩)<br />
µW<br />
2<br />
P|bj⟩<br />
=<br />
(<br />
µ W P|ai⟩ ⊗ 11 ) (<br />
µ W 11 ⊗ P|bj⟩)<br />
, (III. 66)<br />
which proves the first part of the theorem.<br />
For the factorization of the expectation values, we have, analogously,<br />
⟨A ⊗ B⟩ W 1 ⊗W 2<br />
= Tr (A ⊗ B)(W 1 ⊗ W 2 ) = Tr AW 1 Tr BW 2<br />
= ⟨A⟩ W 1<br />
⟨B⟩ W 2<br />
, (III. 67)<br />
and we see that the expectation values indeed factorize. □<br />
From (III. 67) we also see that, if W = W 1 ⊗ W 2 , then ⟨A ⊗ 11⟩ W = Tr A W 1 = ⟨A⟩ W1<br />
and ⟨11 ⊗ B⟩ W = ⟨B⟩ W2 , but this does not hold for more general statistical operators W .
III. 4. COMPOSITE SYSTEMS 57<br />
With (II. 99), for an arbitrary state operator W , hence in general W ≠ W 1 ⊗ W 2 , the expectation<br />
value of A ⊗ 11 is<br />
⟨A ⊗ 11⟩ W = Tr (A ⊗ 11)W<br />
=<br />
=<br />
=<br />
∑N I ∑N II<br />
i=1<br />
N I ∑<br />
i=1<br />
N I ∑<br />
i=1<br />
j=1<br />
N I ∑<br />
k=1 j=1<br />
N I<br />
(<br />
⟨αi | ⊗ ⟨β j | )( A ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />
N II ∑<br />
∑<br />
⟨α i | A | α k ⟩<br />
k=1<br />
(<br />
⟨αi | ⊗ ⟨β j | )( A |α k ⟩ ⟨α k | ⊗ 11 ) W ( |α i ⟩ ⊗ |β j ⟩ )<br />
N II ∑<br />
j=1<br />
(<br />
⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) . (III. 68)<br />
To find ⟨A ⊗ 11⟩ W , define the operator W I in H I , called the partial trace of W in relation to H II ,<br />
W I = Tr II W :=<br />
N II<br />
∑<br />
⟨β j | W | β j ⟩, W I ∈ S (H I ). (III. 69)<br />
j=1<br />
For this partial trace it holds that<br />
⟨α k | W I | α i ⟩ =<br />
N II ∑<br />
j=1<br />
and substituting (III. 70) in (III. 68) yields<br />
⟨A ⊗ 11⟩ W =<br />
N I ∑<br />
i=1<br />
(<br />
⟨αk | ⊗ ⟨β j | ) W ( |α i ⟩ ⊗ |β j ⟩ ) , ⟨α k | W I | α i ⟩ ∈ R, (III. 70)<br />
N I<br />
∑<br />
⟨α i | A | α k ⟩ ⟨α k | W I | α i ⟩ = Tr AW I = ⟨A⟩ WI . (III. 71)<br />
k=1<br />
Analogously, with W II the partial trace of W in relation to H I ,<br />
W II = Tr I W :=<br />
N I<br />
∑<br />
⟨α i | W | α i ⟩, W II ∈ S (H II ), (III. 72)<br />
i=1<br />
we see that<br />
⟨11 ⊗ B⟩ W = Tr BW II = ⟨B⟩ WII . (III. 73)<br />
EXERCISE 19. Prove that Tr II W and Tr I W are state operators in H I and H II , respectively.
58 CHAPTER III. THE POSTULATES<br />
Concerning the expectation values of the quantities of the subsystem S I alone we can replace the<br />
state W by the partial trace, or state operator, Tr II W in H I , analogously for S II . Therefore it is<br />
customary to let the states of the subsystems correspond to the partial traces Tr II W and Tr I W .<br />
For the partial traces it holds that if W is a direct product of state operators W 1 and W 2 in H I<br />
and H II , respectively, W can also be written as a direct product of its partial traces, which we now<br />
show in a lemma.<br />
LEMMA:<br />
If W is a direct product of the form W = W 1 ⊗ W 2 , where W 1 and W 2 are state operators<br />
in H I and H II , respectively, then Tr II W = W 1 and Tr I W = W 2 .<br />
Proof<br />
Tr II W = Tr II (W 1 ⊗ W 2 ) =<br />
∑N II<br />
⟨β j | W 1 ⊗ W 2 | β j ⟩<br />
j=1<br />
∑N II<br />
= W 1 ⟨β j | W 2 | β j ⟩ = W 1 Tr W 2 = W 1 , (III. 74)<br />
j=1<br />
likewise,<br />
Tr I (W 1 ⊗ W 2 ) = W 2 . □ (III. 75)<br />
From this lemma we see that W = W 1 ⊗ W 2 = Tr II W ⊗ Tr I W , and with the first theorem<br />
of this section, p. 56, this leads to the conclusion that if W is a direct product of its partial traces, it<br />
can be uniquely reconstructed from its partial traces. Generally, an arbitrary state operator W of the<br />
composite system can not be defined by its partial traces, which was shown by Von Neumann.<br />
VON NEUMANN’S THEOREM B:<br />
The partial traces Tr II W and Tr I W uniquely define W , iff at least one of the partial<br />
traces is pure, in which case W is factorizable,<br />
W = Tr II W ⊗ Tr I W. (III. 76)<br />
Proof<br />
Let {|u i ⟩} be a basis of eigenstates of W I having non - degenerate eigenvalues. Leaving out the<br />
eigenvalues p n and u i which are equal to 0, expand W and Tr II W in their eigenvectors,<br />
W =<br />
N∑<br />
p n |ψ n ⟩ ⟨ψ n | with |ψ n ⟩ ∈ H (III. 77)<br />
n=1<br />
and<br />
Tr II W =<br />
∑N I<br />
i=1<br />
u i |u i ⟩ ⟨u i | with |u i ⟩ ∈ H I . (III. 78)
III. 4. COMPOSITE SYSTEMS 59<br />
◃ Remark<br />
Leaving out the eigenvalues u i = 0, the eigenvectors |u i ⟩ with eigenvalue 0 do not occur in the<br />
expansion of Tr II W , however, they do belong to the complete basis basis {|u i ⟩}. ▹<br />
Let {|v j ⟩} be a basis in H II . Then {|u i ⟩ ⊗ |v j ⟩} is a basis in H, and |ψ n ⟩ can be expanded as<br />
where<br />
|ψ n ⟩ =<br />
|ϕ n i ⟩ :=<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
∑N II<br />
j=1<br />
ψ n<br />
ij |u i ⟩ ⊗ |v j ⟩ =<br />
∑N I<br />
i=1<br />
|u i ⟩ ⊗ |ϕ n i ⟩ (III. 79)<br />
ψ n<br />
ij |v j ⟩ ∈ H II . (III. 80)<br />
These |ϕi n ⟩ are, in general, not orthogonal. Substituting (III. 79) in (III. 77) we have<br />
W =<br />
N∑<br />
n=1<br />
p n<br />
∑N I<br />
∑N I<br />
i=1 k=1<br />
|u i ⟩ ⟨u k | ⊗ |ϕ n i ⟩ ⟨ϕ n k |. (III. 81)<br />
Subtitution of (III. 81) in (III. 69) yields<br />
Tr II W =<br />
∑N II<br />
⟨β l | W | β l ⟩ =<br />
l=1<br />
N∑<br />
∑N I<br />
∑N I<br />
n=1 i=1 k=1<br />
∑N II<br />
p n |u i ⟩ ⟨u k | ⟨β l | ϕi n ⟩ ⟨ϕk n | β l ⟩<br />
l=1<br />
=<br />
N∑<br />
∑N I<br />
∑N I<br />
n=1 i=1 k=1<br />
p n ⟨ϕ n k | ϕ n i ⟩ |u i ⟩ ⟨u k |. (III. 82)<br />
With {|ψ i ⟩} a basis, the coefficients in the expansion of an operator of the form ∑ ij c ij|ψ i ⟩⟨ψ j |<br />
are unique, and comparison of (III. 82) with (III. 78) gives<br />
therefore,<br />
N∑<br />
p n ⟨ϕk n | ϕi n ⟩ = u i δ ik , (III. 83)<br />
n=1<br />
Tr II W =<br />
∑N I<br />
∑N I<br />
i=1 k=1<br />
u i δ ik |u i ⟩ ⟨u k | =<br />
∑N I<br />
i=1<br />
u i |u i ⟩ ⟨u i |. (III. 84)<br />
◃ Remark<br />
In (III. 83) it follows for i = k, due to the positivity of the p n , that if u i = 0 for certain i,<br />
then |ϕ n i ⟩ = 0 for all n and we see that in (III. 79) only the terms appear for which u i ≠ 0.<br />
Consequently, the same terms occur in (III. 79) as in the expansion (III. 78) of Tr II W . ▹
60 CHAPTER III. THE POSTULATES<br />
If Tr II W is pure, there is only one term<br />
Tr II W = |u 1 ⟩ ⟨u 1 |, (III. 85)<br />
and substitution in (III. 79) yields<br />
|ψ n ⟩ = |u 1 ⟩ ⊗ |ϕ 1 n ⟩. (III. 86)<br />
Therefore,<br />
W =<br />
N∑<br />
N∑<br />
p n |u 1 ⟩ ⟨u 1 | ⊗ |ϕ n 1 ⟩ ⟨ϕ n 1 | = |u 1 ⟩ ⟨u 1 | ⊗ p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 87)<br />
n=1<br />
n=1<br />
Analogous to (III. 82) we find for<br />
Tr I W =<br />
N∑<br />
∑N I<br />
∑N I<br />
n=1 i=1 k=1<br />
p n ⟨u k | u i ⟩ |ϕ n i ⟩ ⟨ϕ n k |. (III. 88)<br />
With i = k = 1 and ⟨u 1 | u 1 ⟩ = 1 we have<br />
Tr I W =<br />
N∑<br />
p n |ϕ n 1 ⟩ ⟨ϕ n 1 |. (III. 89)<br />
n=1<br />
Substituting (III. 89) in (III. 87) we see that W = Tr II W ⊗ Tr I W . Indeed, if one of the partial<br />
traces is pure, W is factorizable, and therefore completely determined, by its partial traces.<br />
To show the ‘only if’ - part of the theorem, that Tr II W and Tr I W uniquely define the state W<br />
of the composite system only if at least one of the partial traces is pure, since only in that<br />
case W is factorizable, we decompose them into orthogonal 1 - dimensional eigenprojectors,<br />
where both u i , v j ∈ [0, 1] sum up to 1 as required in (III. 35) for the projectors to be state<br />
operators,<br />
Tr II W =<br />
Tr I W =<br />
It then holds that<br />
∑N I<br />
i=1<br />
∑N II<br />
j=1<br />
Tr II W ⊗ Tr I W =<br />
u i |u i ⟩ ⟨u i | :=<br />
v j |v j ⟩ ⟨v j | :=<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
∑N I<br />
i=1<br />
∑N II<br />
j=1<br />
u i U i , (III. 90)<br />
v j V j . (III. 91)<br />
u i v j U i ⊗ V j . (III. 92)
Now consider an operator W of the form<br />
III. 4. COMPOSITE SYSTEMS 61<br />
W =<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
which is, in general, not factorizable.<br />
z ij U i ⊗ V j , (III. 93)<br />
EXERCISE 20. Prove that U i ⊗ V j is a 1 - dimensional projector in H.<br />
The operator W , (III. 93), is a state operator if<br />
z ij ∈ [0, 1] and<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
z ij = 1, (III. 94)<br />
furthermore, with (III. 69) and (III. 72) we have<br />
and<br />
Tr II W =<br />
Tr I W =<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
z ij U i (III. 95)<br />
z ij V j . (III. 96)<br />
This system has an infinite number of solutions for the unknown z ij , unless one of the partial<br />
traces is pure, e.g. Tr II W = U 1 . In that case, according to (III. 95) it has to hold for i = 1<br />
that ∑ j z 1j = 1. But then (III. 35) requires that ∑ j z ij = 0 if i ≠ 1, which means that, because<br />
of the non - negativity of the z ij , it has to hold that z ij = 0 if i ≠ 1. Substituting i = 1 in (III. 93)<br />
yields<br />
W =<br />
∑N II<br />
j=1<br />
z 1j U 1 ⊗ V j<br />
= U 1 ⊗<br />
∑N II<br />
j=1<br />
z 1j V j = Tr II W ⊗ Tr I W, (III. 97)<br />
where the last step is in accordance with (III. 96).<br />
We conclude that only if, at least, one of the partial traces is pure, W is factorizable. □<br />
In the foregoing we saw that only if the state operator W of a composite system is factorizable, it<br />
can be uniquely defined. Contrary to classical physics, in quantum mechanics maximal knowledge of<br />
the state of the subsystems is in general not equivalent to maximal knowledge of the state of the entire
62 CHAPTER III. THE POSTULATES<br />
system. Consequently, the state of the entire system can, generally, not be derived from measurements<br />
on the separate subsystems. 4<br />
If the partial traces of W = W 1<br />
⊗ W 2 are both pure, W is also pure, as we saw in the exercise<br />
on p. 56, and since the pure partial traces each have only one term W is of the form |u⟩ ⟨u| ⊗ |v⟩ ⟨v|.<br />
On the other hand, a pure state in H is, generally, not factorizable, which we will show in an example.<br />
EXAMPLE<br />
If |u i ⟩ and |v j ⟩ span a basis in H I and H II , respectively, an arbitrary vector |ψ⟩ in H = H I ⊗ H II<br />
is of the form<br />
|ψ⟩ =<br />
∑N I<br />
∑N II<br />
i=1 j=1<br />
c ij |u i ⟩ ⊗ |v j ⟩. (III. 98)<br />
An arbitrary pure state in H is therefore of the form<br />
|ψ⟩ ⟨ψ| =<br />
∑N I<br />
∑N II<br />
∑N I<br />
∑N II<br />
i=1 j=1 k=1 l=1<br />
Consider the following pure entangled state in H,<br />
c ∗ kl c ij<br />
(<br />
|ui ⟩ ⊗ |v j ⟩ )( ⟨u k | ⊗ ⟨v l | ) . (III. 99)<br />
|Φ⟩ = 1 2<br />
√<br />
2<br />
(<br />
|u1 ⟩ ⊗ |v 1 ⟩ + |u 2 ⟩ ⊗ |v 2 ⟩ ) . (III. 100)<br />
The corresponding W is the 1 - dimensional projector<br />
(<br />
W = |Φ⟩ ⟨Φ| = 1 2 |u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 2 |<br />
+ |u 2 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) . (III. 101)<br />
This pure state W is not factorizable, and cannot be written in the form (III. 93). But although W<br />
is pure, its partial traces are not pure,<br />
Tr II W =<br />
Tr I W =<br />
∑N II<br />
(<br />
⟨v j | Φ⟩ ⟨Φ | v j ⟩ = 1 2 |u1 ⟩ ⟨u 1 | + |u 2 ⟩ ⟨u 2 | ) , (III. 102)<br />
j=1<br />
∑N I<br />
i=1<br />
⟨u i | Φ⟩ ⟨Φ | u i ⟩ = 1 2<br />
(<br />
|v1 ⟩ ⟨v 1 | + |v 2 ⟩ ⟨v 2 | ) , (III. 103)<br />
and indeed,<br />
W I ⊗ W II = 1 4<br />
(<br />
|u1 ⟩ ⟨u 1 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 1 ⟩ ⟨u 1 | ⊗ |v 2 ⟩ ⟨v 2 | +<br />
|u 2 ⟩ ⟨u 2 | ⊗ |v 1 ⟩ ⟨v 1 | + |u 2 ⟩ ⟨u 2 | ⊗ |v 2 ⟩ ⟨v 2 | ) ≠ W. (III. 104)<br />
4 This aspect of the quantum mechanical state description is, however, analogous to a classical state description with a<br />
probability distribution. The two - particle distribution function ρ(q 1 , p 1 ; q 2 , p 2 ) is not uniquely defined by the marginal<br />
distribution functions<br />
∫<br />
∫<br />
ρ 1 (q 1 , p 1 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 2 dp 2 and ρ 2 (q 2 , p 2 ) = ρ(q 1 , p 1 ; q 2 , p 2 ) dq 1 dp 1 ,<br />
the marginals are, after all, analogous to the partial traces.
III. 5. PROPER AND IMPROPER MIXTURES 63<br />
III. 4. 1<br />
SUMMARY<br />
1. The state operator W ∈ S (H) of a composite system, whether pure or not, is not factorizable<br />
in general.<br />
2. If W is factorizable, the factors are equal to the partial traces of W ,<br />
W = W 1 ⊗ W 2 implies W 1 = Tr II W and W 2 = Tr I W. (III. 105)<br />
3. The partial traces uniquely define W iff, at least, one of the partial traces is pure, in which<br />
case W is directly factorizable, W = W 1 ⊗ W 2 .<br />
4. The partial traces of W are pure iff W is pure and of the form W = ( |u⟩ ⊗ |v⟩ )( ⟨u| ⊗ ⟨v| ) ,<br />
with |u⟩ ∈ H I and |v⟩ ∈ H II .<br />
III. 5<br />
PROPER AND IMPROPER MIXTURES<br />
The states of composite systems shed new insight on the interpretation of mixtures. Suppose that<br />
W I and W II are the partial traces of an arbitrary state operator W , and, with u i , v j ∈ [0, 1], it holds<br />
that<br />
W I =<br />
N I ∑<br />
i=1<br />
u i |u i ⟩ ⟨u i | and W II =<br />
N II ∑<br />
j=1<br />
v j |v j ⟩ ⟨v j |. (III. 106)<br />
W I and W II contain all quantum mechanical information about results of measurements on the subsystems<br />
in H I and H II . The question is whether we can interpret this by assuming that the individual<br />
subsystems are in the pure states |u i ⟩ and |v j ⟩, with probabilities u i and v j , respectively. If this were<br />
the case, the composite system could be divided in subensembles of systems in the states |u i ⟩ ⊗ |v j ⟩<br />
with probabilities depending on possible correlations between the values of i and j. The state would<br />
be of the form<br />
W ′ =<br />
=<br />
∑N I ∑N II<br />
i=1<br />
j=1<br />
∑N I ∑N II<br />
i=1<br />
j=1<br />
p ij<br />
(<br />
|ui ⟩ ⊗ |v j ⟩ )( ⟨u i | ⊗ ⟨v j | )<br />
p ij |u i ⟩ ⟨u i | ⊗ |v j ⟩ ⟨v j |. (III. 107)<br />
The coefficients p ij have to satisfy<br />
p ij ∈ [0, 1],<br />
N II ∑<br />
j=1<br />
p ij = u i ,<br />
N I ∑<br />
i=1<br />
p ij = v j<br />
and<br />
∑N I ∑N II<br />
i=1<br />
j=1<br />
p ij = 1, (III. 108)
64 CHAPTER III. THE POSTULATES<br />
but otherwise they are free to choose. As far as being in one of the states |u i ⟩ or |v j ⟩ can be interpreted<br />
as a property the subsystems possess, all correlations between these properties in the total state can<br />
be expressed by the p ij . If there are no correlations, p ij = u i v j .<br />
But we see that W ′ is of the special form (III. 93), and therefore in general not equal to the arbitrary<br />
state operator W we started with, it cannot be said that the individual subsystems are in the pure<br />
states |u i ⟩ and |v j ⟩. Although W I and W II are state operators, they cannot be interpreted as mixtures of<br />
pure states. The mixed states W I and W II are called improper mixtures by B. d’Espagnat (1989, p.61).<br />
Proper mixed states can in principle be taken as an ensemble of systems which are in pure states,<br />
where improper states cannot.<br />
The foregoing shows that the concept of mixed states is forced upon us by the theory of composite<br />
systems as a natural extension of the concept of pure states. Even if the composite system is in a pure<br />
state, the subsystems are generally not pure, it is not correct to understand mixed states in general as<br />
simple mixtures of pure states, in the way the mixture of pieces in the box of a game of chess consists<br />
of black and white pieces.<br />
Finally, we make an observation about similar, or identical, particles. A system of similar particles<br />
is described in quantum mechanics by symmetrized states. Consider the following symmetrized<br />
two - particle state<br />
|Ψ(1, 2)⟩ = 1 2<br />
√<br />
2<br />
(<br />
|u⟩ ⊗ |v⟩ ± |v⟩ ⊗ |u⟩<br />
)<br />
, (III. 109)<br />
where the first factor in each direct product is related to particle 1, and the second to particle 2. In<br />
this case the two subspaces are identical and |u⟩ and |v⟩ can represent states in both one and the other<br />
subspace. The corresponding state operator is<br />
W = |Ψ(1, 2)⟩ ⟨Ψ(1, 2)| = 1 (<br />
2 |u⟩ ⟨u| ⊗ |v⟩ ⟨v| ± |u⟩ ⟨v| ⊗ |v⟩ ⟨u|<br />
the partial traces are<br />
and<br />
W I = Tr II W = 1 2<br />
W II = Tr I W = 1 2<br />
± |v⟩ ⟨u| ⊗ |u⟩ ⟨v| + |v⟩ ⟨v| ⊗ |u⟩ ⟨u| ) , (III. 110)<br />
(<br />
|u⟩ ⟨u| + |v⟩ ⟨v|<br />
)<br />
, (III. 111)<br />
(<br />
|v⟩ ⟨v| + |u⟩ ⟨u|<br />
)<br />
, (III. 112)<br />
and we see that the partial traces are identical. We have to say that both particles are in the same state,<br />
we certainly can not say that one particle is in the state |u⟩ and the other in |v⟩. We cannot assign a<br />
pure state to the separate particles, although the state of the composite system is pure.<br />
III. 6<br />
SPIN 1/2 PARTICLES<br />
The time - dependent Schrödinger equation for the wave function Ψ(q, t) is given by<br />
i ∂Ψ<br />
∂t<br />
= − 2<br />
2m ∇2 Ψ + V Ψ. (III. 113)
III. 6. SPIN 1/2 PARTICLES 65<br />
In this equation<br />
⃗p = − i ⃗ ∇ (III. 114)<br />
is the canonical momentum operator, yielding for the components of the angular momentum ⃗ L = ⃗q×⃗p<br />
for a system in 3 - dimensional space<br />
L i = − i ϵ ijk q j ∂ k . (III. 115)<br />
These components do not commute,<br />
[L i , L j ] = i ϵ ijk L k , (III. 116)<br />
but the operator ⃗ L 2 = L 2 x + L 2 y + L 2 z does commute with ⃗ L, or with any one of its components,<br />
where usually L z is taken.<br />
The simultaneous eigenstates of ⃗ L 2 and L z are written as |l, m⟩, and their eigenvalues are discrete,<br />
⃗L 2 |l, m⟩ = 2 l (l + 1) |l, m⟩, with l = 0, 1 2 , 1, 3 2<br />
, . . . , (III. 117)<br />
L z |l, m⟩ = m |l, m⟩ with m = − l, − l + 1, . . . , l − 1, l. (III. 118)<br />
Although the algebraic derivation using the commutation relations allows for half integer values,<br />
for angular momentum ⃗ L the values of l can only be integers to make sense physically. But the half<br />
integer values are included in the description of spin.<br />
Spin ⃗ S is an internal degree of freedom of elementary particles, which cannot easily be described<br />
in classical terms, but is similar to ⃗ L. A main difference is that where the value of the angular momentum<br />
of a particle can vary, the value s of spin of a particle is constant. The similarity is that spin has,<br />
like ⃗ L, a direction ⃗n in 3 - dimensional space, and satisfies the commutation relations of (III. 116).<br />
Writing the simultaneous eigenstates of ⃗ S 2 and S z as |s, m⟩, we can use (III. 117) and (III. 118)<br />
again, where L 2 and L z are replaced by S 2 and S z , respectively, and l by s. The eigenvalues of ⃗ S 2<br />
and S z are<br />
s = 0 : ⃗ S 2 = 0, S z = 0, (III. 119)<br />
s = 1 2 : S ⃗ 2 = 3 4 2 , S z = − 1 2 , 1 2<br />
, (III. 120)<br />
and so on for s = 0, 1 2 , 1, 3 2<br />
, . . . . In this section we restrict ourselves to the most simple non - trivial<br />
case, spin 1/2.<br />
For spin 1/2 particles there are only two orthonormal eigenstates, | 1 2 , 1 2 ⟩ and | 1 2 , − 1 2<br />
⟩, called<br />
‘spin up’ and ‘spin down’, usually written as |↑⟩ and |↓⟩, respectively. Together, these eigenstates<br />
form a basis for a spin space, the 2 - dimensional Hilbert space H = C 2 .<br />
According to the observables postulate, p. 41, the observable spin corresponds uniquely to a self -<br />
adjoint, or Hermitian, operator A in H. Every Hermitian operator in C 2 can be represented in the<br />
aforementioned basis as a 2 × 2 - matrix,<br />
A =<br />
( )<br />
a11 a 12<br />
a 21 a 22<br />
=<br />
( )<br />
a0 + a z a x − ia y<br />
a x + ia y a 0 − a z<br />
= a 0 11 + a x σ x + a y σ y + a z σ z = a 0 11 + ⃗a · ⃗σ, (III. 121)
66 CHAPTER III. THE POSTULATES<br />
with real coefficients a 0 and ⃗a, and ⃗σ defined by the Pauli matrices,<br />
σ x =<br />
( ) 0 1<br />
, σ<br />
1 0 y =<br />
( ) 0 −i<br />
, σ<br />
i 0 z =<br />
( ) 1 0<br />
. (III. 122)<br />
0 −1<br />
EXERCISE 21. Prove the aforementioned statement.<br />
( (<br />
Writing the eigenvectors of σ z , 1<br />
and 0<br />
, as |z ↑⟩ and |z ↓⟩, we have<br />
0)<br />
1)<br />
σ z |z ↑⟩ = |z ↑⟩ and σ z |z ↓⟩ = − |z ↓⟩. (III. 123)<br />
Analogously, let |x ↑⟩, |x ↓⟩ and |y ↑⟩, |y ↓⟩ denote eigenstates for the eigenvalues ±1 of σ x and σ y .<br />
The Pauli matrices have the following properties:<br />
σ 2 x = σ 2 y = σ 2 z = 11, (III. 124)<br />
σ i σ j = i ϵ ijk σ k , (III. 125)<br />
Tr ⃗σ = 0. (III. 126)<br />
Using the anticommutation relations for the Pauli matrices, [σ i , σ j ] +<br />
from (III. 125), we find a useful relation,<br />
= 0, which follow directly<br />
(⃗a · ⃗σ) ( ⃗ b · ⃗σ) = (⃗a · ⃗b) 11 + i ⃗σ · (⃗a × ⃗ b) (III. 127)<br />
from which it follows that<br />
(⃗a · ⃗σ) 2 = 11 if ∥⃗a∥ = 1. (III. 128)<br />
A 2 × 2 - matrix A has eigenvalues ±1 iff A 2 = 11, and therefore, with ⃗n a unit vector, we see<br />
that the only operators of the form (III. 121) having eigenvalues ±1 are precisely of the form ⃗n · ⃗σ.<br />
This allows us to let spin in the direction ⃗n correspond to the operator<br />
⃗S = 1 2<br />
⃗n · ⃗σ. (III. 129)<br />
We will found this choice shortly, but first we determine the eigenvectors of the spin operator ⃗n · ⃗σ.<br />
Writing ⃗n in spherical coordinates<br />
⃗n =<br />
⎛ ⎞<br />
sin θ cos ϕ<br />
⎝sin θ sin ϕ⎠ , (III. 130)<br />
cos θ
III. 6. SPIN 1/2 PARTICLES 67<br />
we have<br />
⃗n · ⃗σ =<br />
( cos θ e<br />
− i ϕ )<br />
sin θ<br />
e i ϕ , (III. 131)<br />
sin θ − cos θ<br />
with eigenvectors<br />
|⃗n, +⟩ =<br />
(<br />
)<br />
e − i 2 ϕ cos 1 2 θ<br />
e i 2 ϕ sin 1 2 θ<br />
and |⃗n, −⟩ =<br />
(<br />
)<br />
− e − i 2 ϕ sin 1 2 θ<br />
e i 2 ϕ cos 1 2 θ<br />
(III. 132)<br />
for eigenvalues ±1.<br />
EXERCISE 22. Verify (III. 132)<br />
III. 6. 1<br />
SPIN 1/2 AND ROTATIONS IN SPIN SPACE<br />
A rotation over an angle α ∈ [0, π) around an axis in the direction of the unit vector ⃗m,<br />
with ⃗m ∈ R 3 , can be written as a unitary matrix<br />
U (⃗m, α) = e − i α ( ⃗m · ⃗J) , (III. 133)<br />
where the total angular momentum J ⃗ = L ⃗ + S ⃗ is the infinitesimal generator of rotations. With L ⃗ = 0<br />
and writing S i = 1 2 σ i, which is, using (III. 124), in accordance to (III. 120) and the still unfounded<br />
(III. 129), the Pauli matrices are the generators of rotations in C 2 , leading to<br />
U (⃗m, α) = e − i 2 α ( ⃗m · ⃗σ) , (III. 134)<br />
where ∥⃗m∥ is again 1. Using Taylor expansions, with (III. 128) we find for (III. 134)<br />
∞∑ (− i) k (⃗m · ⃗σ) k (<br />
U(⃗m, α) =<br />
1<br />
k!<br />
2 α) k<br />
=<br />
k=0<br />
∞∑<br />
k=0<br />
k=even<br />
(− 1) 1 2 k ( 1<br />
k!<br />
2 α) ∑<br />
k ∞ 11 + i (⃗m · ⃗σ)<br />
k=1<br />
k=odd<br />
(− 1) 1 2 (k+1) ( 1<br />
k!<br />
2 α) k<br />
= cos 1 2 α 11 − i (⃗m · ⃗σ) sin 1 2α. (III. 135)<br />
It can be verified that, under a rotation around an axis ⃗m over an angle α, with ⃗n R the unit vector<br />
in the rotated direction, the eigenstates of ⃗n · ⃗σ, (III. 132), transform into the eigenstates of ⃗n R · ⃗σ,<br />
obeying the rotational transformation rules<br />
U (⃗m, α) |⃗n, ±⟩ = |⃗n R , ±⟩. (III. 136)
68 CHAPTER III. THE POSTULATES<br />
We illustrate (III. 136) using a rotation of ⃗n in the x z - plane, ϕ = 0, over an angle α around<br />
the y - axis as in diagram III. 2.<br />
⃗n<br />
z<br />
θ<br />
α<br />
⃗n R<br />
x<br />
y<br />
Figure III. 2: A rotated unit vector in the xz - plane<br />
For ⃗n and ⃗n R we have<br />
⎛ ⎞ ⎛ ⎞<br />
sin θ<br />
sin(θ + α)<br />
⃗n = ⎝ 0 ⎠ , ⃗n R = ⎝ 0 ⎠ . (III. 137)<br />
cos θ<br />
cos(θ + α)<br />
The eigenstates of ⃗n · ⃗σ, using (III. 132), are<br />
( cos<br />
1<br />
|⃗n, +⟩ = 2 θ )<br />
sin 1 2 θ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 138)<br />
and<br />
|⃗n, −⟩ =<br />
( − sin<br />
1<br />
2 θ )<br />
cos 1 2 θ<br />
= − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩. (III. 139)<br />
Rotating around the y - axis and therefore<br />
(<br />
U (⃗e y , α) = (cos 1 2 α 11 − i ⃗e y · ⃗σ sin 1 cos<br />
1<br />
2 α) = 2 α − sin 1 2 α )<br />
sin 1 2 α cos 1 2 α , (III. 140)<br />
we have<br />
U (⃗e y , α) |⃗n, +⟩ =<br />
( )<br />
cos<br />
1<br />
2<br />
(θ + α)<br />
sin 1 2 (θ + α)<br />
and<br />
U (⃗e y , α) |⃗n, −⟩ =<br />
= cos 1 2 (θ + α) |z ↑⟩ + sin 1 2<br />
(θ + α) |z ↓⟩ (III. 141)<br />
( )<br />
− sin<br />
1<br />
2<br />
(θ + α)<br />
cos 1 2 (θ + α)<br />
= − sin 1 2 (θ + α) |z ↑⟩ + cos 1 2<br />
(θ + α) |z ↓⟩, (III. 142)
III. 6. SPIN 1/2 PARTICLES 69<br />
and we see that (III. 141) and (III. 142) are indeed the eigenstates |⃗n R , +⟩ and |⃗n R , −⟩ of ⃗n R · ⃗σ.<br />
Comparison of these eigenstates with the eigenstates of ⃗n · ⃗σ, (III. 138) and (III. 139), shows<br />
that (III. 136) is satisfied. As can easily be verified, this holds in general, and we conclude that spin<br />
is represented by the spin operator ⃗n · ⃗σ, founding our choice (III. 129).<br />
Under a rotation around the y - axis over an angle θ the eigenvectors of σ z transform into<br />
and, likewise,<br />
U (⃗e y , θ) |z ↑⟩ = (cos 1 2 θ 11 − i σ y sin 1 2 θ) |z ↑⟩ = cos 1 2 θ |z ↑⟩ + sin 1 2θ |z ↓⟩ (III. 143)<br />
U (⃗e y , θ) |z ↓⟩ = − sin 1 2 θ |z ↑⟩ + cos 1 2θ |z ↓⟩ (III. 144)<br />
Especially, it holds that the eigenvectors of σ x correspond to a rotation of the eigenvectors of σ z<br />
around the y - axis over θ = 1 2 π,<br />
and<br />
U (⃗e y , 1 2 π) |z ↑⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↑⟩ + |z ↓⟩<br />
)<br />
= |x ↑⟩, (III. 145)<br />
U (⃗e y , 1 2 π) |z ↓⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↓⟩ − |z ↑⟩<br />
)<br />
= |x ↓⟩. (III. 146)<br />
EXERCISE 23. Construct, analogously, the states |y ↑⟩ and |y ↓⟩ from |z ↑⟩ and |z ↓⟩ using a<br />
rotation around the x - axis.<br />
Successively rotating over 1 2<br />
π transforms |z ↑⟩ via |x ↑⟩, |z ↓⟩ and |x ↓⟩ into −|z ↑⟩, instead of<br />
into |z ↑⟩, and consequently, we have to rotate |z ↑⟩ over 4π to come back to |z ↑⟩ again. Generally,<br />
a rotation over 2π transforms a state |ϕ⟩ into −|ϕ⟩. This means we cannot simply visualize particles<br />
with spin as tiny spinning tops!<br />
Finally a useful relation holds. Choosing again ⃗e y for ⃗m, we have U (⃗e y , α) as in (III. 140) which<br />
yields for arbitrary ⃗n, (III. 130),<br />
⟨⃗n, +| U (⃗e y , α) |⃗n, +⟩ = cos 1 2 α + (e − i 2 ϕ − e i 2 ϕ ) cos 1 2 θ sin 1 2 θ sin 1 2 α<br />
= cos 1 2 α − i sin ϕ sin θ sin 1 2α, (III. 147)<br />
from which we see that, if ⃗n and ⃗n R are in the xz - plane, ϕ = 0 or ϕ = π,<br />
⟨⃗n, + | ⃗n R , +⟩ = cos 1 2 α ⃗n ⃗n R<br />
, (III. 148)<br />
where α ⃗n ⃗nR is the angle between ⃗n and ⃗n R . Because ⃗n and α can be chosen arbitrarily, this relation<br />
holds for any two vectors ⃗n and ⃗n ′ in the xz - plane, and, by freedom of choice of the coordinate<br />
system, it holds whenever ⃗n and ⃗n ′ are in the same plane.
70 CHAPTER III. THE POSTULATES<br />
EXERCISE 24. Show that the operator 1 2<br />
(11 + ⃗n · ⃗σ) is the projector on |⃗n, +⟩,<br />
1<br />
2<br />
(11 + ⃗n · ⃗σ) = |⃗n, +⟩ ⟨⃗n, +|. (III. 149)<br />
◃ Remark<br />
This holds in any matrix representation. ▹<br />
III. 6. 2<br />
MIXED SPIN 1/2 STATES<br />
Every Hermitian 2 × 2 - matrix can, as stated before, be written as (III. 121), A = a 0 11 + ⃗a · ⃗σ,<br />
with real coefficients a 0 and ⃗a. According to (III. 29), for the corresponding operator A to be a state<br />
operator the trace of A has to be 1, which means that a 0 = 1 2<br />
. Furthermore, A has to be positive.<br />
A positive matrix can be written as the square of a Hermitian matrix B,<br />
B = b 0 11 + ⃗ b · ⃗σ and B 2 = (b 2 0 + ⃗ b 2 ) 11 + 2 b 0<br />
⃗ b · ⃗σ, (III. 150)<br />
Therefore,<br />
a 0 = 1 2 = b 2 0 + ⃗ b 2 and ⃗a = 2 b 0<br />
⃗ b. (III. 151)<br />
The possible values of b 0 are limited by (III. 151), b 2 0<br />
fixed, ⃗ b = 1 ⃗a<br />
2 b 0<br />
, yielding<br />
1 2 , while as soon as b 0 is chosen ⃗ b is<br />
⃗a 2 = 4 b 0<br />
2⃗ b 2 = 4 b 0<br />
2 ( 1<br />
2 − b 0 2) . (III. 152)<br />
Obviously, ⃗a 2 only depends on b 2 0 and its values in the interval [0, 1 2 ] are between 0 and 1 4<br />
, where ⃗a<br />
2<br />
has a maximum for b 2 0 = 1 4 . In other words, A is a state operator iff a 0 = 1 2 and ⃗a 2 1 4<br />
, in which<br />
case some b 0 and ⃗ b exist, satisfying the requirements (III. 151).<br />
Now an arbitrary state operator is<br />
W = 1 2 (11 + ⃗w · ⃗σ), ⃗w 2 1. (III. 153)<br />
This state operator is characterized by the vector ⃗w, called the polarization vector, which has its<br />
endpoints within or on the surface of the unit sphere, the so - called Bloch sphere. For ∥ ⃗w∥ = 1 the<br />
system is called completely polarized, for ⃗w = 0 it is called unpolarized, and if 0 < ∥ ⃗w∥ < 1 it is<br />
called partially polarized.<br />
The state operators with ⃗w 2 = 1 are the pure states, the 1 - dimensional projectors,<br />
W 2 = 1 4 (11 + 2 ⃗w · ⃗σ + ⃗w 2 11) = 1 2<br />
(11 + ⃗w · ⃗σ) = W, (III. 154)<br />
the state operators with ⃗w 2 < 1 are mixed states. The set of state operators is a convex set as we<br />
can now easily see. If ⃗w 1 and ⃗w 2 are within or on the surface of the unit sphere, then α ⃗w 1 + β ⃗w 2 ,<br />
with 0 < α, β < 1 and α +β = 1, is the chord linking ⃗w 1 and ⃗w 2 , and this chord is within the sphere.
III. 6. SPIN 1/2 PARTICLES 71<br />
EXERCISE 25. Prove the following statements.<br />
(a) ⟨⃗σ⟩ W = ⃗w,<br />
(b) det W = 1 4 (1 − ⃗w 2 ),<br />
(c) the eigenvalues of W are 1 2 ± 1 2 ∥ ⃗w∥.<br />
EXAMPLES<br />
In the following two examples, consider vectors ⃗w with ∥ ⃗w∥ = 1, thus corresponding to pure<br />
states.<br />
(a) Since in this case ⃗w equals the unit vector ⃗n, for ⃗w = (0, 0, 1) ∈ R 3 we have<br />
( )<br />
W = 1 (11 1 0<br />
2 + σ z) = , (III. 155)<br />
0 0<br />
which is a 1 - dimensional projector, it is the matrix representation of W = |z ↑⟩ ⟨z ↑|.<br />
Likewise we have<br />
⃗w = (1, 0, 0) =⇒ W = 1 2 (11 + σ x) = |x ↑⟩ ⟨x ↑|, (III. 156)<br />
⃗w = (0, 1, 0) =⇒ W = 1 2 (11 + σ y) = |y ↑⟩ ⟨y ↑|,<br />
and we see that generally W = 1 2<br />
(11 + ⃗n · ⃗σ) corresponds to the pure state |⃗n, +⟩, as was<br />
already shown in (III. 149).<br />
In the same way, for |⃗n, −⟩ we have<br />
etc.<br />
⃗w = (0, 0, − 1) =⇒ W = 1 2 (11 − σ z) = |z ↓⟩ ⟨z ↓|, (III. 157)<br />
(b) For the probability to find spin up in the direction ⃗n ′ in the state |⃗n, +⟩, with (III. 45)<br />
and (III. 127) we find<br />
µ W ⃗n<br />
(W ⃗n ′) = Tr W ⃗n ′ W ⃗n = Tr ( 1<br />
2 + ⃗n ′ · ⃗σ) · 1<br />
2<br />
(11 + ⃗n · ⃗σ))<br />
= 1 4 Tr ( 11 + ⃗n ′ · ⃗σ + ⃗n · ⃗σ + (⃗n ′ · ⃗n)11 + i⃗σ · (⃗n ′ × ⃗n) )<br />
= 1 2 (1 + ⃗n ′ · ⃗n) = 1 2 (1 + cos θ) = cos2 1 2θ, (III. 158)<br />
with θ the angle between ⃗n and ⃗n ′ . This is in accordance with (III. 148).<br />
The following examples concern mixed state operators W , for which ⃗w has its endpoint somewhere<br />
inside the sphere, ⃗w 2 < 1.<br />
(c) Choosing ⃗w to be 1 2<br />
(0, 1, 0) yields<br />
( 1<br />
W = 1 (11 2 + 1 2 σ 2<br />
y) =<br />
− 1 4 i<br />
This can, for instance, be factorized as<br />
1<br />
4 i 1<br />
2<br />
)<br />
. (III. 159)<br />
W = 1 4 |z ↑⟩ ⟨z ↑| + 1 4 |z ↓⟩ ⟨z ↓| + 1 2<br />
|y ↑⟩ ⟨y ↑|, (III. 160)<br />
which clearly is a mixture.
72 CHAPTER III. THE POSTULATES<br />
The next two examples concern the center of the Bloch sphere, ⃗w = 0 .<br />
(d) With ⃗w = 0 , we have<br />
( ) 1 0<br />
W = 1 2<br />
. (III. 161)<br />
0 1<br />
The eigenvalues of this mixed state W are degenerate, and various factorizations are possible,<br />
for example<br />
W = 1 2 |x ↑⟩ ⟨x ↑| + 1 2<br />
|x ↓⟩ ⟨x ↓|<br />
= 1 2 |y ↑⟩ ⟨y ↑| + 1 2<br />
|y ↓⟩ ⟨y ↓|<br />
= 1 2 |z ↑⟩ ⟨z ↑| + 1 2<br />
|z ↓⟩ ⟨z ↓|. (III. 162)<br />
(e) Under a rotation R, ⃗w behaves like a vector in R 3 ,<br />
U (R) ( ⃗w · ⃗σ) U − 1 (R) = ⃗w R · ⃗σ (III. 163)<br />
where U (R) is given by (III. 135). Therefore, the only rotation invariant state for a 1 - particle<br />
system is ⃗w = 0 .<br />
The similarity between the set of density matrices W and the 3 - dimensional unit sphere of polarization<br />
vectors is specific for spin 1/2 particles, in which case every pure state is also the eigenstate<br />
for the spin operator in a certain spin direction. For spin 1 bosons and higher spin particles this no<br />
longer applies.<br />
III. 6. 3<br />
TWO SPIN 1/2 PARTICLES<br />
III. 6. 3. 1<br />
SINGLET AND TRIPLET STATES<br />
Consider a composite system of two spin 1/2 fermions. In the direct product space C 2 ⊗ C 2 = C 4<br />
a basis is<br />
|z ↑⟩ ⊗ |z ↑⟩, |z ↑⟩ ⊗ |z ↓⟩, |z ↓⟩ ⊗ |z ↑⟩, |z ↓⟩ ⊗ |z ↓⟩. (III. 164)<br />
From these basis states the simultaneous eigenstates |s, m⟩ of the operators ⃗ S 2 = ( ⃗ S 1 + ⃗ S 2 ) 2<br />
and S z = S 1z + S 2z can be formed, where s can be 0 or 1. The eigenvalues of ⃗ S 2 are 2 s(s + 1), the<br />
eigenvalues of S z are m, as introduced on p. 65.<br />
The singlet state or singlet for short, with s = 0 and therefore m = 0, is the entangled state<br />
|Ψ 0 ⟩ = |0, 0⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↑⟩ ⊗ |z ↓⟩ − |z ↓⟩ ⊗ |z ↑⟩<br />
)<br />
, (III. 165)<br />
which looks the same in terms of the eigenstates of S x and S y , having spherical symmetry. The singlet<br />
is a simultaneous eigenstate of S x , S y and S z with eigenvalue 0. Hence the singlet is an eigenstate<br />
of ⃗n · ⃗S with eigenvalue 0, which means that a rotation (III. 133) carries (III. 165) back into itself.
III. 6. SPIN 1/2 PARTICLES 73<br />
The triplet states, with s = 1 and m = 1, 0, −1 are<br />
|1, 1⟩ = |z ↑⟩ ⊗ |z ↑⟩<br />
√<br />
|1, 0⟩ = 1 ( )<br />
2 2 |z ↑⟩ ⊗ |z ↓⟩ + |z ↓⟩ ⊗ |z ↑⟩<br />
|1, − 1⟩ = |z ↓⟩ ⊗ |z ↓⟩. (III. 166)<br />
III. 6. 3. 2<br />
CORRELATIONS<br />
In chapter VII we will use the spin correlation function of the singlet,<br />
E QM (⃗a, ⃗ b) := ⟨0, 0|⃗a · ⃗σ 1 ⊗ ⃗ b · ⃗σ 2 |0, 0⟩, (III. 167)<br />
where ⃗a, ⃗ b ∈ R 3 are unit vectors. E QM (⃗a, ⃗ b) is the expectation value to find both for particle 1 spin<br />
up along ⃗a and for particle 2 spin up along ⃗ b. To find E QM (⃗a, ⃗ b), first choose the z - axis along ⃗a<br />
as in diagram III. 3, next choose the x - axis in such a way that ⃗ b is in the xz - plane. The spherical<br />
symmetry of the singlet state allows such a choice.<br />
z<br />
⃗a<br />
θ ⃗a, ⃗ b<br />
⃗ b<br />
Figure III. 3: Spin up for particle 1 along ⃗a, for particle 2 along ⃗ b<br />
x<br />
With ⃗a = ⃗e z , ⃗ b similar to ⃗n in (III. 137), and θ ⃗a, ⃗ b<br />
the angle between ⃗a and ⃗ b, we have<br />
E QM (⃗a, ⃗ b) = ⟨0, 0| σ 1z ⊗ (sin θ ⃗a, ⃗ b<br />
σ 2x + cos θ ⃗a, ⃗ b<br />
σ 2z ) |0, 0⟩. (III. 168)<br />
Now σ z |z ↑⟩ = |z ↑⟩, σ x |z ↑⟩ = |z ↓⟩ etc., so that we have, using (II. 100), (III. 165) and (III. 166),<br />
√<br />
(σ 1z ⊗ σ 2x ) |0, 0⟩ = 1 ( )<br />
2 2 |1, 1⟩ + |1, −1⟩ (III. 169)<br />
which is perpendicular to |0, 0⟩, and<br />
(σ 1z ⊗ σ 2z ) |0, 0⟩ = − |0, 0⟩, (III. 170)<br />
from which we see that<br />
E QM (⃗a, ⃗ b) = − cos θ ⃗a, ⃗ b<br />
. (III. 171)
74 CHAPTER III. THE POSTULATES<br />
III. 6. 3. 3<br />
CONDITIONAL PROBABILITIES<br />
In chapter VII we will also need to know, again in case the particles are in the singlet state, the<br />
probability for the spin of particle 2 to be found in the direction ⃗ b, given that the spin of particle 1 was<br />
found in the direction ⃗a. This conditional probability is, by definition,<br />
Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />
Prob ( ) . (III. 172)<br />
⃗a · ⃗σ 1 = 1<br />
Here the joint probability is<br />
Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = | ( ⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| ) |0, 0⟩| 2 , (III. 173)<br />
with |⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ the direct product of the eigenstates of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 having eigenvalues +1.<br />
Again choosing ⃗a and ⃗ b as in diagram III. 3, |⃗a ↑⟩ = |z ↑⟩ and | ⃗ b ↑⟩ equal to |⃗n, +⟩, (III. 138), we find<br />
for the direct product<br />
|⃗a ↑⟩ ⊗ | ⃗ b ↑⟩ = |z ↑⟩ ⊗ ( cos 1 2 θ ⃗a, ⃗ b |z ↑⟩ + sin 1 2 θ ⃗a, ⃗ b |z ↓⟩) . (III. 174)<br />
Therefore, with (III. 165),<br />
( ) √<br />
⟨⃗a ↑| ⊗ ⟨ ⃗ b ↑| |0, 0⟩ =<br />
1<br />
2 2 sin<br />
1<br />
2 θ ⃗a, ⃗ , (III. 175)<br />
b<br />
and we see that the joint probability is<br />
Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 sin2 1 2 θ ⃗a, ⃗ . (III. 176)<br />
b<br />
Likewise, again using (III. 173) with ⟨ ⃗ b ↓| equal to |⃗n, −⟩, (III. 139), we have<br />
Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 ) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (III. 177)<br />
b<br />
This yields for the marginal probability<br />
Prob ( ⃗a · ⃗σ 1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />
and we see that the conditional probability (III. 172) is<br />
+ Prob ( ⃗ b · ⃗σ2 = − 1 ∧ ⃗a · ⃗σ 1 = 1 )<br />
= 1 2 sin2 1 2 θ ⃗a, ⃗ b + 1 2 cos2 1 2 θ ⃗a, ⃗ b = 1 2<br />
, (III. 178)<br />
Prob ( ⃗ b · ⃗σ2 = 1 ∣ ⃗a · ⃗σ1 = 1 ) = sin 2 1 2 θ ⃗a, ⃗ . (III. 179)<br />
b<br />
◃ Remark<br />
By definition there is no correlation between the two results of measurements of spin if<br />
Prob ( ⃗ b · ⃗σ2 = 1 ∣ ∣ ⃗a · ⃗σ1 = 1 ) = Prob ( ⃗ b · ⃗σ2 = 1 ) , (III. 180)<br />
which is the case if ⃗a and ⃗ b are perpendicular. ▹
III. 6. SPIN 1/2 PARTICLES 75<br />
We are now able to calculate the correlation (III. 167) directly, using a well - known formula from<br />
probability theory,<br />
E QM (⃗a, ⃗ b) =<br />
∑+1<br />
∑+1<br />
a=−1 b=−1<br />
a b Prob (a, b), (III. 181)<br />
where a, b ∈ { −1, 1} are the results of measurements of ⃗a · ⃗σ 1 and ⃗ b · ⃗σ 2 , respectively, and<br />
Prob (a, b) is the joint probability to find a and b at measurements of the respective spin quantities.<br />
Using (III. 176) and (III. 177) and calculating the probabilities with eigenvalues −1 for ⃗a · ⃗σ 1 we<br />
find<br />
E QM (⃗a, ⃗ b) = Prob (1, 1) + Prob (− 1, − 1) − Prob (1, − 1) − Prob (− 1, 1)<br />
= 2 · 1<br />
2 sin2 1 2 θ ⃗a, ⃗ b − 2 · 1<br />
2 cos2 1 2 θ ⃗a, ⃗ b<br />
= − cos θ ⃗a, ⃗ b<br />
. (III. 182)<br />
This is indeed equal to the earlier result (III. 171).<br />
III. 6. 3. 4<br />
EXAMPLE <strong>OF</strong> A MIXED STATE <strong>OF</strong> TWO SPIN 1/2 PARTICLES<br />
Consider, analogous to (III. 100), the pure entangled state<br />
|Φ⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↑⟩ ⊗ |z ↑⟩ + |z ↓⟩ ⊗ |z ↓⟩<br />
)<br />
, (III. 183)<br />
and the corresponding state W = |Φ⟩ ⟨Φ|, acting in H I ⊗ H II ,<br />
W = 1 (<br />
2 |z ↑⟩ ⟨z ↑| ⊗ |z ↑⟩ ⟨z ↑| + |z ↑⟩ ⟨z ↓| ⊗ |z ↑⟩ ⟨z ↓| +<br />
|z ↓⟩ ⟨z ↑| ⊗ |z ↓⟩ ⟨z ↑| + |z ↓⟩ ⟨z ↓| ⊗ |z ↓⟩ ⟨z ↓| ) , (III. 184)<br />
where the first factor in the direct product acts in H I , and the second factor in H II .<br />
The representation of W in the corresponding basis (III. 164) of H = H I ⊗ H II is, using the<br />
Kronecker product of matrices, (II. 103),<br />
⎛ ⎞<br />
1 0 0 1<br />
W = 1 ⎜0 0 0 0<br />
⎟<br />
2 ⎝0 0 0 0⎠ . (III. 185)<br />
1 0 0 1<br />
This is indeed a pure state, since W is idempotent, a necessary and sufficient condition for bounded,<br />
self - adjoint operators to be a projector.<br />
The partial traces are<br />
W I = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H I), (III. 186)<br />
W II = 1 2 |z ↑⟩ ⟨z ↑| + 1 2 |z ↓⟩ ⟨z ↓| ∈ S (H II), (III. 187)
76 CHAPTER III. THE POSTULATES<br />
and their matrix representation in the basis of σ z is<br />
W I = 1 2<br />
( ) 1 0<br />
0 1<br />
and W II = 1 2<br />
( ) 1 0<br />
. (III. 188)<br />
0 1<br />
Although W is a pure state, the direct product of the partial traces W I and W II is not pure,<br />
⎛ ⎞<br />
1 0 0 0<br />
W I ⊗ W II = 1 ⎜0 1 0 0<br />
⎟<br />
4 ⎝0 0 1 0⎠ ≠ W. (III. 189)<br />
0 0 0 1<br />
This conclusion is, of course, in accordance with the conclusion (III. 104) concerning the pure state<br />
operator (III. 100).<br />
◃ Remark<br />
Notice that all matrices in this example are indeed Hermitian, positive and have trace 1, the requirements<br />
of Gleason’s theorem, p. 47, for operators W to be state operators. ▹<br />
EXERCISE 26.<br />
(a) In (III. 184), fill in the matrix representations of the projectors in H I and H II , and check<br />
that forming Kronecker products indeed yields (III. 185).<br />
(b) Is the state (III. 184) spherically symmetric?
IV<br />
THE COPENHAGEN INTERPRETATION<br />
It is wrong to think that the task of physics is to find out how nature is. Physics concerns<br />
what we can say about nature.<br />
— Niels Bohr<br />
The Heisenberg-Bohr tranquilizing philosophy - or religion? - is so delicately contrived<br />
that, for the time being, it provides a gentle pillow for the true believer from which he<br />
cannot very easily be aroused. So let him lie there.<br />
— Albert Einstein<br />
I know it is not the fault of N. B. that he did not study philosophy. But I deeply regret<br />
that by his authority the brains of two or three generations will be upset and hindered to<br />
think about the problems ‘He’ pretends to have solved.<br />
— Erwin Schrödinger<br />
Bohr’s famous institute being located in Copenhagen, the standard interpretation of quantum<br />
mechanics as explained in most of the textbooks is generally indicated as the Copenhagen Interpretation.<br />
It is however worth mentioning that the conceptions of the many supporters of the<br />
Copenhagen Interpretation, Niels Bohr, Werner Heisenberg, Wolfgang Pauli, Rudolf Peierls,<br />
Léon Rosenfeld and John Wheeler, to name some of them, mutually differ on numerous points,<br />
and that some of them, including Bohr himself, modified their conceptions in the course of time,<br />
so that the name ‘Copenhagen Interpretation’ is more a collective noun than the name of one<br />
clearly outlined vision. Moreover, important contributions to the standard interpretation of the<br />
theory have been made by Born and Von Neumann, working independently of the Copenhagen<br />
school. In this chapter we will evaluate the conceptions of Heisenberg and Bohr as the main<br />
representatives of the Copenhagen Interpretation, and consider more closely the debate between<br />
Einstein and Bohr. Finally, we will discuss the exact expression of the uncertainty principle.<br />
IV. 1<br />
HEISENBERG AND THE UNCERTAINTY PRINCIPLE<br />
The history of modern quantum mechanics starts in 1925, when Heisenberg publishes his famous<br />
transitional article ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />
Beziehungen’ (‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’). His<br />
summary reads<br />
The present paper seeks to establish a basis for theoretical quantum mechanics founded<br />
exclusively upon relationships between quantities which in principle are observable.
78 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
Obviously the theory was only allowed to speak about observable quantities; every attempt to<br />
visualize the inside of an atom had to be avoided. In particular, one could not speak of the orbit<br />
of an electron. Only the transitions between stationary states were ‘observable’ and therefore the<br />
transition quantities could be characterized by two discrete indices. These ideas were developed<br />
by Heisenberg, Born and Jordan into matrix mechanics. They represented all physical quantities<br />
by infinite complex Hermitian matrices. The ‘quantum condition’, the fundamental equation of this<br />
theory, is the commutation relation<br />
P Q − Q P = − i 11 (IV. 1)<br />
between the matrices P and Q, which were meant to be the ‘quantum counterparts’ of the canonical<br />
dynamical quantities, momentum and position, of classical mechanics à la Hamilton.<br />
In 1926 matrix mechanics received unexpected competition by wave mechanics, established by<br />
Erwin Schrödinger. He interpreted the electron as a vibrating charge cloud, continuously moving<br />
in space. In his conception the stationary states could be understood as resonances, comparable to<br />
the vibrations of the string of a violin. According to Schrödinger, wave mechanics was to be preferred<br />
over matrix mechanics because wave mechanics offers a graphic picture of what takes place in<br />
microphysical reality. This interpretation foundered on three insoluble problems:<br />
(i) waves of physical systems consisting of more than one particle were defined in the configuration<br />
space R 3N instead of in the three - dimensional space R 3 surrounding us,<br />
(ii) wave packets of free particles eventually fall apart and therefore, the electron cannot remain a<br />
localized entity,<br />
(iii) the wave function can carry complex values.<br />
Nevertheless, eventually the empirical strength of wave mechanics turned out to be just as strong<br />
as that of matrix mechanics.<br />
The fact that an approach with such radically different starting points turned out to be possible<br />
also, impelled Heisenberg to further clarify his starting points. The result of this effort is his ‘uncertainty<br />
principle’, formulated for the first time in his 1927 article ‘Über den anschaulichen Inhalt der<br />
quantentheoretischen Kinematik und Dynamik’, which was translated as ‘The physical content of<br />
quantum kinematics and mechanics’.<br />
In this article Heisenberg wonders how the ‘orbit’ of an electron must be understood in quantum<br />
mechanics. On the one hand, the basic equation (IV. 1) prevents granting numerical values to position<br />
and momentum simultaneously, on the other hand, the path of a particle in, for example, a Wilson<br />
chamber, seems to be directly perceptible. To find a way out of this dilemma, he was inspired by a<br />
statement of Einstein (H.J. Folse 1985, p. 91),<br />
[. . . ] it is the theory finally which decides what can be observed and what can not [. . . ]<br />
Could it be, that if a path cannot be defined in quantum mechanics, it can in fact not be observed also?<br />
This idea led him to analyze what the theory has to say about observations.
IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 79<br />
He starts (1927, Eng. tr. p. 64) with linking measuring and defining operationally,<br />
When one wants to be clear about what is to be understood by the words “position of the<br />
object”, for example of the electron, relative to a given frame of reference, then one must<br />
specify definite experiments with whose help one plans to measure the “position of the<br />
electron”, otherwise this word has no meaning.<br />
We will call this the measuring = defining principle.<br />
One could, for example, determine the position of an electron by examining it under a microscope.<br />
According to classical optics a microscope has a limited resolution. The Abbe criterion gives the<br />
smallest distinguishable details as<br />
δq ∼<br />
λ , (IV. 2)<br />
sin ε<br />
where λ is the wavelength of light and ε is the aperture, the opening angle of the lens. For a precise<br />
measurement we must therefore use a very short wavelength, i.e. gamma radiation. But in that case<br />
the Compton effect cannot be neglected. The radiation behaves as a flow of particles, with momentum<br />
p 0 = h λ<br />
, which collides with the electron and causes it to recoil.<br />
Figure IV. 1: Heisenberg’s γ - microscope<br />
To allow for an observation at least one photon has to collide with the electron, which will bring<br />
about a change of momentum. But as we do not know anything more about the direction of the<br />
photon after the collision than that it has gone through the lens, we cannot indicate the size of the<br />
recoil exactly. As can be seen in figure IV. 1, the transfer of momentum remains unknown to an<br />
amount<br />
δp ∼ p 0 sin ε = h λ<br />
sin ε (IV. 3)<br />
and therefore<br />
δq δp ∼ h. (IV. 4)
80 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
The more closely the position is determined, δq is small, the more inaccurately the momentum afterwards<br />
is known, δp is large.<br />
Quoting Heisenberg again (loc. cit.)<br />
At the instant when position is determined - therefore, at the moment when the photon is<br />
scattered by the electron - the electron undergoes a discontinuous change in momentum.<br />
This change is the greater the smaller the wavelength of the light employed - that is, the<br />
more exact the determination of the position. At the instant at which the position of the<br />
electron is known, its momentum therefore can be known up to magnitudes which correspond<br />
to that discontinuous change. Thus, the more precisely the position is determined,<br />
the less precisely the momentum is known, and conversely.<br />
This conclusion is the first formulation of the uncertainty principle. According to Heisenberg’s<br />
own measuring = defining principle this conclusion can, however, not yet be drawn because it also<br />
has to be specified what, in this context, must be understood by the term ‘momentum of the electron’.<br />
In a later discussion (Heisenberg 1930), Heisenberg specifies the reasoning by also discussing the<br />
definition of the momentum of the electron.<br />
This reasoning goes as follows. Suppose that the momentum of the electron has been measured<br />
in advance with an inaccuracy δ p 1 . Next, the position is measured with an inaccuracy δ q, then the<br />
momentum is measured again, with inaccuracy δp 2 . We can assume that δp 1 ≪ p 1 and δp 2 ≪ p 2 ,<br />
so that the momentum is very accurately known before and after the position measurement. Now it<br />
makes sense to speak of the momentum p 1 of the electron shortly before the position measurement.<br />
If now the position is measured very precisely, the position and momentum of the electron in the past<br />
are arbitrarily well defined. Heisenberg (1930, p. 20):<br />
[. . . ] if the velocity of the electron is at first known and the position then exactly measured,<br />
the position for times previous to the measurement may be calculated. Then for<br />
these past times δp δq is smaller than the usual limiting value [. . . ]<br />
Apparently, the uncertainty relation does not apply to the past. In the example the uncertainty concerns<br />
the unpredictability of the value of p 2 after the position measurement, not the inaccuracy δp 2<br />
with which p 2 can be measured. This unpredictability can be determined by accurately measuring<br />
the momentum before and after the determination of position, and the unpredictability is larger if<br />
the determination of position was more precise. Although it is true that one can speak in a logically<br />
consistent manner of the position and momentum of the electron in the past (loc. cit.),<br />
[. . . ] but this knowledge of the past is of a purely speculative character, since it can never<br />
(because of the unknown change in momentum caused by the position measurement) be<br />
used as an initial condition in any calculation of the future progress of the electron and<br />
thus cannot be subjected to experimental verification. It is a matter of personal belief<br />
whether such a calculation concerning the past history of the electron can be ascribed<br />
any physical reality or not.<br />
For Heisenberg, such a calculation does not describe reality. But then, what is reality to him?<br />
Heisenberg says, (1927, Eng. tr. p. 73),<br />
The “orbit” comes into being only when we observe it.
IV. 1. HEISENBERG AND THE UNCERTAINTY PRINCIPLE 81<br />
Apparently, the measurement creates reality, instead of revealing it. This is what we call the measuring<br />
= creating principle.<br />
This leads to the following representation. First, we measure the momentum of the electron<br />
precisely. Not only is the term “the momentum of the electron” hereby defined, now we also can<br />
say, according to the measuring = creating principle, that the value of the momentum, which was<br />
determined in this measurement, is physically real. Next, we measure the position precisely. At<br />
this measurement the electron obtains an exact position. After this measurement the momentum of<br />
the electron has however changed in an unpredictable manner. This can be verified with a second<br />
precise momentum measurement. This unpredictability turns out to be all the larger as the position<br />
measurement is more precise.<br />
Now the question is, if the electron had this changed momentum already before the second momentum<br />
measurement, i.e., if this value is also physically real before this measurement. According<br />
to Heisenberg this is not the case, because we can only predict the momentum to the order of the<br />
size of the change. Before the second momentum measurement the electron has only a blurred, fuzzy<br />
momentum. Only when the measurement of momentum has been carried out the electron regains a<br />
sharply defined momentum. ‘Fuzzy’ is meant in the ontological sense, as the sharpness of a property<br />
the electron possesses. As one quantity is measured more precisely, the conjugate quantity becomes<br />
more fuzzy.<br />
◃ Remark<br />
Directly after the measurement of momentum it is meaningful to say that the electron has this momentum,<br />
because in that case the outcome of a next measurement of momentum can, within the accuracy<br />
of measurement, be predicted with certainty. ▹<br />
In later work Heisenberg uses the Aristotelian term potential. A related term by K.R. Popper<br />
is propensity. The electron has a propensity to produce, at measurement, a certain outcome. This<br />
propensity can be understood as a real property of the electron, even if we are not performing a<br />
measurement. The potential and propensity interpretations are therefore ‘realistic’ interpretations, or<br />
at least not in conflict with scientific realism which is, roughly speaking, the thesis that a scientific<br />
theory tells us how (a part of) reality is made up.<br />
IV. 1. 1<br />
REMARKS<br />
(a) Heisenberg derives the uncertainty relation (IV. 4) for the electron from a quantum mechanical<br />
treatment of the photon. What he in fact hereby proves is the consistency of the uncertainty<br />
principle.<br />
(b) Although it is frequently written that the uncertainty relation restricts simultaneous measurements,<br />
simultaneous measurements of position and momentum do not appear in this discussion.<br />
(c) Creation of the sharp value of a quantity upon measurement can, in the terminology of the<br />
projection postulate, p. 42, be described as follows. Upon measurement of p the state transforms<br />
into the proper eigenstate of p. In that state q is unpredictable. If next q is measured, the<br />
state transforms into the proper eigenstate of q and p becomes unpredictable. The uncertainty<br />
principle says that that unpredictability is larger if the preceding measurement of q was more<br />
precise.
82 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
(d) Heisenberg (1930) describes the path of an electron in a Wilson chamber as follows. Suppose<br />
that the incoming electron can be described by a wave packet with fairly sharply defined position<br />
and momentum. Upon free development this packet spreads out in the course of time so<br />
that the position becomes less sharp. When the electron ionizes a molecule in the Wilson chamber<br />
a macroscopic droplet is formed, which can be understood as a position measurement. As<br />
a result the wave packet reduces to a packet which is rather sharply located, with a dimension<br />
in the order of a molecule, which again spreads out until a next ionization takes place.<br />
It can be shown that the successive spreading and contraction in position and momentum is,<br />
according to the uncertainty principle, in agreement with the observation of a macroscopic<br />
path. We cannot speak however of the path of an electron in an atom, not even approximately.<br />
An observation of the position of the electron with an accuracy larger than the dimension of the<br />
atom requires such a large recoil that the electron is generally pushed out of the atom entirely.<br />
Therefore, of such an ‘orbit’ no more than one point is observable. Notice that observation plays<br />
a vital role; the path in the Wilson chamber only comes into existence because we observe it.<br />
(e) As a result of Heisenbergs discussion of the uncertainty principle the term measurement disturbance<br />
was introduced in quantum mechanics. Initially the inclination existed to consider this<br />
as a more or less classical physical process; the momentum of the electron is disturbed by the<br />
collision with a photon. This is also indicated by Heisenberg’s use of the word ‘error’ for δq.<br />
From the beginning, Bohr resisted this explanation of Heisenberg, and he put the emphasis on<br />
the necessity to combine mutually excluding terms from a wave and particle picture in one description.<br />
Especially because of EPR it later became clear that the ‘measurement disturbance’<br />
cannot be an ordinary error.<br />
IV. 2<br />
BOHR AND COMPLEMENTARITY<br />
The core of the Copenhagen interpretation lays, of course, in Bohr’s work. His articles are characterized<br />
by an entirely own style. Remarkably, Bohr hardly uses the formalism of the theory, he<br />
generally gives a qualitative argument instead. His difficult, and sometimes obscurely formulated,<br />
long sentences are notorious, full of subordinate clauses and conditional definitions which do not<br />
always clarify his intentions. A careful reconstruction and interpretation of Bohr’s point of view,<br />
and its development in the course of time, has been given by E. Scheibe (1973, chapter 1), another<br />
interpretation is the monograph of H.J. Folse (1985).<br />
Centrally in Bohr’s consideration is the language we use to do physics. Bohr emphasizes that,<br />
regardless of how abstract and refined the terms of modern physics may be, in essence they are only<br />
an extension of everyday language, and they are nothing but means of communication we use to<br />
communicate observational results to other people. Such an observational result, the outcome of a<br />
measurement on a physical system in certain experimental circumstances, is therefore the basic element<br />
of consideration. For this, Bohr uses the term phenomenon. Every phenomenon is the resultant<br />
of a physical system S, a preparation apparatus P , a measuring apparatus M and their mutual interaction<br />
in a concrete experimental situation.<br />
The description of a phenomenon must always be made in unambiguous terms because of the<br />
requirement of communicability. A statement like, for example, “the object is in a superposition
IV. 2. BOHR AND COMPLEMENTARITY 83<br />
of two different states” is therefore not suitable. In classical physics a sufficient arsenal of terms is<br />
developed for these aims.<br />
According to Bohr, characteristic of classical physics is in the first place that the interaction between<br />
object and measuring apparatus can be assumed to be negligible small. This implies that upon<br />
describing a phenomenon the measuring apparatus can be left out of consideration. Instead of the<br />
statement: “Thermal interaction between a thermometer and a glass of water has, in certain circumstances,<br />
yielded as a result that the mercury column has been found to have a certain length”, we<br />
can also say: “The temperature of water has a certain value”. In this case we can, without objection,<br />
transfer the description of the phenomenon onto the object itself, and speak in terms of its properties.<br />
The essential difference between classical physics and quantum physics is, according to Bohr, that<br />
in quantum physics the interaction is quantized. The interaction between an object and a measuring<br />
apparatus can only exist of the exchange of one or more quanta, and cannot be made arbitrarily small.<br />
Bohr calls this starting point the quantum postulate (Bohr 1928, p. 580).<br />
<strong>QUANTUM</strong> POSTULATE:<br />
[The] essence [of the quantum theory] may be expressed in the so - called quantum postulate,<br />
which attributes to any atomic process an essential discontinuity, or rather individuality,<br />
completely foreign to the classical theories and symbolized by Planck’s quantum<br />
of action.<br />
In a phenomenon the object, the measuring apparatuses, and their interaction form an indivisible<br />
whole, and the interaction always amounts to at least one quantum h. This postulate unsettles the<br />
procedure to convert the description of a phenomenon into a description of the object itself.<br />
There is however a second element in Bohr’s point of view, which tempers this pessimistic conclusion.<br />
Scheibe called it the buffer postulate (1973, p. 24) because “the function of the postulate is<br />
to use classical physics as a buffer against the quantum - mechanical treatment of a phenomenon”,<br />
BUFFER POSTULATE:<br />
The description of the apparatus and of the results of observation, which forms part of<br />
the description of a quantum phenomenon, must be expressed in the concepts of classical<br />
physics (including those of “everyday life”), eliminating consistently the Planck quantum<br />
of action.<br />
The context of this requirement is again to be able to communicate our experimental findings to other<br />
people. The reasoning is as follows (Bohr 1947, p. 59),<br />
[. . . ] by an experiment we simply understand an event about which we are able in an<br />
unambiguous way to state the conditions necessary for the reproduction of the phenomena.<br />
In the account of these conditions, there can, therefore, be no question of departing<br />
from the Newtonian way of description and, in particular, it may be stressed that by the<br />
[. . . measuring apparatus . . . ], we simply understand some piece of machinery as regards<br />
the working of which classical mechanics can be entirely relied upon and where,<br />
consequently, all quantum effects have to be disregarded.
84 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
Bohr assumes that only the language and terms of classical physics are suitable for the description<br />
of observational results. He writes (Bohr 1931, p. 692)<br />
[. . . ] the unambiguous interpretation of any measurement must be essentially framed in<br />
terms of the classical physical theories, and we may say that in this sense the language<br />
of Newton and Maxwell will remain the language of physicists for all time.<br />
This is a particularly radical point of view, and we will return to its motivation later.<br />
The combination of both postulates now leads to the following reasoning. In all phenomena an<br />
interaction exists between the system and the measuring apparatus which has a minimal order of magnitude<br />
h > 0, after all, the most minute measurements always rely on a quantum phenomenon. But<br />
in our description of the phenomenon we are forced to use classical concepts and this interaction, h,<br />
cannot occur. The consequence is that in our description the interaction is not analyzable.<br />
At the same time the classical character of the description makes it possible to speak again in<br />
terms of properties of the object itself. Therefore, instead of the statement “the interaction between a<br />
particle and a photographic plate resulted in a little black dot in a certain area of the plate”, we can<br />
also say “the particle has been found at a position in that area”, where no longer is referred to the<br />
measuring apparatus.<br />
But the large difference with the classical situation is that we, by disregarding the interaction,<br />
in a certain way make a mistake which remains without consequences within this phenomenon, but<br />
prevents the description to be combinable with the information obtained under different experimental<br />
conditions. If the object is coupled to another measuring apparatus there will be another interaction,<br />
which will again not be analyzable. Descriptions of the object that have been obtained under different<br />
measurement arrangements cannot be combined to one picture which covers it all. We will illustrate<br />
this in a more concrete case.<br />
IV. 2. 1<br />
COMPLEMENTARY PHENOMENA<br />
The most important examples of phenomena which give additional, but mutually excluding information<br />
on an object are measurements of position and momentum. Bohr (1939, p. 22) writes<br />
[. . . ] any phenomenon in which we are concerned with tracing a displacement of some<br />
atomic object in space and time necessitates the establishment of several coincidences<br />
between the object and the rigidly connected bodies and movable devices which, in serving<br />
as scales and clocks respectively, define the space - time frame of reference to which<br />
the phenomenon in question is referred.<br />
In this case, therefore, the object has an interaction with an apparatus which is firmly bolted down<br />
or anchored, so that its position remains secured. But the consequence is that a possible exchange<br />
of momentum between object and apparatus cannot be analyzed. Such a transfer of momentum<br />
will be absorbed by the fixed parts of the apparatus without leaving behind any trails. Within this<br />
experimental setup we are therefore prohibited to say anything about the momentum of the object.
IV. 2. BOHR AND COMPLEMENTARITY 85<br />
The opposite applies to the measurement of momentum (Bohr in Schilpp 1949, p. 219);<br />
In the study of phenomena in the account of which we are dealing with detailed momentum<br />
balance, certain parts of the whole device must naturally be given the freedom to<br />
move independently of others.<br />
Bohr assumes that a measurement of momentum is made by registering the recoil after a collision,<br />
for example, with a test particle. In this way we can, using the conservation laws, retrieve the<br />
momentum of the object. However, the condition that the test particle can move freely means that we<br />
cannot guarantee that it preserves a definite position. It is therefore excluded from being used as part<br />
of a spatial coordinate system, and now we cannot say anything about the position of the object.<br />
In order to perform a position measurement we must therefore put the object in contact with a<br />
part of the measuring apparatus which has been bolted down firmly, while performing a momentum<br />
measurement we must observe the recoil of a freely movable part of the measuring apparatus, and<br />
apply the momentum conservation law. Position and momentum measurements therefore exclude<br />
each other, because a measuring apparatus cannot at the same time be bolted down and freely movable.<br />
In the description of the object we must choose between granting a position or momentum. As worded<br />
by Philipp Frank (1949, p. 163)<br />
Quantum mechanics speaks neither of particles the positions and velocities of which<br />
exist but cannot be accurately observed, nor of particles with indefinite positions and<br />
velocities. Rather, it speaks of experimental arrangements in the description of which the<br />
expressions ”position of a particle” and ”velocity of a particle” can never be employed<br />
simultaneously.<br />
Bohr calls this characteristic property of quantum mechanics, where two quantities exclude each<br />
other whereas both are necessary to describe all phenomena in which the object can participate, complementarity.<br />
Position and momentum are examples of complementary quantities. Similar considerations<br />
apply to time and energy, such that a general complementarity exists between on the one hand<br />
a space - time description of phenomena, and on the other hand a dynamical description, frequently<br />
indicated by Bohr as ‘causally’, in which the conservation laws for energy momentum are applicable.<br />
◃ Remark<br />
The complementarity between quantities like position and momentum or descriptions using space -<br />
time coordination or dynamic laws differs from, and replaces, the contrast which Bohr placed central<br />
in his earlier work, namely between ‘wave’ and ‘particle’, because a classical particle has both position<br />
and momentum, a classical wave has neither. ▹<br />
The role of the uncertainty relations in Bohr’s views can now be described as considering them<br />
in the first place as symbolic expressions of the impossibility to define position and momentum at<br />
the same time when describing an object. In a phenomenon in which the position is determined<br />
sharply, δ q = 0, the momentum must be undetermined, δ p = ∞, and vice versa. But the relation<br />
δq δp ∼ h is, of course, more general. Bohr (1934, pp. 60,61) interprets this as follows:<br />
At the same time, however, the general character of this relation makes it possible to<br />
a certain extent to reconcile the conservation laws with the space - time co - ordination<br />
of observations, the idea of a coincidence of well - defined events in a space - time point<br />
being replaced by that of unsharply defined individuals within finite space - time regions.
86 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
The meaning Bohr attaches to the uncertainty relations can be summarized this way: the sharper<br />
we can, in a phenomenon, define the position of the object, the fuzzier the momentum must be defined,<br />
and vice versa. The quantities δq and δp in the relation δqδp ∼ h therefore represent the fuzziness in<br />
the definition. Bohr emphasizes an epistemological role of these quantities stronger than an ontological<br />
role.<br />
IV. 2. 2<br />
REMARKS AND PROBLEMS<br />
Bohr’s supposition that classical language is a definite means of expression for physical observations<br />
which cannot be improved upon, is radical and at first sight even fairly unacceptable. Language<br />
develops and history teaches us that from time to time new concepts are necessary. Aristotle had,<br />
for example, no momentum concept, Newton knew nothing of energy, Coulomb had no theory of<br />
fields, etc. Doesn’t it speak for itself that quantum mechanics also asks for new concepts? Bohr,<br />
however, (ibid., p. 16), says<br />
[. . . ] it would be a misconception to believe that the difficulties of the atomic theory may<br />
be evaded by eventually replacing the concepts of classical physics by new conceptual<br />
forms.<br />
Bohr emphasizes that with this point of view he does not reject the introduction of new entities,<br />
e.g. quarks, superstrings or black holes. The aspects of classical language which are the reason<br />
that it cannot be improved upon are, according to him, descriptions in terms of space and time and<br />
descriptions in terms of cause and effect. These are the only categories with which we can describe<br />
observational results.<br />
Another problem with the idea that the classical concepts cannot be improved upon is Bohr’s<br />
immediate conclusion that the quantum of action cannot occur in the description of a phenomenon,<br />
because a statement such as ‘h = 6.6 · 10 −34 Js’ is also an unambiguous summary of experimental<br />
evidence, although not of one phenomenon. The idea that h cannot appear in the language of observations<br />
is a weak, and in fact untenable point in his argumentation. The prohibition of the use of h<br />
in the language of observations also brought Bohr to the conclusion that the spin of an electron, 1 2 ,<br />
would be fundamentally unobservable. This conclusion has been proven to be incorrect.<br />
In some articles Bohr gives a more abstract explanation of the quantum postulate and emphasizes<br />
the ‘symbolic’ role of h. It does not so much represent the inevitable interaction, or measurement<br />
disturbance, between object and measuring apparatus, as the fundamental impossibility to make a<br />
sharp distinction between object and observation apparatus. It is, in any case, clear that Bohr does not<br />
regard the formalism of quantum mechanics, with its wave functions and operators, as an extension<br />
or improvement of classical language. He emphasizes that this formalism is purely symbolic and<br />
cannot be taken as a description, as the quantum state of a system is given without reference to the<br />
experimental setup.<br />
It should be noted that Bohr, at emphasizing the applicability of concepts, has more in mind than<br />
the ‘logical’ question of ’definiteness’. For Bohr a term like ‘position of a particle’ is applicable if we<br />
can in fact control and secure this position, using firmly bolted apparatuses. Bohr’s use of the term<br />
‘determination’ refers both to a measurement as to a state preparation.
IV. 2. BOHR AND COMPLEMENTARITY 87<br />
Speaking of ‘partially defined positions and momenta’, Bohr considers the uncertainty relation<br />
between position and momentum as the possibility to come to a compromise with the complementarity<br />
between position and momentum. Here we can think of a context of measurement in which the<br />
object interacts with a part of the apparatus which is linked with the rest of the apparatus by means<br />
of a spring with a finite spring constant, an intermediate form between ‘freely movable’ and ‘firmly<br />
bolted’. He has, however, not developed this compromise. This point of view does in fact not fit the<br />
usual mathematical derivation of the uncertainty relations for position and momentum. They make,<br />
for two given (sharp) quantities p and q, a statement about spreading in quantum states, not about the<br />
well-definedness of the quantities. It has been attempted to prove this compromise mathematically,<br />
by the introduction of ‘blurred quantities’, e.g. Busch, Grabowski and Lahti (1995).<br />
Of fundamental importance in Bohr’s point of view is that in a phenomenon an object and experimental<br />
setup are involved. The setup determines which frame of concepts applies to the object. In<br />
many cases the contrast between object and measuring apparatus coincides with that of the microscopic<br />
and macroscopic system, respectively. But that is not necessarily so. A macroscopic system<br />
can also be considered as an object while a microscopic system can serve as a measuring apparatus.<br />
We can consider, for example, a macroscopic measuring apparatus to be the object of another measurement.<br />
As soon as we do this the macroscopic system can, according to Bohr, no longer execute<br />
its role as a measuring device. It becomes an object itself, to which the quantum formalism must be<br />
applied. This functional contrast between object and measuring apparatus is therefore more essential<br />
than that between microscopic and macroscopic systems.<br />
For a good understanding of Bohr’s position, and Heisenberg’s for that matter, it is important to<br />
notice that measurements do not require the presence of consciousness. Decisive for applicability of<br />
classical concepts is the presence of a measurement context. Therefore, subjectivity does not play<br />
a role in any form, for applicability of a concept as ‘momentum’ it does not matter if a conscious<br />
observer, a computer or another measuring apparatus carries out the momentum measurement.<br />
Also, from Bohr’s refusal to assign a realistic meaning to the quantum mechanical description, the<br />
conclusion cannot be drawn that he supports an anti - realistic or ‘instrumentalist’ view on physics,<br />
where instrumentalism is roughly the thesis that a scientific theory is only an instrument to carry out<br />
calculations of which we compare the outcomes with the indications of measuring apparatuses, in<br />
particular, that a theory is no ‘knowledge of the world’, that it does not provide a faithful picture of<br />
what reality is. An object such as an electron has, besides its quantum mechanical state, more than<br />
enough permanent properties, such as the super - selected quantities mass and charge which are not<br />
subject to complementarity, to conceive it as a real, existing object.<br />
IV. 2. 3<br />
AGREEMENT AND DIFFERENCE BETWEEN HEISENBERG AND BOHR<br />
Both Heisenberg and Bohr emphasize that quantum mechanics is a complete theory which cannot<br />
be extended into a more detailed description with hidden variables. Bohr says (Schilpp 1949, p. 235)<br />
[. . . ] in quantum mechanics, we are not dealing with an arbitrary renunciation of a more<br />
detailed analysis of atomic phenomena, but with a recognition that such an analysis is in<br />
principle excluded.
88 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
Heisenberg (1927, p. 83) also expresses himself in this sense. He defines the uncertainty relations<br />
as<br />
Even in principle, we cannot know the present in all detail.<br />
He rejects the conception that behind the statistic description of quantum mechanics there still is a<br />
‘real world’ as a “fruitless and senseless speculation” (loc. cit.).<br />
According to both Bohr and Heisenberg, the quantum mechanical description cannot be applied<br />
to the whole world, because a classically described context of measurement is always necessary. The<br />
border between the classical and quantum mechanical description can be moved at will, but cannot<br />
be removed. Therefore, quantum mechanics is not a universal theory in the sense that there exists<br />
something like a ‘wave function of the universe’.<br />
Further agreement between Heisenberg and Bohr is found in the significance they attach to measurement.<br />
The difference is that according to Heisenberg something changes in the object during<br />
measurement; some properties are created, others disappear or become fuzzy. According to Bohr<br />
nothing has to happen in the object. The experimental setup only enables some description of the system<br />
which would not be allowed at another experimental setup. According to Bohr, the uncertainty<br />
relation is a symbolic, contrary to a descriptive, expression of the impossibility to define position and<br />
the momentum in one phenomenon.<br />
Another difference is that Heisenberg tends, more than Bohr, to a realistic interpretation of the<br />
mathematical quantum formalism. In an interview at the end of his life, Heisenberg admitted that he<br />
never really understood the idea of complementarity.<br />
IV. 3<br />
DEBATE BETWEEN EINSTEIN EN BOHR<br />
IV. 3. 1<br />
INTRODUCTION<br />
Einstein, who contributed to the development of the quantum theory until 1922, never wanted<br />
to accept the Copenhagen interpretation. In his memoirs, Heisenberg mentions how he, at a visit to<br />
Berlin, explained his starting - point that the theory may speak exclusively about observable quantities,<br />
and, to his surprise, Einstein wanted to know nothing about it, “the theory decides what can be<br />
observed”. The main source of the course of the debate between Einstein and Bohr which we will<br />
review here, is Bohr’s own report ‘Discussion with Einstein on Epistemological Problems in Atomic<br />
Physics’ (Bohr 1949).<br />
The very first time Einstein gave publicity to his objections was at the 5 th Solvay conference in<br />
Brussels in 1927 where he suggested there were two conceivable conceptions concerning the quantum<br />
mechanical wave function.<br />
(i) The state ψ gives a description of the individual system which is as complete as possible.<br />
(ii) The state ψ does not characterize an individual system but an ensemble of identically prepared<br />
systems. Therefore, as a description of the individual system ψ is incomplete, ψ is a ‘statistical<br />
quantity’.
IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 89<br />
Conception (i) was defended by Heisenberg and Bohr. Einstein posed the next objection to this<br />
conception: when a particle travels through a narrow slit, the wave function will, by deflection, extend<br />
itself over a large part of space. If this is a complete description of the particle, we have to conclude<br />
that it is potentially present everywhere in this area. But after detection of the particle on a photographic<br />
plate it is out of the question that it can still be found elsewhere. Therefore, the wave function<br />
must disappear suddenly there, which would imply a peculiar ‘action at a distance’. This objection<br />
does not apply to conception (ii), because there the detection simply corresponds to the choice of an<br />
element from the ensemble.<br />
In his answer, Bohr emphasized that the deflection of the wave function by a slit in a firmly bolted<br />
screen finds its origin in the possibility of the particle to exchange momentum with the screen. But<br />
this exchange of momentum is not analyzable within this setup, i.e., without detaching the screen.<br />
The question whether a more detailed description of the individual case is possible found its<br />
temporary culmination in the analysis of the thought experiment with the double slit, which is depicted<br />
in figure IV. 2. When a monochromatic wave travels through a screen with two narrow slits, an interference<br />
pattern is visible on a photographic plate. This is typical for wave behavior, where the waves<br />
from both slits cooperate. An individual particle, however, can only travel through one slit, and the<br />
wave function does not tell us through which slit it travels.<br />
Figure IV. 2: The double slit interference experiment (Bohr 1949 )<br />
Einstein now suggested that it was nevertheless possible to obtain information about through<br />
which slit the particle travels, for example by measuring the transfer of momentum to the first screen.<br />
If this screen received a thrust downwards, the particle has chosen the upper slit, and vice versa.<br />
Bohr answered that if we want to measure the momentum transfer to the screen with an exactitude<br />
which is enough to distinguish the recoils belonging to the paths through the two slits, the momentum<br />
of the screen itself must be very exactly known. If d represents the distance between the slits, and l<br />
represents the distance between the screens, the angle between the two paths is of the order<br />
α ≃ sin α = d . (IV. 5)<br />
l<br />
The recoil is of the order<br />
p 0 sin α ≃ d , (IV. 6)<br />
λ l
90 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
and therefore we have to know the momentum of the screen with an uncertainty<br />
δp d . (IV. 7)<br />
λ l<br />
Gaining such an exactitude is, however, only possible if the screen is movable. But in that case it is<br />
no longer possible to fulfil its function as a screen which determines an exact position for the slit. It<br />
is therefore no longer part of the original measuring context, as can be seen in figure IV. 3.<br />
Figure IV. 3: Contexts of measurement in which the interference of the particles is visible, and those<br />
in which the recoil of the screen is visible, exclude each other. (Bohr 1949 )<br />
Actually, because now we will perform a measurement on the screen, the screen itself has to be<br />
considered an object. This means that quantum mechanics applies to it, and the screen is, therefore,<br />
also subject to an uncertainty relation<br />
δq λ l . (IV. 8)<br />
d<br />
But this is an indefiniteness of the same order of magnitude as the distance between the interference<br />
bands. Bohr concludes that under these circumstances interference can no longer be seen.<br />
With this reasoning he was able to transform Einstein’s objection to an affirmation of his idea<br />
of complementarity; as soon as we try to carry out a closer analysis of the phenomenon, we have to<br />
modify the experimental setup in such a way that the phenomenon changes unrecognizably. Nowadays<br />
an alternative of this thought experiment can actually be carried out in a laboratory, as we will<br />
discuss in section IV. 4.<br />
IV. 3. 2<br />
THE PHOTON BOX<br />
At the 6 th Solvay Conference in 1930 in Brussels, Einstein gave another example, which is known<br />
under the name ‘the photon box’. It concerns an isolated box filled with radiation and equipped with<br />
a clock mechanism which opens a shutter during a very short interval. It is assumed that in advance<br />
the box is weighed meticulously.
IV. 3. DEBATE BETWEEN EINSTEIN EN BOHR 91<br />
Upon closure of the shutter we have, according to Einstein, a choice: either we weigh the box<br />
again and determine how much mass has vanished so that we can, using the relation E = m c 2 ,<br />
retrieve the energy of the escaped photon, or we open the box and read off the clock mechanism to<br />
determine when the shutter has been opened, which enables us to predict the time of exit of the photon<br />
and therefore its time of arrival at a remote detector. We can choose between both options long after<br />
the photon has left.<br />
Bohr’s answer is not entirely clear. It may be assumed that he did not understand Einstein’s<br />
intentions correctly. 1 He explains Einstein’s objection as an attempt to refute the uncertainty relation<br />
between energy and time; he shows that both determinations cannot possibly be made at the same<br />
time.<br />
Bohr reasons as follows. Assume that the box hangs in equilibrium from a spring in a gravitational<br />
field. When in a time interval T a mass δm escapes, it receives an upward impulse F ∆t of magnitude<br />
g δm T. (IV. 9)<br />
We can keep T finite by, at some moment, hanging a small weight to the box to compensate for the<br />
loss of mass. Suppose we want to determine the mass of the photon by measuring this momentum<br />
transfer then, again, the momentum of the box at the start of the experiment must be exactly known,<br />
δp g δm T. (IV. 10)<br />
But now the same argument applies as used in the double slit experiment. This precise determination<br />
of momentum is only possible if the fixation of the position of the box is given up. The box itself<br />
must be considered a quantum mechanical object, and therefore the uncertainty relation δ pδ q h<br />
applies to it. The position of the box is unknown with an uncertainty of magnitude<br />
δq <br />
<br />
g δm T<br />
(IV. 11)<br />
from which it follows that the gravitational potential ϕ g to which the clock is exposed is also uncertain,<br />
δϕ g ≃ g δq <br />
. (IV. 12)<br />
δm T<br />
But according to the red shift formula from the general theory of relativity (!) the pace of a clock is<br />
influenced by the gravitational potential,<br />
∆T<br />
T<br />
= δϕ g<br />
, (IV. 13)<br />
c2 therefore, the pace of the clock is also uncertain, and consequently the time of opening of the clock is<br />
unknown. Under the circumstances in which we can determine the energy of the photon, we cannot<br />
retrieve its exit time exactly.<br />
Although Bohr seems to rebuke Einstein with his own theory, Bohr’s answer evokes, among other<br />
things, the question whether it is appropriate that the correctness of quantum mechanics relies on<br />
the correctness of the general theory of relativity, which is a classical theory, and is, strictly spoken,<br />
contradictory to quantum mechanics.<br />
1 That Einstein indeed had the intention to point out the freedom of choice is apparent in a letter to Bohr from Paul<br />
Ehrenfest, who heard the argument from Einstein earlier.
92 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
EXERCISE 27. Try, using the uncertainty relation for time and energy, δ tδ E h, to refute<br />
Einstein’s argumentation without appealing to other physical theories.<br />
IV. 3. 3<br />
EINSTEIN, PODOLSKY AND ROSEN<br />
The thought experiment of Einstein, Podolsky and Rosen, which we discussed in section I. 2,<br />
forms the highlight of the debate. Here Einstein’s objections emerge in their most pure form.<br />
Given two systems which interacted with each other at some time, but are separated now, consider<br />
two non - commuting quantities A and A ′ of one of the particles, and B and B ′ of the other<br />
particle. Measurement of A allows us to do a certain prediction concerning B of the other particle,<br />
measurement of A ′ allows us, analogously, to make a certain prediction concerning B ′ of the other<br />
particle.<br />
Einstein admits that these two measurements cannot be carried out simultaneously. But we can<br />
choose which measurement we perform while the other particle is very far away. It is not reasonable,<br />
EPR argue, that this other particle will be influenced by this choice. This means that although<br />
only one of both predictions concerning the other particle can be done with certainty, both predictions<br />
are, at the same time, true, corresponding to properties of the other particle, i.e., to ‘elements of<br />
physical reality’.<br />
IV. 3. 4<br />
HEISENBERG, BOHR AND EINSTEIN, PODOLSKY AND ROSEN<br />
According to Heisenberg, measurement has an essential influence. Some properties of the particle<br />
become sharp, others fuzzy. If this consequence of measurement would be understood to be a physical<br />
interaction, this would evoke the next ‘natural’ requirement of locality (M.L.G. Redhead 1987, p. 77)<br />
An unsharp value for an observable cannot be changed into a sharp value by measurements<br />
performed at a distance.<br />
But the analysis of EPR shows that, the particles being far removed from each other, this requirement<br />
has not been met, making Heisenberg’s interpretation much less physically pictorial than it seemed to<br />
be initially. The natural requirement of locality in Bohr’s interpretation reads (loc. cit.)<br />
A previously undefined value for an observable cannot be defined by measurements performed<br />
‘at a distance’.<br />
This requirement has also not been fulfilled.<br />
Bohr’s answer to EPR, and his rejection of the incompleteness claim, amounts to the notion that<br />
the aforementioned requirement of locality can be violated without implying the existence of superluminal<br />
physical effects. The ‘defining’ functioning of measuring apparatuses is not a process that<br />
propagates in space and time and by means of some interaction disturbs particles that are not measured,<br />
or creates values for properties in those particles. It concerns an epistemological role of the<br />
measuring apparatuses. The measuring apparatuses measuring one of a pair of correlated particles<br />
define which classical terms apply to both particles.
IV. 4. NEUTRON INTERFEROMETRY 93<br />
If the position is measured of one of the particles, we have to do with a phenomenon in which<br />
the term position is applicable. Thus, on the basis of the correlation between these particles the term<br />
‘position’ is also applicable to the other particle. If the position of one of the particles is measured,<br />
a ‘position perspective’ is opened, so to speak, to the world. Likewise, measurement of momentum<br />
on one of the particles makes the other particle accessible to a description with the term ‘momentum’.<br />
Even though there is no physical intervention on this particle, it is still not permitted to speak about<br />
the particles having these properties outside the context of a phenomenon. Therefore, Bohr rejects<br />
Einstein’s reasoning that the other particle, not being disturbed by the measurement, consequently<br />
also possesses the properties ‘position’ and ‘momentum’ independent of measurement.<br />
In fact, this same reasoning can be applied to the the double slit experiment, as Bohr showed<br />
in his answer to Einstein. In this experiment we also have a choice to do either a measurement of<br />
momentum on the screen and this way determine which path the particle has taken, thereby losing the<br />
interference pattern, or to measure its position, thereby retrieving the interference pattern again. But<br />
Bohr writes<br />
As repeatedly stressed, the principal point is here that such measurements demand mutually<br />
exclusive experimental arrangements.<br />
IV. 4<br />
NEUTRON INTERFEROMETRY<br />
Nowadays, a variant version of the thought experiment with the double slit can be carried out in<br />
the laboratory using a neutron interferometer. A neutron interferometer consists of a massive perfect<br />
silicon crystal, usually with dimensions of approximately 10 × 10 × 50 cm 3 . After cutting large<br />
notches in the crystal, a basis with upstanding teeth remains, see figure IV. 4.<br />
Figure IV. 4: Several perfect crystal neutron interferometers (Rauch and Werner 2000 )
94 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
Using an interferometer with three upstanding teeth, a monochromatic beam of neutrons with<br />
a de Broglie wavelength of approximately 1 Å now hits the first tooth of this crystal. The crystal<br />
lattice acts like a grid and lets the beam pass in very sharply defined directions. Under suitable<br />
conditions there are exactly two emanating beams, one transmitted (T) and one reflected (R), as<br />
shown in figure IV. 5 a.<br />
At the second tooth this process is repeated, and both beams are again split up. Two of them are<br />
now outside the interferometer where they are screened, no longer participating. The remaining two<br />
beams are bent towards each other and meet at the third tooth. Here, both beams are split up again,<br />
and now the straightforward going beam of one path is superimposed on the reflected beam of the<br />
other path. Neutron detectors are placed in both emanating beams.<br />
T<br />
T<br />
2<br />
R<br />
A<br />
R<br />
1<br />
R<br />
B<br />
T<br />
a) A sketch of the setup b) The experimental results<br />
(Rauch and Werner 2000 )<br />
Figure IV. 5: The interference pattern in the neutron interferometer is acquired by measuring the<br />
intensity in the detectors at a variable optical path length difference.<br />
If the incoming beam comes from below, and the beams are not manipulated, all neutrons turn out<br />
to end up in the upper beam at detector A, undergoing constructive interference, while the neutrons<br />
in the lower beam extinguish each other. For this phenomenon it is essential that the interferometer<br />
consists of only one crystal, for in that case the waves remain coherent even though, along the way,<br />
the beams have been separated by ‘macroscopic distances’, approximately 5 cm or ≃ 10 9 λ. When a<br />
neutron has arrived in a detector it can have traveled along one of both paths.<br />
Upon introducing a phase difference between the two paths by sliding a small piece of aluminium<br />
of variable thickness in one of the paths, the intensity shifts from the upper to the lower detector. This<br />
intensity is a periodic function of the thickness of the piece of aluminium, see figure IV. 5 b. This is<br />
the interference pattern.<br />
Now the question is if we can, in some way, uncover along which path the particle has traveled.<br />
Following Bohr’s line of thought this should be possible by sawing off one of the teeth and measuring<br />
the recoil it receives of the neutron. Such an experiment can, however, not be carried out with the<br />
required experimental exactitude.<br />
Another option is to make use of the fact that the neutron is a spin 1/2 particle and therefore has<br />
an internal degree of freedom. We can carry out such an experiment with a polarized beam, where
IV. 4. NEUTRON INTERFEROMETRY 95<br />
all neutrons have, at entry in the interferometer, spin up in the z - direction. We place the complete<br />
setup in a homogeneous magnetic field which ensures that spin up and spin down have a different<br />
energy ω 0 . In one of the paths we place a ‘spin flipper’, a small coil through which an alternating<br />
current runs having exactly the resonance frequency ω 0 . At a suitable choice of the length of the<br />
coil the spin of every neutron which travels through it will be flipped over. Subsequently, we place<br />
spin analyzers in front of the detectors, so that we can not only observe in which emanating beam the<br />
neutron is located but also its spin in the z - direction.<br />
In this setup we can therefore uncover exactly along which path the particle has traveled; spin up<br />
means the path without the spin flipper has been chosen, spin down means the neutron traveled along<br />
the path with the spin flipper. But in this setup no more interference is seen! The intensity is equal in<br />
both detectors and independent of the phase difference.<br />
We can describe this as follows. The wavepath function |ϕ 0 ⟩ ∈ L 2 (R 2 ) of an emanating neutron<br />
exists of four terms,<br />
|ϕ 0 ⟩ = 1 2<br />
(<br />
|ϕ1A ⟩ + |ϕ 1B ⟩ + e i χ |ϕ 2A ⟩ + e i χ |ϕ 2B ⟩ ) . (IV. 14)<br />
Here ϕ iA and ϕ iB represent the wave functions ending up in the detectors A and B, respectively, 1<br />
and 2 refer to the two possible paths through the interferometer, as can be seen in figure IV. 5 a. The<br />
factor e iχ corresponds to the phase shift by the aluminium. If χ = 0, there is maximum constructive<br />
interference in A and total destructive interference in B, from which it follows that<br />
|ϕ 1A ⟩ = |ϕ 2A ⟩ and |ϕ 1B ⟩ = − |ϕ 2B ⟩. (IV. 15)<br />
The intensity in detector A is given by the expectation value of a projection P A , where<br />
P A |ϕ iA ⟩ = |ϕ iA ⟩ and P A |ϕ iB ⟩ = 0, analogously for P B . Therefore, we find for the intensity I A<br />
of the neutron beam that encounters detector A, quantum mechanically expressed as the probability<br />
to find a neutron in detector A,<br />
I A = ⟨ϕ 0 | P A |ϕ 0 ⟩ = 1 (<br />
4 ⟨ϕ1A | + ⟨ϕ 2A | e − i χ) ( |ϕ 1A ⟩ + e i χ |ϕ 2A ⟩ )<br />
and likewise for I B ,<br />
= 1 2<br />
I B = ⟨ϕ 0 | P B |ϕ 0 ⟩ = 1 4<br />
= 1 2<br />
(1 + cos χ), (IV. 16)<br />
(<br />
⟨ϕ1B | + ⟨ϕ 2B | e − i χ) ( |ϕ 1B ⟩ + e i χ |ϕ 2B ⟩ )<br />
(1 − cos χ). (IV. 17)<br />
In this experiment the neutrons are polarized, therefore we can add the spin state to the wavepath<br />
function and thus get a Pauli spinor,<br />
( 1<br />
|ϕ i, tot ⟩ = |ϕ 0 ⟩ ⊗ |z ↑⟩ = ϕ(⃗q) =<br />
0)<br />
( ) ϕ(⃗q)<br />
0<br />
∈ L 2 (R 3 ) ⊗ C 2 . (IV. 18)<br />
The functioning of the spin flipper, which we assume to be completely ideal, can now be described as<br />
follows. The component of the state traveling along path 1 does not meet a spin flipper, which means
96 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
that it remains unaltered, and we have, leaving out the cartwheels ⊗,<br />
|ϕ 1A ⟩ |z ↑⟩ → |ϕ 1A ⟩ |z ↑⟩ and |ϕ 1B ⟩ |z ↑⟩ → |ϕ 1B ⟩ |z ↑⟩, (IV. 19)<br />
whereas for the components traveling along path 2 the spin direction reverses,<br />
|ϕ 2A ⟩ |z ↑⟩ → |ϕ 2A ⟩ |z ↓⟩ and |ϕ 2B ⟩ |z ↑⟩ → |ϕ 2B ⟩ |z ↓⟩. (IV. 20)<br />
Therefore, the total final state is<br />
|ϕ f, tot ⟩ = 1 2<br />
which means that for the intensity we have<br />
(<br />
|ϕ1A ⟩ |z ↑⟩ + |ϕ 1B ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ + e i χ |ϕ 2B ⟩ |z ↓⟩ ) , (IV. 21)<br />
I A = ⟨ϕ f, tot | P A ⊗ 11 |ϕ f, tot ⟩ = 1 4 ⟨ϕ f, tot| ( |ϕ 1A ⟩ |z ↑⟩ + e i χ |ϕ 2A ⟩ |z ↓⟩ ) = 1 2<br />
, (IV. 22)<br />
and likewise for I B . We see that, because of the orthogonality of the spin states |z ↑⟩ and |z ↓⟩, the<br />
interference term disappears.<br />
With the neutron interferometer we can also illustrate the fact that there is always freedom of<br />
choice because we can, instead of analyzers for spin in the z - direction, place analyzers for spin in<br />
the x - direction.<br />
The eigenvectors for spin in the x - direction are superpositions of those in the z - direction, see<br />
section III. 6, equations (III. 145) and (III. 146),<br />
|x ↑⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↑⟩ + |z ↓⟩<br />
)<br />
and |x ↓⟩ = 1 2<br />
√<br />
2<br />
(<br />
|z ↓⟩ − |z ↑⟩<br />
)<br />
. (IV. 23)<br />
We can calculate the probability to find, e.g., a neutron with spin in the negative x - direction in detector<br />
A, as the expectation value of the projector P A |x ↓⟩⟨x ↓| in the state |ϕ f, tot ⟩, (IV. 21),<br />
⟨ϕ f, tot | ( P A ⊗ |x ↓⟩ ⟨x ↓| ) |ϕ f, tot ⟩<br />
= 1 (<br />
4 ⟨ϕ1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + e i χ ⟨ϕ 1A | ⟨z ↑| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩<br />
+ e − i χ ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 1A ⟩ |z ↑⟩ + ⟨ϕ 2A | ⟨z ↓| P A | x ↓⟩ ⟨x ↓ | ϕ 2A ⟩ |z ↓⟩ )<br />
= 1 4<br />
(1 − cos χ), (IV. 24)<br />
and we see interference again.<br />
EXERCISE 28. Verify the calculations (IV. 22) and (IV. 24).<br />
In this case we also can choose whether we measure spin in the x - direction or in the z - direction<br />
long after the neutron has left the interferometer, which means that the neutron seems to make the<br />
choice whether to take one of the paths through the interferometer, or to show interference between
IV. 5. THE UNCERTAINTY RELATIONS 97<br />
both paths, after it has left the interferometer. J.A. Wheeler (1978) called such experiments delayed -<br />
choice experiments. Outcomes of measurements in the future seem to determine what has happened<br />
in the past!<br />
Actual confirmation of this freedom of choice was not obtained until 2007, when a group in<br />
Cachan, France, succeeded to carry out such an experiment using linearly polarized single photons,<br />
a 48 m interferometer and two beamsplitters. In their article (Jaques 2007) they conclude that<br />
Our realization of Wheeler’s delayed - choice gedanken experiment demonstrates that<br />
the behavior of the photon in the interferometer depends on the choice of the observable<br />
that is measured, even when that choice is made at a position and a time such that it is<br />
separated from the entrance of the photon into the interferometer by a space - like interval.<br />
EXERCISE 29. Give, concisely, Bohr’s view on such experiments.<br />
IV. 5<br />
THE UNCERTAINTY RELATIONS<br />
IV. 5. 1<br />
INTRODUCTION<br />
Heisenberg’s original reasonings concerning the uncertainty principle resulted in ‘approximate<br />
inequalities’ for position q and momentum p, and for energy E and time t, of the form<br />
δq δp ∼ h and δE δt ∼ h. (IV. 25)<br />
In this section we will focus on the mathematical meaning of δ q, δ p, δ E and δ t and their interpretation.<br />
In his first article, Heisenberg (1927) gives the Gaussian wave packet as the only quantitative<br />
example. Its Fourier transform is also Gaussian and the widths of these packets are inversely proportional<br />
to each other, a general result of Fourier analysis. A suitable definition of these widths<br />
yields q 1 p 1 = h, where q 1 and p 1 represent the widths in question. Still in the same year E.H. Kennard<br />
derived the next general inequality,<br />
∆ ψ Q ∆ ψ P 1 2<br />
, (IV. 26)<br />
where ∆ ψ Q and ∆ ψ P are standard deviations of Q and P in ψ ∈ L 2 (R). In his Chicago lectures,<br />
Heisenberg (1930) considers the Kennard inequality (IV. 26) as the mathematical expression of the<br />
uncertainty principle. We will criticize this still widespread conception shortly, and give a derivation<br />
of the ‘standard uncertainty inequalities’, which are a generalization of the Kennard inequality.<br />
◃ Remark<br />
In his discussions of the uncertainty principle, Bohr exclusively makes use of relations of the<br />
type (IV. 25). ▹
98 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
IV. 5. 2<br />
THE STANDARD UNCERTAINTY RELATIONS<br />
If ψ ∈ L 2 (R) is the normalized wave function of a physical system in the q - language,<br />
with ∥ψ∥ = 1, the wave function ˜ψ(p) in the p - language is its Fourier transform<br />
˜ψ(p) =<br />
∫<br />
1<br />
√<br />
2 π <br />
R<br />
e − i p q<br />
ψ(q) dq, (IV. 27)<br />
and its inverse Fourier transform is<br />
∫<br />
1<br />
ψ(q) = √ e i p q<br />
˜ψ(p) dp. (IV. 28)<br />
2 π <br />
R<br />
The norm is invariant under Fourier transformations, therefore ∥ ˜ψ∥ = 1.<br />
The standard deviation of position in a state |ψ⟩, ∆ ψ Q, is defined as<br />
∫<br />
( ∫ ) 2.<br />
(∆ ψ Q) 2 = ⟨Q 2 ⟩ ψ − ⟨Q⟩ ψ 2 = q 2 |ψ(q)| 2 dq − q |ψ(q)| 2 dq (IV. 29)<br />
R<br />
R<br />
Likewise, for momentum, ∆ ψ P , we have<br />
(∆ ψ P ) 2 = ⟨P 2 ⟩ ψ − ⟨P ⟩ ψ<br />
2<br />
∫<br />
= − 2 ψ ∗ (q) d2 ψ(q)<br />
( ∫<br />
R dq 2 dq − − i ψ ∗ (q) dψ(q) ) 2<br />
dq<br />
R dq<br />
∫<br />
= p 2 | ˜ψ(p)|<br />
( ∫ 2. 2 dp − p | ˜ψ(p)| dp) 2 (IV. 30)<br />
R<br />
R<br />
Without loss of generality we can assume ⟨P ⟩ and ⟨Q⟩ to equal 0, so that<br />
of 1 2<br />
(∆ ψ P ) 2 = − 2 ∫<br />
R<br />
ψ ∗ (q) d2 ψ(q)<br />
dq 2 dq =<br />
∫<br />
R<br />
p 2 | ˜ψ(p)| 2 dp. (IV. 31)<br />
If the wave function ψ (q) is a Gaussian wave packet, the product takes on the minimum value<br />
. An example is the ground state of the one - dimensional harmonic oscillator having mass m,<br />
ϕ 0 (q) =<br />
( m ω0<br />
π <br />
) 1<br />
4 e − m ω q2<br />
2 , (IV. 32)<br />
with energy E 0 = 1 2 ω 0.<br />
Before interpreting the Kennard inequality (IV. 26), we give a still more general inequality, derived<br />
by Schrödinger (1930). Consider two arbitrary self - adjoint operators A and B acting on a Hilbert<br />
space H. Define, for a pure state |ψ⟩ ∈ H, the following operators:<br />
A ψ := A − ⟨A⟩ ψ 11 and B ψ := B − ⟨B⟩ ψ 11. (IV. 33)<br />
The expectation values of these operators are, in the state |ψ⟩, equal to 0,<br />
⟨A ψ ⟩ ψ = ⟨B ψ ⟩ ψ = 0. (IV. 34)
IV. 5. THE UNCERTAINTY RELATIONS 99<br />
The Cauchy - Schwarz inequality (II. 12), p. 19, for the vectors A ψ |ψ⟩ and B ψ |ψ⟩ reads<br />
⟨A ψ ψ | A ψ ψ⟩ ⟨B ψ ψ | B ψ ψ⟩ ∣ ∣ ⟨Aψ ψ | B ψ ψ⟩ ∣ ∣ 2 . (IV. 35)<br />
Because A ψ and B ψ are self - adjoint, we can also write this inequality as follows,<br />
⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ ∣ ∣⟨A ψ B ψ ⟩ ψ<br />
∣ ∣<br />
2 . (IV. 36)<br />
Using both the commutator [· , ·] − and the anti - commutator [· , ·] + , we find for the right - hand side<br />
of (IV. 36)<br />
∣ ⟨Aψ B ψ ⟩ ψ<br />
∣ ∣<br />
2<br />
where the cross - term disappears because of<br />
Furthermore,<br />
= ∣ 1<br />
2 ⟨[A ψ, B ψ ] − ⟩ ψ + 1 2 ⟨[A ∣<br />
ψ, B ψ ] + ⟩ ψ 2<br />
= 1 ∣<br />
∣<br />
4 ⟨[Aψ , B ψ ] − ⟩ ψ 2 +<br />
1<br />
4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 , (IV. 37)<br />
⟨[A ψ , B ψ ] − ⟩ ∗ ψ = − ⟨[A ψ, B ψ ] − ⟩ ψ<br />
⟨[A ψ , B ψ ] + ⟩ ∗ ψ = + ⟨[A ψ, B ψ ] + ⟩ ψ . (IV. 38)<br />
[A ψ , B ψ ] − = [A, B] − , (IV. 39)<br />
and we obtain the inequality<br />
⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />
∣ ∣<br />
∣⟨[A, B] − ⟩ ψ 2 +<br />
1<br />
4 ⟨[A ψ, B ψ ] + ⟩ ψ 2 . (IV. 40)<br />
In view of the inequalities (IV. 26) and (IV. 40), we make a few remarks.<br />
(i) Leaving out the last term on the right - hand side of inequality (IV. 40) gives the better known<br />
but weaker inequality, derived by H.P. Robertson (1929),<br />
⟨A 2 ψ ⟩ ψ ⟨B 2 ψ ⟩ ψ 1 4<br />
∣<br />
∣⟨[A, B] − ⟩ ψ<br />
∣ ∣<br />
2 . (IV. 41)<br />
(ii) Notice that ⟨A 2 ψ ⟩ ψ is equal to the square of the standard deviation of the quantity A in the<br />
state |ψ⟩,<br />
⟨A 2 ψ ⟩ ψ = ⟨(A − ⟨A⟩ ψ ) 2 ⟩ = (∆ ψ A) 2 . (IV. 42)<br />
(iii) For the special case A = Q and B = P , the Robertson inequality (IV. 41) transforms into the<br />
Kennard inequality (IV. 26), and the expressions (IV. 29) and (IV. 31) correspond to ⟨Q 2 ψ ⟩ ψ in<br />
the q - language and ⟨P 2<br />
ψ ⟩ ψ in the p - language.<br />
(iv) Notice that in deriving these uncertainty relations the interpretation of the uncertainties plays<br />
no role.
100 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
(v) An objection to the Robertson inequality (IV. 41) and the Schrödinger inequality (IV. 40) is that<br />
the right - hand side depends on the state, therefore, it is no absolute lower limit for all states.<br />
If |ψ⟩ is an eigenstate of A, the right - hand side of the Robertson inequality (IV. 41) is 0 and<br />
does not provide any restriction on ∆B. Therefore, even if A and B are not both at the same<br />
time sharp in any state, i.e., they do not have simultaneous eigenstates, this does not follow<br />
from the inequality (IV. 41).<br />
Only if the right - hand side of inequality (IV. 41) is unequal to zero for all states, the Robertson<br />
inequality represents the uncertainty principle. This is the case if the commutator is a multiple<br />
of unity, as in the case of P and Q, where [P, Q] = −i11, see p. 78, (IV. 1). It can, however, be<br />
proved that this canonical commutation relation [P, Q] can only apply to unbounded operators<br />
having no eigenstates in the, inevitably infinite dimensional, Hilbert space in which they act.<br />
(vi) Already in 1929 E.U. Condon pointed out the following facts (Jammer 1974, p. 71). In certain<br />
states, non - commuting operators can both be sharp. Take, for example, the ground state of the<br />
H - atom, or any stationary state with total angular momentum l = 0. This is also an eigenstate<br />
of L x , L y and L z with eigenvalue 0. Therefore, ∆L x ∆L y = 0, and likewise for L x and L z ,<br />
and for L y and L z , although these operators do not mutually commute. Therefore, the fact that<br />
operators do not commute does not guarantee an uncertainty relation. Furthermore, sometimes<br />
an inequality holds for commuting operators. Take again a stationary state of the H - atom,<br />
with l = 1 and m = 0. In that state ⟨[L x , L y ]⟩ = 0, whereas ∆L x ≠ 0 and ∆L y ≠ 0.<br />
In conclusion, there are fundamental objections against accepting the Schrödinger inequality, and<br />
by implication against the weaker inequalities which follow from it, to be the mathematical expression<br />
of Heisenberg’s uncertainty principle.<br />
And this is not everything yet.<br />
IV. 5. 3<br />
SINGLE SLIT EXPERIMENT<br />
Relations (IV. 26) and (IV. 41) are considered to be the mathematical expression of the uncertainty<br />
principle in the major part of textbooks on quantum mechanics. Next to the previous criticism, we<br />
will show that this also is, remarkably enough, inconsistent with the experiments used as illustrations<br />
of this principle (Uffink and Hilgevoord 1985, 1988 and Hilgevoord and Uffink 1988, 1990).<br />
Consider the deflection of light, or of electrons, by a single slit in an absorbing screen, an example<br />
Heisenberg also gives. Take for the wave function representing the particles passing through the<br />
screen with the slit a simple square wave function, see figure IV. 6,<br />
ψ ss (q) =<br />
{<br />
1 √<br />
2 a<br />
if |q| a<br />
0 elsewhere<br />
, (IV. 43)<br />
where 2 a ∈ R + is the width of the slit, and q the Cartesian coordinate parallel to the screen and<br />
perpendicular to the slit.
IV. 5. THE UNCERTAINTY RELATIONS 101<br />
2 a<br />
|ψ ss (q)| 2<br />
Figure IV. 6: The probability distribution in position for a slit of width 2 a<br />
The Fourier transform of ψ ss is<br />
˜ψ ss (p) =<br />
√ a<br />
π <br />
sin(ap/)<br />
. (IV. 44)<br />
a p / <br />
The square of this wave function, | ˜ψ ss (p)| 2 , has the same form as the diffraction pattern for the slit<br />
which is formed on a photographic plate placed far away, see figure IV. 7.<br />
2π/a<br />
| ˜ψ ss (p)| 2<br />
Figure IV. 7: The diffraction pattern for a small slit of width 2 a<br />
For the standard deviation of position and momentum in the state ψ ss we find<br />
(∆ ψss Q) 2 =<br />
∫<br />
R<br />
q 2 |ψ ss (q)| 2 dq = 1<br />
2 a<br />
∫ +a<br />
−a<br />
q 2 dq = 1 3 a2 (IV. 45)
102 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
and<br />
yielding<br />
(∆ ψss P ) 2 =<br />
∫<br />
R<br />
p 2 | ˜ψ ss (p)| 2 dp = 1<br />
π a<br />
∫<br />
R<br />
| sin(ap)| 2 dp = ∞, (IV. 46)<br />
∆ ψss Q ∆ ψss P = 1 3√<br />
3 a ∞. (IV. 47)<br />
This indeed satisfies the Kennard inequality (IV. 26), but in a little interesting manner.<br />
Although ∆ ψss P = ∞, the function | ˜ψ ss | 2 has in fact a very pronounced central peak, of a width<br />
of the order a −1 , in which 95% of the total probability is located. It is the inverse proportionality<br />
of the width of this central peak to the width of the slit, which, according to Heisenberg, illustrates<br />
the uncertainty principle; it is impossible to make the probability densities |ψ ss (q)| 2 and | ˜ψ ss (p)| 2<br />
arbitrarily small at the same time.<br />
But this conclusion can not be inferred from the Kennard inequality (IV. 26). If a goes to infinity,<br />
| ˜ψ ss (p)| 2 goes to the delta function δ (p). The standard deviation ∆ ψss P , however, remains<br />
divergent. In other words, 95% of a probability distribution can be concentrated on an arbitrarily<br />
small interval, whereas the standard deviation of the distribution remains arbitrarily large. 2 If nothing<br />
is given concerning the distributions |ψ ss (q)| 2 and | ˜ψ ss (p)| 2 but the Kennard inequality (IV. 26),<br />
these distributions could both be very narrow, and, consequently, Heisenberg’s conclusion can not be<br />
derived from the Kennard inequality, in contrast to what is usually claimed.<br />
Nevertheless, Heisenberg’s conclusion is correct for the given example of the single slit. This<br />
raises the question if his statement is valid in general. What we are in fact interested in is a measure<br />
for the width of a probability distribution representing the width of the unweighted distribution.<br />
The most natural definition of such a measure is the smallest interval a fraction α ∈ [0, 1] of<br />
the total probability can be in, where, roughly, α = 0.95 is taken. If ρ is a probability density, the<br />
definition is<br />
{<br />
∫ b<br />
}<br />
W α (ρ) := min [a, b] ⊂ R ∣ ρ(x) dx = α . (IV. 48)<br />
a<br />
For position and momentum in quantum mechanics we define<br />
{<br />
∫ b<br />
}<br />
W α (Q, ψ) := min [a, b] ⊂ R ∣ |ψ(q)| 2 dq = α , (IV. 49)<br />
{<br />
W α (P, ψ) := min [a, b] ⊂ R<br />
∣<br />
a<br />
∫ b<br />
a<br />
| ˜ψ(p)|<br />
}<br />
2 dp = α . (IV. 50)<br />
The product of these measures also satisfy an uncertainty relation, as was shown for the first time by<br />
H.J. Landau and H.O. Pollak (1961), nota bene in a journal for industrial engineers of the American<br />
Bell Telephone Company,<br />
W α (P, ψ) W α (Q, ψ) c α , (IV. 51)<br />
where α ∈ ( 1<br />
2 , 1] , and c α > 0 is a constant which only depends on α, not on ψ.<br />
2 Responsible for this phenomenon is the mathematical fact that the standard deviation assigns a quadratically increasing<br />
weight to the tails of a distribution. In a Gaussian distribution, e.g. the Gaussian wave packet (IV. 32), these tails go to zero<br />
rapidly enough because an exponential power goes to zero more rapidly than any polynomial goes to infinity, but for many<br />
wave functions occurring in physics the standard deviation diverges.
IV. 5. THE UNCERTAINTY RELATIONS 103<br />
From this inequality it follows that the probability densities of position and momentum cannot<br />
simultaneously be made arbitrarily small, in the sense that a fraction α is concentrated on a arbitrarily<br />
small interval. Finally, 34 years after the birth of the uncertainty principle that of which everyone<br />
thought follows from the standard uncertainty relations was proven.<br />
For the square wave function ψ ss (IV. 43) and its Fourier transform (IV. 44) we find<br />
W α (Q, ψ ss ) ≃ a and W α (P, ψ ss ) ≃ , (IV. 52)<br />
a<br />
so that the product is in the order of magnitude of .<br />
IV. 5. 4<br />
TIME AND ENERGY<br />
In the same article in which Heisenberg (1927) introduces the uncertainty relation for position<br />
and momentum, he also discusses the uncertainty relation between time and energy, starting from the<br />
‘well - known’ equation Et − tE = ih. This equation has caused many problems.<br />
If t is taken to be the universal time parameter, the spectrum of the operator t must be the real axis.<br />
But then the commutation relation can only be satisfied by an energy operator of which the spectrum<br />
is the real axis also. On the other hand, we know that the energy spectrum of quantum mechanical<br />
systems is generally bounded from below and can even be totally or partially discrete. Hence, the<br />
conclusion was soon drawn that there is no time operator in quantum mechanics (Von Neumann 1932,<br />
Pauli 1933). In the light of the existence of a position operator and with the theory of relativity in<br />
mind it was felt that in quantum mechanics something strange was going on with ‘time’. This is<br />
expressed in almost all textbooks and articles concerning this subject. Nevertheless, it has to do with<br />
a conceptual confusion which has not been noticed for a remarkably long time.<br />
As it happens, the comparison between q and t is faulty if t is understood to be a universal time<br />
parameter. After all, q is a dynamic variable of a specific physical system, for example of a particle,<br />
and therefore there are a lot of q’s in a multiple particle system. There is, however, only one time<br />
parameter. This does not belong to a certain physical system but must be put on a par with the<br />
universal position coordinates x, y, z, with which it is linked in the theory of relativity. No more<br />
than these position coordinates, the time coordinate t is an operator in quantum mechanics. Only the<br />
dynamic variables of physical systems can be operators, and the problem outlined above is therefore<br />
a pseudo - problem.<br />
Nevertheless, one can wonder if dynamic variables exist which are just as ‘timelike’, literally<br />
speaking, as q is ‘positionlike’. The answer is affirmative. Such variables exist in systems we call<br />
‘clocks’, think, for example, of the position or the orientation of the hand of a clock. But also very<br />
simple, microscopic systems can have such variables. In quantum mechanics these dynamic time<br />
variables become operators. They occur in specific systems and therefore they are not universal.<br />
And, similar to other dynamic variables, generally the spectrum of such time operators in quantum<br />
mechanics is not the entire real axis (see further J. Hilgevoord 2002).
104 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
IV. 5. 5<br />
DOUBLE SLIT EXPERIMENT<br />
Even more interesting is the famous interference experiment with the double slit. The wave function<br />
corresponding to particles passing through the screen with the slits is, in analogy with (IV. 43),<br />
ψ ds (q) =<br />
{<br />
1 √<br />
2 a<br />
if q ∈ [− A − a, − A + a] ∪ [A − a, A + a]<br />
0 elsewhere<br />
, (IV. 53)<br />
where 2a is the width of each slit, 2A is the distance between the slits, and A ≫ a, see figure IV. 8.<br />
2 A<br />
2 a<br />
|ψ ds (q)| 2<br />
Figure IV. 8: The probability distribution in position for a double slit, 2 a is the width of each slit and<br />
2 A the distance between the slits<br />
The Fourier transform of this double square wave function ψ ds is<br />
˜ψ ds (p) =<br />
√<br />
2 a<br />
( Ap<br />
) sin(ap/)<br />
π cos . (IV. 54)<br />
a p / <br />
The function | ˜ψ ds | 2 again has the same form as the interference pattern for the slits on a photographic<br />
plate placed far away, as can be seen in figure IV. 9.<br />
2 π / a<br />
2 π / A<br />
| ˜ψ ds (p)| 2<br />
Figure IV. 9: The interference pattern for the double slit
IV. 5. THE UNCERTAINTY RELATIONS 105<br />
Now there are, however, two parameters playing a role. The distance of the slits A is a measure for<br />
the total width of |ψ ds (q)| 2 , the ‘enveloping’ cosine factor in (IV. 54), while the width of the slits a is a<br />
measure for the ‘fine structure’ of this probability density. For | ˜ψ ds (p)| 2 the roles have reversed, A −1<br />
is a measure for the width of the interference lines, while a −1 is a measure for the total width of the<br />
interference pattern. This shows the well - known fact that the width of the interference lines and the<br />
distance between the slits are inversely proportional. In a moment we will see that Bohr’s discussion<br />
of the double slit experiment exactly rests on this fact.<br />
◃ Remark<br />
Consider the measures<br />
∆ ψds Q ≃ A and ∆ ψds P = ∞, (IV. 55)<br />
W α (Q, ψ ds ) ≃ A and W α (P, ψ ds ) ≃ . (IV. 56)<br />
a<br />
None of these measures gives the fine structure. Therefore, Bohr’s Copenhagen reasoning, treated<br />
in the next subsection, cannot be based on the Kennard inequality (IV. 26) nor on the inequality of<br />
Landau and Pollak (IV. 51). ▹<br />
EXERCISE 30. Verify the calculations (IV. 55) and (IV. 56).<br />
IV. 5. 6<br />
A NEW UNCERTAINTY MEASURE<br />
Bohr’s reasoning concerning the double slit experiment goes as follows. A way to determine<br />
through which slit the particle has gone is measuring the recoil in the q - direction that the screen<br />
experiences at the passage of this particle. To this end the screen must be able to move in the q - direction.<br />
Instead of a fixed screen we take therefore a screen that is suspended from a spring, as can be<br />
seen in figure IV. 10. The incoming momentum p is perpendicular to the screen.<br />
We assume conservation of kinetic energy, i.e. a heavy screen, which means that only the direction<br />
of the momentum changes. Consequently, a particle arriving at position q of the photographic<br />
plate, gives a recoil to the screen of, assuming r ≫ A and therefore sin θ ≈ tan θ,<br />
( q ± A<br />
r<br />
)<br />
p, (IV. 57)<br />
depending on which slit it has gone through. To be able to measure the difference in recoil, it must<br />
hold for the inaccuracy δP with which the momentum of screen was known in advance, that<br />
δP < 2 A p . (IV. 58)<br />
r
106 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
q<br />
q = r tan θ 1 + A<br />
= r tan θ 2 − A<br />
2 A<br />
1<br />
2<br />
a<br />
θ 2<br />
θ 1<br />
r<br />
Figure IV. 10: Moving screen<br />
Because of the inequality<br />
δP δQ , (IV. 59)<br />
to the inaccuracy with which the position Q of the screen was known then applies<br />
δQ ><br />
r . (IV. 60)<br />
2 A p<br />
But the width of the interference lines on the photographic plate is<br />
λ r<br />
2 A = r , (IV. 61)<br />
2 A p<br />
where λ = p<br />
is the de Broglie wavelength of the electron. Bohr therefore concludes that the uncertainty<br />
in the position of the screen will result in the erasure of the interference pattern.<br />
◃ Remarks<br />
First, we see that Bohr applies the uncertainty principle to the screen which means that he treats this<br />
macroscopic body quantum mechanically. Second, he uses the uncertainty principle in a qualitative<br />
manner, in particular, he does not give a definition of the uncertainties δP and δQ. Third, the relevant<br />
uncertainty in Q is of the order of magnitude of the width A −1 of the interference lines. Bohr<br />
therefore has no use of the Kennard inequality (IV. 26) or the inequality of Landau and Pollak (IV. 51),<br />
which do not contain this width. Finally, Bohr does not show how erasure of the interference pattern<br />
exactly takes place, obviously, he considers it to be intuitively evident. ▹<br />
From the previous it should be clear that something is still lacking in the mathematical formulation<br />
of the uncertainty principle. One would hope that there may exist some direct relation between the
IV. 5. THE UNCERTAINTY RELATIONS 107<br />
total width of a distribution in the p - language (q - language), and the fine structure of this distribution<br />
in the q - language (p - language) as exhibited by the wave function for the double slit, assuming that<br />
this relation has general validity. Indeed, such a relation has been found (Uffink and Hilgevoord 1985),<br />
w α (Q, ψ) W α (P, ψ) C α and w α (P, ψ) W α (Q, ψ) C α , (IV. 62)<br />
where w α ( · , ψ) ∈ R + is a measure for the width of the fine structure of ψ, W α ( · , ψ) ∈ R + is<br />
the measure for the total width of ψ as introduced earlier, and C α > 0 is a constant depending<br />
on α ∈ (0, 1], but not on the state ψ.<br />
Illustratively, if W is taken as a measure of the size of the objective of a microscope and w as a<br />
measure of the fine structure of the image, the inequalities express the fact that the resolving power<br />
must decrease if the aperture is reduced. Likewise, the direction of incoming radiation can better<br />
determined by using a long array of radio telescopes than by using a short one, etc. These inequalities<br />
thus express, among other things, the well-known fact in optics that the resolving power of an<br />
apparatus improves as the apparatus is larger.<br />
The inequalities (IV. 62) seem to solve the problem for Bohr. A closer consideration however<br />
tells us that W α (P, ψ) is not the suitable measure to express whether the difference in recoil can or<br />
cannot be observed. More precise, W α (P, ψ) > 2Ap<br />
r<br />
does not guarantee that this difference cannot be<br />
observed. W α (P, ψ) can be large in this experiment, which makes the inequality (IV. 62) ineffective.<br />
Actually, it is the question if Bohr’s argument can in fact be based on an uncertainty relation.<br />
Nevertheless, his conclusion is correct! The fact is that a direct calculation of the double slit<br />
experiment by D. Hauschildt, unpublished, shows that the intensity of the interference, in case the<br />
screen is movable, is proportional to the factor<br />
∣ ⟨χ| e<br />
i 2 A p Q r sc<br />
|χ⟩ ∣ . (IV. 63)<br />
Here |χ⟩ is the state of the screen and Q sc is the position operator of the screen. The state<br />
|χ⟩ ′<br />
:= e i 2 A p<br />
r Q sc<br />
|χ⟩ (IV. 64)<br />
is the state of which the momentum spectrum is shifted by 2Ap<br />
r<br />
with respect to the momentum spectrum<br />
of the state |χ⟩,<br />
⟨p | χ ′ ⟩ = ⟨ p − 2 A p<br />
r<br />
∣ χ<br />
⟩<br />
. (IV. 65)<br />
The factor (IV. 63) is, therefore, exactly the quantum mechanical expression describing to what extent<br />
the state of the screen after the recoil can be distinguished from the state of the screen before the<br />
recoil.<br />
If the momentum spectrum of |χ⟩ is broad with respect to 2Ap<br />
r<br />
, the overlap (IV. 63) will be large,<br />
namely almost 1. In that case |χ⟩ and |χ⟩ ′ are difficult to distinguish and interference is large. If the<br />
momentum spectrum of |χ⟩ only contains peaks which are narrow with respect to 2Ap<br />
r<br />
, then (IV. 63)<br />
is small. The states |χ⟩ and |χ⟩ ′ are well distinguishable then and interference is small. The essence<br />
of Bohr’s reasoning is therefore correct; to the extent in which the screen can serve as a measuring apparatus<br />
to determine the slit a particle goes through, interference disappears. Whether this reasoning<br />
can be based on an uncertainty relation, is unknown to this very day.
108 CHAPTER IV. THE COPENHAGEN INTERPRETATION<br />
IV. 5. 7<br />
INTERPRETATION<br />
The statistical interpretation of the uncertainty W α (A, ψ) in (IV. 62) is that it is a measure for<br />
the predictability of an outcome of measurement given a probability distribution, it is nothing but<br />
the usual statistical interpretation of the standard deviation. How we must physically understand<br />
this uncertainty depends directly on how we must physically understand quantum mechanical probabilities.<br />
We will discuss this elaborately further on.<br />
The number w α (A, ψ) is a measure for the distinguishability between the state ψ (probability<br />
distribution) and some other state (other probability distribution) when measuring quantity A corresponding<br />
to operator A. This is also nothing but the usual statistical interpretation of this measure.
V<br />
HIDDEN VARIABLES<br />
While we have thus shown that the wave function does not provide a complete description<br />
of the physical reality, we left open the question of whether or not such a description<br />
exists. We believe, however, that such a theory is possible.<br />
— Einstein, Podolsky and Rosen<br />
You may have already suspected that I still believe in the hidden variables hypothesis.<br />
[. . . ] Anyway, for me, the hidden variable hypothesis is still the best way to ease my<br />
conscience about quantum mechanics.<br />
— Gerard ’t Hooft<br />
In this chapter we get acquainted with so-called ‘hidden variable theories’ and the motivation<br />
to consider such theories. We examine if it is possible to shove such a ‘hidden variable theory’<br />
under quantum mechanics, the way classical mechanics can be shoven under classical statistical<br />
mechanics. We also treat the notorious impossibility theorems of Von Neumann and of Kochen<br />
and Specker.<br />
V. 1 HIDDEN REALITY<br />
Quantum mechanics is, roughly speaking, a theory about outcomes of measurements; about which<br />
values can be found upon measurement and about the probability of finding a specific value in such<br />
a measurement. Moreover, according to the Copenhagen perspective, this description is complete:<br />
there is nothing more to say about a physical system. As a consequence, quantum mechanics is<br />
exclusively concerned with the observable behaviour of measuring apparatuses.<br />
In the eyes of many authors, this is bizarre. In the entire history of physics we see that the aim<br />
of a theory has been to tell us something about how reality is organized, how to explain what we<br />
observe around us. Measuring is the eminent scientific manner to examine whether a given theory or<br />
hypothesis meets this aim, or to gather data to help us select theories. Measurment is not an aim, but<br />
a tool. The subject of physical theories, physical reality, does not occur in the quantum mechanical<br />
tale, in contrast to nearly all theories in classical physics.<br />
From this point of view, we could hope that quantum mechanics is some sort of cloak, which must<br />
be sustained by an underlying theory concerning physical reality. Because that underlying theory is<br />
hidden under the quantum mechanical cloak, we will speak of a hidden variable theory.<br />
So, let us examine the matter not from the viewpoint of quantum mechanics, but from ‘physical<br />
reality’, taking as a working hypothesis that something like a ‘physical reality’ exists. The behavior of
110 CHAPTER V. HIDDEN VARIABLES<br />
radioactive atomic nuclei, as discussed in the Introduction, p. 7, suggests that individual nuclei differ<br />
from each other, they show various life spans and emit α - particles with distinct momentum. The<br />
natural idea is that this difference in behavior has a cause, which can be found in mutually differing<br />
properties of the physical states of the individual nuclei. Quantum mechanics does not give us these<br />
differences, but perhaps a description of state exists, exceeding that what quantum mechanics tells us.<br />
We would like such an additional description to show us how the phenomena observed at an<br />
individual nucleus follow decisively from the state of that nucleus. Such a description requires extra<br />
variables in comparison with the quantum mechanical description. It is conceivable that not all of<br />
these variables are accessible to our present, and possibly future, possibilities of observation. They<br />
are ‘hidden’ from us, but they must exist to explain the observed differences. If they exist, then<br />
quantum mechanical states correspond to probability distributions over the states described by these<br />
variables. These probability distributions would only express our ignorance concerning the exact<br />
physical states. In this respect, the situation would be entirely analogous to that in classical statistical<br />
mechanics. EPR believed that it must in principle be possible to construct such a theory.<br />
Such an attempt, interpreting quantum mechanics as a statistical theory about an underlying physical<br />
reality, is what is called a hidden variable theory, HVT for short, the support under the quantum<br />
mechanical cloak. Assuming that quantum mechanics is empirically adequate, we will examine if it<br />
is possible in principle to found this description on a HVT.<br />
An important distinction between several types of HVT’s concerns the question whether the hidden<br />
variables describing the physical state of the system can depend on which quantity of the system is<br />
measured. Theories in which this has been permitted are called contextual, they will be discussed in<br />
section V. 4. For the moment, we will first concentrate on the simpler case where this is not permitted,<br />
the non - contextual theories, to be discussed in section V. 2.<br />
Another important division has to do with determinism. Although it is the objective of a HVT to<br />
supplement or complete the quantum mechanical description of a physical system, this does not imply<br />
that with this supplement the precise future behavior of this system can be entirely predicted, it is<br />
conceivable that the HVTtoo merely determines probabilities of possible events. In that case we speak<br />
of an indeterministic, or stochastic, HVT. In this chapter we will discuss only deterministic HVT’s,<br />
but we will come back to stochastic HVT’s in chapter VII.<br />
V. 2 NON - CONTEXTUAL HIDDEN VARIABLES<br />
Let us try to reconstruct quantum mechanics in analogy with classical statistical mechanics. We<br />
assume a space Λ analogous to the phase space Γ known from statistical physics, which we have<br />
already met in section III. 2. An arbitrary ‘point’ in that space Λ is indicated with λ. We do not in<br />
advance impose any restriction to the mathematical form of λ. The variable λ can represent anything,<br />
for example a single real variable, an infinite - dimensional vector field, complex functionals, etc. The<br />
possibilities are endless, the only restriction will be that a probability measure can be defined on Λ. It<br />
is possible to also incorporate the quantum mechanical state as a component in the specification of λ.<br />
Speaking about a ‘classical’ statistical model here does not mean that the HVT must look like<br />
classical mechanics, let alone that λ specifies the position and momentum of the particles, although<br />
we do not exclude that as a possibility.
V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 111<br />
In the HVT, a pure physical state corresponds to a single ‘point’ λ ∈ Λ. We assume that the<br />
system is always in one of these states λ ∈ Λ, even though we do not know in which one. A general,<br />
mixed state is a probability distribution over Λ. For any given λ every physical quantity A has an<br />
exact value, denoted by A[λ], which is revealed upon measurement of A, and therefore a physical<br />
quantity A can be represented as a real function on the space A : Λ → R.<br />
Furthermore, every quantity represented by quantum mechanics has to have a counterpart in the<br />
HVT. If such a quantity, corresponds to the function A : Λ → R the values A [λ] can take are<br />
the eigenvalues of the self - adjoint operator A : H → H which, according to quantum mechanics,<br />
corresponds to quantity A.<br />
It is also required that every quantum mechanical state can be represented in the HVT; for every<br />
state operator W there must be a corresponding probability distribution ρ W over Λ. It is, however,<br />
not necessary that pure quantum states correspond to pure hidden variable states, the idea being that<br />
the HVT allows for a more detailed, complete description of the system. Neither is it necessary that<br />
every probability distribution on Λ corresponds to a state operator, the HVT could easily be a theory<br />
richer than quantum mechanics.<br />
The requirement that the HVT has to reproduce the empirical statements of quantum mechanics<br />
is now expressed in the requirement that the expectation values of quantity A belonging to a physical<br />
system in a physical state, corresponding in the HVT to ρ W , and in quantum mechanics to W , coincide,<br />
∫<br />
⟨A⟩ ρW := A[λ] ρ W (λ) dλ = Tr A W, (V. 1)<br />
Λ<br />
where ρ W : Λ → [0, ∞) is a probability density,<br />
∫<br />
ρ W (λ) dλ = 1. (V. 2)<br />
Λ<br />
For a pure state |ψ⟩, (V. 1) reduces to<br />
∫<br />
A[λ] ρ ψ (λ) dλ = ⟨ψ | A | ψ⟩. (V. 3)<br />
Λ<br />
In the discrete case the integrals are replaced by summations.<br />
Summary<br />
An non - contextual HVT is any theory meeting the following requirements.<br />
(i) Every physical state of a physical system corresponds to a probability distribution ρ over Λ.<br />
This is the state postulate.<br />
(ii) Every physical quantity A corresponds to a function A : Λ → R, λ ↦→ A[λ]. This is the<br />
observables postulate.
112 CHAPTER V. HIDDEN VARIABLES<br />
(iii) The range of A : Λ → R coincides with the spectrum of the self - adjoint operator A which,<br />
according to quantum mechanics, corresponds to quantity A.<br />
The expectation value of A when the physical system is in the state ρ W which, according<br />
to quantum mechanics, corresponds to the state operator W , equals the quantum mechanical<br />
expression for the expectation value<br />
⟨A⟩ ρW :=<br />
∫<br />
Λ<br />
A[λ] ρ W (λ) dλ = Tr AW.<br />
We will call this last requirement (iii) the reproduction criterion.<br />
Since all probabilities in quantum mechanics can be written as Tr PW , with P ∈ P(H), it follows<br />
that all probability distributions in quantum mechanics coincide with the corresponding probability<br />
distributions in the HVT.<br />
We can now ask whether it is possible to construct a HVT satisfying the above requirements. The<br />
answer is that it is indeed possible, even in a quite trivial way, by choosing Λ large enough. We<br />
illustrate this by means of a simple example.<br />
Suppose there are only three quantities A, B, C, with possible values {a 1 }, {b 1 , b 2 }, {c 1 , c 2 } and<br />
represented by functions A, B, C : Λ → R. The possible value combinations are<br />
(a 1 , b 1 , c 1 ), (a 1 , b 1 , c 2 ), (a 1 , b 2 , c 1 ), (a 1 , b 2 , c 2 ). (V. 4)<br />
We now construct a space Λ by identifying every value combination with a point of Λ. If we denote<br />
these points by λ 1 , λ 2 , λ 3 and λ 4 , then<br />
A[λ 1 ] = a 1 , B[λ 3 ] = b 2 , C [λ 4 ] = c 2 , etc. (V. 5)<br />
When there are more quantities, we extend Λ correspondingly.<br />
We have to introduce a probability measure<br />
µ : F (Λ) → [0, 1] with<br />
∑<br />
µ(λ j ) = 1 (V. 6)<br />
j<br />
such that (V. 1) is satisfied. In our case Λ is discrete and consists of four points only, as a result of<br />
which the integral (V. 1) becomes a sum. For example, to quantity B it must apply that<br />
Tr B W =<br />
4∑<br />
B[λ j ] µ W (λ j )<br />
j=1<br />
This is satisfied by<br />
= b 1<br />
(<br />
µW (λ 1 ) + µ W (λ 2 ) ) + b 2<br />
(<br />
µW (λ 3 ) + µ W (λ 4 ) ) . (V. 7)<br />
µ W (a i , b j , c k ) = Tr P ai W Tr P bj W Tr P ck W, (V. 8)
V. 2. NON - CONTEXTUAL HIDDEN VARIABLES 113<br />
where P ai is the projector on the subspace corresponding to the eigenvalue a i of A, etc. Indeed,<br />
according to quantum mechanics<br />
and therefore<br />
while, with<br />
we have<br />
B = b 1 P b1 + b 2 P b2 , (V. 9)<br />
Tr BW = b 1 Tr P b1 W + b 2 Tr P b2 W, (V. 10)<br />
P a1 = 11, P b1 + P b2 = 11, P c1 + P c2 = 11, (V. 11)<br />
µ W (λ 1 ) + µ W (λ 2 ) = µ W (a 1 , b 1 , c 1 ) + µ W (a 1 , b 1 , c 2 )<br />
Likewise we find<br />
= Tr P a1 W Tr P b1 W (Tr P c1 W + Tr P c2 W )<br />
= Tr P b1 W. (V. 12)<br />
µ W (λ 3 ) + µ W (λ 4 ) = Tr P b2 W. (V. 13)<br />
Therefore, (V. 7) has been satisfied, and the same applies to the expectation values of A and C.<br />
If we have, in general, the quantities A, B, C, . . . , F , with values a i , b j , c k , . . . , f l , where<br />
i = 1, . . . , n A , j = 1, . . . , n B , etc., the measure<br />
µ W (a i , b j , c k , . . . , f l ) = Tr P ai W Tr P bj W Tr P ck W · · · Tr P fl W, (V. 14)<br />
satisfies requirement (V. 3) for all quantities. For example, the probability of finding for quantity A<br />
the value a i is<br />
Prob µ W<br />
(A : a i ) =<br />
∑<br />
µ W (a i , b j , c k , . . . , f l ) = Tr P ai W, (V. 15)<br />
j, k,..., l<br />
because all others sum up to 1. Here we have the required quantum mechanical result. Kochen<br />
and Specker (1967) showed how to formulate this idea in the case of an infinite number of physical<br />
quantities.<br />
This solution of the completeness problem is, however, not very interesting physically. It can be<br />
seen from the factorizable probabilities in (V. 8) that all quantities are treated here as being statistically<br />
independent which is not in agreement with physical practice. Some quantities are functions of<br />
other quantities, e.g., kinetic energy is a function of momentum, E kin = p2<br />
2m<br />
, while other quantities<br />
link with two or more other quantities, such as kinetic, potential and total energy, E = E kin + E pot .<br />
In the just outlined HVT we have ignored such links.
114 CHAPTER V. HIDDEN VARIABLES<br />
To illustrate this we assume that in our example C = A+B so that c 1 = a 1 +b 1 and c 2 = a 1 +b 2 .<br />
Now the possible value combinations in the HVT are<br />
(a 1 , b 1 , a 1 + b 1 ), (a 1 , b 1 , a 1 + b 2 ), (a 1 , b 2 , a 1 + b 1 ), (a 1 , b 2 , a 1 + b 2 ), (V. 16)<br />
and we see that (A + B) [λ] is not equal to A [λ] + B [λ] for all λ. Nevertheless, the HVT succeeded<br />
in reproducing, by construction, all quantum mechanical expectation values, in other words,<br />
the HVT reproduces the relation<br />
⟨ψ | A + B | ψ⟩ = ⟨ψ | A | ψ⟩ + ⟨ψ | B | ψ⟩, (V. 17)<br />
without requiring<br />
(A + B)[λ] = A[λ] + B[λ]. (V. 18)<br />
If we would require (V. 18), Λ would only consist of the points (a 1 , b 1 , a 1 + b 1 ) and (a 1 , b 2 , a 1 + b 2 )<br />
which is, of course, a strong restriction.<br />
In the very first proof of the impossibility of a HVT, that is, of the insolubility of the completeness<br />
problem, given by Von Neumann (1932), the requirement (V. 18) was indeed imposed on the HVT.<br />
Von Neumann required (V. 18) for every hidden variable state, in particular also for pure hidden<br />
variable states, which means that (V. 18) must apply to all λ ∈ Λ. We don’t need to discuss Von<br />
Neumann’s elaborate proof of this claim in detail, since J.S. Bell (1966) has shown this impossibility<br />
by means of a very simple example.<br />
Since the values of A[λ] etc. have to be the eigenvalues of the corresponding operators, it can be<br />
seen immediately that this requirement cannot be satisfied in general. Consider for example the Pauli<br />
matrices<br />
σ x =<br />
( ) 0 1<br />
, σ<br />
1 0 y =<br />
( ) 0 − i<br />
i 0<br />
and σ x + σ y =<br />
( )<br />
0 1 − i<br />
. (V. 19)<br />
1 + i 0<br />
The eigenvalues σ x and σ y are ±1, but the eigenvalues of σ x + σ y are ± √ 2, and therefore, (V. 18)<br />
cannot be satisfied.<br />
Bell argued that the requirement (V. 18) is physically unreasonable. For instance, measuring<br />
σ x ,σ y and σ x +σ y requires three different measurement apparatuses, for example three Stern - Gerlach<br />
magnets in three different orientations. There is absolutely no reason to assume that an algebraical<br />
link would exist between the individual outcomes of these measurements. The fact that in quantum<br />
mechanics the relation (V. 17) exists for pure states, even in case A and B do not commute, must be<br />
considered as a particular property of quantum mechanics.<br />
Since the requirement (V. 18) is unreasonably strong, one can wonder whether there are other,<br />
reasonable, requirements which can be imposed to a HVT in order to find acceptable solutions of the<br />
completeness problem. This brings us to the next section.
V. 3 KOCHEN AND SPECKER’S THEOREM<br />
V. 3. KOCHEN AND SPECKER’S THEOREM 115<br />
As we already proved in section II. 4, p. 28, in quantum mechanics the next theorem holds: if the<br />
operators A, B, C, . . . commute, there is a maximal operator O of which they are a function,<br />
A = f (O), B = g(O), etc. (V. 20)<br />
A measuring procedure for A, B, C, . . . would be to measure O and apply the function relation to the<br />
result in order to find the values for A, B, C, . . . Kochen and Specker (1967, p. 64) call the quantities<br />
corresponding to A, B, C, . . . commeasurable.<br />
Now it seems reasonable to require, as Von Neumann did, that the HVT also has this structure, i.e.,<br />
for B, C : Λ → R, if B = f (C), it follows that B[λ] = f ( C [λ] ) , or<br />
f (C)[λ] = f ( C [λ] ) . (V. 21)<br />
This function rule, (V. 21), yields the so - called sum rule for commuting operators,<br />
[A, B] = 0 =⇒ (A + B)[λ] = A[λ] + B[λ], (V. 22)<br />
since, with O again the maximal operator of which A and B are a function, A = f (O), B = g(O),<br />
implying<br />
(A + B) = h(O) with h = f + g, (V. 23)<br />
from (V. 21) it then follows in this HVT that<br />
(A + B)[λ] = h(O)[λ] = h ( O[λ] ) = f ( O[λ] ) + g ( O[λ] )<br />
= (f O)[λ] + (g O)[λ] = A[λ] + B[λ]. (V. 24)<br />
EXERCISE 31. Prove, again using (V. 21), the product rule for commuting operators,<br />
[A, B] = 0 =⇒ (A B)[λ] = A[λ] B[λ]. (V. 25)<br />
Now we will see how the requirement, (V. 21), which at first sight is eminently reasonable, nevertheless<br />
renders a HVT of quantum mechanics impossible.<br />
THEOREM :<br />
A HVT satisfying the requirements (i) - (iii), p. 111, and the function rule (V. 21), does<br />
not exist if dim H > 2.
116 CHAPTER V. HIDDEN VARIABLES<br />
Proof<br />
Consider a complete collection of mutually orthogonal projectors P 1 , . . . ,P N on a N - dimensional<br />
Hilbert space. Such projectors mutually commute; [P i , P j ] = 0. An arbitrary sum of such projectors<br />
over some subset ∆ ⊂ {1, . . . , N} is again a projector,<br />
∑<br />
i∈Delta<br />
P i = P ∆ ∈ P (H). (V. 26)<br />
Therefore, according to the sum rule (V. 22) it has to hold that<br />
∑<br />
, P i [λ] = P ∆ [λ]. (V. 27)<br />
i ∈∆<br />
But the values P i [λ] are the eigenvalues of the operators P i , therefore they are 0 or 1, likewise<br />
for P ∆ [λ], these values also follow from (V. 21). In particular, taking ∆ = {1, . . . , N}, we find<br />
N∑<br />
, P i [λ] = 11[λ] = 1.<br />
i=1<br />
But then the value assignment P i [λ] to the projectors satisfies the requirements for a probability<br />
measure on P (H), i.e.<br />
µ λ (P i ) := P i [λ] ∈ {0, 1} (V. 28)<br />
is a normalized, additive mapping on the subspaces of H. According to Gleason’s theorem, p. 47,<br />
this probability measure can always be written as<br />
µ λ (P i ) = Tr P i W λ , (V. 29)<br />
for a certain state operator W λ , provided that dim H > 2. There is, however, a contradiction<br />
between (V. 29) and (V. 28). The measure (V. 29) is continuous; a small change of the direction<br />
of P i induces a small change of µ(P i ). The measure (V. 28) is however necessarily discontinuous<br />
because µ(P i ) can only have the values 0 and 1.<br />
The conclusion has to be that a value assignment to quantities satisfying (V. 21), and therefore<br />
(V. 27), is impossible. As a consequence, a HVT of this type is not possible. □<br />
In this proof we used Gleason’s theorem, which is difficult to prove, and his own proof is not very<br />
transparent. There have also been given direct proofs for the impossibility of this value assignment.<br />
Bell (1966) and Kochen and Specker (1967) were the first to prove this in general, i.e., for dim H > 2<br />
and for all states; see also Belinfante (1973). We will not discuss these proofs in detail but restrict<br />
ourselves to a number of observations. Before we do so, we formulate Kochen en Specker’s theorem.<br />
KOCHEN AND SPECKER’S THEOREM :<br />
It is not possible to assign values to all physical quantities of an arbitrary physical system,<br />
with a Hilbert space of dim > 2, in accordance with function rule (V. 21).
V. 3. KOCHEN AND SPECKER’S THEOREM 117<br />
Sketch of the direct proof<br />
We can formulate the problem as follows. Consider as a particular case of (V. 26) a resolution of<br />
identity into 1 - dimensional projectors,<br />
P 1 + P 2 + · · · + P n = 11. (V. 30)<br />
According to (V. 21), thence (V. 22), the following must hold<br />
P 1 [λ] + P 2 [λ] + · · · + P n [λ] = 11[λ] = 1 (V. 31)<br />
for every resolution of identity. Consider the 1 - dimensional projectors H as lines in all possible<br />
directions through the origin of H. Now assign to all lines the value 0 or 1, such that the sum<br />
of the values of each complete set of orthogonal lines is 1. Alternatively, consider the points of<br />
intersection of these lines with the surface of the unit sphere in H. To each point of the sphere the<br />
value 0 or 1 is assigned, antipodal points are assigned the same value, and the sum of the values<br />
of the points of intersection of an orthogonal basis with the surface of the sphere is 1.<br />
If this problem is soluble in a complex H, it is also soluble in a real H with the same dimension.<br />
To see this, choose a basis in H and generate, by application of real orthogonal transformations,<br />
a structure which is isomorphic to a real H. Therefore, we can restrict ourselves to proving the<br />
impossibility of the requested value assignment in a real H.<br />
Furthermore, the impossibility in H N implies the impossibility in H N+1 . This can be shown<br />
by considering the N - dimensional subspace which is orthogonal to a line having value 0. Each<br />
orthogonal (N + 1) - tuple of which this line is a part then turns into an N - tuple with a correct<br />
value assignment. In other words, if it is possible in an (N +1) - dimensional H, it is also possible<br />
in an N - dimensional H and, therefore, we only have to consider a real H with a dimension as<br />
low as possible.<br />
Notice that the problem for a 2 - dimensional Hilbert space H 2 does have a solution, see for<br />
example the diagram V. 1.<br />
1 0<br />
0<br />
1<br />
Figure V. 1: A solution for dim H = 2<br />
All proofs therefore aim at the case of a real, 3 - dimensional Hilbert space H 3 . Now it immediately<br />
seems plausible that the requested value assignment in H 3 is not possible, to each point of<br />
the unit sphere R 3 with value 1 infinitely many points belong having value 0, namely, the equator<br />
of which that point is a pole. On the other hand, of each orthogonal triad of points only two points<br />
have the value 0. But this is, of course, not a proof.<br />
Bell (1966, pp. 450, 451) showed that points with different values cannot be arbitrarily close.<br />
This is an independent proof of the continuity of the measure, and therefore contrary to the necessary<br />
discontinuity of (V. 28).
118 CHAPTER V. HIDDEN VARIABLES<br />
Kochen and Specker (1967, p. 69) explicitly constructed a set of 117 spin quantities for which no<br />
consistent value assignment exists. This construction is depicted on the cover of Redhead (1987)<br />
and can be seen in figure V. 2 a. It shows that every value assignment in accordance with function<br />
rule (V. 21) leads to contradictions.<br />
Kochen and Conway only needed 31 quantities in the so - called Peres cube of 33 points (Peres 1993).<br />
This construction is depicted in figure V. 2 b. □<br />
Figure V. 2: a) Kochen - Specker diagram b) Conway - Kochen diagram<br />
(Redhead 1987 ) (Tkadlec 2000 )
V. 3. KOCHEN AND SPECKER’S THEOREM 119<br />
Figure V. 3: M.C. Escher, Waterfall. Consider the 3 interpenetrating cubes on the top of the<br />
left pillar. Each cube has 4 lines from the mutual center to its vertices, 6 lines to the centers of<br />
its edges, and 3 lines to the centers of its faces. Three of the lines are shared by all three cubes,<br />
giving 3 · (4 + 6 + 3 ) − 6 = 33 lines. These are Peres’ vectors. (Text Meyer 2003 )<br />
It is interesting to see what the measure (V. 29), according to Von Neumann the probability measure<br />
of quantum mechanics, looks like in this case. For a pure state W = |ψ⟩ ⟨ψ|, with P i = |χ⟩ ⟨χ|<br />
the measure (V. 29) is<br />
µ(P i ) = Tr P i W = ⟨ψ | P i | ψ⟩ = |⟨χ | ψ⟩| 2 (V. 32)<br />
so that in a real space we have<br />
µ(P i ) = |⟨χ | ψ⟩| 2 = cos 2 θ, (V. 33)
120 CHAPTER V. HIDDEN VARIABLES<br />
with θ the angle between |ψ⟩ and |χ⟩, see figure V. 4.<br />
ψ<br />
1<br />
θ<br />
χ<br />
cos 2 θ<br />
0<br />
Figure V. 4: µ(P i ) = cos 2 θ<br />
In the appendix of these lecture notes, p. 183, ff., we will prove that, if we assign to each point<br />
of the upper half of a unit sphere a non - negative real number such that 1 is assigned to the ’north<br />
pole’, 0 is assigned to the ’equator’ and the sum of the values of each orthogonal triad in this half<br />
sphere is 1, there is only one possible value assignment and that is the quantum mechanical one,<br />
i.e., in accordance with cos 2 θ.<br />
◃ Remarks<br />
First, illustrations of Kochen and Specker’s theorem are easy to find for Hilbert spaces of dimension<br />
larger than 3, for example 8, in which case a handful of quantities suffices, see Mermin (1993). We<br />
will come back to that in section VII. 6. Second, when restricted to rational angles between spin<br />
vectors, no contradiction with quantum mechanics can be obtained, as D.A. Meyer (1999) proved. ▹<br />
V. 3. 1 SUMMARY<br />
According to Kochen and Specker’s theorem, a HVT satisfying the state postulate and the observables<br />
postulate, p. 111 (i) and (ii), together with the function rule (V. 21), is contradictory to the state<br />
postulate and the observables postulate of quantum mechanics if dim H > 2, although for Hilbert<br />
spaces with dim H 2 it is possible. This conclusion shows how stringent the vector space structure<br />
of quantum mechanics is, and in particular, the fact that there are many different decompositions of<br />
unity forms a heavy barrier for a HVT.<br />
V. 4 CONTEXTUAL HIDDEN VARIABLES<br />
Essential for Kochen and Specker’s proof is the fact that a 1 - dimensional projector can be part<br />
of several decompositions of unity. This is possible as long as the projectors are not maximal, i.e.,<br />
if dim H > 2. The existence of degenerated projectors, apart from unity, is essential for the proof of<br />
Kochen and Specker, and for this reason it does not hold in a 2 - dimensional H where all projectors,<br />
except 11, are maximal. By means of degenerated projectors also non - commuting operators become<br />
connected to each other. By the requirement (V. 21) this is transferred to the quantities of the HVT, so
V. 4. CONTEXTUAL HIDDEN VARIABLES 121<br />
that via a detour we still impose a requirement for non - commeasurable quantities on the HVT. We<br />
will consider this in detail now.<br />
Suppose that operator A commutes with the maximal operators C 1 and C 2 , while [C 1 , C 2 ] ≠ 0.<br />
Then we have<br />
which implies<br />
A = f (C 1 ) and A = g(C 2 ), (V. 34)<br />
f (C 1 ) = g(C 2 ), (V. 35)<br />
and we see that A is degenerate. Function rule (V. 21) leads to the same relation between the quantities<br />
of the HVT,<br />
yielding<br />
A[λ] = f ( C 1 [λ] ) and A[λ] = g ( C 2 [λ] ) , (V. 36)<br />
f ( C 1 [λ] ) = g ( C 2 [λ] ) . (V. 37)<br />
Again, this is a relation between the value assignments to quantities which do not commute in quantum<br />
mechanics, but the relation is not one - to - one, the functions f and g are not bijective.<br />
It can be supposed that such a requirement is unreasonable is because such quantities are not<br />
commeasurable. In other words, the structure of quantum mechanics, and particularly the proposition<br />
that an operator can be a function of two non - commuting maximal operators, leads to relations<br />
between quantities which cannot be measured in one single experiment.<br />
The following is what occurs at the different decompositions of unity. Consider two bases, {|α j ⟩}<br />
and {|β j ⟩}, in a Hilbert space H of dimension N > 2 and suppose that |α 1 ⟩ = |β 1 ⟩, while all other<br />
basis vectors are different. Then we have<br />
N∑<br />
P |αj ⟩ = 11 =<br />
j=1<br />
N∑<br />
P |βj ⟩ and P |α1 ⟩ = P |β1 ⟩. (V. 38)<br />
j=1<br />
Define, as follows, two maximal operators with all coefficients c j and d j distinct,<br />
C :=<br />
N∑<br />
c j P |αj ⟩ and D :=<br />
j=1<br />
N∑<br />
d j P |βj ⟩, (V. 39)<br />
j=1<br />
then it follows that<br />
P |α1 ⟩ = f (C) = g(D). (V. 40)<br />
This leads to a connection between the non - commuting operators C and D, and using (V. 21)<br />
this leads to a connection between the corresponding representations C[λ] and D[λ] in the HVT. It is<br />
this type of relations which the HVT cannot satisfy.
122 CHAPTER V. HIDDEN VARIABLES<br />
◃ Remark<br />
Notice that the occurrence of non - maximal operators P |αi ⟩ is indeed essential, if P |αi ⟩ would be<br />
maximal, C and D would commute, as we saw in section II. 4 on p. 30. M.J. Maczynski (1971) has<br />
proved that if we exclusively consider maximal quantities, and therefore we would apply (V. 21) to<br />
maximal quantities only, Kochen and Specker’s theorem is no longer valid, and in that case a HVT is<br />
possible. ▹<br />
An obvious expedient is to strictly constrain requirement (V. 21) to quantities which are measurable<br />
within one context. In our example the projector P |α1 ⟩ is commeasurable with both C and D,<br />
while mutually C and D are not commeasurable. Therefore, we have to distinguish between a value<br />
assignment P |αi ⟩[λ] within the context of a measurement of C, and one within the context of a measurement<br />
of D. We can think, for example, of a measurement of C and application of the function relation<br />
P |α1 ⟩ = f(C), or of a measurement of D and application of the function relation P |α1 ⟩ = g(D).<br />
More generally, suppose<br />
A = f (C) = g(D) where [C, D] ≠ 0. (V. 41)<br />
Then we distinguish the hidden variable quantities A C [λ] and A D [λ], where the index indicates the<br />
context of measurement. If C and D do not commute there is, according to a contextual HVT, no<br />
reason to assume that for all λ ∈ Λ it holds that<br />
A C [λ] = A D [λ], (V. 42)<br />
as is the case in every HVT we have considered so far.<br />
Kochen and Specker do assume (V. 42), however, and find a contradiction with quantum mechanics.<br />
The remedy is therefore to ‘split up’ all degenerate quantities by addition of the context in which<br />
they are measured, as was firstly proposed by B.C. van Fraassen (1973). For the sake of convenience<br />
we here assume that a measurement of a degenerated quantity always develops by means of the measurement<br />
of a maximal quantity, which does not have to be split up. By definition we then have<br />
A C [λ] = f ( C [λ] ) and A D [λ] = g ( D[λ] ) . (V. 43)<br />
This yields a weaker form of (V. 21). Suppose A = f (C), B = g(C) and A = h(B) = h(g(C)),<br />
then using (V. 43) we have<br />
A C [λ] = h ( B C [λ] ) . (V. 44)<br />
This consideration leads to a new postulate for a HVT, which, in case the HVT accommodates this<br />
postulate, we call contextual.<br />
CONTEXTUAL OBSERVABLES POSTULATE:<br />
If A is a physical quantity which can be taken as a function of at least two other physical<br />
quantities, for example A = f (C) and A = g (D), then, in the HVT, to A corresponds<br />
a function A C : Λ → R iff quantity C is measured, and a function A D : Λ → R iff<br />
quantity D is measured. If A, f(C) and g(D) are the corresponding quantum mechanical<br />
operators, the following applies,<br />
∀ λ ∈ Λ : A C [λ] = A D [λ] ⇐⇒ [C, D] = 0. (V. 45)
V. 4. CONTEXTUAL HIDDEN VARIABLES 123<br />
Although splitting up quantities is a natural consequence of the idea of commeasurability, it means<br />
giving up a one - to - one relation between the quantities of quantum mechanics and those of the HVT in<br />
a very drastic manner; since the operator P |α1 ⟩ is part of infinitely many decompositions of unity, there<br />
are infinitely many contexts in which P |α1 ⟩ can be measured.<br />
The idea that the context of the measurement must be taken into the consideration can already be<br />
found in Bell (1966). In this article, which was actually written earlier than his famous article with the<br />
Bell inequality, Bell makes some observations concerning the requirements which could be imposed<br />
to a contextual HVT. They have to have a spatial meaning and enable us to interpolate a space - time<br />
picture, preferably causally, between the preparation and the measurement of states.<br />
He then considers Bohm’ s theory of the quantum potential, see chapter VI, and shows that this<br />
theory is not local. He wonders if every HVT of quantum mechanics must have this non - local character<br />
(Bell 1966, p. 452),<br />
However, it must be stressed that, to the present writer’s knowledge, there is no proof that<br />
any hidden variable account of quantum mechanics must have this extraordinary character.<br />
It would therefore be interesting, perhaps, to pursue some further “impossibility<br />
proofs,” replacing the arbitrary axioms objected to above by some condition of locality,<br />
or of separability of distant systems.<br />
Meanwhile, still before the delayed publication of his article, Bell (1964) himself had found such a<br />
proof.<br />
Now we will show how the idea of locality can be brought to expression in a contextual HVT with<br />
‘split’ quantities. Consider a composite system with Hilbert space H = H I ⊗ H II and an operator of<br />
the form A ⊗ 11 where A is maximal in H I . Then the operator A ⊗ 11 is not maximal in H, and<br />
A ⊗ 11 = f (X), (V. 46)<br />
where X is some maximal operator on H. Especially consider an X of the form<br />
X = X I ⊗ X II . (V. 47)<br />
Suppose there is no interaction, or not anymore, between the systems I and II. Then we can raise the<br />
question if X II must be taken to belong to the context of A ⊗ 11.<br />
Consider a second maximal operator<br />
Y = X I ⊗ Y II (V. 48)<br />
which only differs from X in the last factor. We then have<br />
A ⊗ 11 = f (X) = g(Y ). (V. 49)<br />
A requirement of locality is now that<br />
(A ⊗ 11) XI ⊗ X II<br />
[λ] = (A ⊗ 11) XI ⊗ Y II<br />
[λ], (V. 50)<br />
in other words, a change in that what is measured of system II, does not result in a splitting of<br />
quantities of system I. A contextual HVT satisfying (V. 50) is called local.
124 CHAPTER V. HIDDEN VARIABLES<br />
The key question is if a local contextual HVT is compatible with quantum mechanics. As an<br />
example we consider Bohm’s version of the thought experiment of EPR (Cooke and Hilgevoord 1979);<br />
two spin 1/2 particles being in a singlet state. Measurements of the spin of each of the particles<br />
correspond to operators of the form σ i ⊗ τ j , where σ i is the operator of the component of the spin of<br />
the first particle in the direction i and τ j is, likewise, the operator for the second particle. In contrast<br />
to the previously considered operators of the form X I ⊗ X II , the operators σ i ⊗ τ j are not maximal.<br />
Let us consider three directions, i, j ∈ {1, 2, 3}, which means there are nine such measurements.<br />
The result of a measurement of spin is either up or down, and consequently every measurement has<br />
four possible outcomes. If we introduce a quantity in the HVT for each of the nine quantities, we can,<br />
as we saw, reproduce the quantum mechanical predictions. Between the operators the relation<br />
σ i ⊗ τ j = (σ i ⊗ 11) (11 ⊗ τ j ), with i, j ∈ {1, 2, 3} (V. 51)<br />
holds. Now we also have to introduce quantities in the HVT for the six operators σ i ⊗ 11 and 11 ⊗ τ j .<br />
In an autonomous HVT the quantities must also satisfy (V. 51), because the factors on the right -<br />
hand side of (V. 51) commute. This means that there are only six independent quantities in the<br />
HVT and it can be shown that with this the experimental predictions of quantum mechanics can not<br />
be reproduced, see Wigner’s derivation in VII. 3.<br />
In a contextual HVT however, we consider the quantities σ i ⊗ 11 and 11 ⊗ τ j to be dependent of<br />
the context of the operators of which they are functions. Let χ(τ j ) be a function which assigns the<br />
value 1 to the outcome of every spin measurement τ j ,<br />
We then have<br />
χ(τ j ) = 11, with j ∈ {1, 2, 3}. (V. 52)<br />
(σ i ⊗ 11) σi ⊗ τ j<br />
[λ] = (σ i ⊗ χ(τ j ))[λ]. (V. 53)<br />
This quantity represents the spin of particle 1 within the context of a measurement of σ i ⊗ τ j ,<br />
which is a measurement of both spins followed by multiplication of the results. Since j ∈ {1, 2, 3},<br />
this gives a 3 - fold splitting of the quantity σ i ⊗ 11. The product rule now only applies to quantities<br />
in the same context, and the validity is trivial in this case. There are enough independent quantities in<br />
the HVT again to be able to reproduce quantum mechanics. The splitting worked out.<br />
But at the same time we see the price we have to pay; the splitting does not satisfy the weak<br />
requirement of locality (V. 50), because for j ≠ j ′ we make a distinction between the quantities<br />
(σ i ⊗ 11) σi ⊗ τ j<br />
[λ] and (σ i ⊗ 11) σi ⊗ τ j ′ [λ]. (V. 54)<br />
This means that properties, quantities having values, of the one particle can no longer be specified<br />
independent of those of the other particle, even if there is no interaction between these particles and<br />
they are located in different galaxies. Redhead (1987, p. 135) speaks of an ontological contextuality.<br />
The conclusion is that a contextual HVT has to be non - local to be compatible with quantum<br />
mechanics.<br />
◃ Remark<br />
Notice that we did not speak of a measurement of the quantity σ i ⊗ 11. We have invariably seen
V. 4. CONTEXTUAL HIDDEN VARIABLES 125<br />
this as being derived from the measurement of an operator of which it is a function. In this way the<br />
maximal operators eventually acquire a special status, they are not being split up and they are the<br />
only operators which can be measured directly. This can be assumed theoretically, but the relation<br />
with the experimental practice in the laboratory, where almost exclusively degenerated quantities are<br />
measured, is less clear. ▹
VI<br />
BOHMIAN <strong>MECHANICS</strong><br />
My suggestion is that at each state the proper order of operation of the mind requires<br />
an overall grasp of what is generally known, not only in formal, logical, mathematical<br />
terms, but also intuitively, in images, feelings, poetic usage of language, etc.<br />
— David Bohm<br />
But why then had Born not told me of this “pilot wave?” If only to point out what was<br />
wrong with it? [. . . ] Why is the pilot wave picture ignored in text books? Should it not be<br />
taught, not as the only way, but as an antidote to the prevailing complacency? To show<br />
that vagueness, subjectivity, and indeterminism, are not forced on us by experimental<br />
facts, but by deliberate theoretical choice?<br />
— John Bell<br />
We briefly describe Bohm’s hidden variables theory, which we will call Bohmian mechanics.<br />
Bohmian mechanics seems to have the same empirical strength as quantum mechanics, but succeeds<br />
to provide an image in space and time of what exactly takes place in micro - physical reality.<br />
VI. 1<br />
INTRODUCTION<br />
The debate between Bohr and Einstein concerning the interpretation of quantum mechanics<br />
reached its peak in the 1935 EPR - article. Although both authors frequently returned to the problems,<br />
neither of them has afterwards introduced new elements in his point of view. For most of<br />
the physicists in the nineteen thirties and later it was not difficult to declare a winner to the debate,<br />
Bohr’s view was accepted nearly unanimously. The question whether a physical reality hides behind<br />
quantum mechanics, which exists of objects having properties and of which we can form ourselves a<br />
picture in space and time, was put aside. It was also thought that Von Neumann’s proof, as discussed<br />
in V. 2, p. 114, made a hidden variables reconstruction of quantum mechanics untenable.<br />
It is the merit of Bohm to have made a breach in the Copenhagen interpretation for the first time,<br />
by doing exactly that what was impossible or meaningless according to the Copenhageners. In 1952<br />
he published two articles in which he presented a HVT of quantum mechanics. In the second article<br />
he describes the breach as follows (Bohm 1952 part II, p. 188)<br />
The usual interpretation of the quantum theory implies that we must renounce the possibility<br />
of describing an individual system in terms of a single precisely defined conceptual<br />
model. We have, however, proposed an alternative interpretation which does not imply
128 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />
such a renunciation, but which instead leads us to regard a quantum - mechanical system<br />
as a synthesis of a precisely definable particle and a precisely definable ψ - field which<br />
exerts a force on this particle.<br />
Bohm’s theory is strongly related to ideas which Louis de Broglie already put forward at the<br />
Solvay Conference in 1927. However, criticism from the Copenhageners at the conference, especially<br />
expressed by Pauli, made de Broglie abandon his theory, which was indeed not quite completely and<br />
consistently developed. Bohm devised, independently of de Broglie, an entirely elaborated version,<br />
which brought about a reconversion of de Broglie.<br />
We will study Bohm’s theory because it is an example of a concrete HVT, in contrast to the abstract<br />
characterization of such theories which we discussed in the previous chapter. We will see that Bohm’s<br />
theory shows remarkable aspects which differ thoroughly from classical physics.<br />
VI. 2<br />
THE <strong>QUANTUM</strong> POTENTIAL<br />
Bohm’s theory, which we will call Bohmian mechanics, starts from wave mechanics, i.e. quantum<br />
mechanics with L 2 (R n ) as its Hilbert space, but without the projection postulate. 1 This means that<br />
Bohm assumes that there is a wave function ψ(⃗q, t) which always satisfies the Schrödinger equation.<br />
First we consider the 1 - particle case, if there are more particles, ψ has more arguments.<br />
The idea is to interpret this wave function as a statistical description of a particle which always has<br />
a certain position and momentum. We will see that this particle must then be subjected to dynamics<br />
which differs from classical dynamics, by assuming that the forces acting on the particle are not<br />
exclusively the forces known from classical physics.<br />
The basic assumption is the Schrödinger equation for a particle with mass m in a time independent<br />
potential V (⃗q),<br />
i <br />
∂ψ(⃗q, t)<br />
∂t<br />
= − 2<br />
2 m ∇2 ψ(⃗q, t) + V (⃗q) ψ(⃗q, t), (VI. 1)<br />
but we will interpret the wave function differently from its usual interpretation in quantum mechanics.<br />
To this end, we rewrite ψ, with the help of two real functions R, S : R 4 → R, as<br />
ψ(⃗q, t) = R(⃗q, t) e i S(⃗q, t) . (VI. 2)<br />
It is always possible to find such functions R and S. Requiring R(⃗q, t) 0, R and S are, at given ψ,<br />
uniquely defined, except where ψ = 0. Substitution of (VI. 2) in (VI. 1), and separating the real and<br />
imaginary parts of the resulting equation, leads to two equations,<br />
∂R(⃗q, t)<br />
∂t<br />
∂S(⃗q, t)<br />
∂t<br />
= − 1 (<br />
R(⃗q, t) ∇ 2 S(⃗q, t) + 2 ∇ R(⃗q, t) · ∇ S(⃗q, t) ) ,<br />
2 m<br />
(VI. 3)<br />
( ) 2 ∇ S(⃗q, t)<br />
= −<br />
− V (⃗q) +<br />
2 ∇ 2 R(⃗q, t)<br />
.<br />
2 m<br />
2 m R(⃗q, t)<br />
(VI. 4)<br />
1 In the literature, under Bohmian mechanics a ’streamlined’ version of Bohm’s original theory is understood, without a<br />
quantum potential.
VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 129<br />
First we consider equation (VI. 3). Using the abbreviation ρ = R 2 this equation becomes<br />
∂ρ(⃗q, t)<br />
∂t<br />
+ ∇ ·<br />
(<br />
ρ(⃗q, t)<br />
)<br />
∇ S(⃗q, t)<br />
m<br />
= 0, (VI. 5)<br />
where ρ = R 2 is equal to |ψ| 2 , the quantum mechanical probability density for finding a particle<br />
at a certain position, which leads to the interpretation of ρ(⃗q, t) to be the probability density to find<br />
the particle at time t at position ⃗q ∈ R 3 . If we now interpret ∇S (⃗q, t) as the momentum of the<br />
particle, ∇S = ⃗p = m⃗v, (VI. 5) acquires a clear meaning; it is the continuity equation for a probability<br />
density ρ, which expresses that the total probability, given by the integral of ρ(⃗q, t) over R, is<br />
constant in time.<br />
Now consider equation (VI. 4). The last term in this equation is the only term of both (VI. 3)<br />
and (VI. 4) in which Planck’s constant appears explicitly. For this term we define the so - called<br />
quantum potential,<br />
U (⃗q, t) : = − 2<br />
2 m<br />
∇ 2 R(⃗q, t)<br />
. (VI. 6)<br />
R(⃗q, t)<br />
In case the quantum potential U would be equal to 0, equation (VI. 4) reads<br />
∂S(⃗q, t)<br />
∂t<br />
= −<br />
(<br />
∇ S(⃗q, t)<br />
) 2<br />
2 m<br />
− V (⃗q), (VI. 7)<br />
which is exactly the classical Hamilton - Jacobi equation for one particle. In (VI. 7), S is called the<br />
action, and ∇S is, as mentioned above, the momentum of the particle. In other words, if U = 0,<br />
we can interpret equations (VI. 3) and (VI. 4), and therefore also the equivalent Schrödinger equation<br />
(VI. 1), as the statistical description of a particle moving in a potential V in accordance with the<br />
laws of classical mechanics. We will discuss (VI. 7) more elaborately in section VI. 5, thereby also<br />
motivating the interpretation of ∇S.<br />
In case the quantum potential U would not be equal to 0, the just discussed interpretation can<br />
still be given if we assume that, next to the classical potential V , the quantum potential U is added<br />
as a correction to the equation of motion. The momentum is still given by ⃗p = ∇S, and (VI. 5)<br />
remains to be a continuity equation. However, (VI. 7) is replaced by (VI. 4), the Hamilton - Jacobi<br />
equation for a particle in the potential field V + U. We see that we have now adopted, besides the<br />
well - known −∇V , an extra force which acts on the particle,<br />
⃗F (⃗q, t) =<br />
d⃗p(⃗q, t)<br />
dt<br />
= − ∇ ( V (⃗q) + U (⃗q, t) ) . (VI. 8)<br />
If the limit → 0 is taken in the Schrödinger equation, (VI. 1), the result is nonsense, but if → 0 is<br />
taken in the definition (VI. 6) of the quantum potential U, we have U (⃗q, t) = 0, and (VI. 8) reduces<br />
to Newton’s law of motion.<br />
We will now discuss a simple example to illustrate the difference between Bohmian mechanics<br />
and quantum mechanics.
130 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />
EXAMPLE<br />
A particle sits in a 1 - dimensional ‘box’ of length L, having walls which are formed by infinitely<br />
high potential barriers. Quantum mechanics gives as stationary solutions<br />
ψ n (q, t) = ψ n (q) e − i En t , (VI. 9)<br />
with<br />
ψ n (q) =<br />
√<br />
2<br />
( nπq<br />
)<br />
L sin , q ∈ [0, L], (VI. 10)<br />
L<br />
and energy values<br />
E n =<br />
2<br />
2 m<br />
( n π<br />
) 2<br />
. (VI. 11)<br />
L<br />
Therefore, in Bohmian mechanics for a stationary state we have<br />
R n (q, t) = ψ n (q) and S n (q, t) = − E n t. (VI. 12)<br />
Now it is surprising that in this example it holds that<br />
p = ∂S n<br />
∂q<br />
= ∂(− E n t)<br />
∂q<br />
= 0, (VI. 13)<br />
i.e., according to Bohmian mechanics the particle is motionless. This also applies to other cases of<br />
stationary states, for example to the ground state of the hydrogen atom. It is in straight contradiction<br />
to the statements of quantum mechanics. After all, in the case of the box quantum mechanics<br />
assigns, if the particle is in the state ψ n , a large probability to finding the momentum p having values<br />
around ±nπ<br />
L<br />
, in which case the particle moves with p m<br />
> 0, although the quantum mechanical<br />
expectation value of p is zero for the particle in the box.<br />
This example shows that the statements of quantum mechanics and Bohmian mechanics do not<br />
coincide for all quantities. They only correspond concerning probability distributions for position<br />
measurements. Bohmian mechanics is, therefore, not a HVT in the sense of chapter V, where it was<br />
assumed that the statements of such a theory are similar to the statements of quantum mechanics for<br />
all quantities. Von Neumann’s impossibility proof is therefore not applicable to Bohmian mechanics.<br />
The explanation of the discrepancy between Bohmian mechanics and quantum mechanics lies, of<br />
course, in the use of the quantum potential. According to Bohm, the energy of the particle in the box<br />
has been entirely stored in the form of potential energy as a result of the quantum potential, hence,<br />
the particle has no kinetic energy.<br />
This changes however as soon as we open the box by removing one or both barriers. The quantum<br />
potential energy is again released, and the particle will start to move. The wave packet ψ(⃗q, t) then<br />
spreads out in space, in exactly the same way as prescribed by the Schrödinger equation, and there<br />
is no difference anymore between the statements of both theories concerning the movement of the<br />
particle.
VI. 2. THE <strong>QUANTUM</strong> POTENTIAL 131<br />
The discrepancy between Bohmian mechanics and quantum mechanics has no perceptible consequences<br />
if we argue that all measurements are ultimately made by means of observation of position.<br />
Every physical quantity is eventually determined by a ‘pointer’ with a certain position, and a momentum<br />
measurement must eventually be registered by means of the displacement of some object.<br />
◃ Remark<br />
Notice that Bohm’s point of view deviates from that of Bohr, which says that position and momentum<br />
measurements exclude each other in principle but are both necessary to be able to give an exhaustive<br />
description of the system. ▹<br />
Figure VI. 1: The quantum potential for the two slit system as viewed from the screen, under assumption<br />
of a Gaussian distribution at the slits (Bohm 1989 )<br />
Finally we consider a special case. Suppose that A, B ⊂ R 3 are disjoint areas in space,<br />
i.e. A ∩ B = ∅, ψ A and ψ B are wave functions which are 0 outside these areas, and the wave<br />
function has the following form,<br />
ψ(⃗q) = a ψ A (⃗q) + b ψ B (⃗q), (VI. 14)<br />
with a, b ∈ R. Since ψ A and ψ B have no overlap, for all ⃗q ∈ R 3 it holds that<br />
ψ A (⃗q) ψ B (⃗q) = 0. (VI. 15)<br />
Therefore, the probability density belonging to (VI. 14) is<br />
ρ(⃗q) = |a ψ A (⃗q)| 2 + |b ψ B (⃗q)| 2 , (VI. 16)<br />
without a cross - term, and we see that the ensemble of particles described by the density |ψ (⃗q)| 2<br />
behaves like a mixture.
132 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />
With<br />
S(⃗q) =<br />
⎧<br />
⎪⎨<br />
⎪⎩<br />
S A (⃗q) for ⃗q ∈ A,<br />
S B (⃗q) for ⃗q ∈ B,<br />
0 elsewhere,<br />
(VI. 17)<br />
and ψ A (⃗q) = R A (⃗q)e i S A(⃗q) , etc., (VI. 14) reads<br />
ψ(⃗q) = ( a R A (⃗q) + b R B (⃗q) ) e i S(⃗q) , (VI. 18)<br />
which means that also the quantum potential, as depicted in figure VI. 1, can now be taken as a sum<br />
of terms belonging to separate areas. The particles in area A do not perceive the wave function in<br />
area B at all.<br />
Figure VI. 2: A simulation of the double slit experiment in Bohmian mechanics. Each particle follows<br />
a certain path between the slits and the photographic plate. All particles coming from the upper slit<br />
arrive at the upper half of the photographic plate, likewise for the lower slit and lower half of the<br />
plate. The twists in the paths are caused by the quantum potential U. (Vigier et al. 1987 )<br />
VI. 3<br />
COMPOSITE SYSTEMS<br />
The technique used to rewrite the Schrödinger equation into equations describing particles with<br />
definite position and momentum in a non - classical potential field, can easily be generalized. For
VI. 3. COMPOSITE SYSTEMS 133<br />
example, for a system of two particles, represented by the wave function ψ (⃗q 1 , ⃗q 2 , t), we interpret<br />
|ψ(⃗q 1 , ⃗q 2 , t)| 2 as the probability density that, simultaneously, particle 1 is located at position ⃗q 1<br />
and particle 2 at position ⃗q 2 .<br />
We write<br />
ψ(⃗q 1 , ⃗q 2 , t) = R(⃗q 1 , ⃗q 2 , t) e i S(⃗q 1, ⃗q 2 , t) , (VI. 19)<br />
and the quantum potential is now given by<br />
2 ( 2 ∇1 R(⃗q 1 , ⃗q 2 , t)<br />
U (⃗q 1 , ⃗q 2 , t) = −<br />
+ ∇ 2 2 )<br />
R(⃗q 1 , ⃗q 2 , t)<br />
, (VI. 20)<br />
R(⃗q 1 , ⃗q 2 , t) 2 m 1 2 m 2<br />
where ∇ i := ∂ /∂⃗q i is the gradient to the coordinates of particle i. In this expression the coordinates<br />
of both particles occur. Therefore, the force on particle 1, ⃗ F 1 = −∇(V + U), also depends,<br />
by means of the quantum potential, on the position of particle 2, and vice versa. This can be compared<br />
to the situation in Newton’s gravitation theory, where such a dependence appears in the classical<br />
potential V ; there is an instantaneous interaction (Latin: actio in distans) between particles, a choice<br />
of another initial position of one particle immediately influences the dynamics of the other.<br />
Notice, however, that in Bohmian mechanics this influence does not have to decrease with the<br />
distance between the particles. Even if R (⃗q 1 , ⃗q 2 , t) would go to 0 for ∥⃗q 1 − ⃗q 2 ∥ → ∞, the quantum<br />
potential U(⃗q 1 , ⃗q 2 ) does not need to do so, it depends on the second derivative, which means that<br />
it depends on the strength of the oscillation of R, not on the amplitude.<br />
Also notice that the mutual dependence between the particles does not only appear by means of<br />
the quantum potential. The momentum of particle 1, given by ∇ 1 S(⃗q 1 , ⃗q 2 , t), cannot be chosen independently<br />
of the position of particle 2, and vice versa. This does not even happen in a classical theory<br />
with an actio in distans, and it gives Bohmian mechanics a deeply ‘holistic’ character.<br />
Only when the total wave function is a product this mutual dependence disappears, because then<br />
yielding<br />
ψ(⃗q 1 , ⃗q 2 , t) = ψ 1 (⃗q 1 , t) ψ 2 (⃗q 2 , t), (VI. 21)<br />
R(⃗q 1 , ⃗q 2 , t) = R 1 (⃗q 1 , t) R 2 (⃗q 2 , t),<br />
S(⃗q 1 , ⃗q 2 , t) = S 1 (⃗q 1 , t) + S 2 (⃗q 2 , t) (VI. 22)<br />
and, consequently, (VI. 20) becomes<br />
U (⃗q 1 , ⃗q 2 , t) = U 1 (⃗q 1 , t) + U 2 (⃗q 2 , t). (VI. 23)<br />
Each particle only feels its own potential field, and its momentum does not depend on the position<br />
of the other particle. If now the classical potential V is also a sum of 1 - particle potentials, this<br />
factorizability is preserved in time.<br />
We know, however, that the wave function ψ (⃗q 1 , ⃗q 2 , t) does in general not have to be a product<br />
state, and even if it is a product state at some moment, it will generally not remain to be one. We<br />
must therefore conclude that the quantum potential U represents a non - local connection between the<br />
particles.
134 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />
◃ Remark<br />
For Bell, this observation was a reason to examine if quantum mechanical HVT’s can, in fact, be local<br />
at all. We will come back to this in chapter VII. ▹<br />
An intermediate form occurs if A, B, C, D ⊂ R 3 are certain areas in space, such that A ∩ C = ∅<br />
or B ∩ D = ∅, ψ A , ψ C , ϕ B , ϕ D are wave functions which are 0 outside these areas, and the wave<br />
function is, analogously to (VI. 14), of the form<br />
ψ(⃗q 1 , ⃗q 2 ) = a ψ A (⃗q 1 ) ϕ B (⃗q 2 ) + b ψ C (⃗q 1 ) ϕ D (⃗q 2 ), (VI. 24)<br />
with a, b ∈ R. Since the pair ψ A and ψ C , or the pair ϕ B and ϕ D , or both, have no overlap, for<br />
all ⃗q 1 , ⃗q 2 ∈ R 3 we have<br />
ψ A (⃗q 1 ) ψ C (⃗q 1 ) = 0 or ϕ B (⃗q 2 ) ϕ D (⃗q 2 ) = 0. (VI. 25)<br />
Therefore, the probability density belonging to (VI. 24) is<br />
ρ(⃗q 1 , ⃗q 2 ) = R 2 (⃗q 1 , ⃗q 2 ) = |a ψ A (⃗q 1 ) ϕ B (⃗q 2 )| 2 + |b ψ C (⃗q 1 ) ϕ D (⃗q 2 )| 2 , (VI. 26)<br />
without a cross - term, and we see that the ensemble, again analogously to (VI. 14), behaves like a<br />
mixture. In this case we call the wave function ψ(⃗q 1 , ⃗q 2 ) effectively factorizable.<br />
With<br />
⎧<br />
S ⎪⎨ A (⃗q 1 ) + S B (⃗q 2 ) for ⃗q 1 ∈ A, ⃗q 2 ∈ B<br />
S tot (⃗q 1 , ⃗q 2 ) = S C (⃗q 1 ) + S D (⃗q 2 ) for ⃗q 1 ∈ C, ⃗q 2 ∈ D<br />
(VI. 27)<br />
⎪⎩<br />
0 elsewhere,<br />
and ψ A (⃗q 1 ) = R A (⃗q 1 )e i S A(⃗q 1 ) , etc., because of (VI. 25) it holds that<br />
ψ(⃗q 1 , ⃗q 2 ) = a R A (⃗q 1 ) R B (⃗q 2 ) e i (S A(⃗q 1 ) + S B (⃗q 2 ))<br />
+ b R C (⃗q 1 ) R D (⃗q 2 ) e i (S C (⃗q 1 ) + S D (⃗q 2 ))<br />
(VI. 28)<br />
= ( a R A (⃗q 1 ) R B (⃗q 2 ) + b R C (⃗q 1 ) R D (⃗q 2 ) ) e i Stot(⃗q 1, ⃗q 2 ) .<br />
Therefore, also in case of composite systems, the quantum potential can be taken as a sum of terms<br />
belonging to the separate particles, and the momentum of a particle does not depend on the other<br />
particle.<br />
Consequently, we can interpret the system as being composed of a pair of particles of which one<br />
particle is in area A and the other in B, or, likewise, in area C and D. The pair of particles is not<br />
influenced by the wave functions or the quantum potential in the other area. For this reason, these<br />
pilot waves are also called empty waves. They have no dynamic influence on the particles, but they<br />
do contain energy. If, at some time, the wave functions will have overlap again, they will of course<br />
also regain influence.
VI. 4. REMARKS AND PROBLEMS 135<br />
VI. 4<br />
REMARKS AND PROBLEMS<br />
In Bohmian mechanics the wave function a plays a double role. On the one hand, we see<br />
that ρ(⃗q, t 0 ) = R 2 = |ψ (⃗q, t 0 )| 2 is equal to the probability density to find a particle at time t 0 at<br />
a certain position, and we use this to characterize the ensemble at t 0 . On the other hand, ψ determines<br />
the value of R, and thereby, by means of formula (VI. 6) or (VI. 20), also the quantum potential which<br />
has the same status as the classical potential V . This means that ψ is also connected with the dynamic<br />
evolution of particles.<br />
This is strange if seen from a classical perspective. In classical statistical mechanics it is always<br />
possible to specify the form of the probability density at t 0 independently of the dynamics. Inversely,<br />
the force acting on a particle in a classical theory does not depend on the probabilities that the particle<br />
would be at another position then it actually is. But we saw that in Bohmian mechanics the force does<br />
depend on the probabilities. In Bohm’s interpretation we must therefore assume that if at an initial<br />
time t 0 the quantum mechanical probability density is |ψ (⃗q, t 0 )| 2 , the particles subsequently move<br />
under the influence of forces which are also determined by ψ(⃗q, t 0 ).<br />
Nonetheless, it can be proved that if this pre - established harmony is valid at one moment in<br />
time, it remains valid at all other times. In later work, Bohm speculated that this harmony between<br />
the quantum potential and the probability density could possibly be understood as a requirement for<br />
equilibrium of an underlying ‘sub - quantum aether’. From this idea the expectation arises that if this<br />
equilibrium can be disrupted, it can only after some time become restored again, so that deviations<br />
from the quantum mechanical predictions can appear at very swift measurements. Until now such<br />
deviations have not been found.<br />
Bohmian mechanics gives, on the basis of the thesis that, eventually, all measurements are position<br />
measurements, the same empirically verifiable predictions as standard quantum mechanics does.<br />
Moreover, it provides a picture in which particles have position and momentum and it can be visualized<br />
how the particles move through space, even if there is no measurement. Also, Bohmian<br />
mechanics is deterministic; the evolution is determined by classical mechanics, extended with the<br />
quantum potential. Although these properties seem to be large advantages, Bohm’s proposal evoked<br />
no enthusiasm in the nineteen fifties.<br />
Of course, from the side of the Copenhageners little support was to be expected. The proposal was<br />
dismissed as ‘metaphysical speculation’, a return to the lost paradise of classical physics. Bohm parried<br />
this argument by calling the Copenhageners’ ‘completeness’ claim untestable and metaphysical.<br />
But Einstein also found the idea ‘too cheap’ because it leaned too much on the quantum mechanical<br />
formalism in combination with the classical idea of particles. Einstein himself thought that a<br />
completely new theory with a totally different perspective was necessary, such as his unified field<br />
theory. Probably, Einstein also had objections because of the far - reaching non - locality of Bohmian<br />
mechanics.<br />
Others stumbled at the fact that Bohmian mechanics only relies on a rewriting of the Schrödinger<br />
equation, and contains nothing new. Bohm had foreseen this criticism and tried to argue that his theory<br />
presents new ideas for experiments and that on distance and energy scales which are within range of<br />
Heisenberg’s indeterminacy principle, Bohmian mechanics will prove to be necessary. But above all,<br />
Bohm wanted to show the possibility of a HVT and to challenge the necessity of the Copenhagen<br />
interpretation.
136 CHAPTER VI. BOHMIAN <strong>MECHANICS</strong><br />
Bohmian mechanics has not lead to new verifiable statements, although ‘tunneling times’ are debated,<br />
about which quantum mechanics does not say anything, but Bohmian mechanics does. Furthermore,<br />
by the fresh look supplied by Bohmian mechanics, new extensions of the theory are suggested,<br />
such as the suggestion of an underlying sub - quantum aether, as a result of the unexpected double<br />
role of the wave function.<br />
In the nineteen nineties, a growing group of physicists considered Bohmian mechanics to be a<br />
serious alternative for the Copenhagen interpretation, see for example Holland (1993) and Cushing<br />
(1994), who suggests a sociological explanation for the fact that the physicists’ community did not<br />
replace quantum mechanics by the, according to Cushing, superior Bohmian mechanics.<br />
VI. 5<br />
THE HAMILTON - JACOBI EQUATION<br />
In classical mechanics we assume that for a system of n point particles, with canonical positions<br />
⃗q = (q 1 , . . . , q n ) ∈ R 3n and speeds ˙⃗q = ( ˙q 1 , . . . , ˙q n ) ∈ R 3n , a Lagrangian L(⃗q, ˙⃗q, t) can be<br />
found, the Lagrangian L = T − V being the difference between kinetic and potential energy. Define<br />
the following functional, called the action<br />
∫<br />
S γ (⃗q, t; ⃗q 0 , t 0 ) := L(⃗q, ˙⃗q, t) dt, (VI. 29)<br />
γ<br />
where the integral, for n particles in 3 dimensions, is taken over a continuous path γ in configuration<br />
space R 3n between an initial configuration ⃗q 0 at time t 0 and the configuration ⃗q at time t. In case the<br />
Lagrangian does not explicitly depend on t, we can also write S γ (⃗q, ⃗q 0 , t − t 0 ).<br />
The equations of motion are found by application of Hamilton’s principle of least action; for the<br />
path γ 0 which is actually followed, the action reaches an extremum in comparison to all possible<br />
continuous paths. This requirement,<br />
δS γ = 0, (VI. 30)<br />
provides n equations of motion of Euler and Lagrange,<br />
d<br />
dt<br />
∂L<br />
∂ ˙q j<br />
− ∂L<br />
∂q j<br />
= 0. (VI. 31)<br />
The Hamiltonian, H = T + V , is defined as the Legendre transform of the Lagrangian,<br />
H (⃗q, ⃗p, t) :=<br />
3n∑<br />
j=1<br />
p j ˙q j − L(⃗q, ˙⃗q, t) (VI. 32)<br />
where<br />
p j := ∂L<br />
∂ ˙q j<br />
(VI. 33)<br />
is the canonical momentum.
VI. 5. THE HAMILTON - JACOBI EQUATION 137<br />
Substitution of (VI. 32) in (VI. 29) yields<br />
S γ =<br />
∫<br />
γ<br />
( 3n∑<br />
j=1<br />
)<br />
p j ˙q j − H (⃗q, ⃗p, t) dt =<br />
3n∑<br />
j=1<br />
∫<br />
γ<br />
p j dq j −<br />
∫<br />
γ<br />
H (⃗q, ⃗p, t) dt, (VI. 34)<br />
and variation of S γ in this form yields the 2n Hamiltonian equations of motion,<br />
˙q j = ∂H<br />
∂p i<br />
,<br />
ṗ j = − ∂H<br />
∂q i<br />
. (VI. 35)<br />
Now consider the action S γ along a real path γ 0 , i.e., a path satisfying the equations of motion,<br />
and form its differential,<br />
dS(⃗q, ⃗q 0 , t − t 0 ) =<br />
3n∑<br />
j=1<br />
(p j dq j − p 0j dq 0j ) − H (⃗q, ⃗p, t) dt. (VI. 36)<br />
Comparison with<br />
dS(⃗q, ⃗q 0 , t − t 0 ) =<br />
3n∑<br />
j=1<br />
( ∂S<br />
∂q j<br />
dq j +<br />
∂S )<br />
dq 0j + ∂S dt (VI. 37)<br />
∂q 0j ∂t<br />
and using requirement (VI. 30) shows that<br />
H (⃗q, ⃗p, t) = − ∂S<br />
∂t ,<br />
p j = ∂S<br />
∂q j<br />
,<br />
p 0j = − ∂S<br />
∂q 0j<br />
, (VI. 38)<br />
and therefore<br />
∂S<br />
(<br />
∂t + H ⃗q, ∂S )<br />
∂⃗q , t<br />
= 0. (VI. 39)<br />
This is (VI. 7), the Hamilton - Jacobi equation, as discussed on p. 129. The technique to solve the<br />
mechanical equations of motion by means of this equation is especially due to Jacobi. Without discussing<br />
this technique in detail, we mention the following.<br />
For definite q 0 and t 0 it is possible to consider the action S as a function on configuration space. It<br />
can be shown that the paths satisfying the equations of motion are always perpendicular to the hyperplanes<br />
of constant S, hence the frequently quoted analogy with optics; paths are comparable to rays<br />
of light, and planes of constant S to wave fronts. If, for one moment in time, the values S are given<br />
over the complete configuration space, the Hamilton - Jacobi equation determines how they evolve in<br />
the course of time. The problem to find the paths of the particles is thus reduced to constructing the<br />
curves which are normal to the planes of constant S.<br />
◃ Remark<br />
Schrödinger originally based his derivation of wave mechanics on the idea that wave mechanics is to<br />
classical mechanics as wave optics is to ray optics, and with the just mentioned wave fronts and the<br />
Hamilton - Jacobi equation he came to his wave mechanics. ▹
VII<br />
BELL’S INEQUALITIES<br />
There is hardly a paper - nor was there any during the past two and a half decades -<br />
which deals with the foundations of quantum mechanics and does not refer to the work<br />
of John Stewart Bell.<br />
Bell’s theorem is the most profound discovery of science.<br />
— Max Jammer<br />
— Henry Stapp<br />
[. . . ] Bell is generally credited with having brought down a purely philosophical issue<br />
from the lofty realms of abstract speculation to the tangible reach of empirical investigation<br />
and of having thereby established what has been called ‘experimental metaphysics’.<br />
— Max Jammer<br />
The ‘Bell inequalities’ is a generic term for inequalities in terms of measurable physical quantities<br />
which are satisfied by hidden variables theories, but are violated by quantum mechanics. We will<br />
derive several Bell inequalities, belonging to different types of hidden variables theories. This<br />
also includes indeterministic, stochastic HVT’s, which fell outside the scope of chapter V.<br />
VII. 1<br />
LOCAL DETERMINISTIC HIDDEN VARIABLES<br />
VII. 1. 1<br />
DERIVATION <strong>OF</strong> THE FIRST BELL INEQUALITY<br />
Returning to the hidden variables theories, HVT’s, we focus our attention at a specific experiment.<br />
In the article ‘On the Einstein Podolsky Rosen paradox’ (1964), J.S. Bell examines the EPR experiment,<br />
discussed in section I. 2, in a version which was given by Bohm and Aharonov (Bohm 1957),<br />
also called the EPRB experiment. Bohm and Aharonov proposed an experiment in which two spin<br />
1/2 particles are prepared in the singlet state and, next, move apart in opposite directions. After they<br />
are separated, the spin of each of the particles is measured in an arbitrary direction, where the spin of<br />
particle 1 is measured in direction ⃗a and the remote particle 2 in direction ⃗ b, as in figure III. 3, p. 73.<br />
In this experiment, one can follow the same argument as EPR. Using the notation of section III. 6,<br />
if measurement of ⃗σ 1 · ⃗a yields the value +1 then, for the singlet state, measurement of ⃗σ 2 · ⃗a must<br />
yield the value −1 and vice versa.<br />
Since the result of a measurement of a spin component of the one particle can be predicted with<br />
certainty by measuring the same component of the other particle, whereas the particles are far away
140 CHAPTER VII. BELL’S INEQUALITIES<br />
from each other and do not interact, it follows, according to EPR, that the result of a measurement<br />
of any spin component is determined in advance, i.e., that it is an element of physical reality. This<br />
suggests that there there should be a more complete description of the state of the particles, including<br />
hidden variables.<br />
Specify this description of the pair of particles with variables λ ∈ Λ as we did in chapter V. We<br />
write the quantities corresponding to (⃗σ 1·⃗a)⊗(⃗σ 2·⃗b) as the pair (A, B), having values a,b = ±1. In a<br />
contextual HVT, these values are dependent on the hidden variable λ and the total measuring context,<br />
which can be specified here by means of the measurement directions ⃗a and ⃗ b, leading to<br />
A = A(⃗a, ⃗ b, λ) and B = B(⃗a, ⃗ b, λ). (VII. 1)<br />
Now the essential assumption is the requirement of locality that the quantity A does not depend<br />
on the reading ⃗ b of a remote spin meter, and vice versa for B and ⃗a. These quantities therefore only<br />
depend upon the local context,<br />
A(⃗a, ⃗ b, λ) = A(⃗a, λ), a = ±1,<br />
B(⃗a, ⃗ b, λ) = B( ⃗ b, λ), b = ±1. (VII. 2)<br />
b = +1<br />
b = −1<br />
B( ⃗ b, λ)<br />
A(⃗a, λ)<br />
a = +1<br />
a = −1<br />
b ′ = +1<br />
b ′ = −1<br />
B( ⃗ b ′ , λ)<br />
A(⃗a ′ , λ)<br />
a ′ = +1<br />
a ′ = −1<br />
Spin meter B<br />
ρ(λ)<br />
Spin meter A<br />
Source<br />
Figure VII. 1: Thought experiment of Einstein, Podolsky and Rosen on the singlet<br />
The source emitting the particle pairs probably does not prepare the pairs in the same state λ each<br />
time. We assume that the source can be characterized by a probability density ρ,<br />
∫<br />
ρ(λ) dλ = 1, (VII. 3)<br />
Λ<br />
where we also assume that this probability density does not depend on the measuring directions ⃗a<br />
and ⃗ b, which, after all, can be established long after the particles have left the source. The expectation<br />
value of the product of A and B in this HVT is therefore<br />
∫<br />
E(⃗a, ⃗ b) = A(⃗a, λ) B( ⃗ b, λ) ρ(λ) dλ. (VII. 4)<br />
Λ
VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 141<br />
Quantum mechanics gives as the expectation value, with the particle pair in the singlet state, see<br />
equation (III. 171), p. 73,<br />
E QM (⃗a, ⃗ b) = ⟨ ⃗σ 1 · ⃗a ⊗ ⃗σ 2 · ⃗b ⟩ = −⃗a · ⃗b = − cos θ ⃗a, ⃗ b<br />
. (VII. 5)<br />
But the expressions (VII. 4) and (VII. 5) cannot coincide for all directions ⃗a and ⃗ b. According<br />
to (VII. 2), the expectation value E(⃗a, ⃗ b) of the product of A and B cannot be less than −1. Therefore,<br />
to reach −1 at ⃗a = ⃗ b, also requiring equality between (VII. 4) and (VII. 5), it must hold for all unit<br />
vectors ⃗n that<br />
A(⃗n, λ) = − B(⃗n, λ), (VII. 6)<br />
which leads to<br />
∫<br />
E(⃗a, ⃗ b) = −<br />
Λ<br />
A(⃗a, λ) A( ⃗ b, λ) ρ(λ) dλ. (VII. 7)<br />
Now it follows, because of ( A(⃗n, λ) ) 2 = 1, that<br />
∫<br />
E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />
b ′ ) = − A(⃗a, λ) A( ⃗ b, λ) − A(⃗a, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />
=<br />
Λ<br />
∫<br />
Λ<br />
A(⃗a, λ) A( ⃗ b, λ) ( A( ⃗ b, λ) A( ⃗ b ′ , λ) − 1 ) ρ(λ) dλ, (VII. 8)<br />
where ⃗ b ′ is another setting of the remote spin meter, and A( ⃗ b ′ , λ) also has values ±1. Taking the<br />
absolute value on both sides, keeping in mind that |A(⃗a, λ)A( ⃗ b, λ)| = 1, it follows that<br />
∫<br />
|E(⃗a, ⃗ b) − E(⃗a, ⃗ (<br />
b ′ )| 1 − A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ, (VII. 9)<br />
or,<br />
Λ<br />
|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| 1 + E( ⃗ b, ⃗ b ′ ). (VII. 10)<br />
This is the original Bell inequality.<br />
VII. 1. 2<br />
THE BELL INEQUALITY <strong>OF</strong> CLAUSER, HORNE, SHIMONY AND HOLT<br />
Next, we will derive a second inequality. In (VII. 8), we replace ⃗a by ⃗a ′ and the − sign by<br />
the + sign,<br />
∫<br />
E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ (<br />
b ′ ) = − A(⃗a ′ , λ) A( ⃗ b, λ) + A(⃗a ′ , λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ<br />
∫<br />
= −<br />
Λ<br />
Λ<br />
A(⃗a ′ , λ) A( ⃗ b, λ) ( 1 + A( ⃗ b, λ) A( ⃗ b ′ , λ) ) ρ(λ) dλ. (VII. 11)
142 CHAPTER VII. BELL’S INEQUALITIES<br />
Now, in the same way as we derived (VII. 10), we obtain<br />
|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 1 − E( ⃗ b, ⃗ b ′ ). (VII. 12)<br />
Combination of (VII. 10) and (VII. 12) leads to<br />
|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2. (VII. 13)<br />
This version of the Bell inequality has been first derived although under weaker assumptions than<br />
used here, by Clauser, Horne, Shimony and Holt (Clauser 1969), for which reason it is also called the<br />
CHSH inequality. We will return to these assumptions in section VII. 2,<br />
VII. 1. 3<br />
VIOLATION <strong>OF</strong> THE BELL INEQUALITIES BY <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />
We will now prove the following theorem.<br />
BELL’S FIRST THEOREM:<br />
A local deterministic HVT is empirically contradictory to quantum mechanics.<br />
Proof<br />
With the expression empirically contradictory we mean that the two theories make contradictory<br />
statements in terms of measurable physical quantities. We will show that, quantum mechanically,<br />
there are spin quantities which violate the Bell inequalities.<br />
Consider the configuration below, where all vectors lie in the same plane.<br />
a<br />
a ′ , b<br />
b ′<br />
ϕ<br />
ϕ<br />
Figure VII. 2: A configuration in which the spin quantities violate the Bell inequality<br />
Using (VII. 5) for this configuration, and substituting the quantum mechanical expression into (VII. 13),<br />
F (ϕ) := | − cos ϕ + cos 2ϕ | + | − cos ϕ − 1| 2, . (VII. 14)<br />
This function is plotted in figure VII. 3.
VII. 1. LOCAL DETERMINISTIC HIDDEN VARIABLES 143<br />
2<br />
F (ϕ)<br />
0<br />
π/2<br />
ϕ →<br />
π<br />
Figure VII. 3: The Bell inequality violated for every acute angle ϕ<br />
We see that (VII. 14) is violated for every ϕ ∈ (0, 1 2 π). The maximum violation is F (60◦ ) = 5 2 ,<br />
as can be seen in the figure.<br />
Even larger violations are by the next configuration:possible in other configurations. The largest<br />
violation is obtained in the configuration of figure VII. 4(with all vectors in a single plane),<br />
leading to<br />
E QM (⃗a, ⃗ b) = − cos 45 ◦ = − 1 2<br />
√<br />
2,<br />
E QM (⃗a, ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />
√<br />
2,<br />
E QM (⃗a ′ , ⃗ b) = − cos 135 ◦ = 1 2<br />
√<br />
2,<br />
E QM (⃗a ′ , ⃗ b ′ ) = − cos 135 ◦ = 1 2<br />
√<br />
2,<br />
|E QM (⃗a, ⃗ b) − E QM (⃗a, ⃗ b ′ )| + |E QM (⃗a ′ , ⃗ b) + E QM (⃗a ′ , ⃗ b ′ )| = 2 √ 2. (VII. 15)<br />
This is a violation of 41%. □<br />
a<br />
b<br />
a ′ 45 ◦ b ′<br />
Figure VII. 4: the configuration giving the largest violation of the Bell inequality (all vectors in the<br />
same plane)
144 CHAPTER VII. BELL’S INEQUALITIES<br />
VII. 1. 4<br />
THE BELL INEQUALITY IN A NON-CONTEXTUAL, LOCAL DETERMINISTIC HVT<br />
To show that the Bell inequality, derived for a local deterministic contextual HVT, also holds for<br />
a local deterministic autonomous HVT, we consider a local deterministic autonomous model for the<br />
singlet.<br />
Assume that both particles are characterized by a ‘classical’ spin vector, ⃗ J and − ⃗ J, about a<br />
common axis. This is the hidden variable. In this HVT, we further assume that the outcome of a<br />
measurement of spin in the direction ⃗n is determined by the sign of the component of the spin vector<br />
in the direction ⃗n. Now let the particles fly away from each other. If the spin of the first particle in the<br />
direction ⃗a is measured we find the outcome<br />
⃗J · ⃗a<br />
∥ ⃗ J · ⃗a∥<br />
∈ {− 1, 1}, (VII. 16)<br />
for the spin of the second particle in direction ⃗ b we find<br />
− ⃗ J · ⃗b<br />
∥ ⃗ J · ⃗b∥<br />
∈ {− 1, 1}. (VII. 17)<br />
The result of the measurement of the first particle is independent of the direction ⃗ b and vice versa,<br />
therefore, the model is local.<br />
Now consider an ensemble of such two particle systems where ⃗ J is distributed isotropically. If a n<br />
is the sign of ⃗ J · ⃗a in the n th pair, and likewise, b n the sign of − ⃗ J · ⃗b, then if ⃗ J pierces through the<br />
shaded area of the unit sphere on the right side in figure VII. 5, a n b n = +1. Otherwise, a n b n = −1.<br />
⃗a<br />
+<br />
⃗a<br />
⃗ b<br />
−<br />
⃗J<br />
θ<br />
θ<br />
+<br />
−<br />
⃗ b<br />
−<br />
+<br />
− ⃗ J<br />
Figure VII. 5: Unit spheres for a n , b n and a n b n . In the shaded areas of the larger sphere a n b n is<br />
positive, in the unshaded areas a n b n is negative.
VII. 2. LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES 145<br />
The surface of the shaded area is 4θ ⃗a, ⃗ b<br />
, that of the remaining part is 4(π − θ ⃗a, ⃗ b<br />
). For an isotropic<br />
distribution, averaging over the surface of the unit sphere, we therefore find<br />
⟨a n b n ⟩ = 1 (<br />
4 θ⃗a, ⃗<br />
4 π b<br />
− 4 (π − θ ⃗a, ⃗ b<br />
) ) = − 1 + 2 π θ ⃗a, ⃗ , (VII. 18)<br />
b<br />
which is an increasing line through (0, −1) having slope π 2 . This runs from perfect anti - correlation<br />
for θ = 0 to perfect correlation for θ = π.<br />
1<br />
− cos θ ⃗a, ⃗ b<br />
⟨a n b n ⟩<br />
0<br />
θ →<br />
π<br />
− 1<br />
Figure VII. 6: Comparison of the quantum mechanical expectation values and those for the local<br />
deterministic HVT<br />
In this HVT, equation (VII. 18) must satisfy the Bell inequality (VII. 13) for E (⃗a, ⃗ b) = ⟨a n b n ⟩.<br />
Choosing the angles as in the example on p. 142, figure VII. 2, if (VII. 18) is substituted in (VII. 13)<br />
it yields exactly 2 for any θ π, where the quantum mechanical expectation values violated the<br />
inequality for every θ ∈ (0, 1 2 π).<br />
In the configuration giving the largest violation of the inequality (VII. 13), see figure VII. 4, we<br />
have<br />
θ ⃗a, ⃗ b<br />
= 1 4 π and θ ⃗a, ⃗ = θ<br />
b ′ ⃗a ′ , ⃗ b = θ ⃗a ′ , ⃗ = 3 b ′ 4<br />
π, (VII. 19)<br />
and therefore, (VII. 18) substituted in (VII. 13) yields<br />
| ( − 1 + 2) 1 ( ) ( ) (<br />
− − 1 +<br />
3<br />
2 | + | − 1 +<br />
3<br />
2 + − 1 +<br />
3<br />
2)<br />
| = 1 + 1 = 2, (VII. 20)<br />
where quantum mechanically, on p. 143 we found 2 √ 2.<br />
We see that where quantum mechanics violated the inequality (VII. 13), this local deterministic<br />
autonomous HVT satisfies it, thereby confirming Bell’s first theorem.<br />
VII. 2<br />
LOCAL DETERMINISTIC CONTEXTUAL HIDDEN VARIABLES<br />
We have seen that a considerable difference exists between the empirically verifiable statements<br />
of quantum mechanics and those of a local deterministic, autonomous HVT for a singlet state and
146 CHAPTER VII. BELL’S INEQUALITIES<br />
suitably chosen spin directions. This enables an experimental test of these statements, and therefore<br />
of the correctness of the philosophical bases of both theories. A. Shimony (1989) spoke, concerning<br />
the experimental testing of the Bell inequalities, of ‘experimental metaphysics’.<br />
However, the question of experimental testing puts the derivation of the Bell inequalities in another<br />
perspective. We no longer want to compare a HVT with quantum mechanics, but with experimental<br />
results. In this respect (VII. 6), implying perfect anti - correlation when ⃗a = ⃗ b, is overly<br />
idealized. In a real experiment the particle detectors are not perfectly efficient, in the sense that not<br />
all particles are registered. Imagine a detector which, even if A(⃗a, λ) = 1, sometimes gives 0, i.e. not<br />
measured, or even −1, i.e. wrongly measured. Moreover, in a contextual HVT the outcomes could also<br />
be dependent of the measuring context, i.e. of (possibly hidden) variables of the detectors. But also in<br />
this generalized situation it is possible to derive the inequality (VII. 13) from a locality assumption.<br />
We will show this by proving the next theorem.<br />
BELL’S SECOND THEOREM:<br />
A local deterministic contextual HVT is empirically inconsistent with quantum mechanics.<br />
Proof<br />
Assume that the quantities A and B are functions of three arguments,<br />
A = A(⃗a, λ, µ), B = B( ⃗ b, λ, ν) where A, B ∈ {− 1, 1}. (VII. 21)<br />
Here the local deterministic character of the HVT is expressed; the outcome of the measurement<br />
at the measuring apparatus measuring ⃗a · ⃗σ is determined by λ ∈ Λ, describing the source, by the<br />
local hidden variables of that measuring device, expressed symbolically by µ ∈ Λ a , and by the<br />
position ⃗a of the meter pointer. Therefore, the requirement of locality is that A does not depend<br />
on ⃗ b and ν, and B does not depend on ⃗a and µ. We also assume that the hidden variables of the<br />
apparatuses are independent of each other and of λ,<br />
Defining<br />
ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ) ρ 2 (ν). (VII. 22)<br />
and<br />
⟨A(⃗a, λ)⟩ :=<br />
⟨B( ⃗ b, λ)⟩ :=<br />
∫<br />
A(⃗a, λ, µ) ρ 1 (µ) dµ (VII. 23)<br />
Λ a<br />
∫<br />
B( ⃗ b, λ, ν) ρ 2 (ν) dν, (VII. 24)<br />
Λ b<br />
we have, instead of assumption (VII. 2), the much weaker requirements<br />
|⟨A(⃗a, λ)⟩| 1 and |⟨B( ⃗ b, λ)⟩| 1, (VII. 25)<br />
and we will show now that from this it is again possible to derive the Bell inequality (VII. 13).
dµ A(⃗a, λ, µ) dν B( ⃗ b, λ, ν) ρ(λ, µ, ν)<br />
VII. 3. WIGNER’S DERIVATION 147<br />
The expectation value in this HVT is<br />
∫ ∫<br />
∫<br />
E(⃗a, ⃗ b) = dλ<br />
Λ Λ a Λ b<br />
∫<br />
= ⟨A(⃗a, λ)⟩ ⟨B( ⃗ b, λ)⟩ ρ(λ) dλ, (VII. 26)<br />
Λ<br />
which is an ‘averaged’ version of (VII. 4). With (VII. 25) we see that<br />
∫<br />
|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| = |⟨A(⃗a, λ)⟩ ( ⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩ ) | ρ(λ) dλ<br />
Λ<br />
∫<br />
|⟨B( ⃗ b, λ)⟩ − ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ. (VII. 27)<br />
Λ<br />
Likewise we have<br />
|E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| <br />
∫<br />
Λ<br />
|⟨B( ⃗ b, λ)⟩ + ⟨B( ⃗ b ′ , λ)⟩| ρ(λ) dλ, (VII. 28)<br />
and therefore<br />
|E(⃗a, ⃗ b) − E(⃗a, ⃗ b ′ )| + |E(⃗a ′ , ⃗ b) + E(⃗a ′ , ⃗ b ′ )| 2, (VII. 29)<br />
since |x + y| + |x − y| 2 if |x| 1 and |y| 1. We see that (VII. 29) is, indeed, the Bell<br />
inequality (VII. 13).<br />
For ⃗a ′ = ⃗ b ′ and the assumption of perfect anti - correlation E ( ⃗ b ′ , ⃗ b ′ ) = −1, from inequality<br />
(VII. 13) follows the original Bell inequality (VII. 10). But, as we showed, (VII. 13) remains<br />
valid under the weaker conditions (VII. 25). □<br />
◃ Remark<br />
It is not necessary to assume mutual independence for µ and λ or for ν and λ as in (VII. 22), the<br />
result (VII. 25) also follows when we make the weaker assumption that the conditional probability<br />
distributions of the apparatuses factorize the conjoint probability distribution ρ,<br />
ρ(λ, µ, ν) = ρ(λ) ρ 1 (µ | λ) ρ 2 (ν | λ). ▹ (VII. 30)<br />
VII. 3<br />
WIGNER’S DERIVATION<br />
E.P. Wigner (1970) was the first to give an elegant derivation of a Bell inequality in terms<br />
of probabilities. We again consider the EPRB experiment from section VII. 1. Using three directions,<br />
⃗n 1 , ⃗n 2 , ⃗n 3 ∈ R 3 , define<br />
σ i := ⃗n i · ⃗σ and τ i := ⃗n i · ⃗τ with i ∈ {1, 2, 3}. (VII. 31)
148 CHAPTER VII. BELL’S INEQUALITIES<br />
Here ⃗σ and ⃗τ are the spin operators of particle 1 and particle 2, respectively. We assume the quantities<br />
of particle 1 to be independent of those of particle 2 and therefore<br />
(σ i ⊗ 11) σi ⊗ τ j<br />
[λ] = (σ i ⊗ 11) σi ⊗ τ j ′ [λ], (VII. 32)<br />
(11 ⊗ τ j ) σi ⊗ τ j<br />
[λ] = (11 ⊗ τ j ) σi ′ ⊗ τ j<br />
[λ]. (VII. 33)<br />
for i ′ ≠ i and j ′ ≠ j. This is the requirement of locality. Without this requirement we would have<br />
nine quantities in the HVT, namely the pairs (σ i ,τ j ), that is, as much quantities as measuring contexts.<br />
Now we have only six: σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 .<br />
The outcome of measurement of every spin quantity is ±1 in units of 1 2<br />
. A HVT must grant a<br />
probability to every combination of outcomes,<br />
0 p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) 1, (VII. 34)<br />
with the usual marginal distributions, for instance<br />
p (σ 1 , τ 1 ) =<br />
∑+1<br />
∑+1<br />
∑+1<br />
∑+1<br />
σ 2 =−1 σ 3 =−1 τ 2 =−1 τ 3 =−1<br />
p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ), (VII. 35)<br />
and so on.<br />
◃ Remark<br />
Quantum mechanics does not have such joint probability distributions because these six quantities<br />
do not all in pairs commute with each other. The spin quantities are not jointly measurable but in<br />
the HVT their values are all fixed. ▹<br />
Calling the angles between ⃗n 1 , ⃗n 2 , ⃗n 3 : θ 12 , θ 23 , θ 31 , then in the singlet state we have, see chapter<br />
III, (III. 176) and (III. 177),<br />
Prob (σ i = 1 ∧ τ j = 1) = 1 2 sin2 1 2 θ ij, (VII. 36)<br />
Prob (σ i = 1 ∧ τ j = − 1) = 1 2 cos2 1 2 θ ij. (VII. 37)<br />
These are the quantum mechanical probabilities and we will see that the HVT, satisfying requirement<br />
(VII. 34), cannot reproduce this. From (VII. 36) and (VII. 37) follows the requirement<br />
p (σ 1 , σ 2 , σ 3 , τ 1 , τ 2 , τ 3 ) = 0 unless σ 1 = − τ 1 , σ 2 = − τ 2 , σ 3 = − τ 3 , (VII. 38)<br />
because the hidden variables cannot assume values giving a positive spin of both particles in the same<br />
direction.<br />
The probability for σ 1 and τ 3 to be both +1 is, using (VII. 36),<br />
∑ ∑<br />
p (+, σ 2 , σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 13 (VII. 39)<br />
τ 1 ,τ 2<br />
σ 2 ,σ 3<br />
= p (+, +, −, −, −, +) + p (+, −, −, −, +, +).
Likewise we calculate the following probabilities<br />
∑ ∑<br />
and<br />
σ 1 ,σ 3<br />
∑ ∑<br />
σ 2 ,σ 3<br />
VII. 3. WIGNER’S DERIVATION 149<br />
τ 1 ,τ 2<br />
p (σ 1 , +, σ 3 , τ 1 , τ 2 , +) = 1 2 sin2 1 2 θ 23 (VII. 40)<br />
= p (+, +, −, −, −, +) + p (−, +, −, +, −, +)<br />
τ , τ 3<br />
p (+, σ 2 , σ 3 , τ 1 , +, τ 3 ) = 1 2 sin2 1 2 θ 12 (VII. 41)<br />
From (VII. 40) and (VII. 41) it follows that<br />
= p (+, −, +, −, +, −) + p (+, −, −, −, +, +).<br />
p (+, +, −, −, −, +) 1 2 sin2 1 2 θ 23 and (VII. 42)<br />
p (+, −, −, −, +, +) 1 2 sin2 1 2 θ 12, (VII. 43)<br />
respectively. Consequently, we have for (VII. 39), the probability for σ 1 and τ 3 to be both +1,<br />
1<br />
2 sin2 1 2 θ 23 + 1 2 sin2 1 2 θ 12 1 2 sin2 1 2 θ 13, (VII. 44)<br />
which, using sin 2 1 2 θ = 1 2<br />
(1 − cos θ), is equal to<br />
(1 − cos θ 23 ) + (1 − cos θ 12 ) (1 − cos θ 13 ). (VII. 45)<br />
This is, in essence, the same as inequality (VII. 10); rewriting (VII. 45), realizing that 1 − cos θ 0,<br />
and comparing E(⃗a, ⃗ b) to − cos θ 12 etc. yields<br />
1 − cos θ 23 | − cos θ 12 + cos θ 13 |. (VII. 46)<br />
n 2<br />
n 1<br />
ϕ ϕ<br />
n 3<br />
Figure VII. 7: Violation of the Bell inequality again<br />
With θ 23 = θ 12 = 1 2 θ 13 = ϕ as in diagram VII. 7, (VII. 45) becomes<br />
1 − 2 cos ϕ + cos 2ϕ 0, (VII. 47)<br />
and using cos 2ϕ = 2 cos 2 ϕ − 1 we see that<br />
cos ϕ (1 − cos ϕ) 0. (VII. 48)<br />
Since 1 − cos ϕ 0 for every ϕ, this inequality is violated for every acute angle.
150 CHAPTER VII. BELL’S INEQUALITIES<br />
EXERCISE 32. What type of HVT is excluded by Wigner’s reasoning?<br />
◃ Remark<br />
Wigner (1970) makes the observation that the HVT would have been possible if the terms in (VII. 44)<br />
had been sin 1 2 θ instead of sin2 1 2θ. Apparently, our world depends on such ‘minimal’ mathematical<br />
differences. ▹<br />
VII. 4<br />
THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP<br />
In the previous derivations of the Bell inequalities hidden variables were assumed, which represent<br />
properties of the pair of particles and determine the outcomes of measurements of all physical<br />
quantities. As a consequence, in this HVT a joint probability is defined for the values of non -<br />
commuting quantities also, as we saw in Wigner’s derivation. This follows from the fact that at<br />
given λ both A(⃗a, λ) and A(⃗a ′ , λ) are fixed, for example<br />
p ( A(⃗a) = 1 ∧ A(⃗a ′ ) = 1 ) ∫<br />
= ρ(λ) dλ, (VII. 49)<br />
∆<br />
where ∆ ⊂ Λ is the area in which both A(⃗a, λ) = 1 and A(⃗a ′ , λ) = 1. Since quantum mechanics<br />
does not acknowledge such ‘simultaneous probabilities’ for non - commuting quantities, the quantities<br />
not being simultaneously measurable, it could be suspected that this property of the HVT is the main<br />
reason for the deviation from quantum mechanics, instead of locality or determinism.<br />
In the next derivation of the Bell inequality, given by P. Eberhard and H. Stapp (1977), the existence<br />
of hidden variables is not assumed. They claim that the Bell inequality follows from an assumption<br />
of locality only. However, what will be shown to be necessary in this derivation, is the<br />
assumption that we can speak reasonably about the outcomes of measurements which have not actually<br />
been carried out.<br />
THE EBERHARD - STAPP THEOREM:<br />
Quantum mechanics is a non - local theory.<br />
Proof<br />
Consider again the EPRB experiment. Let ⃗a and ⃗a ′ be two readings of the spin meter at A, and ⃗ b<br />
and ⃗ b ′ likewise at B. We can carry out four experiments:<br />
I : ⃗a, ⃗ b II : ⃗a, ⃗ b ′ III : ⃗a ′ , ⃗ b IV : ⃗a ′ , ⃗ b ′ . (VII. 50)<br />
Define, for the n th pair of particles, a n (I) as the outcome of a spin measurement in the direction ⃗a<br />
of the particle traveling to A while the meter at A points in the direction ⃗a, while at the other<br />
particle, which travels to B, spin in the direction ⃗ b is measured; this gives a n (I) = ±1 for<br />
experiment I and likewise for a n (II), a n ′ (III), a n ′ (IV), b n (I), b n ′ (II), b n (III) and b n ′ (IV).<br />
These values represent outcomes of measurements of actual or possible measurements, not actual<br />
properties of the particles which also exist if they are not measured.
VII. 4. THE DERIVATION <strong>OF</strong> EBERHARD AND STAPP 151<br />
The assumption of locality is that an outcome of measurement of spin of particle 1, in direction ⃗a,<br />
does not depend on which spin direction, ⃗ b or ⃗ b ′ , is measured of the other, remote particle 2. This<br />
is the supposition of locality from the Eberhard - Stapp theorem, leading to what we will call the<br />
matching condition,<br />
a n (I) = a n (II),<br />
a n ′ (III) = a n ′ (IV),<br />
b n (I) = b n (III), b n ′ (II) = b n ′ (IV), (VII. 51)<br />
for all N particle pairs in the singlet state |Ψ 0 ⟩.<br />
Now we can define the following mathematical expression<br />
γ n := a n (I) b n (I) + a n (II) b n ′ (II) + a n ′ (III) b n (III) − a n ′ (IV) b n ′ (IV), (VII. 52)<br />
where the first term corresponds to experiment I, the second to experiment II, etc. Because of the<br />
value assignment ±1, γ is an even integer, and the fourth term being the product of the first three<br />
terms, subtraction of the fourth term means that γ has only two values, as we will see. Moreover,<br />
subtraction allows for an inequality similar to Bell’s inequality (VII. 13).<br />
In (VII. 52) we can omit writing out the labels referring to the numbers of the experiments because<br />
of the matching condition (VII. 51); a n := a n (I) = a n (II), etc. Rewriting (VII. 52),<br />
γ n = a n (b n + b n ′ ) + a n ′ (b n − b n ′ ), (VII. 53)<br />
because of the value assignment ±1 we immediately see that either the first or the second term<br />
equals 0, yielding for all n<br />
γ n = ± 2. (VII. 54)<br />
Averaging over N recurrences of the experiment we have<br />
∣ 1 N<br />
N∑ ∣ ∣∣<br />
γ n =<br />
n=1<br />
1<br />
∣<br />
N<br />
N∑<br />
a n b n +<br />
n=1<br />
Defining the correlation coefficients<br />
N∑<br />
a n b ′ n +<br />
n=1<br />
N∑<br />
a ′ n b n −<br />
n=1<br />
N∑<br />
a ′ ′<br />
n b n ∣ 2. (VII. 55)<br />
n=1<br />
we conclude<br />
c N (⃗a, ⃗ b) :=<br />
1 N<br />
N∑<br />
a n b n etc., (VII. 56)<br />
n=1<br />
|c N (⃗a, ⃗ b) + c N (⃗a, ⃗ b ′ ) + c N (⃗a ′ , ⃗ b) − c N (⃗a ′ , ⃗ b ′ )| 2. (VII. 57)<br />
This is indeed a Bell inequality again, equivalent to inequality (VII. 13) in the limit N → ∞.<br />
The expectation value of c(⃗a, ⃗ b) = ⟨a n b n ⟩ in quantum mechanics is given by (VII. 5) and the<br />
contradiction with (VII. 57) follows as in section VII. 2. □
152 CHAPTER VII. BELL’S INEQUALITIES<br />
◃ Remark<br />
The derivation of (VII. 57) directly comes from expression (VII. 52) and as a result, the existence of<br />
hidden variables does not have to be presumed, only locality was required. Sensationally, we seem to<br />
have proved that quantum mechanics is empirically inconsistent with the requirement of locality. ▹<br />
The experimental violation of the Bell inequalities thus leads us to the conclusion that physical<br />
reality is not local. What we, however, have presupposed in the matching condition (VII. 51) is that<br />
we can simultaneously assign values to a n and a n′ , although they cannot be simultaneously measured<br />
because the spin measuring device cannot be at the same time in both positions ⃗a and ⃗a ′ ≠ ⃗a. In fact,<br />
of the set of four terms in (VII. 52), at the most one of them is experimentally realizable. Still, we<br />
spoke of outcomes of measurements that have not actually been carried out. Of course, the derivation<br />
of the Bell inequality (VII. 57) from the matching condition (VII. 51) is mathematically flawless. The<br />
question is whether the matching condition (VII. 51) follows from the requirement of locality. We<br />
will now explore this question further.<br />
VII. 4. 1<br />
COUNTERFACTUAL CONDITIONAL STATEMENTS AND INDETERMINISM<br />
Let a n be the outcome of experiment I. With their matching condition, Eberhard and Stapp claim<br />
that this value of a n would be unaltered if we had carried out experiment II instead of experiment I<br />
because these experiments only differ in the settings of the B - meter, which is far away. Therefore,<br />
a n is the outcome which the spin meter A would have given for the n th pair of particles for both<br />
experiment I and experiment II. Redhead (1987, p. 92) formulates this requirement as follows<br />
PRINCIPLE <strong>OF</strong> LOCAL COUNTERFACTUAL DEFINITENESS (PLCD):<br />
The result of an experiment which could be performed on a microscopic system has a<br />
definite value which does not depend on the setting of a remote piece of apparatus.<br />
This means that if this setting would have been different, the outcome of the experiment would<br />
not have been different. Using the same mathematics as before it follows that<br />
PLCD → Bell inequality. (VII. 58)<br />
Since PLCD is an assumption of locality concerning outcomes of measurements, (VII. 58) seems<br />
to be independent of the existence of hidden variables. But appearances are deceptive. In fact, PLCD<br />
is only reasonable in a deterministic context, and not in the case of indeterminism.<br />
Consider the following example given by Redhead (ibid.). Suppose that, at t 1 , just before the<br />
clock strikes twelve, I raise my hand. Now I ask the question if the clock would also have struck if<br />
I had not raised my hand at t 1 . Intuitively, the right answer is ‘Yes’, in agreement with PLCD. Now<br />
replace the clock by a radioactive atom which decays at t 2 . Suppose I raised my hand at t 1 < t 2 ,<br />
would the atom also have decayed if I had not done this? Now the answer is far from clear. If the<br />
decay is purely indeterministic, a recurrence of the experiment, even if it is just a thought experiment,<br />
does not have to have the same outcome. The supposition that the atom would not have decayed if I<br />
had not raised my hand, is not contradictory to locality.<br />
The assumptions that outcomes of measurements remain to have the same values even if they are<br />
not measured, or that measurements which are not carried out have certain outcomes in advance, are
VII. 5. STOCHASTIC HIDDEN VARIABLES 153<br />
only reasonable in a deterministic context. But in a deterministic context these assumptions do not<br />
differ from each other, and a outcome of measurement is decisively linked to the value the quantity<br />
had just beforehand, therefore, to a hidden variable.<br />
The conclusion is that the assumption of Eberhard and Stapp, PLCD, is no more general than the<br />
assumption that the value a n is a property of the particles which is determined in advance, and which<br />
is independent of the settings of the meter at B. This means that the derivation is no more general<br />
than the derivation for a local deterministic HVT.<br />
VII. 5<br />
STOCHASTIC HIDDEN VARIABLES<br />
In this section we will no longer require determinism in the HVT; the λ only determine the probability<br />
that a quantity has a certain value, which is revealed by the measuring apparatus in the way a<br />
balance reveals our weight. A stochastic HVT is linked more closely to quantum mechanics, enabling<br />
a more well - defined comparison between the assumptions leading to the Bell inequalities on the one<br />
hand, and quantum mechanics on the other.<br />
In our stochastic HVT we assume the existence of a probability distribution at given directions<br />
⃗a, ⃗ b ∈ R 3 of the spin meters in the EPRB experiment<br />
p ⃗a, ⃗ b<br />
(a, b, λ), (VII. 59)<br />
which is the probability to find for the quantities A = ⃗σ 1 · ⃗a and B = ⃗σ 2 · ⃗b the values a and b,<br />
respectively, where it holds that a,b = ±1. Again, λ ∈ Λ is the hidden variable describing the source.<br />
Such a probability distribution can always be written in terms of conditional probabilities,<br />
p ⃗a, ⃗ b<br />
(a, b, λ) = p ⃗a, ⃗ b<br />
(a | b ∧ λ) p ⃗a, ⃗ b<br />
(b | λ) ρ ⃗a, ⃗ b<br />
(λ). (VII. 60)<br />
To be able to derive the Bell inequalities we make the following three suppositions.
154 CHAPTER VII. BELL’S INEQUALITIES<br />
1. Outcome independence<br />
The probability to find a value a for ⃗a · ⃗σ is ‘completely’ determined by the settings of the spin<br />
meters and by λ, particularly, it is not necessary to also give outcome b, likewise for finding a<br />
value b,<br />
p ⃗a, ⃗ b<br />
(a | b ∧ λ) = p ⃗a, ⃗ b<br />
(a | λ) and p ⃗a, ⃗ b<br />
(b | a ∧ λ) = p ⃗a, ⃗ b<br />
(b | λ). (VII. 61)<br />
2. Parameter independence<br />
The probability to find the outcome of measurement a or b is independent of the settings of the<br />
remote spin meter,<br />
p ⃗a, ⃗ b<br />
(a | λ) = p ⃗a (a | λ) and p ⃗a, ⃗ b<br />
(b | λ) = p ⃗b (b | λ). (VII. 62)<br />
3. Source independence<br />
The distribution of λ in the source does not depend on the settings of the spin meters,<br />
ρ ⃗a, ⃗ b<br />
(λ) = ρ(λ). (VII. 63)<br />
In principle we can adjust the spin meters ‘at the last moment’, long after the particles have left<br />
the source. It is reasonable to assume that the source is not influenced by what happens to the<br />
measuring devices in the future.<br />
Now we will prove the next theorem.<br />
BELL’S THIRD THEOREM:<br />
A stochastic HVT which is in agreement with outcome, parameter and source independence<br />
is empirically inconsistent with quantum mechanics.<br />
Proof<br />
As a consequence of the aforementioned properties, in every local stochastic HVT, (VII. 60) becomes<br />
p ⃗a, ⃗ b<br />
(a, b, λ) = p ⃗a (a | λ) p ⃗b (b | λ) ρ(λ), (VII. 64)<br />
or<br />
p ⃗a, ⃗ b<br />
(a, b | λ) = p ⃗a (a | λ) p ⃗b (b | λ), (VII. 65)<br />
which means that the quantities A and B are statistically independent of each other for given λ.<br />
This statement is often called factorizability or conditional independence.
VII. 5. STOCHASTIC HIDDEN VARIABLES 155<br />
Using (VII. 64), another Bell inequality can be derived for E(⃗a, ⃗ b) by means of the relation<br />
∫<br />
E(⃗a, ⃗ (<br />
b) = p⃗a, ⃗ b<br />
(1, 1, λ) − p ⃗a, ⃗ b<br />
(1, −1, λ) (VII. 66)<br />
Defining<br />
Λ<br />
− p ⃗a, ⃗ b<br />
(−1, 1, λ) + p ⃗a, ⃗ b<br />
(−1, −1, λ) dλ )<br />
∫<br />
(<br />
= p⃗a (1 | λ) − p ⃗a (−1 | λ) ) ( p ⃗b (1 | λ) − p ⃗b (−1 | λ) ) ρ(λ) dλ.<br />
Λ<br />
f (⃗a, λ) := p ⃗a (1 | λ) − p ⃗a (−1 | λ) (VII. 67)<br />
and<br />
g( ⃗ b, λ) := p ⃗b (1 | λ) − p ⃗b (−1 | λ), (VII. 68)<br />
we see that<br />
|f (⃗a, λ)| 1 and |g( ⃗ b, λ)| 1, (VII. 69)<br />
which brings us back to (VII. 25) and the subsequent equations so that again we obtain the Bell<br />
inequality (VII. 13). Violation of this Bell inequality means that (VII. 64) can not apply and<br />
therefore no HVT can guarantee both outcome independence (VII. 61) and parameter independence<br />
(VII. 62). □<br />
VII. 5. 1<br />
OUTCOME, PARAMETER AND SOURCE INDEPENDENCE<br />
The importance of the distinction between outcome and parameter independence was first brought<br />
to attention by J. Jarrett (1984).<br />
1. Outcome independence, (VII. 61), means that the probability of outcome b, for given λ, does<br />
not depend on the outcome a. This is motivated by the idea that λ gives a complete description of<br />
the state of the pair of particles; the variable λ contains an exhaustive specification of all factors<br />
which are relevant for the outcomes of measurement. Therefore, specifying the extra information that<br />
outcome a has occurred can, if λ is already known, not lead to new information on b.<br />
The purpose of the requirement can be illustrated by giving the next example, in which it is not<br />
satisfied. Suppose that two people, without looking, each draw a little ball out of a box containing two<br />
little balls, one black and one white. Hereafter they separate, one travels to New York, the other to<br />
Tokyo. Now consider a ‘stochastic hidden variable’ with probability 1 2<br />
for the little balls to be black<br />
or white. On arrival at Tokyo the traveler opens his hand and sees that his little ball is black, which<br />
instantaneously enables him to predict the color of the little ball in New York, it has to be white. Here<br />
the outcome of measurement of the one little ball does provide relevant information on the outcome<br />
of a measurement of the other little ball.
156 CHAPTER VII. BELL’S INEQUALITIES<br />
The idea behind the requirement of outcome independence is that such a situation could only<br />
occur because the HVT was incomplete; in a complete specification of the state of the pair of particles<br />
which existed at the beginning of the trip also the color of the little balls should have been included,<br />
even though the travelers did not know the color of their little ball. Then it automatically follows,<br />
at given λ, that the little ball in New York is white and the observation in Tokyo provides no new<br />
information.<br />
2. Parameter independence, (VII. 62), means that the probability distribution of the outcomes<br />
at A is independent of external changes at B, e.g. pointing the spin meter. The argumentation leading<br />
to the assumption of parameter independence is generally associated with the possibility of signaling.<br />
Suppose that, for example, adjustments ⃗ b and ⃗ b ′ existed such that<br />
p ⃗a, ⃗ b<br />
(a | λ) ≠ p ⃗a, ⃗ b ′ (a | λ), (VII. 70)<br />
then, in principle, it is possible to instantaneously exchange signals between experimenters located<br />
at A and B. Since the experimenter located at B can choose if he points his spin meter in the<br />
direction ⃗ b or ⃗ b ′ , an experimenter located at A is able, if the source emits particle pairs in a pure<br />
hidden - variables state λ, to register the relative frequency of outcomes of A and thereby retrieve<br />
which adjustment has been chosen by the experimenter at B. Violation of parameter independence<br />
therefore means that the HVT enables the instantaneous exchange of signals over arbitrarily large<br />
distances.<br />
3. Source independence, (VII. 63), means that the probability distribution over the hidden variable<br />
describing the particle pair cannot depend on the measuring directions chosen by the experimenters.<br />
The argumentation leading to the assumption of source independence is often described<br />
in terms of the ‘free will’ of the experimenters. The experimenters are considered to be completely<br />
‘free’ in their decision how to point their spin meters, and even to make their choice just at the last<br />
moment, when the particles have long left the source. Therefore, the probability distribution ρ(λ),<br />
which characterizes the source of the particle pairs, cannot depend on that.<br />
Of course, here too it applies that violation of the requirement is logically conceivable. It is<br />
possible that this freedom does not exist, and that at emitting the particles, the directions in which<br />
the experimenters will measure have already been determined. It is also conceivable that by some<br />
other cause a correlation exists between λ and the directions ⃗a and ⃗ b, influencing both. The first case,<br />
in which all relevant factors of the EPR experiment are determined in advance and the experimenters<br />
have no free will, is called super - determinism. Therefore, in a super - deterministic HVT the Bell<br />
inequalities can be violated also.<br />
VII. 5. 2<br />
<strong>QUANTUM</strong> <strong>MECHANICS</strong> AS A STOCHASTIC HVT<br />
Exclusively giving probability statements concerning outcomes of measurements, a stochastic<br />
HVT conceptually differs less from quantum mechanics than other HVT’s. In fact we can, without<br />
objection, take quantum mechanics itself as an example of a stochastic HVT by identifying λ with the<br />
quantum mechanical state and Λ with the relevant Hilbert space. Since quantum mechanics does not<br />
satisfy the Bell inequalities, it is interesting to examine which of the aforementioned requirements is<br />
violated inevitably by quantum mechanics.
VII. 5. STOCHASTIC HIDDEN VARIABLES 157<br />
3. Source independence. We already discussed the possibility of violation of the Bell inequalities<br />
by a super-deterministic theory without source independence. It is a philosophical question whether<br />
we can somehow establish if we have free will or not, therefore, it is a possibility, but not an inevitability,<br />
leaving outcome and parameter independence.<br />
2. Parameter independence. Describing the pairs of particles in the singlet state |Ψ 0 ⟩, (III. 165),<br />
by a pure hidden - variables state, the probability distribution is a delta - distribution,<br />
ρ Ψ0 (λ) = δ λ0 (λ) := δ(λ − λ 0 ), (VII. 71)<br />
which leads to<br />
∫<br />
p ⃗a, ⃗ b,λ0<br />
(a, b, λ) ρ Ψ0 (λ) dλ = p ⃗a, ⃗ b,λ0<br />
(a, b). (VII. 72)<br />
Λ<br />
The probabilities for the outcomes of measurement are given by (III. 176),<br />
p ⃗a, ⃗ b,λ0<br />
(a = 1 ∧ b = 1) = 1 2 sin2 1 2 θ ⃗a, ⃗ b ,<br />
p ⃗a, ⃗ b,λ0<br />
(a = 1 ∧ b = −1) = 1 2 cos2 1 2 θ ⃗a, ⃗ . (VII. 73)<br />
b<br />
EXERCISE 33. Also calculate the other two joint probabilities, that is, for a = 1 ∧ b = 1<br />
and a = −1 ∧ b = 1.<br />
The marginal probabilities are, using (VII. 73),<br />
p ⃗a, ⃗ b<br />
(a | λ 0 ) = p ⃗a, ⃗ b,λ0<br />
(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />
(a = 1 ∧ b = −1) = 1 2 ,<br />
p ⃗a, ⃗ b<br />
(b | λ 0 ) = p ⃗a, ⃗ b,λ0<br />
(a = 1 ∧ b = 1) + p ⃗a, ⃗ b,λ0<br />
(a = −1 ∧ b = 1) = 1 2<br />
, (VII. 74)<br />
which means that, both being equal to 1 2<br />
, they are not dependent of the settings of a remote measuring<br />
device. Consequently, even the quantum mechanical correlations in the singlet cannot be used for<br />
signaling, there is no actio in distans, leading to the following theorem.<br />
NO - SIGNALING THEOREM:<br />
Quantum mechanics satisfies parameter independence, i.e., if subsystems of a composite<br />
physical system no longer interact, the probability of finding certain outcomes of measurement<br />
for an arbitrary quantity of subsystem 1 is independent of which quantity of<br />
subsystem 2 is measured, and vice versa.<br />
EXERCISE 34. Prove that the EPRB experiment is an example of the no - signaling theorem.<br />
Optional: prove, in general, the no - signaling theorem using state operators. Whoever cannot<br />
solve this problem, is advised to consult Ghirardi, Rimini and Weber (1980).
158 CHAPTER VII. BELL’S INEQUALITIES<br />
1. Outcome independence. In quantum mechanics it is indeed the requirement of outcome independence<br />
that is not satisfied. The conditional probabilities, i.e., the probabilities for the spin of<br />
particle 1 to be found in the direction ⃗a, given that the spin of particle 2 was found in the direction ⃗ b<br />
and vice versa, which were defined in (III. 172), p. 74, are clearly not independent,<br />
p ⃗a, ⃗ b<br />
(a = 1 | λ 0 ∧ b = 1) = sin 2 1 2 θ ⃗a, ⃗ , (VII. 75)<br />
b<br />
p ⃗a, ⃗ b<br />
(a = −1 | λ 0 ∧ b = 1) = cos 2 1 2 θ ⃗a, ⃗ . (VII. 76)<br />
b<br />
According to quantum mechanics, physical systems are inseparable. However, this interdependence<br />
of outcomes cannot be used to exchange signals since we do not control the outcomes of spin measurements<br />
and therefore we are unable to actively influence the probability distribution over the outcomes<br />
of measurements from a distance. Shimony (1984, p. 227) called it passion at a distance. The experimenter<br />
at B can, on the basis of his observation, indeed do a better prediction concerning an outcome<br />
at A than that which is possible on just the knowledge of the singlet state, but he cannot warn the<br />
observer at A, he only can watch passively.<br />
The singlet |Ψ 0 ⟩ ∈ C 4 does violate the Bell inequalities for suitably chosen spin quantities.<br />
The singlet is not factorizable, i.e., it cannot be written as a direct product of two states in C 2 , it is<br />
entangled. We can raise the question if types of quantum mechanical states exist which do not violate<br />
a Bell inequality for any choice of four spin quantities.<br />
Capasso, Fortunato and Selleri (1973) proved that the CHSH inequality, (VII. 13), is upheld for<br />
every choice of four spin quantities by all factorizable states and by all mixtures thereof. Violations<br />
are therefore only possible for entangled states. Vice versa, Home and Selleri (1991, pp. 22 - 26)<br />
proved that for every entangled pure state, that is, a state which cannot be written as a direct product,<br />
it is always possible to choose spin quantities in such a way that the CHSH inequality is violated.<br />
These results can be summarized in the statement that entanglement and violation of Bell inequalities<br />
are equivalent. It confirms Schrödinger’s insight from 1935 (Schrödinger 1935a) that the<br />
existence of entangled states marks the cardinal difference between classical and quantum mechanics.<br />
VII. 6<br />
AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES<br />
The contradiction between a local deterministic or a local stochastic HVT, both either autonomous<br />
or contextual, on the one hand, and quantum mechanics on the other hand, is statistical in nature, because<br />
it concerns inequalities in terms of expectation values or probabilities, like all Bell’s theorems.<br />
But Kochen and Specker’s theorem, which we discussed in V. 3, does not contain any inequalities. In<br />
this case it is customary to speak of algebraic proof.<br />
This raises the question whether an algebraic proof of Bell’s theorems is also possible, that is,<br />
without appealing to the measurement postulate. The answer is affirmative. Using a spin state of a<br />
composite system of four particles, D.M. Greenburger, M.A. Horn en A. Zeilinger (1989) showed<br />
that it is mathematically impossible to locally and separably assign values to all spin quantities. Here<br />
we will show a simplified version given by N.D. Mermin (1993), where, in using |GHZ⟩, we refer to<br />
the aforementioned authors.
VII. 6. AN ALGEBRAIC PRO<strong>OF</strong> WITHOUT INEQUALITIES 159<br />
Consider a composite system of three spin 1/2 fermions with pure states in the direct product<br />
Hilbert space C 2 ⊗ C 2 ⊗ C 2 = C 8 . We look at 10 physical quantities which correspond to the<br />
spin operators represented in the Mermin pentagon, figure VII. 8. In this diagram σy<br />
1 is shorthand<br />
for σ y (1) ⊗ 11 (2) ⊗ 11 (3), and σy 1 σy 2 σx 3 is likewise for σ y (1) ⊗ σ y (2) ⊗ σ x (3), etc. On every<br />
straight line through the Mermin pentagon we find four commuting operators. These operators are<br />
products of commuting operators with eigenvalues ±1 and therefore have eigenvalues ±1 also.<br />
σ 1 y<br />
σ 1 x σ 2 x σ 3 x σ 1 y σ 2 y σ 3 x σ 1 y σ 2 x σ 3 y σ 1 x σ 2 y σ 3 y<br />
σ 3 x<br />
σ 3 y<br />
σ 1 x<br />
σ 2 y<br />
σ 2 x<br />
Figure VII. 8: The Mermin pentagon<br />
Using the properties of the Pauli matrices (III. 122), p. 66, it can be shown that<br />
(<br />
σx (1) ⊗ σ y (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ x (2) ⊗ σ y (3) ) ( σ y (1) ⊗ σ y (2) ⊗ σ x (3) )<br />
= − σ x (1) ⊗ σ x (2) ⊗ σ x (3), (VII. 77)<br />
where we note that the four operators acting in C 8 commute. Consequently, they have a simultaneous<br />
eigenstate in C 8 , having eigenvalue +1 for the three operators on the left - hand side of the equation,<br />
and eigenvalue −1 for the operator on the right - hand side. The entangled state in C 8 ,<br />
|GHZ⟩ := 1 2<br />
√<br />
2<br />
(<br />
|z ↑⟩ ⊗ |z ↑⟩ ⊗ |z ↑⟩ − |z ↓⟩ ⊗ |z ↓⟩ ⊗ |z ↓⟩<br />
)<br />
, (VII. 78)<br />
is such a state.<br />
We assume that the three particles are already far away from each other and are moving still further<br />
apart, and the composite system is, as far as spin is concerned, in the state |GHZ⟩. A measurement of<br />
two particles, of which we assume that it does not influence the third particle in any way, determines<br />
the value of the third particle because, according to quantum mechanics, the product of the outcomes<br />
of measurement is determined.
160 CHAPTER VII. BELL’S INEQUALITIES<br />
According to a HVT, at the moment a measurement is made the values of the spin quantities<br />
are revealed. If we call these values w x (1) for the spin in x - direction of particle 1, etc, then, because<br />
|GHZ⟩, (VII. 78), is a simultaneous eigenstate for the four quantities in C 8 , (VII. 77), it must<br />
hold that<br />
and<br />
w x (1) w y (2) w y (3) = w y (1) w x (2) w y (3) = w y (1) w y (2) w x (3) = + 1, (VII. 79)<br />
w x (1) w x (2) w x (3) = − 1. (VII. 80)<br />
The product of these four factors is<br />
(<br />
wx (1) w y (2) w y (3) ) ( w y (1) w x (2) w y (3) ) ( w y (1) w y (2) w x (3) ) ( w x (1) w x (2) w x (3) )<br />
= (+ 1) (+ 1) (+ 1) (− 1) = − 1. (VII. 81)<br />
But if we consider the product as as a product of the 12 values of these spin quantities we find<br />
w x (1) w y (2) w y (3) w y (1) w x (2) w y (3) w y (1) w y (2) w x (3) w x (1) w x (2) w x (3)<br />
= w 2 x (1) w 2 y (1) w 2 x (2) w 2 y (2) w 2 x (3) w 2 y (3) = 1 6 = + 1, (VII. 82)<br />
This leads to +1 = −1, which is, of course, an algebraical absurdity. And indeed, this is an algebraic<br />
proof since it contains no probabilities or inequalities.<br />
EXERCISE 35. What kind of HVT is excluded by the foregoing reasoning? Which postulates of<br />
quantum mechanics are necessary to obtain the contradiction?<br />
VII. 7<br />
MISCELLANEA<br />
Literature concerning the Bell inequalities has reached an extraordinarily large extent since the<br />
seventies of the 20 th century, however, its growth has decreased in recent years. In conclusion of this<br />
chapter we will briefly discuss some of the main topics.<br />
VII. 7. 1<br />
LOCALITY AND RELATIVITY<br />
Although in these lecture notes we have restricted ourselves to non - relativistic quantum mechanics,<br />
the speed of light did not play a role in our considerations, it is, of course, especially the<br />
special theory of relativity which provides the inspiration to study the (im -) possibility of signaling.
VII. 7. MISCELLANEA 161<br />
Therefore, it is interesting to consider the EPRB experiment schematically in a Minkowski diagram,<br />
figure VII. 9.<br />
ct<br />
A<br />
B<br />
λ<br />
Figure VII. 9: Minkowski diagram of the EPRB experiment, where λ is in the past light cones of both<br />
A and B<br />
A natural requirement of locality for a relativistic stochastic HVT is that the probability of an<br />
outcome A depends exclusively on the variables which specify the state in the past light cone of<br />
the measuring event at A, and likewise for B. Bell has called it local causality. We have seen that<br />
quantum mechanics is not a local causal theory. Indeed, the probability of an outcome at A cannot be<br />
influenced by the choice of the direction of measurement ⃗ b at B, but with the outcome at B, which<br />
can be registered there, a prediction can be done by an observer at B concerning the particle at A<br />
which an observer at A can not do, even if he has complete knowledge of the state in the past light<br />
cone of A.<br />
x<br />
VII. 7. 2<br />
LOCALITY VERSUS CONDITIONAL INDEPENDENCE<br />
A problem that is brought up in some publications, e.g. Fine (1982), De Muynck (1986, 1996),<br />
is, to what extent locality is necessary to derive the Bell inequalities. The authors argue that in<br />
‘requirements of locality’ only a special form of statistic independence is expressed. The distance<br />
between the measuring apparatuses is in absolutely no way manifest in the requirement. Although<br />
‘locality’ is a term which seems to presuppose a space - time, such space - times are conspicuous by<br />
their absence in relevant locality assumptions, they all are probability statements without reference to<br />
space or time.<br />
Indeed, strictly speaking one cannot say that these assumptions express a requirement of locality.<br />
It could be possible to expect an analogous independence for a hypothetical pair of particles, for<br />
example a photon and a gluon, which absolutely cannot interact with each other, but are located very<br />
close to each other. The essence is that in a local theory the large distance between the particles can<br />
be taken to be a sufficient, but not necessary condition for the absence of interactions.<br />
The requirement of outcome independence in the HVT is not a representation of the requirement<br />
of locality, it has only been motivated by it. The conclusion that is sometimes drawn from this, that<br />
apparently locality itself is irrelevant for the Bell inequality, is, however, incorrect. Factual violation<br />
of the Bell inequality means that every stochastic HVT satisfying the factorizability as formulated in<br />
section VII. 5 is excluded, and therefore, also the local versions are excluded.
162 CHAPTER VII. BELL’S INEQUALITIES<br />
VII. 7. 3<br />
DETERMINISM<br />
Another widespread view is that the derivation of the Bell inequalities always relies on a supposition<br />
of determinism in the HVT, so that giving up determinism would be a possible expedient from<br />
the Bell inequalities.<br />
Bell himself has emphasized the inadequacy of this view. Determinism, which is the possibility<br />
to make predictions concerning a remote object with certainty before making measurements, indeed<br />
plays an important role in the original version. But this is a consequence of the perfect correlation<br />
in the quantum mechanical expression (VII. 5), i.e., this determinism follows from the singlet state<br />
itself, and is not a specific supposition of the HVT, see for instance Suppes and Zanotti (1976), and<br />
Dieks (1983).<br />
We saw that in a stochastic, or indeterministic, HVT the Bell inequalities are also derivable, so<br />
that giving up determinism does not help. Moreover, the opposite is true; especially super - determinism,<br />
the supposition that also the choice of the direction of measurement by the experimenter is<br />
determined in advance, offers a way out of the Bell inequalities.
VIII<br />
THE MEASUREMENT PROBLEM<br />
[. . . ] if one has to stick to these darn quantum jumps then I regret that I ever have taken<br />
part in the whole thing.<br />
— Erwin Schrödinger<br />
In this final chapter we will elaborate on the most important interpretation problem, the measurement<br />
problem, which has the subject of an ever-continuing series of publications. We will give<br />
an introduction to Von Neumann’s quantum mechanical measurement theory and formulate the<br />
measurement problem, we will go through a number of attempts to solve it, and finally we will<br />
discuss some criticism of the theory.<br />
VIII. 1<br />
INTRODUCTION<br />
The term ‘measurement’ plays a very special role in quantum mechanics, and we suggest a short<br />
rereading of the first paragraphs of chapter V. It is remarkable that the term arises in the Von Neumann<br />
postulates as described in chapter III, p. 41, ff. Both in the measurement postulate, specifying the<br />
possible outcomes of measurement and giving a physical meaning to the probability measure which<br />
is determined by the state vector, or the state operator, in terms of outcomes of measurement, and<br />
in the projection postulate, establishing the evolution in time of the state at measurement, the term<br />
‘measurement’ comes forward.<br />
That special role also becomes apparent in the debates concerning the interpretation of the theory,<br />
where it is frequently remarked that measurement ‘creates’ the value for a quantity, or that it causes a<br />
sudden state change, as expressed by Dirac (1958, p. 36),<br />
In this way we see that a measurement always causes the system to jump into an eigenstate<br />
of the dynamical variable that is being measured, the eigenvalue this eigenstate<br />
belongs to being equal to the result of the measurement.<br />
From the perspective of classical physics, this is extremely unusual. In Newton’s theory of gravitation,<br />
or the electrodynamics of Faraday and Maxwell, measurements are sometimes mentioned, as<br />
suppliers of experimental facts, but never as specific types of operation on physical systems, needing<br />
a separate treatment in the theory.<br />
The point here is not only that measurements in classical physics, as is frequently stated, always<br />
bring about a negligible or compensable disturbance of the system and therefore can remain outside<br />
consideration, much more important is, that in in classical physics there is no distinction in principle
164 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
between processes which serve as measurements and processes that do not. Every physical process<br />
or every mutual influence of physical systems can, under suitable circumstances, be considered as<br />
a measurement. Since it is the physical theory that indicates which physical processes in nature are<br />
possible, the theory itself also provides the criterion for the kinds of measurements which are possible.<br />
According to Von Neumann’s postulates, in quantum mechanics this is exactly the other way<br />
around. First we must, according to the aforementioned postulates, have a criterion to know when a<br />
process is a measurement, before we can indicate what the theory has to say concerning the process,<br />
before we can apply the postulates. That the term measurement in this way gets a more fundamental<br />
status than the physical theory, is also expressed by the words of Pauli as quoted in chapter I, p. 9,<br />
that a measurement creating values is “outside the laws of nature”.<br />
Intuition tells us that measurements are just an ‘ordinary kind’ of physical interactions, and this<br />
intuition cannot easily be wept out, from which we will give an illustration. Consider a photon which<br />
has gone through a slit and is on its way to a photographic plate. If we presume the interaction with<br />
this photographic plate to be a measurement, the wave function of the photon must, according to the<br />
projection postulate, collapse on arrival at the plate. But we also know that the photographic plate has<br />
a microscopic structure. It contains silver atoms in an emulsion which can be excited by the photon<br />
and start a chemical process in such a way that we can see something when the plate is developed.<br />
Would it not be plausible that quantum mechanics could describe such a process using a Schrödinger<br />
equation?<br />
In every way this event looks like a physical interaction which falls completely within the well -<br />
known laws of nature, instead of without. And if this is denied, how shall we decide at all when<br />
a microscopic interaction between a photon and an atom can and when it cannot be labeled as a<br />
measurement? Asking an experimental physicist how her measurement setup works, one will be<br />
given an answer in which physical interactions, generally of electromagnetic nature, are of uppermost<br />
importance. It seems absurd to deny that events take place in the laboratory that are “outside the laws<br />
of nature”.<br />
The clash between the conception that measurements do not differ from other physical interactions<br />
on the one hand, and the fact that measurements in quantum mechanics acquired a special status<br />
because they are not classified to be physical interactions on the other hand, is called the quantum<br />
mechanical measurement problem in the broad sense.<br />
VIII. 2<br />
MEASUREMENT ACCORDING TO CLASSICAL PHYSICS<br />
Although usually no special attention is given to measurements in classical physics, it is no problem<br />
to give a general, schematic description of how a measurement is treated classically.<br />
A measurement brings about a correlation between a quantity A of a physical system S which<br />
is, within the context of a measurement, frequently called an object system, and a quantity R, where<br />
the R comes from reading, which is characteristic for the measuring apparatus M, the apparatus<br />
being a physical system also. In classical physics we assume that A has a certain value a ∈ R,<br />
where a is an element from a set of possible values, for instance a 1 , . . . , a n ⊂ R, and that after the<br />
measurement process R has a value r j = m(a j ), where m is a bijection of the possible values of A<br />
before the measurement, to the possible values of R after the measurement.
VIII. 2. MEASUREMENT ACCORDING TO CLASSICAL PHYSICS 165<br />
Take, for example, S to be yourself and M to be a balance, A is your weight and R is the reading<br />
of the pointer of the balance. Now you have an unknown weight value, a, which is revealed by<br />
the balance indicating r = m(a) = 63 kg. The role of a measurement is pragmatic; the value of<br />
a physical quantity of the object system which is not directly or not easily observable, for example<br />
mass, is correlated to a quantity that is directly observable, in this case the position of an pointer. For a<br />
correlation to occur between A and R there must be an interaction between S and M. This interaction<br />
can, potentially, influence the value of A in such a way that the value before the measurement can<br />
change to another value after measurement. Measurement is a process looking towards the past and<br />
its aim is to reveal the value of A before the interaction with M.<br />
If it is possible to predict, from the value a and the interaction between S and M, the value a ′<br />
which A has after measurement, then the measurement also looks at the future and acts like an apparatus<br />
which prepares a state of S in which A has the value a ′ . Think, for example, of an ammeter<br />
in an electric circuit with an energy source of V volt; if the current through a resistor R is I = V R<br />
without the ammeter, then, after the ammeter has been connected in series with the resistor, the current<br />
I ′ V<br />
equals<br />
R+R s<br />
, where R s is the internal resistance of the ammeter. In case a ′ equals a, the<br />
measurement is called non - disturbing or ideal. The measurement process thus has two aspects; what<br />
happens to the measuring apparatus M, and what happens to the physical system S, i.e. measurement<br />
and state preparation.<br />
In classical physics the measurement interaction can be taken to be arbitrarily small, in which<br />
case the value of A is not disturbed. Therefore, the transition in such an ideal measurement process<br />
is<br />
(a j , r 0 ) (a j , r j ) = ( a j , m(a j ) ) . (VIII. 1)<br />
Notice that the characteristics of the measurement are left out of the consideration. The method<br />
of measuring does not have anything to do with the phenomenon one wants to get information about.<br />
The motion of the planets in the gravitational field of the sun is studied by looking at them, i.e., by<br />
using the fact that the planets reflect sunlight. The optical instruments that are used have nothing to<br />
do with the gravitational motion under examination.<br />
Also notice that in this consideration the question how to measure A is only transformed into the<br />
question how to find the value of R. If we also would have to measure the value of R, this could<br />
lead to an infinite chain of measuring apparatuses. This is avoided by assuming that the quantity R is<br />
directly observable, hence the term pointer reading for R, where we have to take the term ‘pointer’<br />
very generally, for instance, screens showing results of measurements or results printed on paper are<br />
included in the term.<br />
We appeal in our description to a distinction between two different types of quantities; the directly<br />
observable quantities, that is, observable to the naked eye, versus the not directly observable or unobservable<br />
quantities. But this is not a distinction which corresponds to a fundamental distinction of<br />
these quantities, in classical physics all quantities are treated as properties of objects. The fact that we<br />
stop at a directly observable quantity R is a decision based on purely contingent factors, particularly<br />
human physiology and the physics of the human senses.
166 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
VIII. 3<br />
MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong><br />
The following schematic representation of the measurement process in quantum mechanics is<br />
given by Von Neumann (1932).<br />
Suppose that A is a physical quantity of the object system S, represented quantum mechanically<br />
by the maximal operator A on Hilbert space H S , having a discrete spectrum a 1 , . . . , a N . Now<br />
let S interact with a measuring apparatus M, where M is described quantum mechanically also.<br />
For the measuring apparatus M to be able to function as a measuring apparatus, it has to have an<br />
pointer quantity R, represented by the operator R on Hilbert space H M , having orthonormal eigenstates<br />
|r 0 ⟩, . . . , |r N ⟩. These eigenstates have to be orthonormal since they correspond to pointer<br />
readings which can be distinguished by the human eye. Let |r 0 ⟩ be the eigenstate in which the pointer<br />
shows no deflection. The Hilbert space of this composite system S M is H = H S ⊗ H M with<br />
dim H M = dim H S + 1, the basis of R including |r 0 ⟩, where that of A does not include |a 0 ⟩.<br />
Prior to the measurement, the measuring apparatus M is in the eigenstate |r 0 ⟩. We want this state<br />
to change, as a result of the measurement interaction, into the eigenstate |r j ⟩ which is indicative of the<br />
value a j of A, thus, let S initially be in the eigenstate |a j ⟩ of A. Moreover, we want the measurement<br />
to be ideal, so that the state |a j ⟩ of S does not change.<br />
Von Neumann showed that this transition can indeed be brought about by a unitary transformation,<br />
which means we have to find for the composite system SM a unitary evolution operator U, inducing<br />
the transition<br />
U ( |a j ⟩ ⊗ |r 0 ⟩ ) = |a j ⟩ ⊗ |r j ⟩, (VIII. 2)<br />
where U describes the measurement interaction lasting some unspecified time interval.<br />
EXERCISE 36. Show that the operator<br />
U =<br />
N∑ N∑<br />
|a l ⟩ ⊗ |r [l+m] ⟩ ⟨a l | ⊗ ⟨r m | (VIII. 3)<br />
l=1 m=0<br />
(a) is unitary, and (b) induces the desired transition (VIII. 2). Here, [l + m] means l + m modulo<br />
N + 1, i.e.: [N + 1] = 0, [N + 2] = 1, etc.<br />
The formula (VIII. 2) strongly resembles the transition (VIII. 1). Apparently, everything we desired<br />
concerning the ideal measurement process in quantum mechanics, including the requirement<br />
that the value of A must not be disturbed, can be achieved using a unitary operator. At first sight,<br />
there does not seem to be any problem with a completely quantum mechanical treatment of the measurement<br />
interaction, taken as an ordinary physical process obeying Schrödinger’s equation. As in the<br />
classical case, the method of measuring is not discussed. We also did not appeal to the measurement<br />
or the projection postulate.
VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 167<br />
However, at a second look, the transition (VIII. 2) turns out to have peculiar consequences. The<br />
formula (VIII. 2) assumed that the object system S was, before the measurement, in an eigenstate of<br />
A. But what if S is in an arbitrary state |ψ⟩ ∈ H S ?<br />
We can decompose this arbitrary state |ψ⟩ into the orthonormal eigenstates |a j ⟩ of A with coefficients<br />
c j = ⟨a j | ψ⟩. Therefore, using |ψ⟩ = ∑ c j |a j ⟩ and the linearity of the evolution operator it<br />
follows that<br />
U ( |ψ⟩ ⊗ |r 0 ⟩ ) = U<br />
N S ∑<br />
j=1<br />
c j |a j ⟩ ⊗ |r 0 ⟩ =<br />
N S ∑<br />
j=1<br />
c j U ( |a j ⟩ ⊗ |r 0 ⟩ )<br />
=<br />
N S ∑<br />
j=1<br />
c j |a j ⟩ ⊗ |r j ⟩ =: |Φ⟩. (VIII. 4)<br />
We see that the state |Φ⟩ of the composite system of object S and measuring apparatus M after the<br />
measurement is no longer a product state, rather it is entangled. This implies that we cannot describe<br />
S, nor M, with a pure state; the partial traces S and M yield mixed states, see section III. 4.<br />
This aspect has no classical analogue. We will come back to this, but first we consider the question<br />
whether this quantum mechanical description of the measurement process is compatible with the<br />
measurement postulate. Or, more precisely, whether application of the measurement postulate to A<br />
leads to the same result as its direct application to S. And we ask whether the desired correlation<br />
between the values of A and R is achieved. We will show now that this is indeed the case.<br />
The quantity R of the measuring apparatus M is represented on the Hilbert space H S ⊗ H M of<br />
the composite system SM as 11⊗R. The probability to find for this quantity the value r k is, according<br />
to the measurement postulate,<br />
Prob |Φ⟩ (R : r k ) = ⟨Φ| ( 11 ⊗ |r k ⟩ ⟨r k | ) |Φ⟩. (VIII. 5)<br />
With (VIII. 4) this yields<br />
Prob |Φ⟩ (R : r k ) = |c k | 2 , (VIII. 6)<br />
where we have used the orthonormality of the |r k ⟩ ∈ H M . This is the same result as yielded by<br />
direct application of the measurement postulate to the arbitrary |ϕ⟩ from (VIII. 4). Apparently, the<br />
probability to find an outcome r k when measuring R of M is always equal to the probability to find<br />
the outcome a k of A on S. This former measurement can therefore be regarded as a substitute for the<br />
latter.<br />
The validity of (VIII. 6) itself does not show that a correlation between the value of A and R has<br />
been established. To show that such a correlation exists, we have to know the probability of a certain<br />
pair of outcomes (a i , r k ) for A ⊗ R, in the state |Φ⟩ of (VIII. 4). The joint probability to find this pair<br />
of outcomes is<br />
Prob |Φ⟩ (A : a i ∧ R : r k ) = ⟨Φ| ( |a i ⟩ ⟨a i | ⊗ |r k ⟩ ⟨r k | ) |Φ⟩<br />
= ∣ ∣ ( ⟨a i | ⊗ ⟨r k | ) |Φ⟩ ∣ ∣ 2 = |c i | 2 δ ik . (VIII. 7)
168 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
The conditional probability to find for A the value a i , given that for R the value r k has been found,<br />
is therefore<br />
Prob |Φ⟩ (A : a i | R : r k ) = Prob (A : a i ∧ R : r k )<br />
Prob (R : r k )<br />
= |c i| 2 δ ik<br />
|c k | 2 = δ ik . (VIII. 8)<br />
In other words, in the state |Φ⟩ a strict correlation exists between the quantities A and R, represented<br />
quantum mechanically by the operators A and R.<br />
The schematic representation of the ideal measurement process is, as we have seen, consistent<br />
with the measurement postulate in the sense that a measurement on M can be a substitute for a measurement<br />
on S. Notice that, to answer this question, we did appeal to the measurement postulate.<br />
This is unavoidable, since the final state after the measurement process, (VIII. 4), is an entangled<br />
quantum state. We can only specify its empirical consequences by appealing to the meaning quantum<br />
mechanics attributes to such quantum states, and in Von Neumann’s postulates that meaning is established<br />
by means of the measurement postulate. Unfortunately, this postulate forces us to consider of a<br />
measurement again, namely, a measurement on the measuring apparatus M itself, by reading off the<br />
position of the pointer. Now we have to ask if this second measurement can also be represented as a<br />
normal interaction.<br />
Suppose that we introduce a second measuring apparatus M ′ which we use to read off the result<br />
of M using a new pointer quantity R ′ , represented by the operator R ′ in H M ′. As an example, we<br />
can think of a quantum mechanical description of our eye. Schematically, we then have the process<br />
|r j ⟩ ⊗ |r ′ 0⟩ −→ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 9)<br />
where the |r ′ j⟩ are the eigenstates of R ′ of M ′ . Let U ′ be the unitary operator describing the measurement<br />
by M ′ on M, lasting again some unspecified amount of time. Now we have, for the composite<br />
system SM M ′ in the Hilbert space H = H S ⊗ H M ⊗ H M ′,<br />
|a j ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩<br />
U<br />
|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ 0⟩<br />
U ′<br />
|a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩, (VIII. 10)<br />
and therefore, if we start from a general initial state |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩, the final state will be<br />
|Φ ′ ⟩ = U ′ U ( |ψ⟩ ⊗ |r 0 ⟩ ⊗ |r ′ 0⟩ ) =<br />
N S ∑<br />
j=1<br />
c j |a j ⟩ ⊗ |r j ⟩ ⊗ |r ′ j⟩. (VIII. 11)<br />
Again, one can argue that all this is consistent with the measurement postulate. That is, upon measurement<br />
of R ′ , the probability of finding the value r ′ k, is equal to |c k | 2 , etc.<br />
We can extend this type of reasoning ad nauseam, by incorporating more and more systems in<br />
the chain of measurement apparatuses, even including a photon scattered by the pointer and entering<br />
the eye of the observer, his retina, the nerve fibres of his brain, etc. All this is consistent with the<br />
measurement postulate, and you can, if you want to, be satisfied with this.<br />
However, the argument does not show that we can take measurements to be on an entirely equal<br />
footing with other physical interactions. No matter how far we extend the chain of apparatuses, the<br />
final state will always be a superposition of the form (VIII. 4) or (VIII. 11)); the meaning of which<br />
can only be specifies by saying what we will find at yet another measurement. The transition to the
VIII. 3. MEASUREMENT ACCORDING TO <strong>QUANTUM</strong> <strong>MECHANICS</strong> 169<br />
conclusion that a certain state has been actually found, sometimes called the ‘Heisenberg cut’ (e.g.<br />
Primas 1993), cannot be made within the formalism. Rudolf Haag has expressed this situation as<br />
follows (Haag 1990, p. 246),<br />
Indeed the problem faced in the development in quantum theory has [. . . ] been [. . . ] the<br />
inability of devising any coherent realistic picture conforming with the observed phenomena.<br />
We can shift the place where we want to make the Heisenberg cut at will, by incorporating more<br />
and more systems in the quantum mechanical description. But the transition itself, exchanging the<br />
quantum mechanical description for a description in terms of observed facts, must come from outside<br />
quantum mechanics.<br />
One can of course, in analogy to the classical measurement scheme, simply postulate that this<br />
quantum mechanical description of the measurement process ends as soon as we can couple the system<br />
S, perhaps by means of many intermediate steps, to some measuring apparatus M whose pointer<br />
quantity R is directly observable. But here we are dealing with the fundamental issue in the theory<br />
and therefore we cannot be satisfied with a pragmatical point of view. Also, we would be faced<br />
with the question which quantities deserve to have the special status of being “directly observable”.<br />
Furthermore, there is the problem that the final states of (VIII. 4) or (VIII. 11) are entangled superpositions<br />
of states with different pointer positions. As we mentioned above, this has no classical analogue.<br />
Without further analysis it is hard to imagine what a direct observation on such states would look like.<br />
An example in which these issues emerge sharply is Schrödinger’s famous cat paradox (1935b),<br />
which we discussed already in the introduction. Schrödinger imagined that a living cat is locked up in<br />
a hermetically closed box, together with a radioactive substance of which perhaps one atom decays<br />
in the course of one hour. The box is provided with a Geiger counter which can register the decay of<br />
the atom, and activates upon decay an installation which lets escape a deadly gas.<br />
Assume that initially the quantum mechanical state of this total system is a product state with a<br />
very large number of factors. The state of the radioactive atom evolves in the course of the hour we<br />
agreed upon to wait into a superposition of the atom before and after decay. The evolution of the total<br />
state then takes on the same form as the state |Φ⟩ in (VIII. 11), i.e., the state evolves into something<br />
like<br />
c 1 (t) |A : 1⟩ ⊗ |ν : 0⟩ ⊗ · · · ⊗ |cat : ⌣⟩<br />
+ c 2 (t) |A : 0⟩ ⊗ |ν : 1⟩ ⊗ · · · ⊗ |cat : †⟩, (VIII. 12)<br />
where t is the time we wait and the system evolves, |A : 1⟩ and |A : 0⟩ are the states of the radioactive<br />
atom before and after decay, |ν : 1⟩ and |ν : 0⟩ are the states of the electromagnetic field with<br />
and without a photon, etc.<br />
The composite system is therefore in a gigantic superposition of states in which the cat is living<br />
and in which it is dead. If we want to hold on to the orthodox interpretation of quantum mechanics<br />
to the bitter end, we have to say that in this state the cat is neither living nor dead, and that only at<br />
measuring, which is perhaps lifting the lid of the box after one hour, there is a certain probability,<br />
namely |c 2 | 2 versus |c 1 | 2 , to find the cat dead or alive. It is the observer, the opener of the box, who<br />
determines the fate of the cat.
170 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
Figure VIII. 1: Schrödinger’s cat paradox (DeWitt 1970 )<br />
VIII. 4<br />
THE MEASUREMENT PROBLEM IN THE NARROW SENSE<br />
In the previous section we have seen how the measurement process, (VIII. 4), brings the composite<br />
system in a superposition of macroscopic different states, e.g. pointer positions. The development of<br />
such superpositions is a consequence of the linearity of the evolution operator. An example is given in<br />
the discussion between Einstein and Pauli, described in the introduction, p. 9, concerning the center of<br />
mass of a macroscopic body. The strangeness of a superposition comes from our tacit presupposition<br />
that the macroscopic pointer positions not only act as possible outcomes of a measurement, but can<br />
also be taken as properties of the pointer. We think that pointers of a measuring apparatus indicate<br />
something, even if we are not in the act of reading them off.<br />
Assuming, for the sake of convenience, that observing something is sufficient to decide that there<br />
is an element of physical reality which is responsible for the observation, we expect that if the quantum<br />
state presents a complete description of the system, i.e., if every element of physical reality has a<br />
counterpart in quantum mechanics, then those macroscopic properties should be represented by it.<br />
That is, however, not the case in the state (VIII. 4).<br />
The idea playing a background role is the next postulate, often called the ‘eigenstate-eigenvalue<br />
link’. It was explicitly supported both by Dirac (1958, p. 46) and Von Neumann (1955, p. 253).<br />
EIGENSTATE-EIGENVALUE LINK, PURE CASE:<br />
A physical system S has the property that quantity A has a definite value iff its state is an<br />
eigenstate of the operator A which, according to the observables postulate, corresponds<br />
to A.<br />
It is also conceivable that a system possesses a definite but unknown value for a quantity. If we use<br />
the ‘ignorance interpretation of mixtures’, as discussed in chapter III, p. 52, we obtain the variation<br />
EIGENSTAT-EIGENVALUE LINK, MIXED CASES:<br />
A physical system S has the property that quantity A has a definite but unknown value
VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 171<br />
iff its state is in a mixture of eigenstates of the operator A which, according to the observables<br />
postulate, corresponds to A.<br />
These postulates speak about the existence of properties, about physical quantities having values,<br />
independent of a measurement or a measuring context.<br />
EXERCISE 37. Discuss the link between the property postulates and the sufficient condition of<br />
reality EPR(EPR) of Einstein, Podolsky and Rosen, section I. 2, p. 12, ff.<br />
From this point of view it would be good to have a quantum mechanical description of the measurement<br />
process in which, in any case, the measuring apparatus has a certain property after completion<br />
of the measurement. This means that, instead of the superposition (VIII. 4), we require, as a final<br />
state, the mixture<br />
W ′ =<br />
N∑<br />
|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |. (VIII. 13)<br />
j=1<br />
Some authors, e.g. Landau and Lifshitz (1958, pp. 21 - 24), go still further and require as a final<br />
state an eigenstate |r k ⟩ of the pointer quantity R, corresponding to the pointer position found after<br />
measurement. According to them the measuring interaction finishes with an indeterministic jump,<br />
with probability |c j | 2 , to one of the states |a j ⟩ ⊗ |r j ⟩.<br />
Summarizing, we have the following options for the description of the measurement process. For<br />
the initial state there is no comtroversy,<br />
|ψ⟩ ⊗ |r 0 ⟩ =<br />
N S ∑<br />
j=1<br />
For the final state there are three possibilities,<br />
c j |a j ⟩ ⊗ |r 0 ⟩. (VIII. 14)<br />
1.<br />
N S ∑<br />
j=1<br />
c j |a j ⟩ ⊗ |r j ⟩, (VIII. 15)<br />
2. W ′ = ∑ j<br />
|c j | 2 |a j ⟩ ⊗ |r j ⟩ ⟨a j | ⊗ ⟨r j |, (VIII. 16)<br />
3. |a j ⟩ ⊗ |r j ⟩ with probability |c j | 2 . (VIII. 17)<br />
According to the foregoing line of reasoning we require that, at the end of a measuring interaction,<br />
the pointer of the measuring apparatus, which is of course macroscopic, designates something.<br />
The state (VIII. 15) does not satisfy this requirement, on the contrary, the quantum mechanical superposition<br />
|ψ⟩ of eigenstates |a j ⟩ of the quantity that is measured and which prohibited us to ascribe,
172 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
preliminary to the measurement, a certain value A to the object system S, proves to be contagious;<br />
after the interaction also the pointer quantity of the measuring apparatus has no definite value anymore,<br />
and if the composite system SM is coupled to another measuring apparatus M ′ , this also<br />
becomes infected with ‘property loss’. This is why (VIII. 16) and (VIII. 17) are preferred as final<br />
states over (VIII. 15).<br />
The problem of giving a treatment of the measurement process which produces one of these two<br />
final states, and which therefore ‘creates’ the definite values by means of the measuring interaction,<br />
is the measurement problem in the narrow sense. Notice that (VIII. 16) and (VIII. 17) cannot be<br />
obtained from the initial state by means of a unitary transformation. Therefore, we have to adjust or<br />
extend the first five Von Neumann postulates. We will discuss some proposals for a solution.<br />
VIII. 4. 1<br />
THE PROJECTION POSTULATE AND CONSCIOUSNESS<br />
By adding the projection postulate to the first five postulates, p. 41, Von Neumann gave the standard<br />
solution to the measurement problem in the narrow sense. He distinguished two ways in which<br />
a state can change in time,<br />
Process 1. The discontinuous, non - unitary, indeterministic projection occurring at a<br />
measurement; the projection postulate.<br />
Process 2. The continuous, unitary, deterministic evolution which is consistent with the<br />
Schrödinger equation or its generalization to mixed states, as long as no measurement is<br />
made on the system; the Schrödinger postulate.<br />
At measurement the state undergoes a transition into the eigenstate belonging to the outcome of<br />
measurement. Therefore, this brings about the final state (VIII. 17) and gives, in accordance with the<br />
eigenstate-eigenvalue link, p. 170, definite properties to both the object system and the pointer of the<br />
measuring apparatus.<br />
Although the measurement problem in the narrow sense is solved with these two types of evolution,<br />
the measurement problem in the broad sense, p. 164, comes into prominence more than ever.<br />
We would now like to have an explanation for the particular nature of a measurement, or at least a<br />
criterion with which it can be distinguished of other processes.<br />
Such a criterion is provided, by Von Neumann and for instance Wigner, W. Heitler (1970 p. 42),<br />
and F. London and E. Bauer (1939), in terms of the consciousness of an observer. London and Bauer<br />
reason as follows.<br />
Consider an object system S, a measuring apparatus M and a conscious observer B. The state of<br />
the composite system after measurement is, according to (VIII. 11),<br />
|Φ⟩ = ∑ j<br />
c j |a j ⟩ ⊗ |r j ⟩ ⊗ |b j ⟩. (VIII. 18)<br />
According to London and Bauer, this is the description of the state for us. But for the conscious<br />
observer B it is not the same, because B has the characteristic capacity of introspection. By introspection<br />
he knows in which eigenstate he is, he perceives one certain pointer position. This breaks the
VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 173<br />
quantum mechanical chain. If he knows that he is in the state |b k ⟩ and sees the meter indicating something<br />
which corresponds to the pointer state |r k ⟩, then from that moment on the state has immediately<br />
become |a k ⟩ ⊗ |r k ⟩ ⊗ |b k ⟩. Conscious introspection of the observer therefore causes the collapse<br />
of the wave packet. This strange situation is expressed in the thought experiment called ‘Wigner’s<br />
friend’, in which the measuring device is replaced by a friend who communicates the outcome of<br />
measurement to Wigner.<br />
The aforementioned authors emphasize the role of consciousness in the interpretation of quantum<br />
mechanics. It need hardly be emphasized that for the majority of physicists something like this is<br />
unacceptable. They are of the opinion that a measurement is finished as soon as the result is registered<br />
somewhere in the equipment. It is not necessary that it subsequently comes to attention of a conscious<br />
being. But of course, then the question remains again which criterion can be given for a permanent<br />
registration.<br />
VIII. 4. 2<br />
BOHMIAN <strong>MECHANICS</strong><br />
An important advantage of the theory of chapter VI is its avoidance of the projection postulate.<br />
This has consequences for the treatment of measurements. ‘Measuring’ is not a primitive concept in<br />
Bohmian mechanics, measurements are treated on an equal footing with all other physical interactions.<br />
The measuring apparatus is treated in the same manner as the measured object system, namely<br />
with the Bohmian equations, which are derived from the Schrödinger equations. As a consequence,<br />
the interaction between an object system and a measuring apparatus can be given according to the<br />
measurement scheme (VIII. 4).<br />
If, for the sake of simplicity, we limit ourselves to two terms, the interaction is of the<br />
form (VI. 24), p. 134, where ϕ B and ϕ D are the eigen - wave functions of the pointer quantity, corresponding<br />
to the various pointer positions. It is plausible to assume that ϕ B and ϕ D have no overlap.<br />
Consequently, the wave function of the object system and the measuring apparatus is effectively factorizable<br />
and we can regard the superposition as a mixture. There is no measurement problem in<br />
Bohmian mechanics.<br />
◃ Remark<br />
The requirement that ϕ B and ϕ D in (VI. 24) have no overlap is stronger than what is required in<br />
Von Neumann’s model. There it suffices that the wave functions are orthogonal, i.e., ⟨ϕ B | ϕ D ⟩ = 0<br />
instead of ϕ B (⃗q)ϕ D (⃗q) = 0 for all ⃗q ∈ R 3 . ▹<br />
VIII. 4. 3<br />
SPONTANEOUS COLLAPSE<br />
The next option has been developed by G.C. Ghirardi, A. Rimini, and T. Weber (1986), a related<br />
proposal comes from F.A. Bopp (1947). In this view the evolution from the Schrödinger postulate has<br />
to be replaced by an indeterministic evolution. A stochastic term is added, making the Schrödinger<br />
equation non - linear. This has as a consequence that every physical system from time to time spontaneously<br />
makes a small jump, so that the wave function collapses to, almost, a position eigenstate.<br />
The new constant of nature characterizing the relevant time scale is such that the probability of a<br />
spontaneous collapse of the wave function for a single elementary particle is extremely small, in the
174 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
order of once every 10 10 years, leaving the continuous Schrödinger equation an excellent approach<br />
for such a physical system.<br />
In this theory it can be shown that in case of composite systems a collapse of the state of a partial<br />
system brings about a collapse of the state of the entire composite system. This has as a consequence<br />
that the average frequency of these spontaneous jumps per unit of time increases with the number of<br />
degrees of freedom, and for a macroscopic system with approximately 10 25 particles the average time<br />
between two jumps, and therefore two collapses, will only be 10 −5 milliseconds. Hence, in good<br />
approximation, macroscopic systems always have a definite position where microscopic systems do<br />
not.<br />
The difference between this approach and that of Von Neumann is that in the first place there<br />
is no fundamental difference between measurements and other interactions, consciousness plays no<br />
role. Moreover, by adapting the evolution equation, this theory leads to predictions which differ from<br />
quantum mechanics, making it verifiable. By means of experiments it is possible to obtain upper<br />
and lower limits for the collapse frequency. Ghirardi, Rimini and Weber are of the opinion that the<br />
experimental data we have at present are still compatible with a finite interval for their new constant<br />
of nature.<br />
VIII. 4. 4<br />
MANY WORLDS<br />
Another option is the many - worlds interpretation of H. Everett (1957), J.A. Wheeler (1957) and,<br />
especially, B.S. DeWitt (1970, 1971). In this view it is posed that the quantum mechanics of the<br />
first five postulates gives a universally valid description of reality. Therefore, in principle the wave<br />
function of the universe can be written down. There is no part of the world, including the context of<br />
measurement, which is described classically. Moreover, there is no projection postulate. The wave<br />
function develops according to a unitary evolution, which means that it remains a pure state for all<br />
time.<br />
Everett models a measurement process by assuming that a certain system has a complete set<br />
of orthonormal eigenstates, which are interpreted to signify that certain outcomes of measurement<br />
have occurred and are permanently registered in a memory. They are analogous to the previously<br />
mentioned pointer positions |r j ⟩. The state |Ψ⟩ of the composite system of object system S and<br />
measuring apparatus M remains in the superposition form (VIII. 15) for all time. To every state |ϕ i ⟩<br />
of the object system corresponds a relative state of the measuring apparatus,<br />
|ψ⟩ rel<br />
Ψ, ϕ i<br />
:= N i<br />
∑<br />
j<br />
c ij |r j ⟩ with c ij = ( ⟨ϕ i | ⊗ ⟨r j | ) |Ψ⟩, (VIII. 19)<br />
where N i is a normalization constant and {|ϕ i ⟩} and {|r j ⟩} are arbitrary orthonormal bases of the<br />
Hilbert spaces H S and H M of the object system and measuring apparatus, respectively. It can simply<br />
be shown that this definition is independent of the choice of this basis, so that the relative state is<br />
uniquely defined by |Ψ⟩ and |ϕ i ⟩.<br />
In case of an ideal measurement we have<br />
|ψ⟩ rel<br />
Ψ, ϕ i<br />
= |r i ⟩. (VIII. 20)
VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 175<br />
This relative state yields the usual conditional probability distribution for the possible outcomes of<br />
measurement of a quantity in case the object system is found in the state |ϕ i ⟩. This is substantiated by<br />
Everett by showing that, if we set the right conditions for the state |ϕ i ⟩, all predictions for quantities<br />
which only refer to the object system S can be determined using the relative state. Therefore, we can<br />
act as if a projection to that state has taken place. In reality, however, the superposition (VIII. 15)<br />
remains.<br />
Now the question is, of course, how this superposition must be interpreted. Especially DeWitt<br />
has propagated a radical view; all terms in this superposition represent real, existing worlds. The<br />
transition during the measurement process is a division of the world in uncountably many copies,<br />
where a different result is registered in each of them. All these worlds exist and develop further next<br />
to one another, without being able to have mutual contact. The problem how to choose one really<br />
realized term from the superposition, as we do using the projection postulate, is avoided because all<br />
terms are realized.<br />
Postulating the existence of such an multiplicity of worlds, with which, moreover, we absolutely<br />
cannot make contact, is acceptable only for a small number of people. But probably worse is the<br />
idea that any decay process in a star in a remote part of the universe can split up our local world into<br />
millions of copies of itself.<br />
Moreover, a difficult point in this theory is how the ‘splitting’ must be understood exactly. It<br />
seems that DeWitt intends a special kind of physical process which emerges at registration. This<br />
would look like adopting a second type of process besides the Schrödinger evolution, in contrast to<br />
the objective of the interpretation; the measurement problem in the broad sense would not be solved.<br />
There is also the problem which process we have to suggest for the reversed evolution; a ‘melting’ of<br />
worlds? In Everett’s original work the idea of a physical splitting of the universe does not occur. He<br />
only regards this as a ‘bookkeeping’ transition to a relative state.<br />
Finally there is the supposition that to a set of states |r j ⟩ of the measuring apparatus the interpretation<br />
can be given that herewith an outcome of measurement is permanently registered. This<br />
supposition cannot without problems be brought into conformity with quantum mechanics because it<br />
still concerns superpositions.<br />
VIII. 4. 5<br />
SUPERSELECTION RULES<br />
Again another option is to introduce superselection rules. Certain superpositions of microscopic<br />
states do not seem to occur in nature, for example, superpositions of states with unequal charge, e.g.<br />
electric, baryonic, or superpositions of states with integer and half integer spin. Therefore, it could be<br />
assumed that superpositions of macroscopically different states do not occur also, and the dynamics<br />
of quantum mechanics must then be adapted to account for this.<br />
In such a setup of quantum mechanics, e.g., in which the superposition principle is not valid<br />
in general, it is possible to have W ′ , (VIII. 16), as the final state of the measurement process, see<br />
Beltrametti and Cassinelli (1981, p. 57). More precisely, in the presence of superselection rules the<br />
mixture (VIII. 16) and the pure state (VIII. 15) become equivalent; the superselection rules provide<br />
the same expectation values for all physical quantities allowed by the superselection operators.<br />
An example of this approach is the suggestion of R. Penrose (1996) that in a future unified theory<br />
for quantum gravitation a superselection rule would apply to the space - time metric. Because
176 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
the gravitational field is taken into account in the metric, also the positions of massive bodies such<br />
as pointers of measuring apparatuses are superselected since the field depends on the positions of<br />
massive macroscopic bodies.<br />
VIII. 4. 6<br />
IRREVERSIBILITY <strong>OF</strong> MEASUREMENT<br />
The next option is to appeal to the special characteristic properties of measuring apparatuses,<br />
and to the theory of irreversible processes, as is done in the work of A. Daneri, A. Loinger and<br />
G.M. Prosperi (1962). According to these authors it is characteristic for measuring apparatuses that<br />
they are in a metastable state. An interaction with a microscopic system then causes, by means of a<br />
chain reaction, an irreversible response of the measuring apparatus.<br />
The description of such an irreversible process within quantum mechanics is not straightforward,<br />
because the unitary evolution is always reversible. It is necessary to make special assumptions concerning<br />
the structure of the macroscopic measuring apparatus and its observable quantities; all matrices<br />
corresponding to these quantities have to be almost diagonal in the energy representation. Then<br />
it can be shown that, as regards the empirical statements for this observable quantities, the final<br />
state (VIII. 15) can be replaced by that of (VIII. 16).<br />
The elegance of this approach is that the details and construction of the measuring apparatus are<br />
discussed. The presence of a metastable state indeed seems to be an essential aspect, like for example<br />
the Geiger counter, or the bubble chamber using superheated liquids. But the introduction of irreversible<br />
processes asks for a modification of the unitary evolution and therefore of the Schrödinger<br />
postulate. Just as in the quantum theory of Ghirardi, Rimini and Weber, this is a fundamental modification<br />
of quantum mechanics.<br />
VIII. 4. 7<br />
MODAL INTERPRETATION<br />
This option to solve the measurement problem is provided by the so - called modal interpretation,<br />
introduced by B.C. van Fraassen (1979) and developed by S. Kochen (1985), D. Dieks, (1989) and<br />
R. Healey (1989). Overviews are given by Vermaas (1999), and Dieks and Vermaas (1998).<br />
In the modal interpretation the projection postulate is removed together with a part of the property<br />
postulate, while the measurement postulate is replaced by a postulate saying that every vector of the<br />
form<br />
|ψ⟩ = ∑ j<br />
c j |a j ⟩ ⊗ |r j ⟩ (VIII. 21)<br />
describes the situation in which system 1 has, as a property, the value a j for the quantity A corresponding<br />
to the operator which is determined by the basis {|a j (t)⟩} and in which, similarly, system 2<br />
has the value r j . Each of these states has a probability |c j | 2 to be realized. This is not different from<br />
the usual ‘ignorance interpretation’ of probabilities. Finally, the Schrödinger postulate is declared to<br />
be valid universally, it is, therefore, also effective during the measurement process.<br />
An important theorem by E. Schmidt, the so - called (biorthogonal -) decomposition theorem, says<br />
that for every composite system the evolution of a state |ψ⟩ in the form (VIII. 21) is unique as long
VIII. 4. THE MEASUREMENT PROBLEM IN THE NARROW SENSE 177<br />
as |c j | ̸= |c k | for j ≠ k. Therefore it is possible for every state |ψ⟩ for which this holds to exactly<br />
indicate the potential corresponding properties. A generalization to mixed states can be achieved by<br />
taking the spectral decomposition of W of the composite system as the preferred decomposition, the<br />
Schmidt decomposition (VIII. 21) is then found for the special case of pure states.<br />
The idea that the meaning of the state vector can be exclusively formulated in terms of measurements<br />
is rejected, the state vector describes factual properties. The description by the wave function<br />
is, however, incomplete, |ψ⟩ determines the possibilities and the probabilities of the possibilities, but<br />
the real physical situation is not determined. Quantum mechanics is fundamentally indeterministic<br />
because sometimes one possibility, at other times another one occurs.<br />
Moreover, in this interpretation the ‘only if’ part of the property postulate is rejected, if a system<br />
is in an eigenstate it has indeed the corresponding eigenvalue, but not ‘only if’; a system which is<br />
in a superposition of eigenstates, (VIII. 21), nevertheless has one of the properties. In the first case<br />
a composite physical system necessarily has the property, in the second case contingently. In logic<br />
the italicized words are called ‘modalities’, hence the name modal interpretation. The projection<br />
postulate is now superfluous.<br />
If, however, the singlet state, being a state of a composite system also, is considered in the modal<br />
interpretation, this interpretation tells us less than quantum mechanics with the property postulate<br />
does.<br />
◃ Remarks<br />
In this interpretation, the metastability or possibly permanent nature of the quantities of system 2 plays<br />
no role in attributing properties. Another point in this interpretation is that, besides the Schrödinger<br />
dynamics for the state, there seems to be a need for a dynamics describing how properties change in<br />
time. Several attempts have been made to that end. ▹<br />
EXERCISE 38. What does quantum mechanics with the property postulate say about the EPRB<br />
experiment, p. 139, that the modal interpretation does not say, and why? Does it help to couple a<br />
measuring apparatus to the composite system of the two spin particles?<br />
VIII. 4. 8<br />
DECOHERENCE<br />
Finally we will discuss the option which is possibly supported by the majority of physicists, see<br />
H.J. Groenewold (1946), K. Gottfried (1989), N.G. van Kampen (1988), W.H. Zurek (1981 and 1982).<br />
Bell (1990) named this option the For All Practical Purposes solution, briefly FAPP. The idea is to<br />
show that the difference between the pure state (VIII. 15) and the mixed state (VIII. 16) is hardly<br />
perceptible in practice.<br />
A measuring apparatus is a macroscopic system which is in continuous interaction with its surroundings.<br />
A more realistic representation of the measurement process will therefore be of the
178 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
form (VIII. 11), but with a very large number of terms and factors, e.g.<br />
|ψ⟩ ⊗ |r 0 ⟩ ⊗ |s 0 ⟩ ⊗ · · · ⊗ |t 0 ⟩ ∑ j<br />
c j |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩. (VIII. 22)<br />
In practice, the coherence between the various terms of the superposition will rapidly be lost because<br />
this coherence can only be revealed if the expectation values of the quantities contain cross<br />
terms. To see this, consider a quantity which is a product of quantities of the various partial systems<br />
S, M, M ′ , . . . , M ′′ , for instance, of the form à ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T , or a summation thereof,<br />
which contains non - zero off - diagonal matrix elements. This means that we assume that<br />
(<br />
⟨ai ′| ⊗ ⟨r j ′| ⊗ · · · ⊗ ⟨t k ′| ) Ã ⊗ ˜R ⊗ ˜S ⊗ · · · ⊗ ˜T ( |a i ⟩ ⊗ |r j ⟩ ⊗ · · · ⊗ |t k ⟩ )<br />
= ⟨a i ′ | A | a i ⟩ ⟨r j ′ | R | r j ⟩ · · · ⟨t k ′ | T | t k ⟩ (VIII. 23)<br />
does not exclusively contains diagonal terms. However, in practice such quantities cannot be measured,<br />
as soon as we do not measure one of the partial systems the coherence is already broken.<br />
For example, because of the orthogonality of the states |s j ⟩, the expectation value of the quantity<br />
˜Q ⊗ ˜R ⊗ 11 ⊗ · · · ⊗ ˜T in the state (VIII. 22) is equal to that in the mixed state<br />
W ′′ = ∑ |c j | 2 ( |a j ⟩ ⊗ |r j ⟩ ⊗ |s j ⟩ ⊗ · · · ⊗ |t j ⟩ )<br />
j<br />
(<br />
⟨aj | ⊗ ⟨r j | ⊗ ⟨s j | ⊗ · · · ⊗ ⟨t j | ) . (VIII. 24)
VIII. 5. INCOMPATIBLE QUANTITIES 179<br />
The step from the pure state (VIII. 22) to the mixture (VIII. 24) is therefore justified by limiting<br />
ourselves to practically realizable states.<br />
At first sight, this reasoning is in every way reasonable. Of course, the reasoning only refers to<br />
a particular class of quantities; a physical quantity for a composite system is certainly not always a<br />
direct product or a summation thereof. But it can be maintained that quantities which are not direct<br />
products are even harder to measure in practice. It is, however, beyond doubt that experimentally<br />
distinguishing the pure state (VIII. 22) from the mixed state (VIII. 24) using macroscopic quantities<br />
will be extremely difficult.<br />
Bell considers this FAPP solution as a pitfall, he speaks of the FAPP - trap. He emphasizes that the<br />
measurement problem is not a practical but a fundamental problem. The core of the problem is if,<br />
after the measurement process, certain properties are present in the measuring apparatus. The FAPP<br />
reasoning shows that, generally, in practice the system behaves as if it had those properties, but it<br />
leaves untouched the fact that ‘in reality’ the system does not have those properties, and that, if our<br />
experimental possibilities would be more ample, this is also experimentally provable.<br />
EXERCISE 39. Show that, using the physical quantity corresponding to the operator |Ψ⟩ ⟨Ψ|, in<br />
which |Ψ⟩ is the right - hand side of (VIII. 22), experimental distinction can be made between the<br />
pure state (VIII. 22) and the mixed state (VIII. 24).<br />
VIII. 5<br />
INCOMPATIBLE QUANTITIES<br />
So far we considered measuring a single physical quantity or two compatible, or commeasurable,<br />
physical quantities of the object system, where compatible quantities are quantities corresponding<br />
to commutating operators. The simple measurement theory (VIII. 2) however, enables us to discuss<br />
also the measurement of incompatible quantities.<br />
Let A and B be two arbitrary, incompatible quantities of the object system S corresponding<br />
to the maximal operators A and B. Measuring apparatus M 1 measures A and apparatus M 2 measures<br />
B. The pointer observables of the apparatuses are R and T , corresponding to the operators<br />
R and T , the eigenstates are |a j ⟩, |b j ⟩, |r j ⟩, |t j ⟩, respectively. The initial state is |ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />
in H = H S ⊗ H 1 ⊗ H 2 , and with dim H S = N S ,<br />
|ψ⟩ =<br />
N S<br />
∑<br />
⟨a j | ψ⟩ |a j ⟩ =<br />
j=1<br />
N S<br />
∑<br />
⟨b k | ψ⟩ |b k ⟩. (VIII. 25)<br />
k=1
180 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
Now we first measure A and next B. The measurement scheme (VIII. 4) gives<br />
|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />
A<br />
N S<br />
∑<br />
⟨a j | ψ⟩ |a j ⟩ ⊗ |r j ⟩ ⊗ |t 0 ⟩<br />
j=1<br />
B ∑<br />
N S<br />
N S<br />
∑<br />
⟨a j | ψ⟩ ⟨b k | a j ⟩ |b j ⟩ ⊗ |r j ⟩ ⊗ |t j ⟩. (VIII. 26)<br />
j=1 k=1<br />
If we first measure B and then A, we have<br />
|ψ⟩ ⊗ |r 0 ⟩ ⊗ |t 0 ⟩<br />
B<br />
N S<br />
∑<br />
⟨b k | ψ⟩ |b k ⟩ ⊗ |r 0 ⟩ ⊗ |t k ⟩<br />
k=1<br />
A ∑<br />
N S<br />
N S<br />
k=1 j=1<br />
∑<br />
⟨b k | ψ⟩ ⟨b k | a j ⟩ ∗ |a j ⟩ ⊗ |r j ⟩ ⊗ |t k ⟩. (VIII. 27)<br />
We see that the final states (VIII. 26) and (VIII. 27) differ from each other. For the probability to get<br />
for A the outcome a j and for B the outcome b k we find<br />
and<br />
Prob A, B (R : r j ∧ T : t k ) = |⟨a j | ψ⟩| 2 |⟨b k | a j ⟩| 2 (VIII. 28)<br />
Prob B, A (T : t k ∧ B : r j ) = |⟨b k | ψ⟩| 2 |⟨a j | b k ⟩| 2 . (VIII. 29)<br />
The good thing is that the measurement theory enables us to make a statement about measurements<br />
of the incompatible quantities A and B which are done after each other, on the basis of the,<br />
possibly simultaneous, measurements of the compatible quantities R and T .<br />
EXERCISE 40. Why are R and T compatible?<br />
We see that the order in which A and B are measured is important. Here the result of the ‘measurement<br />
disturbance’ develops within the framework of the unitary time evolution of the state.<br />
For the conditional probability to find b k if we have found a j , and vice versa, we find,<br />
with (VIII. 28) and (VIII. 29), |⟨b k | a j ⟩| 2 and |⟨a j | b k ⟩| 2 , respectively, and we see that they are equal.<br />
This can be generalized easily. If we successively measure the discrete quantities A, A ′ , A ′′ , . . . ,<br />
having eigenvalues a i , a ′ j, a ′′ k, . . . , the probability to find, given that measurement of A yielded the<br />
outcome a i , for A ′ the outcome a ′ j and for A ′′ the outcome a ′′ k, etc. is equal to<br />
Prob ( · · · A ′′ : a ′′ k ∧ A ′ : a ′ j | A : a i )<br />
= · · · |⟨a ′′ k | a ′ j⟩| 2 |⟨a ′ j | a i ⟩| 2 = ⟨a i | a ′ j⟩ ⟨a ′ j | a ′′ k⟩ · · · ⟨a ′′ k | a ′ j⟩ ⟨a ′ j | a i ⟩<br />
= ⟨a i | P ′ j P ′′ k · · · P ′′ k P ′ j | a i ⟩ = Tr P i P ′ j P ′′ k · · · P ′′ kP ′ j. (VIII. 30)<br />
This result does not apply to degenerated eigenvalues.
VIII. 6. COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT 181<br />
We can consider (VIII. 30) to be the most general statement of quantum mechanics for maximal,<br />
discrete quantities; a probability statement concerning the occurrence of correlations between the<br />
outcomes of consecutive measurements. Empirically speaking, all of physics is about such statements,<br />
including classical physics. But classical physics permits us to associate with it a picture of physical<br />
systems as scraps and pieces of matter with properties, moving through space, while in quantum<br />
mechanics such a picture is not available.<br />
◃ Remark<br />
If we measure on S, as in (VIII. 2), the same quantity for a number of times, we will always find the<br />
same outcome. Then the projections in (VIII. 30) are orthogonal and<br />
Tr P i P j P k · · · P k P j = δ ij δ jk · · · . ▹ (VIII. 31)<br />
VIII. 6<br />
COMMENTS ON THE THEORY <strong>OF</strong> MEASUREMENT<br />
Although the measurement scheme (VIII. 2) seems evident, it is not entirely so. To show this, we<br />
start with deriving a desired consequence from it; only physical quantities which correspond to normal<br />
operators are measurable. The pointer states of the measuring apparatus have to be macroscopically<br />
distinguishable, which means that the eigenstates |r j ⟩ of operator R are orthonormal, ⟨r j | r k ⟩ = δ jk ,<br />
since R corresponds to the observable pointer position R of the measuring apparatus. Because the<br />
measurement interaction is unitary, it holds that<br />
(<br />
⟨ai | ⊗ |r 0 ⟩ ) ( ⟨a j | ⊗ |r 0 ⟩ ) = ( ⟨a i | ⊗ ⟨r i | ) ( |a j ⟩ ⊗ |r j ⟩ ) , (VIII. 32)<br />
or<br />
⟨a i | a j ⟩ ⟨r 0 | r 0 ⟩ = ⟨a i | a j ⟩ ⟨r i | r j ⟩ = δ ij , (VIII. 33)<br />
and therefore, ⟨a i | a j ⟩ = 0 if i ≠ j, where the |a j ⟩ are again the eigenvectors of the maximal operator<br />
A, introduced on p. 166. The |a j ⟩ are thus orthonormal and can therefore be a basis. According<br />
to the spectral theorem of p. 26, every basis generates, by means of the projectors projecting on the<br />
elements of the basis, a normal operator. The physical quantity A indeed corresponds to the normal<br />
operator A and, representing a physical quantity, the eigenvalues of A are real. Consequently, A is,<br />
on finite dimensional Hilbert spaces, self - adjoint.<br />
Now we will discuss some points of criticism. The measurement scheme (VIII. 2) is strongly<br />
idealized. It does not say anything about the physical nature of measurements, which are nearly<br />
always of electromagnetic nature. In case of a concrete description, the evolution operator U (t) will<br />
have to represent something, i.e., a Hamiltonian H is needed which generates this evolution by means<br />
of U (t) = e − i H t . In general, A and U will not commute, in which case the |a j ⟩ do not transform<br />
into themselves, unless the duration of the measurement is ‘sufficiently short’. But the question what,<br />
in this connection, is sufficiently short cannot be answered without discussing the characteristics of U<br />
and H.<br />
Likewise, complying with the conservation laws evokes problems as is shown in a theorem by<br />
Wigner (1952) and Araki and Yanase (1960).
182 CHAPTER VIII. THE MEASUREMENT PROBLEM<br />
THE WIGNER - ARAKI - YANASE THEOREM:<br />
The evolution U(τ), which brings about the measurement transition (VIII. 4) when measuring<br />
physical quantity A, is possible iff A commutes with all additive conserved quantities<br />
of the composite system of object system and measuring apparatus. In other words,<br />
conserved physical quantities which are not additive, additive physical quantities which<br />
are not conserved, and physical quantities which are neither conserved nor additive, cannot<br />
be measured exactly.<br />
Proof<br />
Here we will only prove the ‘if’ - part of the theorem. Let B be an additive conserved quantity of<br />
the composite system SM, i.e., B is, by definition, of the form<br />
B = B 1 ⊗ 11 + 11 ⊗ B 2 , (VIII. 34)<br />
which is conserved. This means that B commutes with the Hamiltonian H of the composite<br />
system,<br />
[B, H] = 0. (VIII. 35)<br />
Then B also commutes with every function of H and therefore with U (τ) = e − i H τ ,<br />
[B, U (τ)] = 0 =⇒ B = U † (τ) B U (τ). (VIII. 36)<br />
Consider the matrix element<br />
B jk := ⟨a j | ⊗ ⟨r 0 | B |a k ⟩ ⊗ |r 0 ⟩. (VIII. 37)<br />
On the one hand, because of the additivity of B, (VIII. 34), we have<br />
B jk = ⟨a j | ⊗ ⟨r 0 | (B 1 ⊗ 11 + 11 ⊗ B 2 ) |a k ⟩ ⊗ |r 0 ⟩<br />
= ⟨a j | B 1 | a k ⟩ + δ jk ⟨r 0 | B 2 | r 0 ⟩, (VIII. 38)<br />
while on the other hand, using (VIII. 36), we see that<br />
B jk = ⟨a j | ⊗ ⟨r 0 | U † (τ) B U (τ) |a j ⟩ ⊗ |r 0 ⟩ = ⟨a j | ⊗ ⟨r j | B | a k ⟩ ⊗ |r k ⟩<br />
= δ jk ⟨a j | B 1 | a k ⟩ + δ jk ⟨r j | B 2 | r k ⟩. (VIII. 39)<br />
Comparison of these two results shows that<br />
⟨a j | B 1 | a k ⟩ = 0 for j ≠ k, (VIII. 40)<br />
which means that in the basis {|a j ⟩} of H S , B 1 is in diagonal form and therefore A commutes<br />
with B 1 . □<br />
This theorem shows that the scheme (VIII. 4) can, strictly speaking, apply only to measurement of<br />
quantities which commute with all additive conserved quantities. However, the measurement scheme<br />
remains approximately valid if the value of the conserved quantity is large, which will easily be the<br />
case for macroscopic apparatuses. We therefore see that, whereas the U (t) in (VIII. 2) exists, a more<br />
concrete interpretation can come across problems. The shortcomings of the conventional formalism<br />
of quantum mechanics with regard to giving a faithful description of the measurement process, has<br />
lead to interesting extensions of the formalism, see, for instance, Busch, Lahti and Mittelstaedt (1991).
A<br />
GLEASON’S THEOREM<br />
Proofs really aren’t there to convince you that something is true - they’re there to show<br />
you why it is true.<br />
— Andrew Gleason<br />
Of course mathematics works in physics! It is designed to discuss exactly the situation<br />
that physics confronts; namely, that there seems to be some order out there - let’s find<br />
out what it is.<br />
— Andrew Gleason<br />
In section III. 2 we mentioned that Von Neumann suggested for a quantum mechanical probability<br />
measure the trace formula Tr P W , with P a projector. Gleason’s theorem shows that<br />
this probability measure in fact characterizes all probability measures on P (H), the set of all<br />
projectors on H. Since Gleason’s original proof is very difficult, in this appendix we will give a<br />
simplified version by proving the theorem for pure states only.<br />
A. 1 INTRODUCTION<br />
Let H be a real or complex Hilbert space with dim H > 2, and P (H) the set of all projectors<br />
on H. Let µ be a mapping µ : P (H) → [0, 1]. This µ is called a measure on H if it is additive,<br />
satisfying<br />
P i ⊥ P j =⇒ µ(P i + P j ) = µ(P i ) + µ(P j ) ∀ P i , P j ∈ P (H) (A. 1)<br />
µ(0 ) = 0 and µ(11) = 1. (A. 2)<br />
Combination of (A. 1) and the last requirement of (A. 2) implicates that µ attributes the value 1 to any<br />
orthogonal decomposition of unity.<br />
In section III. 2, p. 46, we saw that pure states are represented by the extreme elements of a convex<br />
set, and by proving the theorem on p. 49 we showed that the extreme elements of the convex set S(H)<br />
of state operators on H are the 1 - dimensional projectors in P (H). Consequently, the measure µ is<br />
called extreme if there exists a 1 - dimensional projector P such that<br />
µ(P ) = 1. (A. 3)<br />
This is also expressed by saying that µ is concentrated on P . We can now formulate Gleason’s<br />
theorem for pure states.
184 APPENDIX A. GLEASON’S THEOREM<br />
GLEASON’S THEOREM FOR PURE STATES:<br />
Under the condition that dim H > 2, a 1 - dimensional projector P 0 ∈ P (H) exists on<br />
which the measure µ : P (H) → [0, 1] is concentrated, such that<br />
µ(P ) = Tr P 0 P (A. 4)<br />
for all P ∈ P (H).<br />
The original proof by A.M. Gleason uses sophisticated mathematical methods and is rather<br />
opaque. Several authors have undertaken attempts a to give a more simple proof, particularly<br />
C. Piron (1976), J. Dorling (unpublished) and R. Cooke, M. Keane and B. Moran (1985), where<br />
the commentaries on the ‘elementary proof’ of Cooke, Keane and Moran by R.I.G. Hughes (1989)<br />
are clarifying.<br />
The following proof is a mixture of all this work. It exists of four steps, which, for that matter, do<br />
not coincide with the sections.<br />
A. 2 CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM<br />
Before taking the first step, we discuss a number of simple observations. First, the probability<br />
measure of Gleason’s theorem has to be continuous in P . In section III. 1, p. 48, we showed that<br />
discontinuous probability measures exist for dim H = 2. Therefore, the requirement dim H > 2<br />
holds without further mentioning throughout this appendix. Second, since the trace of a projector P<br />
is equal to the dimension of the subspace onto which it projects, the trace of a 1 - dimensional projector<br />
is 1, which yields for µ being concentrated on P 0<br />
µ(11) = Tr P 0 11 = 1, (A. 5)<br />
in accordance with (A. 2) and (A. 4). Third, every measure is entirely determined by giving its values<br />
on the 1 - dimensional projectors, and, since every higher - dimensional projector P is the sum<br />
of orthogonal 1 - dimensional projectors P i we can, with (A. 1), determine µ (P ) from the values<br />
of µ(P i ). Fourth, for every Hilbert space, (A. 4) at the same time defines an extreme measure µ on H<br />
which is concentrated on P 0 , and as of now we will indicate this measure by µ 0 ,<br />
µ 0 (P ) := Tr P 0 P. (A. 6)<br />
Using the idempotence of P 0 we have<br />
µ 0 (P 0 ) = Tr P 0 2 = Tr P 0 = 1, (A. 7)<br />
from which we see that (A. 4) holds for µ = µ 0 and P = P 0 . Since this measure, being concentrated<br />
on P 0 , assigns the value 0 to all projectors orthonormal to P 0 , it can also easily be verified that this<br />
measure satisfies the requirements (A. 1) and (A. 2).<br />
The foregoing observations lead to the conclusion that to prove Gleason’s theorem for pure states<br />
we have to prove that µ = µ 0 for all P ∈ P (H). Now we will take the first step.
A. 2. CONVERSION TO A 3 - DIMENSIONAL REAL PROBLEM 185<br />
A. 2. 1 STEP 1<br />
THEOREM 1:<br />
If Gleason’s theorem for pure states is true for any 3 - dimensional real Hilbert space, it<br />
is also true for any complex Hilbert space with dim H > 2.<br />
We will prove theorem 1 using a proof by contradiction.<br />
Proof<br />
Let H be a complex Hilbert space with dim H > 3 for which Gleason’s theorem is not true. Since<br />
all higher - dimensional projectors can be decomposed to 1 - dimensional projectors, it suffices to<br />
prove this theorem for 1 - dimensional projectors.<br />
Assume a measure µ on H exists, which is concentrated on P 0 ∈ P (H) such that µ(P 0 ) = 1,<br />
but differs from the measure µ 0 defined by (A. 6) in the sense that there is some 1 - dimensional<br />
projector P 1 for which the theorem does not hold,<br />
µ 0 (P 1 ) := Tr P 0 P 1 ≠ µ(P 1 ). (A. 8)<br />
First we will show that, if these measures differ on a higher - dimensional Hilbert space, they also<br />
differ on a 3 - dimensional Hilbert space.<br />
Using the projectors P 0 and P 1 , we can construct a set of three orthogonal 1 - dimensional projectors<br />
P 0 , ˜P 1 , P 2 in the following way. With P 0 = |e 0 ⟩ ⟨e 0 | and P 1 = |e 1 ⟩ ⟨e 1 |, construct a<br />
unit vector |ẽ 1 ⟩ in the plane spanned by |e 0 ⟩ and |e 1 ⟩ which is perpendicular to |e 0 ⟩, i.e.<br />
|ẽ 1 ⟩ ∝ (11 − P 0 ) |e 1 ⟩, (A. 9)<br />
as can be seen in figure A. 1. Then the projector ˜P 1 := |ẽ 1 ⟩ ⟨ẽ 1 | is perpendicular to P 0 . 1<br />
ẽ 1<br />
e 1<br />
e 2 e 0<br />
Figure A. 1: Construction of a 3 - dimensional subspace E<br />
Let P 2 be a 1 - dimensional projector which is perpendicular to both P 0 and ˜P 1 , it is always possible<br />
to choose such a projector because dim H > 3. With P 2 = |e 2 ⟩ ⟨e 2 |, the three orthonormal<br />
vectors |e 0 ⟩,|ẽ 1 ⟩ and |e 2 ⟩ together span a 3 - dimensional Hilbert space, which is a subspace of H.<br />
We will call this space E, and, by construction, P 0 , P 1 , ˜P 1 , P 2 ∈ P (E).<br />
1 To be exact<br />
˜P 1 = (1 − Tr P 0 P 1 ) −1 (P 1 + P 0 P 1 P 0 − P 1 P 0 − P 0 P 1 ).
186 APPENDIX A. GLEASON’S THEOREM<br />
Now we have the following statements,<br />
(a) P (E) ⊂ P (H),<br />
(b) the restriction of µ 0 to P (E) is a measure on P (E),<br />
(c) the restriction of µ to P (E) is a measure on P (E),<br />
(d) the measures µ 0 and µ differ on P (E).<br />
Statement (a) follows immediately from E ⊂ H. The statements (b) and (c) follow from the<br />
fact that both µ 0 and µ, being concentrated on P 0 , assign the value 1 to E, thereby assigning<br />
the value 0 to all subspaces of H perpendicular to E. Statement (d) follows from our assumption<br />
(A. 8).<br />
Next, we have to show that the Hilbert space E can be real. A Hilbert space is real if scalar<br />
multiplication and linear combinations of vectors are only carried out with real coefficients and<br />
the inner products are real. Choosing the vectors |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩, we have the freedom to<br />
absorb an arbitrary phase factor, which means that we can also take them real. Furthermore, we<br />
can exploit that freedom to bring about that the vector |e 1 ⟩, lying in the plane spanned by |e 0 ⟩<br />
and |ẽ 1 ⟩, becomes a linear combination with real coefficients, i.e.,<br />
|e 1 ⟩ = a |e 0 ⟩ + b |ẽ 1 ⟩ with a, b ∈ R. (A. 10)<br />
All inner products of the four vectors |e 0 ⟩, |ẽ 1 ⟩, |e 2 ⟩ and |e 1 ⟩ now have a real value. The required<br />
real Hilbert space is obtained by taking all linear combinations of |e 0 ⟩, |ẽ 1 ⟩ and |e 2 ⟩ with real<br />
coefficients. Because both |e 0 ⟩ and |e 1 ⟩ are elements of this Hilbert space, (a) through (d) remain<br />
valid.<br />
We see that, if Gleason’s theorem for pure states is not true for a complex Hilbert space with<br />
dim > 3, it is also not true for a real 3 - dimensional Hilbert space. Now assume that the theorem<br />
is proven to be true for a real Hilbert space with dim = 3. At the same time supposing that it is<br />
not true for a Hilbert space with dim > 3, so that it would, as we showed, also not be true for a<br />
real H with dim = 3, yields a contradiction. Therefore, theorem 1 is true. □<br />
A. 3 FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE<br />
While by proving theorem 1 we showed that, if Gleason’s theorem for pure states is true for a<br />
real, 3 - dimensional Hilbert space, it is also true for a complex Hilbert space with dim > 2, we did<br />
not prove that µ = µ 0 . In this section we will take the next steps towards proving that indeed µ = µ 0<br />
for all P ∈ P (H) in a real, 3 - dimensional Hilbert space.<br />
Conversion of an arbitrary complex Hilbert space to a 3 - dimensional real Hilbert space is convenient<br />
because this space is isomorphic with the usual 3 - dimensional Euclidean space R 3 . Here,<br />
the 1 - dimensional projectors correspond to lines through the origin, and we can identify them with<br />
points on the surface of a unit sphere, or actually, with half of the unit sphere because |e⟩ and −|e⟩ represent<br />
the same state. Those points will be designated by means of their spherical coordinates (θ, ϕ),<br />
or as points, or directions, on the surface of the unit sphere p, q, r, s, t, . . . , ∈ S 2 , where S 2 is the<br />
standard notation for this surface, and the index 2 refers to the fact that it is 2 - dimensional.
A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 187<br />
Letting lines through the origin represent 1 - dimensional projectors, the mapping µ is represented<br />
by a function µ which is a function of the points on S 2 or of the spherical coordinates of those points,<br />
having the following characteristics.<br />
The point p 0 , corresponding to the projector P 0 for which the measure µ is extreme, therefore<br />
µ(p 0 ) = 1, (A. 11)<br />
is called the north pole by convention, the other 1 - dimensional projectors are represented by points on<br />
the northern hemisphere. The set of all 1 - dimensional projectors perpendicular to a given direction r<br />
is called a great circle with axis r, this can be seen in figure A. 2. The great circle representing the<br />
projectors perpendicular to P 0 is called the equator, for which it holds that for any point s on the<br />
equator, according to (A. 1), (A. 11), and the requirement 0 µ 1,<br />
µ(s) = 0. (A. 12)<br />
The requirement (A. 1) for µ to be a measure is, if µ is taken to be a function of points on the<br />
surface of the unit sphere, that for arbitrary, mutually perpendicular axes (r, s, t) in the northern<br />
hemisphere it holds that<br />
µ(r) + µ(s) + µ(t) = 1, (A. 13)<br />
while for µ taken as a function of the spherical coordinates (θ, ϕ) of the points of intersection of the<br />
arbitrary axes (r, s, t) with the surface of the unit sphere we have<br />
µ(θ r , ϕ r ) + µ(θ s , ϕ s ) + µ(θ t , ϕ t ) = 1 (A. 14)<br />
where for any ϕ it holds that<br />
µ(0, ϕ) = 1 and µ( 1 2π, ϕ) = 0, (A. 15)<br />
assigning the required values to the north pole and the equator.<br />
Since we are working in a real, 3 - dimensional Hilbert space, we can assign values to the special<br />
measure (A. 6) in accordance with Von Neumann’s value assignment (V. 33), p. 119. Using (III. 45),<br />
with P 0 = |e 0 ⟩ ⟨e 0 | and P s = |ψ⟩ ⟨ψ|,<br />
µ 0 (P s ) = Tr P 0 P s = |⟨ψ | e 0 ⟩| 2 , (A. 16)<br />
with θ s the angle between s and the north pole, the special measure can be written as<br />
µ 0 (s) = cos 2 θ s . (A. 17)<br />
We will come back to this value assignment in section A. 4. In the next two steps we will prove<br />
that any measure µ (s) satisfying the requirements (A. 11) to (A. 13), or (A. 14) and (A. 15), is a<br />
nonincreasing function in θ s , and does not depend on ϕ.
188 APPENDIX A. GLEASON’S THEOREM<br />
A. 3. 1 STEP 2<br />
THEOREM 2:<br />
If the function µ (s) or, equivalently, µ (θ s , ϕ s ), satisfies the requirements (A. 11)<br />
to (A. 15), then µ(s) is a nonincreasing function in θ s .<br />
We will prove this theorem using two lemmas.<br />
A. 3. 1. 1 LEMMA 1<br />
A LITTLE LEMMA:<br />
Let {s ∈ S 2 | s ⊥ r} be the great circle with axis r ≠ p 0 . Furthermore, let s 0 represent<br />
the most northern point of this circle. Then for all points s of this great circle it holds<br />
that<br />
µ(s 0 ) µ(s), (A. 18)<br />
i.e., if we let s travel along a great circle, µ(s) will have its maximum value in the most<br />
northern point s 0 .<br />
Proof<br />
Choose a set of three orthogonal directions r, s, t, with s ∈ S 2 an arbitrary point on the great<br />
circle around axis r. From (A. 13) we have<br />
µ(r) + µ(s) + µ(t) = 1. (A. 19)<br />
Now carry out a rotation of the orthogonal pair s and t around the axis r until s arrives at the most<br />
northern point s 0 of the great circle. Under this rotation t arrives at a point t ′ at the equator as<br />
can be seen in figure A. 2.<br />
r<br />
p 0<br />
s 0<br />
θ<br />
t ′<br />
t<br />
∆ϕ<br />
s<br />
equator<br />
Figure A. 2: Rotation of s to s 0 and t to t ′ along a great circle around axis r
A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 189<br />
Since r, s 0 and t ′ are still mutually orthogonal, we have<br />
µ(r) + µ(s 0 ) + µ(t ′ ) = 1, (A. 20)<br />
and combination with (A. 19) gives<br />
µ(s) + µ(t) = µ(s 0 ) + µ(t ′ ). (A. 21)<br />
But t ′ is on the equator, where, according to (A. 12), µ(t ′ ) = 0, and with 0 µ 1 we see that<br />
µ(s) = µ(s 0 ) − µ(t) µ(s 0 ). (A. 22)<br />
Therefore, on the great circle µ(s) has its largest value in the most northern point. □<br />
A. 3. 1. 2 LEMMA 2<br />
PIRON’S GEOMETRIC LEMMA :<br />
If the pair (s, t) are points on the northern hemisphere, and s lies more northwards than t,<br />
a curve of s to t can be found, existing entirely of segments of great circles, always<br />
starting at their most northern point.<br />
The following proof of Piron’s geometric lemma using projective geometry has been given by<br />
Cooke, Keane and Moran (1985).<br />
Proof<br />
The surface of the northern hemisphere of the unit sphere can be projected bijectively from the<br />
origin onto the horizontal plane P tangent to the north pole, as can be seen in figure A. 3. Therefore,<br />
we can also formulate our problem in this plane.<br />
P<br />
p 0 = Im(p 0 )<br />
Im(s ′ )<br />
Im(s 0 )<br />
s ′ s 0<br />
Im(s ′′ )<br />
s ′′<br />
Figure A. 3: Projection of points on a great circle onto a plane P through the north pole
190 APPENDIX A. GLEASON’S THEOREM<br />
All great circles, except the equator, are projected onto this plane as straight lines. The most<br />
northern point of such a great circle is projected onto the point of its corresponding line that is<br />
closest to the north pole. The line connecting the image of the north pole, Im(p 0 ), and the image<br />
of s 0 , Im(s 0 ), therefore intersects this line at a right angle.<br />
The projection plane therefore contains circles around the projected north pole corresponding to<br />
circles of constant northern latitude, where θ is constant, lines through the projected north pole<br />
corresponding to meridians which are lines of constant ϕ, and projected great circles, where one<br />
of those great circles is depicted in figure A. 4 by the thick grey line, while the projection of its<br />
most northern point is connected with the projected north pole by the thin grey line.<br />
P<br />
θ = c<br />
ϕ = c<br />
Figure A. 4: Projection of meridians, circles with constant latitude, and a great circle<br />
A continuous path from s to t, with s more northern than t, therefore θ s < θ t , along a series of<br />
segments of great circles while always starting at their most northern point, is represented in this<br />
way by a spiral consisting of straight line segments as shown in figure A. 5.<br />
t<br />
S N<br />
Figure A. 5: Spiral representing a projected path from s to t along subsequent great circles, each time<br />
starting at their most northern point<br />
By increasing the number of segments between s and t, we can let this spiral approach a circle<br />
with the north pole as its center. This means that on the northern hemisphere we can travel<br />
every desired distance in longitude by changing over to other great circles, while by changing<br />
over frequently enough we can make the decrease in northern latitude arbitrarily small, leaving θ<br />
constant or nearly constant.<br />
It is also possible to travel from a point t to a more southern point v having the same longitude,<br />
ϕ t = ϕ v . Of course, this can be done by traveling along a nearly circular path as described<br />
p 0<br />
S s
A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 191<br />
above while taking ϕ from 0 to 2π and changing over just often enough to descend the required<br />
distance, but we will show it can also be done taking a path along two great circles only, again<br />
starting in their most northern points.<br />
As we saw, on the plane P paths of constant latitude are represented by circles around the north<br />
pole p 0 . Taking t as the starting point, choose it to be the most northern point of a great circle and<br />
travel along a segment, projected onto P as a straight line, to arrive at u, with θ u > θ t . From u,<br />
also choosing it to be the most northern point of a great circle, travel along a segment in opposite<br />
rotational direction, to arrive at v, the projection of which can be seen in figure A. 6.<br />
v<br />
t<br />
u<br />
ϕ(t) = ϕ(v)<br />
S<br />
Figure A. 6: Path from t to v, having the same longitude<br />
By traveling far enough along the great circle through t, u can always be chosen such that v can<br />
be reached from t in two steps. This means that we can always combine paths with constant latitude<br />
and constant longitude to create a path between two points s and t, where s is more northern<br />
than t, existing entirely of segments of great circles, always starting at their most northern point,<br />
thereby satisfying Piron’s lemma. □<br />
p 0<br />
A. 3. 1. 3 RESULT <strong>OF</strong> LEMMA 1 AND 2<br />
By proving the first lemma, we showed that µ(s 0 ), with s 0 the most northern point of the great<br />
circle through s, is always larger than, or equal to, µ(s), consequently, µ can only remain constant or<br />
decrease along a great circle if traveling along the circle starts from its most northern point.<br />
According to lemma 2, traveling from s to t, where s is more northern that t, is always possible<br />
to follow a path along subsequent great circles, each time starting at their most northern points.<br />
Combination of the two lemmas means that Piron’s lemma implies that we can find a sequence of<br />
points s, ′ , s ′′ , . . . , t, with<br />
and therefore<br />
µ(s) µ(s ′ ) . . . µ(t) for θ s < θ s ′ < . . . < θ t , (A. 23)<br />
µ(s) µ(t) for θ s < θ t , (A. 24)<br />
which proves theorem 2. □
192 APPENDIX A. GLEASON’S THEOREM<br />
A. 3. 2 STEP 3<br />
THEOREM 3:<br />
The function µ is constant at constant latitude and hence does not depend on ϕ,<br />
θ s = θ t ⇒ µ(θ s , ϕ s ) = µ(θ t , ϕ t ). (A. 25)<br />
Proof, first part<br />
Again, we will use a proof by contradiction.<br />
Suppose a latitude exists, i.e., there is a horizontal circle B on the surface of the unit sphere,<br />
B(θ 0 ) = {s ∈ S 2 | θ s = θ 0 }, (A. 26)<br />
for which µ is not constant. Here we assume that B(θ 0 ) is not the north pole or the equator, where<br />
theorem 3 is obvious. Now let<br />
and<br />
M (θ 0 ) := sup{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )} (A. 27)<br />
m(θ 0 ) := inf{µ(s) ∈ [0, 1] | s ∈ B(θ 0 )}, (A. 28)<br />
where M (θ 0 ) is the least upper bound, or supremum, and m(θ 0 ) is the greatest lower bound, or<br />
infimum, of all values of µ over B(θ 0 ). If µ does not remain constant, it applies, for certain ε > 0,<br />
that<br />
M (θ 0 ) − m(θ 0 ) = ε. (A. 29)<br />
Now let C be an arbitrary continuous curve which intersects each circle of constant latitude at<br />
most once, i.e., C is strictly in - or decreasing.<br />
p<br />
B(θ 0 )<br />
C<br />
Figure A. 7: A strictly in - or decreasing curve C<br />
Let p be the point where the curve C intersects the latitude (A. 26),<br />
p = C ∩ B(θ 0 ), (A. 30)<br />
as can be seen in figure A. 7.
A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 193<br />
For every point s 1 on this curve north of p we have θ s1 < θ 0 , which means that according<br />
to (A. 24) it holds that µ(s 1 ) µ(s) for every point s ∈ B(θ 0 ). Consequently, it also holds that<br />
µ(θ s1 ) M (θ 0 ). (A. 31)<br />
Likewise, for all points s 2 of C south of B(θ 0 ) we see that<br />
µ(θ s2 ) m(θ 0 ). (A. 32)<br />
This reasoning holds no matter how close to B(θ 0 ) the points s 1 and s 2 are chosen.<br />
Because of (A. 29) we conclude that the value of µ, when traveling from north to south along<br />
the curve C, makes a discontinuous jump of at least<br />
M (θ 0 ) − m(θ 0 ) = ε (A. 33)<br />
to a lower value when passing B (θ 0 ). This conclusion applies to every continuous, strictly in -<br />
or decreasing curve intersecting B (θ 0 ), which means we can also choose the curve C to be a<br />
meridian,<br />
C = {s ∈ S 2 | ϕ s = ϕ 0 }, (A. 34)<br />
which is a great circle through the north pole having its axis t at the equator, see figure A. 8.<br />
p 0<br />
s ⊥ B(θ 0 )<br />
q<br />
θ q<br />
s<br />
p<br />
C<br />
t<br />
Figure A. 8: Great circle C, coordinate system (p, q, t), and rotating pair (s, s ⊥ )<br />
Let q ∈ C be orthogonal to the point of intersection p of C and B (θ 0 ), such that t, p and q<br />
are mutually orthogonal. Choose an orthogonal pair (s, s ⊥ ) ∈ C to be a rigid coordinate system.<br />
Rotating this system around axis t, we move s from north to south through point p, whereby,<br />
according to (A. 33), the value of µ jumps discontinuously with at least ε while crossing over the<br />
latitude of B(θ 0 ). The pair s and s ⊥ forming a rigid system, we know that<br />
µ(s) + µ(s ⊥ ) + µ(t) = 1, (A. 35)<br />
where µ(t) = 0 because the axis t is on the equator.
194 APPENDIX A. GLEASON’S THEOREM<br />
Therefore, if s moves southwards, passing through p, and simultaneously s ⊥ moves northwards,<br />
passing through q, the value of µ(s ⊥ ) also has to jump discontinuously. If µ(s) jumps with −ε,<br />
then µ(s ⊥ ) jumps with ε.<br />
Now choose another great circle C ′ with axis t ′ , which intersects B(θ 0 ) in p under a slightly tilted<br />
angle, as can be seen in figure A. 9.<br />
p 0<br />
q<br />
q ′ q ′′ B(θ 0 )<br />
t ′′<br />
C ′′<br />
p<br />
t ′<br />
C ′<br />
C<br />
t<br />
Figure A. 9: Great circle C and tilted great circles C ′ and C ′′<br />
For this great circle we can repeat the same argument, and conclude that for s ′ ∈ C ′ , while<br />
passing the latitude of B(θ 0 ), µ(s ′ ) makes a jump of at least ε, and an equally valued but opposite<br />
jump is made by µ (s ′⊥ ) in a point q ′ ∈ C ′ which is again perpendicular to p. Notice that,<br />
because C ′ is tilted with respect to C, θ(q) ≠ θ(q ′ ).<br />
This argument can be repeated endlessly, with great circles C ′′ , C ′′′ , . . . , C n , intersecting B(θ 0 )<br />
in p, always under different angles. We therefore find a series of points q, q ′ , q ′′ , . . . , q n where,<br />
in passing through one of them while traveling along one of the great circles through p, the value<br />
of µ jumps discontinuously. □<br />
Here we briefly pause from the proof of theorem 3 to prove a simple lemma.<br />
ACCESSORY LEMMA:<br />
Let C 1 and C 2 be two continuous curves on S 2 , intersecting in q, where q is not the<br />
most northern point of either curve. For some s ∈ C 1 , with s more northern than q,<br />
suppose that, traveling south, µ(s) makes a discontinuous jump of −ε < 0 in the point<br />
of intersection q.<br />
This means that it holds for all s ∈ C 1 and some constant a,<br />
and<br />
θ s < θ q ⇒ µ(s) a, (A. 36)<br />
θ s > θ q ⇒ µ(s) a − ε, (A. 37)<br />
and consequently, for all t ∈ C 2 , µ(t) also makes a discontinuous jump in q of at least ε.
A. 3. FORMULATION <strong>OF</strong> THE PROBLEM ON THE SURFACE <strong>OF</strong> A SPHERE 195<br />
Proof<br />
For every pair of points (s 1 , s 2 ) ∈ C 1 , where θ s1 < θ q and θ s2 > θ q , we can always find a<br />
pair (t 1 , t 2 ) ∈ C 2 , such that θ s1 < θ t1 < θ q and θ s2 > θ t2 > θ q , see figure A. 10.<br />
s 1<br />
t 1<br />
q<br />
s 2 t 2<br />
C 1<br />
C 2<br />
θ q<br />
Figure A. 10: Two continuous curves on S 2 , intersecting in q<br />
Using (A. 24), (A. 36) and (A. 37), we have for t ∈ C 2<br />
θ s < θ t < θ q ⇒ µ(s) µ(t) a, (A. 38)<br />
and<br />
θ s > θ t > θ q ⇒ µ(s) µ(t) a − ε. (A. 39)<br />
This holds no matter how close to q the points s and t are chosen, which proves the lemma. □<br />
Proof, second part<br />
Now we continue the proof of theorem 3. In the first part of the theorem we proved for the pair<br />
(s, s ⊥ ) that if µ jumps with ε in p, it also jumps with ε in q. The same rigidity holding for any<br />
pair (s i , s i⊥ ) ∈ C i , we concluded that µ jumps in every point q, q ′ , q ′′ , . . . , q n with at least ε.<br />
With the accessory lemma, we proved that, if µ makes a jump of at least ε at some point on one<br />
curve C, it does so on any curve C i through that point.<br />
Since we chose the directions q, q ′ , q ′′ , . . . , q n perpendicular to p, see figure A. 9, they all lie<br />
on C p , a great circle with axis p. Starting in its most northern point q, upon descending along this<br />
great circle C p towards the equator, µ(s) remains constant or decreases, as we showed by proving<br />
theorem 2.<br />
But according to the first part of this proof and the accessory lemma, upon descending along<br />
this great circle C p towards the equator, in each of the points q, q ′ , q ′′ , . . . , q n , µ jumps with<br />
at least −ε while passing their various latitudes. Since we can choose n arbitrary large, we can<br />
choose n to be larger than n > ε 1 , making the total jump nε > 1. This leads to µ acquiring values<br />
smaller than 0, which is contradictory to the requirement that 0 µ 1. We have to conclude<br />
that ε = 0, which yields M (θ 0 ) = m(θ 0 ).<br />
We proved that if on the surface of the unit sphere a horizontal circle B exists for which µ is not<br />
constant, then µ /∈ [0, 1], hence µ is constant on constant latitude and does not depend on ϕ,<br />
which proves theorem 3. □
196 APPENDIX A. GLEASON’S THEOREM<br />
A. 4 AN ANALYTIC LEMMA<br />
We have to take one more step to prove that µ = µ 0 , but first we prove a lemma using results<br />
from previous sections.<br />
LEMMA:<br />
The special measure µ 0 can be written as<br />
µ 0 (χ s ) = χ s , (A. 40)<br />
Proof<br />
The special measure (A. 6) can, as we saw in (A. 17), be written as<br />
µ 0 (P ) = Tr P 0 P = |⟨ψ | e 0 ⟩| 2 = cos 2 θ. (A. 41)<br />
As we proved that µ is a nonincreasing function in θ, and does not depend on ϕ, we can take µ<br />
to be a function of a function of θ, and to already make a connection with the analytic lemma<br />
of step 4 which will follow shortly, we choose this function to be the constant, nonincreasing<br />
function χ s : [0, 1 2 π] → [0, 1], χ(θ s) := cos 2 θ s , where θ s is the angle between the direction s<br />
and the north pole. In the next step we will show that this measure satisfies the requirements for µ<br />
to be a measure.<br />
For the special measure µ 0 (θ s ), with s representing an arbitrary P , (A. 41) now reads<br />
µ 0 (χ(θ s )) = cos 2 θ s (A. 42)<br />
which can be written as<br />
µ 0 (χ s ) = χ s . □ (A. 43)<br />
What is left for us to do is to see whether a measure exists, not equal to µ 0 , for which this does<br />
not hold for some P ∈ P (H), as was our assumption (A. 8) in A. 2. 1. This will be the final step,<br />
where, by proving the next theorem, we will see that such a measure does not exist.<br />
A. 4. 1 STEP 4<br />
THEOREM 4:<br />
The only form of µ satisfying (A. 24),<br />
is<br />
θ s < θ t ⇒ µ(s) µ(t), (A. 44)<br />
µ(χ s ) = χ s . (A. 45)
A. 4. AN ANALYTIC LEMMA 197<br />
To prove this theorem, we will use an analytic lemma given by Cooke, Keane and Moran (1985).<br />
But before we will do so, we make some observations.<br />
First, for any triple of mutually perpendicular directions (r, s, t) and some direction q it holds in<br />
general that<br />
cos 2 θ r + cos 2 θ s + cos 2 θ t = 1, (A. 46)<br />
where θ r is the angle between the direction q and axis r, and r corresponds to cos θ r , likewise for s<br />
and t. We can easily see that (A. 46) holds in general if we express the directions in the usual spherical<br />
coordinates,<br />
cos θ r = cos ϕ sin θ, cos θ s = sin ϕ sin θ, and cos θ t = cos θ, (A. 47)<br />
from which we readily know that their squares add up to 1.<br />
With χ(θ r ) = cos 2 θ r etc., we can write (A. 46) as<br />
χ r + χ s + χ t = 1. (A. 48)<br />
Second, for µ as a function of χ(θ s ), µ : [0, 1] → [0, 1], it holds that although µ is nonincreasing<br />
in θ, it is nondecreasing in χ s . The requirements for µ to be a measure, (A. 14) and (A. 15), can now<br />
be rewritten as<br />
µ(χ r ) + µ(χ s ) + µ(χ t ) = 1, (A. 49)<br />
µ(0) = 0 and µ(1) = 1. (A. 50)<br />
With these properties, µ equals the function f in the analytic lemma which now follows.<br />
ANALYTIC LEMMA:<br />
If f : [0, 1] → [0, 1] is a function such that<br />
(1) f (0) = 0,<br />
(2) f is nondecreasing, i.e., if a < b then f (a) f (b),<br />
(3) if a, b, c ∈ [0, 1] and a + b + c = 1, then f (a) + f (b) + f (c) = 1,<br />
then f is the identity function: f (a) = a for all a ∈ [0, 1].<br />
Proof<br />
Choosing c = 0, from (3) we have b = 1 − a, yielding<br />
f (a) = 1 − f (1 − a) (A. 51)<br />
for all values a ∈ [0, 1]. Next, choose c = 1 − (a + b),<br />
f (a) + f (b) = 1 − f (1 − (a + b) = 1 − (1 − f (a + b)) = f (a + b) (A. 52)<br />
for all a, b, a + b ∈ [0, 1].
198 APPENDIX A. GLEASON’S THEOREM<br />
Iteration of (A. 52) yields, for n ∈ N + ,<br />
nf (a) = f (na) for n a 1. (A. 53)<br />
Taking a = 1 n<br />
we see that<br />
( 1<br />
f =<br />
n)<br />
f (1)<br />
n<br />
and iterating again, we have<br />
or, indeed,<br />
( m<br />
)<br />
f = m n n<br />
= 1 , (A. 54)<br />
n<br />
for m, n ∈ N, m < n, (A. 55)<br />
f (a) = a ∀ a ∈ Q. (A. 56)<br />
From (2) we see that<br />
lim f (a) = sup f (a) = 0, (A. 57)<br />
a→0 a→0<br />
and, using again (A. 52),<br />
lim f (a + b) = f (b) ∀ 0 b 1. (A. 58)<br />
a→0<br />
Therefore, f is continuous, and<br />
f (a) = a ∀ a. □ (A. 59)<br />
A. 5 SUMMARY<br />
In this appendix we proved Gleason’s theorem for pure states, represented by extreme measures µ.<br />
In section A. 2 we proved that if Gleason’s theorem for pure states is true for any 3 - dimensional<br />
real Hilbert space, it is also true for any complex Hilbert space with dim H > 2. In A. 3. 1 we showed<br />
that µ is a nonincreasing function in θ, and in A. 3. 2 we proved that µ does not depend on ϕ.<br />
Finally, by proving the analytic lemma we showed that there can only be one form for the measure<br />
µ which satisfies these requirements for all P ∈ P (H) and that is the quantum mechanical one,<br />
i.e., in accordance with cos 2 θ.
WORKS CONSULTED<br />
Most subjects in these lecture notes are also found in Redhead (1987), Krips (1987), Hughes (1989),<br />
D’Espagnat (1989) and Bub (1997).<br />
Dickson (1998) is an accessible monograph.<br />
Jammer (1974) is a survey of the research in foundations of quantum mechanics in historical perspective<br />
from the beginnings of quantum mechanics until 1974. However, Jammer remains indispensable<br />
for every student seriously studying foundations of quantum mechanics.<br />
Bell (1987) contains his articles on quantum mechanics.<br />
Von Neumann’s Grundlagen (1932) is a masterpiece, which is still fully worth studying.<br />
Prugovečki (2006) is a modernized and more systematic version, but it evades subjects of interpretation<br />
and is mainly a mathematical reference book.<br />
Busch, Lahti and Mittelstaedt (1996) is a monograph on quantum mechanical measurement theory.<br />
Hooker (1975) is a collection of important articles of algebraic and logical signature.<br />
Wheeler and Zurek (1983) is an extensive collection of photocopies of important articles (EPR, Bohr,<br />
Bohm, Everett, etc.).<br />
Fine (1986) is the unequalled monograph on Einstein and quantum mechanics.<br />
Contributions to the research of foundations of quantum mechanics from Utrecht University are the<br />
work of Hilgevoord and Uffink and vice versa, of Dieks and Vermaas about the modal interpretation<br />
of quantum mechanics and Uffink’s thesis (1990) about uncertainty relations.
BIBLIOGRAPHY<br />
Albers, D.J., Alexanderson, G.L., Reid, C. (1990) More Mathematical People : Contemporary Conversations<br />
Boston: Harcourt Brace Jovanovich<br />
Araki, H., Yanase, M.M. (1960) ‘Measurement of Quantum Mechanical Operators’<br />
Physical Review 120 (2) pp. 622-626<br />
Aspect, A., Dalibard, J., Roger, G. (1982) ‘Experimental Test of Bell’s Inequalities Using Time -<br />
Varying Analyzers’<br />
Physical Review Letters 49 (25) pp. 1804-1807<br />
Belinfante, F.J. (1973) A Survey of Hidden - Variables Theories<br />
Oxford: Pergamon Press<br />
Bell, J.S. (1964) ‘On the Einstein Podolsky Rosen Paradox’<br />
Physics 1 (3) pp. 195-200, repr. in Wheeler and Zurek (1983)<br />
Bell, J.S. (1966) ‘On the Problem of Hidden Variables in Quantum Mechanics’<br />
Reviews of modern physics 38 pp. 447-452<br />
Bell, J.S. (1971) ‘Introduction to the hidden - variables question’<br />
In d’Espagnat (1971), repr. in Bell (1987)<br />
Bell, J.S. (1975) ‘The Theory of Local Beables’<br />
Presented at the sixth GIFT Seminar, Jaca, 2 - 7 June 1975, repr. in Bell (1987)<br />
Bell, J.S. (1982) ‘On the impossible pilot wave’<br />
Foundations of Physics 12 (10) pp. 989-999<br />
Bell, J.S. (1987) Speakable and Unspeakable in Quantum Mechanics<br />
Cambridge: Cambridge University Press<br />
Bell, J.S. (1990) ‘Against measurement’<br />
Physics World (August) pp. 33-40<br />
Beltrametti, E.G., Cassinelli, G. (1981) The Logic of Quantum Mechanics<br />
Reading: Addison - Wesley Publishing Company<br />
Birkhoff, G., Von Neumann, J. (1936) ‘The Logic of Quantum Mechanics’<br />
The Annals of Mathematics, Second Series 37 (4) pp. 823-843<br />
Bohm, D.J. (1952) ‘A Suggested Interpretation of the Quantum Theory in Terms of “Hidden” Variables.<br />
I, II’<br />
Physical Review 85 (2) pp. 166-179, pp. 180-193
202 BIBLIOGRAPHY<br />
Bohm, D.J., Aharonov, Y. (1957) ‘Discussion of Experimental Proof for the Paradox of Einstein,<br />
Rosen, and Podolsky’<br />
Physical Review 108 (4) pp. 1070-1076<br />
Bohm, D.J. (1981) Wholeness and the implicate order<br />
London: Routledge & Kegan Paul<br />
Bohm, D.J., Peat, F.D. (1989) Science, order, and creativity<br />
London: Routledge<br />
Bohr, N.H.D. (1928) ‘The Quantum Postulate and the Recent Development of Atomic Theory’<br />
Nature 121 (3050) pp. 580-590<br />
Bohr, N.H.D. (1931) ‘Maxwell and Modern Theoretical Physics’<br />
Nature 128 (3234) pp. 691-692<br />
Bohr, N.H.D. (1934) Atomic Theory and the Description of Nature<br />
New York: The Macmillan Company<br />
Bohr, N.H.D. (1935a) ‘Quantum Mechanics and Physical Reality’<br />
Nature 136 p. 65<br />
Bohr, N.H.D. (1935b) ‘Can Quantum - Mechanical Description of Physical Reality Be Considered<br />
Complete?’<br />
Physical Review 48 (8) pp. 696-702<br />
Bohr, N.H.D. (1939) ‘The causality problem in atomic physics’<br />
In Bohr, N.H.D. (1939) New Theories in Physics<br />
Paris: International Institute of Intellectual Co - operation<br />
Bohr, N.H.D. (1947) ‘Newton’s Principles and Modern Atomic Mechanics’<br />
In The Royal Society of London (1947) Newton Tercentenary Celebrations. 15 - 19 July 1946<br />
Cambridge: Cambridge University Press<br />
Bohr, N.H.D. (1949) ‘Discussion with Einstein on epistemological problems in atomic physics’<br />
In Schilpp (1949), repr. in Wheeler and Zurek (1983)<br />
Bopp, F.A. (1947) ‘Quantenmechanische Statistik und Korrelationsrechnung’<br />
Zeitschrift für Naturforschung A 2 pp. 202-216<br />
Born, M., Jordan, P., (1925) ‘Zur Quantenmechanik’<br />
Zeitschrift fur Physik 34 (1) pp. 858-888<br />
Eng. tr. (abridged): ‘On Quantum mechanics’<br />
In Van der Waerden (1967)<br />
Bródy F., Vámos, T. (eds) (1995) The Neumann Compendium<br />
Singapore: World Scientific Publishing Company
BIBLIOGRAPHY 203<br />
Broglie, L.V.P.R. de (1928) ‘La nouvelle dynamique des quanta’<br />
La Commission Administrative de l’Institut Internale de Physique Solvay (1928) Électrons et<br />
Photons: Rapports et Discussions du Cinquième Conseil de Physique tenu à Bruxelles du 24<br />
au 29 Octobre 1927 sous les Auspices de l’Institut International de Physique Solvay<br />
Paris: Gauthier - Villars<br />
Eng. tr.: ’The new dynamics of quanta’<br />
In Bacciagaluppi, G., Valentini, A. (2009) Quantum Theory at the Crossroads : Reconsidering<br />
the 1927 Solvay Conference<br />
Cambridge: Cambridge University Press<br />
Bub, J., Clifton, R.K. (1996) ‘A Uniqueness Theorem for ‘No Collapse’ Interpretations of Quantum<br />
Mechanics’<br />
Studies in the History and Philosophy of Modern Physics B 27 (2) pp. 181-219<br />
Bub, J., Clifton, R.K., Goldstein, S. (2000) ‘Revised Proof of the Uniqueness Theorem for ‘No Collapse’<br />
Interpretations of Quantum Mechanics’<br />
Studies in the History and Philosophy of Modern Physics B 31 pp. 95-98<br />
Bub, J. (1997) Interpreting the Quantum World<br />
Cambridge: Cambridge University Press<br />
Busch, P.S., Grabowski, M.P., and Lahti, P.J. (1995) Operational Quantum physics<br />
Berlin: Springer - Verlag<br />
Busch, P., Lahti, P.J., Mittelstaedt, P. (1991) The Quantum Theory of Measurement<br />
Berlin: Springer - Verlag<br />
Capasso, V., Fortunato, D., Selleri, F. (1973)‘Sensitive Observables of Quantum Mechanics’<br />
International Journal of Theoretical Physics 7 (5) pp. 319-326<br />
Clauser, J.F., Horne, M.A., Shimony, A., Holt, R.A. (1969) ‘Proposed Experiment to test Local Hidden<br />
- Variable Theories’<br />
Physical Review Letters 23 (15) pp. 880-884<br />
Clifton, R.K., Butterfield, J.N., Redhead, M.L.G. (1990) ‘Nonlocal Influences and Possible Worlds –<br />
A Stapp in the Wrong Direction’<br />
British Journal for the Philosophy of Science 41 (1) pp. 5-58<br />
Condon, E.U. (1929) ‘Remarks on uncertainty principles’<br />
Science 69 pp. 573-574<br />
Cooke, R.M, Hilgevoord, J. (1979) ‘Correspondence, Equivalence and Completeness’<br />
Epistemological Letters (March) pp. 42-54<br />
Cooke, R.M., Keane, M.S., Moran, W. (1985) ‘An elementary proof of Gleason’s theorem’<br />
Mathematical Proceedings of the Cambridge Philosophical Society 98 pp. 117-128<br />
Cushing, J.T. (1994) Quantum Mechanics : Historical Contingency and the Copenhagen Hegemony<br />
Chicago: The University of Chicago Press
204 BIBLIOGRAPHY<br />
Daneri, A., Loinger, A., Prosperi, G.M. (1962) ‘Quantum Theory of Measurement and Ergodicity<br />
Conditions’<br />
Nuclear Physics 33 (1962) pp. 297-319<br />
De Muynck, W.M. (1986) ‘The Bell Inequalities and their Irrelevance to the Problem of Locality in<br />
Quantum Mechanics’<br />
Physics Letters A 114 (2) pp. 65-67<br />
De Muynck, W.M. (1996) ‘Can We Escape from Bell’s Conclusion that Quantum Mechanics Describes<br />
a Non - Local Reality?’<br />
Studies in History and Philosophy of Science Part B: Studies in History and Philosophy of<br />
Modern Physics 27 (3) pp. 315-330<br />
DeWitt, B.S. (1970) ‘Quantum mechanics and reality’<br />
Physics Today 23 (9) pp. 30-40<br />
DeWitt, B.S. (1971) ‘The Many - Universes Interpretation of Quantum Mechanics’<br />
In d’Espagnat (1971), repr. in DeWitt and Graham (1973)<br />
DeWitt, B.S., Graham, R.N. (eds) (1973) The Many - Worlds Interpretation of Quantum Mechanics<br />
Princeton: Princeton University Press<br />
Dickson W.M. (1998) Quantum Chance and Non - locality : Probability and Non - locality in the<br />
Interpretations of Quantum Mechanics<br />
Cambridge: Cambridge University Press<br />
Dieks, D.G.B.J. (1983) ‘Stochastic Locality and Conservation Laws’<br />
Lettere al Nuovo Cimento 38 (13) pp. 443-447<br />
Dieks, D.G.B.J. (1989) ‘Resolution of the Measurement Problem through Decoherence of the Quantum<br />
State’<br />
Physics Letters A 142 (8,9) pp. 439-446<br />
Dieks, D.G.B.J. and Vermaas, P.E. (eds) (1998) The Modal Interpretation of Quantum Mechanics<br />
Dordrecht: Kluwer Academic Publishers<br />
Dirac, P.A.M., (1925) ‘The Fundamental Equations of Quantum Mechanics’<br />
Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical<br />
and Physical Character 109 (752) pp. 642-653<br />
Dirac, P.A.M. (1958) The Principles of Quantum Mechanics<br />
Oxford: at the Clarendon Press<br />
Dirac, P.A.M. (1963) ‘The Evolution of the Physicist’s Picture of Nature’<br />
Scientific American 208 (5) pp. 45-53<br />
Eberhard, P.H. (1977) ‘Bell’s Theorem without Hidden Variables’<br />
Il Nuovo Cimento B 38 (1) pp. 75-80
BIBLIOGRAPHY 205<br />
Einstein, A. (1921) ‘Geometrie und Erfahrung’<br />
Sitzungsberichte der Preussischen Akademie der Wissenschaften pp. 123-130<br />
Eng. tr.: Bargmann, S. (transl) ‘Geometry and Experience’<br />
In Janssen, M., Schulmann, R., Illy, J., Lehner, C., Buchwald, D. (eds) (2002) The Collected<br />
Papers of Albert Einstein, Volume 7 : The Berlin Years: Writings, 1918 - 1921<br />
Princeton: Princeton University Press<br />
Einstein, A. (1934) Mein Weltbild<br />
Amsterdam: Querido Verlag<br />
Eng. tr.: Bargmann, S. (transl), Seelig, C. (ed) (1954) Ideas and opinions<br />
New York: Bonanza Books<br />
Einstein, A., Podolsky, B., Rosen, N. (1935) ‘Can Quantum - Mechanical Description of Physical<br />
Reality Be Considered Complete?’<br />
Physical Review 47 (10) pp. 777-780<br />
Einstein, A., Born, M., Born, H. (1971) The Born - Einstein letters : correspondence between Albert<br />
Einstein and Max and Hedwig Born from 1916 to 1955 with commentaries by Max Born<br />
London: The Macmillan Press<br />
Espagnat, B. d’ (ed) (1971) Foundations of Quantum Mechanics : Proceedings of the International<br />
School of Physics ”Enrico Fermi”, held at Varenna, 29th June-11th July, 1970, Course IL<br />
New York: Academic Press<br />
Espagnat, B. d’ (1989) Conceptual Foundations Of Quantum Mechanics<br />
New York: Perseus Books<br />
Everett, H. III (1957) ‘The Theory of the Universal Wave Function’<br />
In DeWitt and Graham (1973)<br />
Everett, H. III (1957) “‘Relative State” Formulation of Quantum Mechanics’<br />
Reviews of Modern Physics 29 (3) pp. 454-462<br />
Fine, A.I. (1982) ‘Hidden Variables, Joint Probability, and the Bell Inequalities’<br />
Physical Review Letters 48 (5) pp. 291-295<br />
Fine, A. (1986) The shaky game : Einstein, realism and the quantum theory<br />
Chicago: University of Chicago Press<br />
Folse, H.J. (1985) The philosophy of Niels Bohr : the framework of complimentarity<br />
Amsterdam: North - Holland Physics Publishing<br />
Fraassen, B.C. van (1973) ‘Semantic Analysis of Quantum Logic’<br />
In Hooker, C.A. (ed) (1973) Contemporary Research in the Foundations and Philosophy of<br />
Quantum Theory<br />
Dordrecht: D. Reidel Publishing Company<br />
Fraassen, B.C. van (1979) ‘Hidden Variables and the Modal Interpretation of Quantum Theory’<br />
Synthese 42 (1) pp. 155-165
206 BIBLIOGRAPHY<br />
Frank, P.G. (1949) Modern Science and Its Philosophy<br />
Cambridge: Harvard University Press<br />
Freedman, S.J., Clauser, J.F. (1972) ‘Experimental Test of Local Hidden - Variable Theories’<br />
Physical Review Letters 28 (14) pp. 938-941<br />
Ghirardi, G.C., Rimini, A., Weber, T. (1980) ‘A General Argument against Superluminal Transmission<br />
through the Quantum Mechanical Measurement Process’<br />
Lettere al Nuovo Cimento 27 (10) pp. 293-298<br />
Ghirardi, G.C., Rimini, A., Weber, T. (1986) ‘Unified dynamics for microscopic and macroscopic<br />
systems’<br />
Physical Review D 34 (2) pp. 470-491<br />
Gleason A.M., (1957) ‘Measures on the Closed Subspaces of a Hilbert space’<br />
Journal of Mathematics and Mechanics 6 pp. 885-893<br />
Gottfried, K. (1989) ‘Does Quantum Mechanics describe the Collapse of the Wavefunction?’<br />
Unpublished contribution to the 1989 Conference, International School of History of Science,<br />
Erice, Italy, 5 - 14 August<br />
Greenberger, D.M., Horne, M.A., Zeilinger, A. (1989) Going Beyond Bell’s Theorem<br />
In Kafatos, M.C. (ed) (1989) Bell’s Theorem, Quantum Theory and Conceptions of the Universe<br />
Dordrecht: Kluwer Academic Publishers<br />
http://arxiv.org/abs/0712.0921<br />
Groenewold, H.J. (1946) ‘On the Principles of Elementary Quantum Mechanics’<br />
Physica 12 (7) pp. 405-460<br />
Haag, R. (1990) ‘Fundamental Irreversibility and the Concept of Events’<br />
Communications in Mathematical Physics 132 pp. 245-251<br />
Healey, R.A. (1989) The philosophy of quantum mechanics : An interactive interpretation<br />
Cambridge: Cambridge University Press<br />
Heisenberg, W. (1925) ‘Über quantentheoretische Umdeutung kinematischer und mechanischer<br />
Beziehungen’<br />
Zeitschrift für Physik 33 (1) pp. 879-893<br />
Eng. tr.: ‘Quantum - theoretical re - interpretation of kinematic and mechanical relations’<br />
In Van der Waerden, B.L. (1967)<br />
Heisenberg, W.K. (1927) ‘Über den anschaulichen Inhalt der quantentheoretischen Kinematik und<br />
Mechanik’<br />
Zeitschrift für Physik 43 (3/4) pp. 172-198<br />
Eng. tr.: ‘The physical content of quantum kinematics and mechanics’<br />
In Wheeler and Zurek (1983)
BIBLIOGRAPHY 207<br />
Heisenberg, W.K., (1930) Die Physikalischen Prinzipien der Quantentheorie<br />
Leipzig: Verlag von S. Hirzel<br />
Eng. tr.: Eckart, C., Hoyt, F.C. (transl) (1930) The Physical Principles of Quantum Theory<br />
New York: Dover Publications<br />
Heisenberg, W. (1963) Niels Bohr Library and Archives<br />
Interview with Werner Heisenberg by T. S. Kuhn at the Max Planck Institute, Munich, Germany,<br />
February 25. Transcript Session VIII<br />
http://www.aip.org/history/ohilist/4661underscore8.html<br />
Heitler, W.H. (1970) Der Mensch und die naturwissenschaftliche Erkenntniss<br />
Braunschweig: Friedrich Vieweg & Sohn Verlagsgesellschaft<br />
Hey, T., Walters, P. (2003) The New Quantum Universe<br />
Cambridge: Cambridge University Press<br />
Hilgevoord, J., Uffink, J.B.M. (1988) ‘The mathematical expression of the uncertainty principle’<br />
In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1988) Microphysical Reality and Quantum<br />
Formalism. Volume I<br />
Dordrecht: Kluwer Academic Publishers<br />
Hilgevoord, J., Uffink, J.B.M. (1990) ‘A new view on the uncertainty principle’<br />
In Miller A.I. (ed) (1990) Sixty - Two years of Uncertainty : Historical, Philosophical and<br />
Physical Inquiries into the Foundations of Quantum Mechanics<br />
New York: Plenum Press<br />
Hilgevoord, J. (2002) ‘Time in quantum mechanics’<br />
American Journal of Physics 70 (3) pp. 301-306<br />
Holevo, A.S. (1982) Probabilistic and Statistical Aspects of Quantum Theory<br />
Amsterdam: North - Holland Publishing Company<br />
Holland, P.R. (1993) The Quantum Theory of Motion : An Account of the de Broglie - Bohm Causal<br />
Interpretation of Quantum Mechanics<br />
Cambridge: Cambridge University Press<br />
Home, D., Selleri, F. (1991) ‘Bell’s Theorem and the EPR Paradox’<br />
La Rivista del Nuovo Cimento 14 (9) pp. 1-95<br />
’t Hooft, G. (1997) In search of the ultimate building blocks<br />
Cambridge: Cambridge University Press<br />
Hooker, C.A. (ed) (1975) The Logico - Algebraic Approach to Quantum Mechanics. Volume I: the<br />
Historical Evolution<br />
Dordrecht: D. Reidel Publishing Company<br />
Hughes, R.I.G. (1989) The Structure and Interpretation of Quantum Mechanics<br />
Cambridge: Harvard University Press
208 BIBLIOGRAPHY<br />
Isham, C.J. (1995) Lectures on Quantum Theory : Mathematical and Structural Foundations<br />
River Edge: Imperial College Press<br />
Jacques, V., Wu, E., Grosshans, F., Treussart, F., Grangier, P., Aspect, A., Roch, J-F. (2007) ‘Experimental<br />
realization of Wheeler’s delayed - choice gedanken experiment’<br />
Science 315 (5814) pp. 966-968<br />
Jammer, M. (1974) The Philosophy of Quantum Mechanics : The Interpretations of Quantum Mechanics<br />
in Historical Perspective<br />
New York: John Wiley & Sons<br />
Jammer, M. (1990) ‘John Stewart Bell and His Work - On the Occasion of His Sixtieth Birthday’<br />
Foundations of Physics 20 (10) pp. 1139-1145<br />
Jammer, M., (1992) ‘John Stewart Bell and the Debate on the Significance of His Contributions to<br />
the Foundations of Quantum Mechanics’<br />
In Merwe, A. van der, Selleri, F., Tarozzi, G. (eds) (1992) International Conference on Bell’s<br />
Theorem and the Foundations of Modern Physics<br />
Singapore: World Scientific Publishing<br />
Jarrett, J.P. (1984) ‘On the Physical Significance of the Locality Conditions in the Bell Arguments’<br />
Noûs 18 (4) pp. 569-589<br />
Jauch, J.M. (1968) Foundations of Quantum Mechanics<br />
Reading: Addison - Wesley Educational Publishers<br />
Kalckar, J. (ed) (1996) Niels Bohr - Collected Works : Volume 7 - Foundations of Quantum Physics II<br />
(1933 - 1958)<br />
Amsterdam: Elsevier Science<br />
Kampen, N.G. van (1988) ‘Ten Theorems about Quantum Mechanical Measurements’<br />
Physica A 153 pp. 97-113<br />
Kennard, E.H. (1927) ‘Zur Quantenmechanik einfacher Bewegungstypen’<br />
Zeitschrift für Physik 44 (4/5) pp. 326-352<br />
Kochen, S., Specker, E.P. (1967) ‘The Problem of Hidden Variables in Quantum Mechanics’<br />
Journal of Mathematics and Mechanics 17 (1) pp. 59-87<br />
Kochen, S. (1985) ‘A New Interpretation of Quantum Mechanics’<br />
In Lahti, P.J., Mittelstaedt, P. (eds) (1985) Symposium on the foundations of modern physics<br />
1985 : 50 years of the Einstein - Podolsky - Rosen Gedankenexperiment<br />
Singapore: World Scientific Publishing Company<br />
Krips, H. (1987) The Metaphysics of Quantum Theory<br />
Oxford: Clarendon Press<br />
Landau, L.D., Lifshitz, E.M. (1958) Quantum Mechanics : Non - Relativistic theory<br />
London: Pergamon Press
BIBLIOGRAPHY 209<br />
Landau, H.J., Pollack, H.O. (1961) ‘Prolate Spheroidal Wave Functions, Fourier Analysis and Uncertainty<br />
- II’<br />
The Bell System Technical Journal 40 pp. 65-84<br />
London, F., Bauer, E. (1939) La Théorie de l’Observation en Mécanique Quantique<br />
Paris: Hermann<br />
Eng. tr.: ‘The Theory of Observation in Quantum Mechanics’<br />
In Wheeler and Zurek (1983)<br />
Lüders, G., (1951) ‘Über die Zustandsänderung durch den Meßprozeß’<br />
Annalen der Physik 443 (5 - 8) pp. 322-328<br />
Eng. tr.: Kirkpatrick, K.A. (transl) (2006) ‘Concerning the state - change due to the measurement<br />
process’<br />
Annalen der Physik 15 (9) pp. 663-670<br />
Maczynski, M.J. (1971) ‘Boolean Properties of Observables in Axiomatic Quantum Mechanics’<br />
Reports on Mathematical Physics 2 (2) pp. 135-150<br />
Mermin, N.D. (1993) ‘Hidden variables and the two theorems of John Bell’<br />
Reviews of Modern Physics 65 (3) pp. 803-815<br />
Meyer, D.A. (1999) ‘Finite precision measurement nullifies the Kochen - Specker theorem’<br />
Physical Review Letters 83 pp. 3751-3754<br />
Meyer, D.A. (2003) ‘Coloring, quantum mechanics, and Euclid’<br />
Pdf file: math.ucsd.edu/ dmeyer/research/talks/cqmE.pdf<br />
Miller, A.I. (1990) (ed) Sixty - two Years of Uncertainty : Historical, Philosophical and Physical<br />
Inquiries into the Foundations of Quantum Mechanics<br />
New York: Plenum Press<br />
Miller, W.A., Wheeler, J.A. (1984) ‘Delayed - Choice Experiments and Bohr’s Elementary Quantum<br />
Phenomenon’<br />
In Nakajima, S., Murayama, Y., Tonomura, A. (eds) (1996) Foundations of Quantum Mechanics<br />
in the Light of New Technology<br />
Singapore: World Scientific Publishing<br />
Muller, F.A. (1997a) ‘The Equivalence Myth of Quantum Mechanics–Part I’<br />
Studies in History and Philosophy of Modern Physics 28 (1) pp. 35-61<br />
(1997b) ‘Part II’<br />
ibid. 28 (2) pp. 219-247<br />
(1999) ‘(Addendum)’<br />
ibid. 30 (4) pp. 543-545<br />
Neumann, J. Von (1932) Mathematische Grundlagen der Quantenmechanik<br />
Berlin: Verlag von Julius Springer<br />
Eng. tr.: Beyer, R.T. (transl) (1955) The Mathematical Foundations of Quantum Mechanics<br />
Princeton: Princeton University Press
210 BIBLIOGRAPHY<br />
Pauli, W.E. (1933) Die allgemeinen Prinzipien der Wellenmechanik<br />
Berlin: Verlag von Julius Springer<br />
Eng. tr.: (1950) The General principles of wave mechanics<br />
Urbana - Champaign: University of Illinois Press<br />
Penrose, R. (1996) ‘On Gravity’s Role in Quantum State Reduction’<br />
General Relativity and Gravitation 28 (5) pp. 581-600<br />
Peres, A. (1993) Quantum Theory: Concepts and Methods<br />
Dordrecht: Kluwer Academic Publishers<br />
Petersen, A. (1963) ‘The Philosophy of Niels Bohr’<br />
Bulletin of the Atomic Scientists 19 (7) pp. 8-14<br />
Petersen, A. (1968) Quantum Physics and the Philosophical Tradition<br />
Cambridge: M.I.T. Press<br />
Piron, C. (1976) Foundations of Quantum Physics<br />
Reading: W.A. Benjamin<br />
Prugovečki, E. (2006) Quantum Mechanics in Hilbert Space<br />
Mineola: Dover Publications<br />
Przibram, K. (ed) (1963) Briefe zur Wellenmechanik : Schrödinger, Planck, Einstein, Lorentz<br />
Wien: Springer - Verlag<br />
Eng. tr.: Przibram, K. (ed) (1963) Letters on wave mechanics : Schrödinger, Planck, Einstein,<br />
Lorentz<br />
New York: Philosophical Library<br />
Rauch, H., Werner, S.A. (2000) Neutron Interferometry : Lessons in Experimental Quantum Mechanics<br />
Oxford: Oxford University Press<br />
Redhead, M.L.G. (1987) Incompleteness, Nonlocality and Realism : A Prolegomenon to the Philosophy<br />
of Quantum Mechanics<br />
Oxford: Clarendon Press<br />
Robertson, H.P. (1929) ‘The Uncertainty Principle’<br />
Physical Review 34 p. 163<br />
Scheibe, E., Sykes, J.B., (transl) (1973) The Logical Analysis of Quantum Mechanics<br />
Oxford: Pergamon Press<br />
Schiff, L.I. (1949) Quantum Mechanics<br />
New York: McGraw - Hill<br />
Schilpp, P.A. (ed) (1949) Albert Einstein : Philosopher - Scientist<br />
Evanston: The Library of Living Philosophers
BIBLIOGRAPHY 211<br />
Schmidt, E. (1907) ‘Zur Theorie der linearen und nichtlinearen Integralgleichungen. I. Teil’<br />
Mathematische Annalen 63 pp. 433-476<br />
(1907) ‘Zweite Abhandlung’<br />
ibid. 64 pp. 161-174<br />
(1907) ‘III. Teil’<br />
ibid. 65 pp. 370-399<br />
Schrödinger, E.R.J.A. (1926) ‘An Undulatory Theory of the Mechanics of Atoms and Molecules’<br />
The Physical Review 28 (6) pp. 1049-1070<br />
Schrödinger, E.R.J.A. (1930) ‘Zum Heisenbergschen Unschärfeprinzip’<br />
Sitzungsberichte der Preußischen Akademie der Wissenschaften. Physikalisch - mathematische<br />
Klasse pp. 296-303<br />
Schrödinger, E.R.J.A. (1935a) ‘Discussion of Probability Relations between Separated Systems’<br />
Mathematical Proceedings of the Cambridge Philosophical Society 31 (4) pp. 555-563<br />
Schrödinger, E.R.J.A. (1935b) ‘Die gegenwärtige Situation in der Quantenmechanik’<br />
Naturwissenschaften 23 (48) pp. 807-812, (49) pp. 823-828, (50) pp. 844-849<br />
Eng. tr.: Trimmer, J.D. (transl) (1980) ‘The Present Situation in Quantum Mechanics: A Translation<br />
of Schrödinger’s “Cat Paradox”’<br />
Proceedings of the American Philosophical Society 124 (5) pp. 323-338<br />
Repr. in Wheeler and Zurek (1983)<br />
Shimony, A. (1984) ‘Controllable and Uncontrollable Non - Locality’<br />
In Kamefuchi, S., et al. (eds) Proceedings of the International Symposium : Foundations of<br />
Quantum Mechanics in the Light of New Technology<br />
Tokyo: Physical Society of Japan<br />
Shimony, A. (1989) ‘Search for a Worldview Which Can Accommodate Our Knowledge of Microphysics’<br />
In Cushing, J.T., McMullin, E. (eds) Philosophical Consequences of Quantum Theory : Reflections<br />
on Bell’s Theorem<br />
Notre Dame: University of Notre Dame Press<br />
Shimony, A. (1995) ‘Degree of entanglement’<br />
In Greenberger, D.M., Zeilinger, A. (eds) Fundamental Problems in Quantum Theory : In<br />
Honor of Professor John A. Wheeler<br />
New York: New York Academy of Sciences<br />
Stapp, H.P. (1975) ‘Bell’s Theorem and World Process’<br />
Il Nuovo Cimento B 29 (2) pp. 270-276<br />
Stapp, H.P. (1977) ‘Are Superluminal Connections Necessary?’<br />
Il Nuovo Cimento B 40 (1) pp. 191-205<br />
Stone, M.H. (1932) ‘On One - Parameter Unitary Groups in Hilbert Space’<br />
The Annals of Mathematics, Second Series 33 (3) pp. 643-648
212 BIBLIOGRAPHY<br />
Suppes, P., Zanotti, M. (1976) ‘On the Determinism of Hidden Variable Theories with Strict Correlation<br />
and Conditional Statistical Independence of Observables’<br />
In Suppes, P. (ed) Logic and Probability in Quantum Mechanics<br />
Dordrecht: D. Reidel Publishing Company<br />
Svetlichny, G., Redhead, M.L.G., Brown, H.R., Butterfield, J. (1988) ‘Do the Bell Inequalities Require<br />
the Existence of Joint Probability Distributions?’<br />
Philosophy of Science 55 (3) pp. 387-401<br />
Tkadlec, J. (2000) ‘Diagrams of Kochen - Specker Type Constructions’<br />
International Journal of Theoretical Physics 39 (3) pp. 921-926<br />
Uffink, J.B.M., Hilgevoord, J. (1985) ‘Uncertainty Principle and Uncertainty Relations’<br />
Foundations of Physics 15 (9) pp. 925-944<br />
Uffink, J.B.M., Hilgevoord, J. (1988) ‘Interference and Distinguishability in Quantum Mechanics’<br />
Physica B 151 pp. 309-313<br />
Uffink, J.B.M. (1990) Measures of Uncertainty and the Uncertainty Principle<br />
Utrecht: Rijksuniversiteit te Utrecht, Dissertation<br />
Vermaas, P.E., Dieks, D.G.B.J. (1995) ‘The Modal Interpretation of Quantum Mechanics and its<br />
Generalization to Density Operators’<br />
Foundations of Physics 25 (1) pp. 145-158<br />
Vermaas, P.E. (1999) A Philosopher’s Understanding of Quantum Mechanics : Possibilities and Impossibilities<br />
of a Modal Interpretation<br />
Cambridge: Cambridge University Press<br />
Vigier, J.-P., Dewdney, C., Holland, P.R., Kyprianidis, A. (1987) ‘Causal particle trajectories and the<br />
interpretation of quantum mechanics’<br />
In Hiley, B.J., Peat, F.D. (eds) (1987) Quantum implications : essays in honour of David Bohm<br />
London: Routledge & Kegan Paul<br />
Waerden, B.L. Van der (ed) (1967) Sources of Quantum mechanics<br />
Amsterdam: North - Holland Publishing Company<br />
Weihs, G., Jennewein, T., Simon, C., Weinfurter, H., Zeilinger, A. (1998) ‘Violation of Bell’s Inequality<br />
under Strict Einstein Locality Conditions’<br />
Physical Review Letters 81 (23) pp. 5039-5043<br />
Wheatley, M.J. (2001) Leadership and the New Science : Discovering Order in a Chaotic World<br />
San Francisco: Berrett - Koehler Publishers<br />
Wheeler, J.A. (1957) ‘Assessment of Everett’s “Relative State” Formulation of Quantum Theory’<br />
Reviews of Modern Physics 29 (3) pp. 463-465<br />
Wheeler, J.A., Zurek, W.H. (eds) (1983) Quantum Theory and Measurement<br />
Princeton: Princeton University Press
BIBLIOGRAPHY 213<br />
Wick, G.C., Wightman, A.S., Wigner E.P. (1952)‘The Intrinsic Parity of Elementary Particles’<br />
Physical Review 88 (1) pp. 101-105<br />
Wigner, E.P. (1952) ‘Die Messung quantenmechanischer Operatoren’<br />
Zeitschrift für Physik 133 pp. 101-108<br />
Wigner, E.P. ‘Remarks on the mind - body question’<br />
In Good, I.J. (1962) The scientist speculates : an anthology of partly - baked ideas<br />
London: Heinemann<br />
Repr in Wheeler and Zurek (1983)<br />
Wigner, E.P. (1963) ‘The problem of measurement’<br />
American Journal of Physics 31 (6) pp. 6-15<br />
Wigner, E.P. (1970) ‘On Hidden Variables and Quantum Mechanical Probabilities’<br />
Americal Journal of Physics 38 (8) pp. 1005-1009<br />
Wigner, E.P. (1983) ‘Interpretation of Quantum Mechanics’<br />
In Wheeler and Zurek (1983)<br />
Zukav, G. (1984) The Dancing Wu Li Masters : An Overview of the New Physics<br />
New York: Bantam Books<br />
Zurek, W.H. (1981) ‘Pointer basis of quantum apparatus: Into what mixture does the wave packet<br />
collapse?’<br />
Physical Review D 24 (6) pp. 1516-1525<br />
Zurek, W.H. (1982) ‘Environment - induced superselection rules’<br />
Physical Review D 26 (8) pp. 1862-1880