2.2 Quantum States and Hilbert Vector Space 29

by the coefficients ai. However, it is clear that I cannot identify the coefficients
themselves with the probabilities because ai are not necessarily positive or even real
(if this were not the case, we could not describe both constructive interference and
destructive interference just like in the case of electromagnetic waves). At the same
time, you can recall that in the case of wave interference, it is not the amplitudes
of the waves but their intensities proportional to the absolute values of the squared
amplitudes that determine the brightness of the interference fringes. Thus, we can
surmise that in the case of quantum superposition, respective probabilities are given
by jaij2:

p.qi/ D jaij2 : (2.23)

Multiplying Eq. 2.2 by hq1j or hq2j (the bra counterparts of respective vectors jqii)
from the left and using the orthogonality condition, Eq. 2.22, I can derive for the
coefficients ai

hq1j˛
˛ D hq1j .a1 jq1i C a2 jq2i/ D a1

��q1
��2 ) a1 D hq1j˛

˛

hq2j˛
˛ D hq2j .a1 jq1i C a2 jq2i/ D a1

��q2
��2 ) a2 D hq2j˛

˛
; (2.24)

where I took into account the convention that all vectors describing quantum states
are presumed to be normalized. Expressions derived in Eq. 2.24 allow presenting
Eq. 2.23 for probability in a more generic form:

p.qi/ D
ˇ̌hqij˛

˛ˇ̌2
: (2.25)

Applying Eq. 2.25 to the case of j˛i D ˇ̌qj
˛
, you find p .qi/ D ıi;j establishing

formal correspondence between notions of mutual exclusivity and orthogonality.
Computation of the norm of the state

ˇ̌
˛
˛

yields

��˛��2 D ˇ̌a1
ˇ̌2 C ˇ̌a2

ˇ̌2 � p1 C p2:

If
��˛�� D 1, i.e., the state j˛i is normalized as presumed, then you obtain relation

p1 C p2 D 1 in complete agreement with what is expected of the probabilities. This
result reinforces my (well, actually Max Born’s) suggestion to interpret

ˇ̌hqij˛
˛ˇ̌2

as a probability that the measurement of observable q on a system in state j˛i will
produce qi.

It is important to emphasize that any uncertainty in the result of the measurement
of the observable q exists only before the measurement has taken place and that the
probability referred to in this discussion describes the number of given outcomes in
the series of such measurements. After the measurement is carried out, and one of
the values of the observable is actually observed, all uncertainty has disappeared.
We now know that the measurement yielded a particular value qi, which, according
to our earlier proposition, is only possible if the system is in the respective state

Jacob David

Jacob David

WHAT IS ALPHA?

30 2 Quantum States

jqii. Thus we have to conclude that the act of measurement has destroyed the initial
state j˛i and “collapsed” it into state jqii. The most intriguing question, of course,
is what determines the state in which the system collapses into. This question has
been debated during the entire 100+ year-long history of quantum mechanics and is
being debated still. The orthodox Copenhagen interpretation of quantum mechanics
essentially leaves this question without an answer claiming that the choice of the
final (after measurement) state is completely random.2 I propose that you accept
this interpretation as quite sufficient for most practical purposes, while it does leave
people with a philosophical state of mind somewhat unsatisfied.

Equation 2.2 describes superposition of only two states. It is not too difficult
to imagine that it can be extended to the case of the arbitrary number of statesˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
generated by mutually consistent observables with discrete

spectrum:

j˛i D
X

k;m��� ;p
ak;m��� ;p

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
: (2.26)

This sum, in principle, can contain any number of terms, including an infinite
amount. In the latter case, of course, one have to start worrying about its con-
vergence, but I will leave these worries to mathematicians. Coefficients ak;m��� ;p
appearing in this equation have the same meaning as the coefficients in the two-state
superposition expressed by Eq. 2.25, in which jqii is replaced with a more general
state appearing in Eq. 2.26.

2At a talk given at the physics department of Queens College in New York in 2014, British
mathematician J.H. Conway (currently Professor Emeritus of Mathematics at Princeton University)
dismissed the randomness postulate of the Copenhagen interpretation as a “cop-out” and also
because the use of probabilities only makes sense when one deals with a well-defined ensemble
of events or particles, which is not true in the case of a single electron or photon. At the same
time, he and S. Kochen (Canada) proved a mathematical theorem asserting that the entire structure
of quantum mechanics is inconsistent with the idea of existence of some unknown characteristics
of quantum systems, which would, shall we find them, provide deterministic description of the
system. In this sense, they proved completeness of the existing structure of quantum theory
and buried the idea of “hidden variables”—unknown elements of reality, which could restore
determinism to the quantum world—provided that we are unwilling to throw away the entire
conceptual structure of quantum mechanics, which, so far, gave excellent quantitative explanation
of a vast number of the experimental data. The Conway and Kochen theorem is called “free will
theorem” because it can be interpreted as an assertion that electrons, just like humans, have “free
will,” which in strict mathematical sense means that electron’s future behavior might not be a
deterministic function of its past. The description of the theorem can be found here: https://en.
wikipedia.org/wiki/Free_will_theorem.

2.3 States Characterized by Observables with Continuous Spectrum 31

2.3 States Characterized by Observables with Continuous
Spectrum

In the previous section, I considered only states generated by observables with
discrete spectrum. As a result, even though the number of states in Eq. 2.26 can
be infinite, they are still countable (one can enumerate them using natural numbers
1; 2; 3; � � � ). Some observables, however, have continuous spectrum, meaning that
they can take values from a continuous (finite or infinite) interval of values. One of
such important observables is a particle’s position, measured by its position vector r
or a set of Cartesian coordinates .x; y; z/, defined in a particular coordinate system.
It is interesting to note in this regard that while in classical mechanics, descriptions
using Cartesian coordinates are largely equivalent to those relying on spherical or
polar coordinates, it is not so in quantum description, where angular coordinates
in spherical or cylindrical systems do not easily submit to quantum treatment.
This comment obviously appears somewhat cryptic here, but its meaning will be
clarified in the subsequent chapters. Another peculiarity of the position observable
is the need to carefully distinguish between coordinates being characteristics of a
particle’s position and coordinates being markers of various points in space, needed
to describe position dependence of various mathematical and physical quantities.

Other observables, such as energy or momentum, might have either continuous or
discrete spectrum depending upon the environment, in which a particle finds itself,
or might have mixed spectrum, where an interval of discretely defined values crosses
over into an interval of continuously distributed values.

Two main peculiarities of states characterized by observables with continuous
spectrum are that (1) they cannot be normalized in a regular sense of the word and
(2) the concept of probability as defined by Eq. 2.25 loses its meaning because in
the case of continuous random variables, the probability can only be defined for an
interval (which might be infinitesimally small) of values, but not for any particular
value of the variable. These two features are not independent and are related to each
other as it will be seen from the future analysis.

I will illustrate the properties of states corresponding to observables with
continuous spectrum using the position of a particle as an example. Assuming that
there are no other observables mutually consistent with the position, I will present
a state in which this observable has a definite value r as jri. In order to construct
a superposition state using these vectors, I have to replace the sum in Eq. 2.26
with an integral over all possible values of position vector r introducing instead
of coefficients ak with discrete indexes a function .r/ of a continuous variable:

j˛i D
ˆ

d3r .r/ jri : (2.27)

Now I want to compute the norm k˛k of the superposition state given by Eq. 2.27.
The Hermitian conjugation of Eq. 2.27 produces the respective bra vector

h˛j D
ˆ

d3r �.r/ hrj (2.28)

Jacob David

WHAT IS THIS?
How do you do this?�

32 2 Quantum States

so that the norm becomes

k˛k2 � h˛j ˛i D
“

d3r1d
3r2

� .r1/ .r2/ hr1j r2i ; (2.29)

where I had to rename the integration variables in order to be able to replace the
product of integrals with a double integral (note that r1 appears in those parts of
Eq. 2.29, which originate from the bra vector of Eq. 2.28, and r2 appears in the ket-
related parts of the integral). States jr1i and jr2i remain mutually exclusive even in
the case of continuous spectrum as long as r1 ¤ r2. Thus I can write based on the
discussion in the previous sections that hr1j r2i D 0 for r1 ¤ r2. If I now require
that hr1j r2i D 1 for r1 D r2, which would correspond to the “regular” normalization
condition, I will end up with an integral, in which the integrand is zero everywhere
with exception of one point, where it is finite. Clearly such an integral would be
zero in contradiction with the properties of the norm (it can only be zero for null-
vector). To save the situation, something has to give, and I have to reject one of
the assumptions made when evaluating Eq. 2.29. The mutual exclusivity of jr1i and
jr2i and the related requirement that hr1j r2i D 0 for r1 ¤ r2 are connected with
the basic ideas discussed in Sect. 2.2.3 and, therefore, appears untouchable. So, the
only choice left to me is to reject the assumption that hr1j r1i is equal to unity or
takes any other finite value. As a result we are left with the following requirements
on hr1j r2i: this expression must be zero for unequal values of its arguments while
producing a non-zero result when being integrated with any “normal” function.

These requirements are satisfied by an object called Dirac’s delta-function. Dirac
introduced notation ı .x/ for its simplest single-variable version and presented most
of its properties in the form useful for physicists in his influential 1930 book
The Principles of Quantum Mechanics, which since then has been reissued many
times (it is the same Paul Dirac who introduced the bra-ket notation for quantum
states). It seems a bit unfair to name this object after him because it was already
known to such mathematicians as Poisson and Fourier in the nineteenth century, but
physicists learned about it from Dirac, so we stick to our guns and call it a Dirac’s
function. The first thing one needs to understand about the delta-function is that
it is not a function in any reasonable sense of the word. Therefore, the meaning
of such operations as integration or differentiation involving this object cannot be
defined following the standard rules of regular calculus. Nevertheless, physicists
keep working with this object as though nothing is wrong (giving nightmares to
rigor-sensitive mathematicians) with the only requirements that the results of all
performed operations must make sense (that is from a physicist’s perspective).
Mathematicians call such objects “distributions” or treat them as examples of
“functionals.” Below I supply you with all properties of the delta-functions you will
need to know.

The main defining property of the delta-function of a single variable is

x2ˆ

x1

f .x/ı.x/dx D
(

f .0/; 0 2 Œx1; x2�
0 0 … Œx1; x2�

(2.30)

2.3 States Characterized by Observables with Continuous Spectrum 33

with its immediate generalization to

x2ˆ

x1

f .x/ı.x � x0/dx D
(

f .x0/; x0 2 Œx1; x2�
0 x0 … Œx1; x2� :

(2.31)

These equations express the main property of the delta-function—it acts as a selector
singling out the value of function f .x/ at x D x0; where the argument of the delta-
function vanishes. In a particular case of f .x/ D 1, Eq. 2.31 yields another important
characteristics of the delta-function:

x2ˆ

x1

ı.x � x0/dx D
(
1; x0 2 Œx1; x2�
0 x0 … Œx1; x2� ;

which expresses the idea that while the “width” of the delta-function is zero and
its “height” is infinite, the area covered by it is equal to unity. An example of actual
limiting procedure producing a delta-function out of a regular function based on this
idea can be found in the exercises in this chapter.

One can also define the delta-function of a more complex argument such as
ı Œg.x/�, where g.x/ is an arbitrary function. If g.x/ has only one zero at x D x0,
I can define ı Œg.x/� by replacing g.x/ with the first term of its Taylor expansion
around x0: g.x/ � b.x � x0/, where b � .dg=dx/xDx0 , and making a substitution of
variableex D b .x � x0/, which yields

x2ˆ

x1

f .x/ı Œg.x/� dx D 1jbj

g.x2/ˆ

g.x1/

f

ex
b

C x0
ı .ex/ dex D 1jbj f .x0/; (2.32)

The expansion of g.x/ in the Taylor series is justified here because the value of the
integral is determined by the behavior of this function in the immediate vicinity
of x0.

If the function g.x/ has multiple zeroes within the interval of integration, then we
must isolate each zero and perform the procedure described above for each of them.
The result will look something like this:

x2ˆ

x1

f .x/ı Œg.x/� dx D
X

i

1

jbij f .x
.i/
0 / (2.33)

where bi is the value of the derivative of g.x/ at the respective i-th zero x
.i/
0 . To

illustrate this procedure, consider an example.

34 2 Quantum States

Example 5 (Delta-function With Two Zeros) Consider

g.x/ D x2 � x20:

In this case the method outlined above yields

x2ˆ

x1

f .x/ı
�
x2 � x20

�
dx � 1

2x0

2
4

x2ˆ

x1

f .x/ı .x � x0/ dx C
x2ˆ

x1

f .x/ı .x C x0/ dx
3
5

(2.34)
where I assumed that both x0 and �x0 belong to the interval between x1 and x2. I can
also define a derivative of the delta-function using integration by parts and assuming
that integral of df=dx is still equal to f .x/ even if f .x/ � ı .x/. This is how it goes:

x2ˆ

x1

f .x/ı0.x � x0/dx D f .x/ı.x � x0/jx2x1 �
x2ˆ

x1

ı.x � x0/f 0.x/dx

D
8<
:

� dfdx
ˇ̌
ˇ
xDx0

; x0 2 Œx1; x2�
0 x0 … Œx1; x2� :

(2.35)

Similarly one can define higher derivatives of the delta-function.
We will also need an important representation of the delta-function as a Fourier

transform:

ı.x/ D 1
2�

1̂

�1
eikxdk: (2.36)

To demonstrate that this representation of the delta-function actually makes sense,
consider direct and inverse Fourier transforms:

f .x/ D 1p
2�

1̂

�1
Qf .k/eikxdk (2.37)

Qf .k/ D 1p
2�

1̂

�1
f .x/e�ikxdx: (2.38)

Substituting Eq. 2.38 into Eq. 2.37, I get

f .x/ D 1
2�

1̂

�1
f .x1/e

�ikx1eikxdkdx1 D 1
2�

1̂

�1
dx1f .x1/

1̂

�1
eik.x�x1/dk

2.3 States Characterized by Observables with Continuous Spectrum 35

and the only way to make this into an identity for any function is to accept Eq. 2.36
for the integral over k.

Finally, you will need a generalization of the delta-function to the case of
several variables. For instance, delta-function involving position vectors in Cartesian
coordinates can be defined as

ı .r1 � r2/ � ı .x1 � x2/ ı . y1 � y2/ ı .z1 � z2/ ; (2.39)

in which case its representation in the form of a Fourier transform becomes

ı .r1 � r2/ D 1
.2�/3

1̂

�1
eikx.x1�x2/eiky. y1�y2/eikz.z1�z2/dkxdkydkz D

1

.2�/3

1̂

�1
eik�.r1�r2/d3k: (2.40)

Now back to the calculation of the norm, i.e., to Eq. 2.29. To complete this
calculation, I will introduce a generalized, so-called delta-function normalization
condition for states jri by requiring that

hr1j r2i D ı .r1 � r2/ : (2.41)

Substituting Eq. 2.41 into Eq. 2.29 and using the properties of the delta-function, I
finally arrive at

k˛k2 D
ˆ

d3r � .r/ .r/ : (2.42)

Now, in order to ensure correct normalization of the state j˛i, you only need to
require that function .r/ is chosen to be normalized such that

ˆ
d3r j .r/j2 D 1: (2.43)

Example 6 (Normalization of a Wave Function) Normalize the state j˛i presented
by the following function.

.x/ D eikxe�ax2=2:

36 2 Quantum States

Solution

Using the definition of the norm, Eq. 2.42, I have

k˛k2 D
1̂

�1
�.x/ .x/dx D

1̂

�1
e�ax2dx D

r
�

a

where I used the substitution of variables y D pax and the well-known integral
1̂

�1
exp

��y2� dy D p�:

Thus, the normalized form of the state can be written as

j˛i D
� a
�

�1=4 1̂

�1
eikxe�ax2=2 jxi :

Function .r/ in these expressions is called the wave function and is often cited
in quantum mechanics textbooks as the descriptor of a quantum state. You can see
now that this is not quite the case—the definition of the wave function involves
two different states: the actual state of the system j˛i and a state, jri, in which a
particle would have a definite position. You can think of the wave function as a
projection of j˛i on jri. If the position r could only take discrete values, we would
have interpreted j .r/j2 as a probability. In the continuous case, however, we can
only ask about a probability that the measurement of position would produce a result
within a certain (possibly infinitesimally small) volume around some central point r.
The answer to this question is well expressed in terms of differential probability

dP .r/ � d3r j .r/j2 ;

where j .r/j2 can be interpreted as the position probability density. The probability
that the measured position vector belongs to some finite volume V is given by
expression

P.V/ D
•

V

d3r j .r/j2 ; (2.44)

while the normalization condition 2.43 simply states the fact that the measurement
of the particle’s position will produce some value within the entire volume available
to the particle with probability equal to one.

2.4 Problems 37

2.4 Problems

Problem 1 Consider two states:

j 1i D j 1i C i j 2i � 2 j 3i
j 2i D � j 1i C 2 j 2i � i j 3i ;

where j 1;2;3i are all normalized and orthogonal to each other.
1. Normalize states j 1i and j 2i.
2. Find adjoint counterparts of these states h 1j and h 2j.
3. Compute inner products h 1j 2i and h 2j 1i and verify that h 1j 2i D

h 2j 1i�.
4. Find a linear combination of states j 1i and j 2i that would be orthogonal to

j 1i.
5. Compute .h 1j C h 2j/ .j 1i C j 2i/. Do it in two ways: (a) by computing the

sums first and then taking the inner product and (b) using the distributive property
of the inner product, remove the parenthesis and compute the inner products of
the resulting individual terms.

Problem 2 Determine if the following sets of vectors, defined by their components
in some basis, are linearly dependent or independent:

1. .2; 2; 0/, .1; 0; 1/, .0; i;�1/
2. .0; 0; 1/, .i; 0; 0/, .0; 0;�1/
3. .1; i; 2/, .1; i;�1/, .i;�i; 2i/
Problem 3 Consider the set of functions

fn.x/ D A sin �nx
L
;

where L is a positive quantity.

1. Prove that these functions form an orthogonal system with the inner product
defined as

h fnj fmi �
Lˆ

0

fn.x/fm.x/dx:

2. Normalize these functions.
3. Find an expression for coefficients cn in the expansion

.x/ D
1X

nD0
cnfn.x/;

38 2 Quantum States

where fn.x/ is given by the expression from the first part of the problem with
amplitude A replaced by the normalization coefficient found in Part 2.

Problem 4 Repeat problem 3 with the set of functions

'n.x/ D exp

i
2�nx

L

;

and the inner product defined as

h'nj 'mi �
Lˆ

0

'�n .x/'m.x/dx:

Problem 5 Consider functions g1.x/ D x, g2.x/ D x2, and g3.x/ D x3 defined on
an interval x 2 Œ�1; 1� with inner product defined as

hgnj gmi D
1ˆ

�1
dxgn.x/gm.x/:

1. Which of these three functions are mutually orthogonal, and which are not?
2. Consider linear combination of functions g1.x/ and g3.x/: ag1.x/ C bg3.x/ and

find coefficients a and b which would make this function orthogonal to g1.x/.
3. Find a different linear combination of the same functions, which would be

orthogonal to g3.x/.
4. Are these two new functions orthogonal to each other?

Problem 6 Consider a wave function of the form

.x/ D Aeikxxe�x2=2:

1. Normalize this function using the standard definition of the inner product for the
square-integrable functions.

2. Find the probability that a measurement of the x-coordinate of the particle will
produce a value between 0 � x � p2.

3. Find the probability that a measurement of the x-coordinate of the particle will
produce a value such that x > 2. Use mathematical tables or any available
computational tools to obtain the numerical values.

Problem 7 Consider a function of the form

f .x/ D
(

1
4 jxj < 4=2
0 jxj > 4=2:

Show that this function turns into a Dirac’s ı-function in the limit 4 ! 0.

2.4 Problems 39

Problem 8 Compute the following integrals:

1.

1ˆ

0

�
x3 C 5x� ı .x � 2/ dx

2.

3ˆ

0

.sin 2x C 2 tan 3x/ ı �x2 � 5x C 4� dx

3.

1̂

�1
xe�x2ı0 .x C 5/ dx

Problem 9 Evaluate the following expression:

1̂

�1
dxf .x/

1̂

�1
dkkeik.x�x0/:

Hint: Use the representation of the delta-function as a Fourier integral to figure out
the integral with respect to k.

Chapter 3
Observables and Operators

3.1 Hamiltonian Formulation of Classical Mechanics

The version of classical mechanics based on forces and Newton’s laws resists any
meaningful reformation into a quantum theory because it depends critically on such
concepts (trajectory, acceleration, etc.) that do not correspond to any observable
reality in the quantum world. More productive for finding links between classical
and quantum realms is an alternative formulation, where energy rather than force
takes the central role. There are two essential elements in this formulation of
classical mechanics. One is the idea of canonical coordinates in the so-called phase
space (as opposed to regular three-dimensional configuration space), and the other
is the concept of Hamiltonian.

Points in the phase space represent classical states of the system, characterized,
for instance, by its coordinates xi and components of the momentum vector pi. For
a single particle moving along a straight line (one-dimensional motion), the phase
space is two-dimensional; for the fully three-dimensional motion, the phase space is
six-dimensional; and for a three-dimensional motion of N particles, the dimension
of the phase space is 6N. Each point in the phase space represents the most complete
information about a classical system—its coordinates and velocities. When particles
move, their coordinates and momentums change, drawing a phase trajectory of the
system in the phase space. For a single particle allowed to move only along a
straight line, this trajectory is a curve in two-dimensional space. If the motion of the
particle is conservative, i.e., its energy is a conserving quantity, each phase trajectory
is an equienergetic line—each point on the trajectory corresponds to the state of
the system with exactly the same energy (energy does not change, while particles
change their position and momentum). Using the phase space, we effectively put
space coordinates and momentum of the particles on equal footing without imposing
any a priori relationships between them (as opposed to elementary mechanics, when
the momentum is defined via the time derivative of coordinates). You shall see that

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_3

41

42 3 Observables and Operators

the relationship between coordinate and momentum, called canonically conjugate
variables, arising within this framework is much closer to its quantum version than
it would have been in the Newtonian approach.

The Hamiltonian is essentially the energy of a conservative system expressed in
terms of coordinates and momentum H.p;r/, which in the case of a single particle
takes the form of

H.p; r/ D p
2

2m
C V.r/; (3.1)

where p is the momentum vector1 and V.r/ is the potential energy of the particle in
the external field. The Hamiltonian occupies a special place in classical mechanics
(as compared, for instance, to angular momentum, which can also be a conserving
quantity under certain circumstances) because it determines system’s dynamics via
Hamiltonian equations, which can be formulated as

dpi
dt

D �@H
@ri

(3.2)

dri
dt

D @H
@pi
; (3.3)

where ri and pi (i D 1; 2; 3/ are Cartesian components of the position and
momentum vectors x; y; z and px; py; pz, respectively. Hamiltonian equations can be
rewritten in another interesting form using so-called Poisson brackets f f ; gg defined
for two arbitrary functions of canonical variables:

f f ; gg D
NX

iD1

@f

@ri

@g

@pi
� @f
@pi

@g

@ri

: (3.4)

Summation in Eq. 3.4 is over all relevant canonical conjugated pairs of coordinates.
It is easy to see that the Poisson brackets for momentum and corresponding
coordinates are

˚
ri; pj

� D ıi;j: (3.5)

This form of Poisson brackets is called canonical: any pair of variables possessing
Poisson brackets of this form form a canonically conjugated pair and satisfy
Hamiltonian equations 3.2 and 3.3.

Applying the definition of the Poisson brackets, Eq. 3.4, to the pair of functions
pi, H and ri, H, you can find (check it out!)

1p2 is defined as usual as the square of the magnitude of the vector in Cartesian coordinates p2x C
p2y C p2z .

3.2 Operators in Quantum Mechanics 43

f pi;Hg D �@H
@ri
;

fri;Hg D @H
@pi
;

so that Hamiltonian equations 3.2 and 3.3 can be rewritten even in a more symmetric
form:

dpi
dt

D f pi;Hg ; (3.6)
dri
dt

D fri;Hg : (3.7)

Finally, the time derivative of an arbitrary function of canonical coordinates can
be expressed in terms of Poisson brackets involving Hamiltonian. I illustrate this
statement for a function of only one pair of coordinates f .x; p; t/:

df

dt
D @f
@t

C @f
@p

dp

dt
C @f
@x

dx

dt
D @f
@t

� @f
@p

@H

@x
C @f
@x

@H

@p
D @f
@t

C f f ;Hg : (3.8)

3.2 Operators in Quantum Mechanics

3.2.1 General Definitions

The main task of quantum theory is to be able to predict (or explain) results of
experiments conducted with quantum systems. All such experiments involve taking
a system in some initial state, subjecting it to external influences, which change
its environment, and observing a reaction of the system to these changes. A theo-
retician in me would say that by doing all these manipulations and measurements,
experimentalists change the quantum state of the system, but so far the formalism I
have at my disposal does not have any theoretical representation of all these turning
knobs and dials, lasers, which go on and off, magnets, thermostats, and all other real
material objects in the arsenal of an experimentalist. I need additional mathematical
tools, which would allow me to describe theoretically all these changes inflicted
upon an unsuspecting system by the men in lab coats. Since the quantum states are
presented in the theory by vectors of a linear vector space, what I need are objects
that can change these vectors. Such objects are known to mathematicians—they
call them operators. The role of operators in quantum theory is twofold. On one
hand, they are used to describe transformations of state vectors, and on the other
hand, they provide the theoretical means to predict the outcomes of measurements
of observables.

44 3 Observables and Operators

From the mathematical standpoint, an operator is a rule prescribing how to
change one abstract vector of a linear vector space, say, j˛i, into another abstract
vector, say, jˇi of the same or a different vector space. Symbolically this can be
represented as,

jˇi D OT j˛i ; (3.9)

where the “hat” O above a capital letter (in this case T) signifies that OT represents
such a rule, or an operator, “acting” on j˛i and converting it into jˇi. Note that the
symbol of the operator appears in Eq. 3.9 next to the vertical line marking the “tail”
of the ket j˛i.

The special role in quantum mechanics and other applications is played by linear
operators—the class of rules satisfying the following condition:

OT .a1 j˛1i C a2 j˛2i/ D a1 OT j˛1i C a2 OT j˛2i : (3.10)

Here are a few examples of linear operators:

1. Differentiation operator d=dx converting a function f .x/ into its derivative g.x/ D
.d=dx/ f � df=dx (note how the operator symbol appears on the left of the
function)

2. Gradient operator
�!r D [email protected][email protected] C [email protected][email protected] C [email protected][email protected], where ex;y;z are unit vectors

in the directions of the respective coordinate axes, converting a scalar function of
three spatial variables into a vector:

�!r f .x; y; z/ D [email protected][email protected] C [email protected][email protected] C [email protected][email protected]

3. Integration operator, OK, which is defined by its kernel K.x1; x2/ and converts one
function to another as

jgi D OK j f i ” g.x1/ D
1̂

�1
K .x1; x2/ f .x2/ dx2

4. Rotation operator OR, which changes the orientation of a vector without changing
its length

Linearity of the first three operators is evident from linearity of differentiation and
integration, and the proof of linearity of rotations is a simple exercise in geometry
and is left to the readers to perform.

Equation 3.9 defines an operator by its action on a ket vector. It is also possible
to define an operator acting on bra vectors. One can, for instance, perform formal
Hermitian conjugation of Eq. 3.9 and introduce Hermitian conjugate operator OT�:

hˇj D h˛j OT�: (3.11)

3.2 Operators in Quantum Mechanics 45

Notice that now operator OT� stands to the right of the respective bra vector but still
next to its tail, “acting” to the left. Thus, Hermitian conjugation in this case involves
also the change in the order, in which the participating objects are written, as well
as the “direction” of “action” of the operators from right to left.

In order to help you develop intuition regarding transition between Eqs. 3.9
and 3.11, consider a linear space of column vectors—1 � N matrices. For operators
you can take N � N matrices and define its action on a vector as regular matrix
multiplication. For this definition to make sense from the point of view of matrix
multiplication rules, the matrix must be placed to the left of the column vector. The
result of this operation is another column vector:

2
66664

t11 t12 � � � t1N
t21 t22 � � � t2N
:::

:::
: : :

:::

tN1 tN2 � � � tNN

3
77775

2
66664

a1
a2
:::

aN

3
77775

D

2
66664

b1
b2
:::

bN

3
77775
: (3.12)

The Hermitian conjugate of a column vector is a row vector (N � 1 matrix) with
complex-conjugated elements. Equation 3.12 contains two column vectors, and
its Hermitian conjugated version must describe the relation between two rows�
a�1 a�2 � � � a�N

�
and

�
b�1 b�2 � � � b�N

�
. However, in order to be able to multiply a square

matrix and a row vector, I must place the former to the left of the latter:

�
a�1 a�2 � � � a�N

�

2
666664

t�11 t
�
12 � � � t�1N

t�21 t
�
22 � � � t�2N

:::
:::
: : :

:::

t�N1 t
�
N2 � � � t�NN

3
777775

D �b�1 b�2 � � � b�N
�
; (3.13)

where t�ij represents elements of a Hermitian conjugate operator matrix OT�. If I want
(and I certainly do) that the relation between elements a�i and b�i expressed by
Eq. 3.13 reproduce complex-conjugated relations given by Eq. 3.12, I must require
that the rows of the matrix in Eq. 3.12 coincide with the complex-conjugated
columns of the matrix in Eq. 3.13: t�ij D t�ji . This gives me an operational (not
just formal) rule for performing the Hermitian conjugation of the matrix operator:
it consists in regular matrix transposition and complex conjugation of all matrix
elements. This example serves two important purposes: first, it demonstrates
why reversal of the order in which vectors and operators appear after Hermitian
conjugation makes sense, and, second, it yields a rule for Hermitian conjugation of
a matrix.

In a general case, Eq. 3.11 does not give us any clue on how to actually generate
Hermitian conjugate operators. In order to derive such a rule, I need to relate both
(initial and Hermitian conjugate) operators to a quantity, which I know how to
transform and which does not depend on any concrete realization of the vector space

46 3 Observables and Operators

or an operator. The only quantity of this kind, which I know of, is an inner product,
and I, in order to get to it, will multiply Eq. 3.9 by a bra vector hˇj from the left.
This will leave me with expression hˇj OT j˛i, which can be understood as a product
of a bra vector hˇj and a ket vector OT j˛i. Complex conjugating this expression and
applying Eq. 2.19, I get

hˇj OT j˛i� D h˛j OT� jˇi ; (3.14)

where I also used Eq. 3.11 to convert ket OT j˛i into a corresponding bra h˛j OT�.
Equation 3.14 can be used to find a Hermitian conjugate of any particular operator
as illustrated by the following examples.

Example 7 (Hermitian Conjugation) Consider differentiation operator OD acting on
differentiable square-integrable functions as

OD j f i � df
dx
;

Using the definition of the inner product defined by Eq. 2.21, you can present the
expression in Eq. 3.14 as

hgj OD j f i �
1̂

�1
dxg�.x/

df

dx
:

Integration by parts converts this expression into the following form:

�
hgj OD j f i

�� D
0
@

1̂

�1
dxg�.x/

df

dx

1
A

�

D g.x/f �.x/j1�1 �
1̂

�1
dxf �.x/

dg

dx
D

�
1̂

�1
dxf �.x/

dg

dx
;

where I took into account that any square-integrable functions must vanish at both
positive and negative infinities. Presenting this result in the form of the right side of
Eq. 3.14 h f j OD� jgi, you can identify OD� as OD� D �d=dx.

If an operator and its Hermitian conjugate coincide

hˇj OT j˛i� D h˛j OT jˇi (3.15)

or OT D OT�, the respective operator is called Hermitian or self-adjoint operator.
Hermitian operators have a number of important properties, which will be discussed
in more detail in Sect. 3.3. Here I shall note just one important property of Hermitian
operators, which trivially follows from Eq. 3.15: a quantity defined as h˛j OT j˛i is a

3.2 Operators in Quantum Mechanics 47

real-valued number for any choice of state j˛i. Expressions of this type are called
expectation values of the operator in a given state. The origin of this name will
become clear in Sect. 3.3. A few examples of Hermitian operators follow below.

Example 8 (Hermitian Operators) Let me prove that operator i OD, where OD is the
differentiation operator, introduced in the previous example, is Hermitian. To this
end I just need to repeat computations from Example 7:

�
hgj i OD j f i

�� D
0
@i

1̂

�1
dxg�.x/

df

dx

1
A

�

D g.x/f �.x/j1�1 C i
1̂

�1
dxf �.x/

dg

dx
�

h f j i OD jgi :

Example 9 (Hermitian Operators) As a second example of the Hermitian operator,
I consider a 3 � 3 matrix M acting on vectors in a three-dimensional vector space:

M D
2
4
1 i 2
�i 1 4i
2 �4i 0

3
5 :

I will demonstrate that this matrix is Hermitian by directly remembering that Her-
mitian conjugation of matrices consists of transposition and complex conjugation.
Consequently carrying out these operations, you can convince yourselves that they
yield the same matrix M:

2
4
1 i 2
�i 1 4i
2 �4i 0

3
5 !

2
4
1 �i 2
i 1 �4i
2 4i 0

3
5 !

2
4
1 i 2
�i 1 4i
2 �4i 0

3
5 :

You can also compute expression a��M � a, where a is an arbitrary column vector
and a� its Hermitian conjugate:

�
a�1 a�2 a�3

�
2
4
1 i 2
�i 1 4i
2 �4i 0

3
5
2
4

a1
a2
a3

3
5 D �a�1 a�2 a�3

�
2
4

a1 C ia2 C 2a3
�ia1 C a2 C 4ia3
2a1 � 4ia2

3
5 D

a�1a1 C ia�1a2 C 2a�1a3 � ia�2a1 C a�2a2 C 4ia�2a3 C 2a1a�3 � 4ia2a�3 D
ja1j2 C ja2j2 C ja3j2 C 2

�
a�1a3 C a1a�3

�C i �a�1a2 � a�2a1 C 4a�2a3 � 4a2a�3
�
:

It is obvious that the final expression is real-valued as promised.

48 3 Observables and Operators

3.2.2 Commutators, Functions of Operators, and Operator
Identities

In addition to Hermitian conjugation, you will need to perform on operators other,
less exotic, operations, such as multiplication. The product of two operators OT1 andOT2 is defined as consecutive action of the operators. If you consider action on a ket
vector, the first operator to do the work is the one on the right:

� OT2 OT1
�

j˛i � OT2
� OT1 j˛i

�
:

In the case of operators acting on the bra vector, the order is opposite: the first to act
is the leftmost operator:

h˛j
� OT2 OT1

�
�
�
h˛j OT2

� OT1:

The most important property of the operator multiplication is actually the absence
of a property: multiplication of operator is not, in general, commutative2:

OT2 OT1 ¤ OT1 OT2:

The non-commutative nature of operator multiplication is of extreme importance
in quantum mechanics, and as you will see, it is the main mathematical feature
responsible, for instance, for the uncertainty relation. For the same reason, sets
of operators that do commute with each other also play an important role in the
quantum formalism.

The non-commutativity of operator multiplication is expressed quantitatively via

the notion of a commutator. The commutator of two operators
h OT1 O;T2

i
is defined as

h OT1 O;T2
i

D OT1 OT2 � OT2 OT1: (3.16)

The knowledge of the commutator or, as it is sometimes called, a commutation
relation between two operators is essential and, often, the most important informa-
tion about operators that you can have. You will see throughout the course how the
commutation relations of different operators are used in a variety of applications
and calculations.

Commutators have a few important properties, the most frequently used of which
are the following:

2We all are used to deal with commutative multiplication of numbers: the result does not depend
on the order, in which multiplication is performed. The lack of commutativity of multiplication
was one of the features of the Heisenberg theory, which especially freaked out Schrödinger.

3.2 Operators in Quantum Mechanics 49

h OT1 O;T2
i

D �
h OT2 O;T1

i
(3.17)

h OT1 C OT2; OT3
i

D
h OT1 O;T3

i
C
h OT2 O;T3

i
(3.18)

h
c1 OT1; c2 OT2

i
D c1c2

h OT1 O;T2
i
: (3.19)

The proof of all these identities is quite obvious, and I shall leave it for you as an
exercise.

Having defined a product of two operators, I can introduce a power function for
the operators: OTn simply means applying the same operator n times. The power
function is important because it allows defining other, more complex, functions

of the operators. In general, expression f
� OT
�

, where f .x/ is an arbitrary function,

which has infinitely many derivatives at x D 0, can be expended in the infinite
Taylor series. Using this series one can define the operator function f

� OT
�

by simply

substituting the operator instead of x in the series:

f
� OT
�

D
1X

nD0

1

nŠ

dnf

dxn

ˇ̌
ˇ̌
xD0

OTn:

However, a number of important functions, which you are used to dealing with
routinely, cannot be defined this way and, therefore, do not make sense for operators.

Among them are
p OT , ln

� OT
�

and other similar functions with singularities at zero.

An important exception is function OT�1, called inverse operator, which is defined by
equation

OT OT�1 D OT�1 OT D OI; (3.20)

where OI is a unity operator, i.e., an operator which does not change a vector it
acts upon. The meaning of the inverse operator can be illustrated by the following
expressions:

OT j˛i D jˇi
OT�1 jˇi D j˛i ;

where the second line is obtained from the first one by multiplying both sides of the
latter by OT�1. Finding inverse operators is usually a difficult task and often amounts
to solving an entire problem. If an operator has a form of a matrix, its inverse can
be found according to standard rules for inverting matrices.

Finding inverse operators is significantly simplified for a special class of
operators called unitary operators. These operators, defined by the condition

OU� D OU�1;

50 3 Observables and Operators

play an extremely important role in quantum theory (we value them, of course, not
just because their inverse is easy to find). The main property of unitary operators
is that they do not change the norm of the vectors or their inner products. Indeed,

consider vectors j˛i and jˇi, and define new vectors j Q̨ i D OU j˛i and
ˇ̌
ˇ Q̌
E

D OU jˇi,
where OU is a unitary operator. Direct computation of h Q̨ j Q̌

E
proves this statement:

h Q̨ j D h˛j OU� ) h Q̨ j Q̌
E

D h˛j OU� OU jˇi D h˛j OU�1 OU jˇi D h˛j ˇi :

Unitary operators are a generalization of the rotation operator acting on regular
three-dimensional vectors: rotation of two vectors by the same angle does not
change their lengths as well as an angle between them. As a result, the dot product
of these vectors also does not change. Here is an example of a unitary operator based
on the two-dimensional rotation matrix.

Example 10 (Unitary Operators) Consider the well-known matrix used to relate
the coordinates of a two-dimensional vector rotated by an angle � from its initial
position:

R D
"

cos � � sin �
sin � cos �

#

Its Hermitian conjugate is

R� D
"

cos � sin �

� sin � cos �

#
:

Simple computation shows that product R�R is a unity matrix:

R�R D
"

cos � sin �

� sin � cos �

#"
cos � � sin �
sin � cos �

#
D

"
cos2 � C sin � � cos � sin � C cos � sin �

� cos � sin � C cos � sin � cos2 � C sin �

#
D
"
1 0

0 1

#
:

This proves, of course, that R� D R�1.
An important example of an operator function is an exponential function defined

as

exp
� OT
�

�
1X

nD0

1

nŠ
OTn: (3.21)

3.2 Operators in Quantum Mechanics 51

Some of the familiar properties of this function remain valid even when its argument

is an operator. For instance, the derivative of the expression f .�/ D exp
�
� OT
�

with

respect to the parameter � is calculated as though OT were a regular number:

df=d� D OT exp
�
� OT
�
:

You should be warned, however, that a very convenient property of exponential
functions

exp .x C y/ D exp .x/ exp. y/ (3.22)

does not hold for operator arguments. One way to understand the reason for
this unfortunate circumstance is to notice that if two operators OT1 and OT2 in the
argument of the exponential function exp

� OT1 C OT2
�

do not commute, expressions

exp
� OT1

�
exp. OT2/ and exp

� OT2
�

exp. OT1/ are not equivalent, so they both cannot be
equal to the exponential of the sum of these operators. Generalization of Eq. 3.22
to the case of operator arguments is, in general, very complicated and will not
be considered here. There is, however, one case, when such a generalization has
a relatively simple form and can be derived without too much efforts, while
some work is still required, of course. This simplification takes place when the
commutator of the operators OT1 and OT2 commutes with both of them. In most cases,
this means that the commutator is a regular number, but it does not have to be.

So, suppose that the commutator of two operators OT1 and OT2 is
h OT1; OT2

i
D OC,

where OC is such that
h OT1; OC

i
D
h OT2; OC

i
D 0. This assumption appears to be quite

restrictive, but in reality, it is fulfilled in a great many pairs of operators that are
important for quantum mechanics. In order to derive the promised generalization
of Eq. 3.22, I have to, first, prove two intermediate identities, which, however, are
useful in their own right. Let me begin by computing the following expression:

h OT1; e� OT2
i

D
1X

nD0

1

nŠ

h OT1; �n OT2n
i

D
1X

nD0

�n

nŠ

h OT1; OT2n
i
: (3.23)

To proceed I need to prove the following identity for the commutators:

h OT1; OT2n
i

D n OC OT2n�1: (3.24)

The easiest way to do it is to use the method of mathematical induction. For those
who have forgotten how this method works, the first step is to prove the statement for
the first nontrivial value of the index (n D 2 in this case). After that you assume that
the statement is correct for n D k and, using this assumption, prove it for n D k C1.
Thus, the first step—consider n D 2:

52 3 Observables and Operators

h OT1; OT22
i

D OT1 OT22 � OT22 OT1 D OT1 OT2 OT2 � OT2 OT1 OT2 C OT2 OT1 OT2 � OT2 OT2 OT1

D
� OT1 OT2 � OT2 OT1

� OT2 C OT2
� OT1 OT2 � OT2 OT1

�
D 2 OC OT2:

(Note that it works because OC commutes with OT2.) Next, n D k assumption:
h OT1; OT2k

i
D k OC OT2k�1

The final step—proof for n D k C 1:
h OT1; OT2kC1

i
D OT1 OT2kC1 � OT2kC1 OT1 D OT1 OT2kC1 � OT2 OT1 OT2k C OT2 OT1 OT2k � OT2kC1 OT1

D
h OT1; OT2

i OT2k C OT2
h OT1; OT2k

i
D OC OT2k C k OC OT2k D .k C 1/ OC OT2k:

Using this identity I can transform Eq. 3.23 into

h OT1; e� OT2
i

D OC
1X

nD0

�nn

nŠ
OT2n�1 D OC�

1X
nD1

�n�1

.n � 1/Š
OT2n�1 D � OCe� OT2 (3.25)

This result can be used to derive another important identity. Multiply Eq. 3.25 by
e�� OT2 from the left:

e�� OT2
h OT1; e� OT2

i
D e�� OT2� OCe� OT2 :

The right-hand side of this expression simplifies to OC�: e�� OT2e� OT2 D e�� OT2C� OT2 D
OI, since Eq. 3.22 is applicable for any commuting operators and any operator
commutes with itself. Now you can expand the commutator on the left of the
expression above to get

e�� OT2 OT1e� OT2 � OT1 D OC�

or

e�� OT2 OT1e� OT2 D OT1 C OC�: (3.26)

Now I am ready to approach my main target and to prove that

e OT1COT2 D e OT1e OT2e� 12 Œ OT1; OT2�: (3.27)

3.2 Operators in Quantum Mechanics 53

The proof of this identity is more involved than the two previous derivations. Direct
proof (for instance, by using series expansions of the exponential functions on
both sides of Eq. 3.27) results in expressions too cumbersome to allow for fruitful
analysis. Therefore, I am going to use an indirect approach, which was invented by
Harvard Professor Roy Glauber, winner of the 2005 Nobel Prize for his contribution
in quantum optics. Glauber considered function f .x/ D e OxT1ex OT2 , for which he
derived a differential equation by computing its derivative:

df

dx
D OT1e OxT1ex OT2 C e OxT1 OT2ex OT2 :

Note how operators OT1 and OT2 are placed in this expression: OT1 appears in front of
the exponent containing OT2 because it originates from the exponential function of
OT1 positioned to the left of ex OT2 . At the same time, OT2 appears behind e OxT1 following
the respective position of e OxT2 . Relative positions of e OxTi and respective OTi are
not important because these operators commute (any operator commutes with any
function of the same operator). Now the derivative can be rewritten in the following
way:

df

dx
D e OxT1ex OT2e O�xT2e�x OT1

� OT1e OxT1ex OT2 C e OxT1 OT2ex OT2
�
:

It is not too difficult to see that the expression in front of the brackets is equal to
unity so writing it there does not change anything. Continue

df

dx
D f .x/

�
e O�xT2 OT1ex OT2 C OT2

�
D f .x/

� OT1 C x
h OT1; OT2

i
C OT2

�
;

where the identity given by Eq. 3.26 is used and OC is replaced with
h OT1; OT2

i
: This

differential equation can now be solved for function f .x/:

ˆ
df

f
D

ˆ
dx
� OT1 C OT2 C x

h OT1; OT2
i�

)

ln
f

f0
D x

� OT1 C OT2
�

C 1
2

x2
h OT1; OT2

i
;

where integration constant f0 is chosen to satisfy the obvious initial condition:
f .0/ D 1. With this in mind, function f can be written as

f D ex. OT1COT2/C 12 x2Œ OT1; OT2�:

Setting x D 1 in this expression and multiplying it by e� 12 Œ OT1; OT2�, Eq. 3.27 is finally
obtained, completing the proof.

54 3 Observables and Operators

I want to finish this section with two important technical statements about
Hermitian operators. The first one is concerned with Hermitian conjugation of a
product of two Hermitian operators. It can be shown that

� OT1 OT2
�� D OT2 OT1: (3.28)

This statement can be proven as follows. By definition

˝
˛j
� OT1 OT2

�� jˇ˛ D
�˝
ˇj OT1 OT2j˛

˛��
:

Introducing
˝
ˇj OT1 D

˝ Q̌j; OT2j˛
˛ D j Q̨ ˛ and using Eq. 2.19, you can write the right-

hand side of this expression as

�˝ Q̌j Q̨ ˛
�� D ˝ Q̨ j Q̌˛:

Rules for Hermitian conjugation yield j Q̌˛ D OT�1 jˇ
˛

and
˝ Q̨ j D ˝˛j OT�2 , which allows

to proceed as follows:

�˝
ˇj OT1 OT2j˛

˛�� D
�˝ Q̌ ˇ̌ Q̨ ˛

�� D ˝ Q̨ j Q̌˛ D ˝˛j OT�2 OT�1 jˇ
˛
:

By the way, you may have noticed that I had actually proved a more general
statement. Indeed, the last equation means that

� OT1 OT2
�� D OT�2 OT�1 ;

which is valid for any linear, not necessarily Hermitian, operator. Equation 3.28
follows from this result if OT1 and OT2 are Hermitian. An immediate corollary of this
result is the following:

h OT1; OT2
i� D �

h OT2; OT1
i
: (3.29)

Operators which change sign upon Hermitian conjugation are called anti-Hermitian,
so the commutator of two Hermitian operators is also anti-Hermitian. It is now easy
to demonstrate that a commutator of two Hermitian operators can be presented as

h OT1; OT2
i

D i OA; (3.30)

3.2 Operators in Quantum Mechanics 55

where OA is Hermitian. If the commutator is a number, Eq. 3.30 is reduced to
h OT1; OT2

i
D ic; (3.31)

where c is real.

3.2.3 Eigenvalues and Eigenvectors

When an operator acts on a generic vector, the result is a different vector. For
instance, differentiation operator acting on function e�x2 : ODe�x2 D �2xe�x2—
produces a different function. If, however, you apply the same operator to function
e
x, the result will be the same function, multiplied by a number: ODe
x D
e
x. This
example illustrates a general phenomenon: among many vectors that are changed
by operators in completely different vectors, there are some that are only being
multiplied by a number. This special class of vectors, called eigenvectors, plays
an important role in the application of operators in quantum physics. The number,
which appears as a factor in front of an eigenvector, is specific for each vector (or
a limited subset thereof) and is called an eigenvalue. The formal definition of an
eigenvector and an eigenvalue is as follows: vector j˛i is an eigenvector of operator
OT with a respective eigenvalue �˛ if

OT j˛i D �˛ j˛i : (3.32)

For each eigenvector there might be one and only one corresponding eigenvalue,
but the opposite of this statement is not always true. If for each eigenvalue there
exists only a single eigenvector, we describe this eigenvalue as non-degenerate. If
an opposite happens, and several eigenvectors “belong” to the same eigenvalue, the
respective eigenvalue is naturally called “degenerate.” In the non-degenerate case,
an eigenvalue describes a respective eigenvector with an accuracy to a constant
factor (a vector appearing in Eq. 3.32 can be multiplied by any number without
destroying the equation). If we, however, require that all eigenvectors be normalized,
then the eigenvalue will define the respective eigenvector uniquely (with accuracy
to an arbitrary phase factor, which cannot be fixed by normalization but which does
not affect any physical results) so that I can designate it simply as j�i.

To distinguish between different eigenvectors belonging to the same eigenvalue,
I need an additional index so that Eq. 3.32 becomes

OT j�;�i D � j�;�i : (3.33)

The physical meaning of the additional index will become clear later, but for
now, it is just a way to distinguish between different eigenvectors belonging to
the same eigenvalue. An important property of degenerate eigenvectors is that any

56 3 Observables and Operators

linear combination of these vectors is again an eigenvector belonging to the same
eigenvalue. Indeed, consider a vector

j˛i D a�1 j�;�1i C a�2 j�;�2i

and apply operator OT to it
OT j˛i D OT �a�1 j�;�1i C a�2 j�;�2i

� D a�1� j�;�1i C a�2� j�;�2i D � j˛i

where I used Eq. 3.33. Using mathematical lingo, you can say that eigenvectors
belonging to a degenerate eigenvalue form a subspace of the total linear space
because by forming any linear combination thereof you remain within the same
set of vectors in complete agreement with the definition of a vector space.

Now I shall prove an important theorem concerning eigenvectors of commuting
operators and discuss its consequences.

Theorem 1 (Eigenvectors of Commuting Operators) Consider two operators OT1
and OT2 such that OT1 OT2 D OT2 OT1. Also assume that �T1 is a non-degenerate eigenvalue
of OT1 with eigenvector j�T1i. Then, this vector is also an eigenvector of the operatorOT2.
Proof Consider

OT2 OT1 j�T1i D �T1 OT2 j�T1i D OT1 OT2 j�T1i

where at the last step I used the commutative property of the operators. The obtained
result means that OT2 j�T1i is also an eigenvector of OT1 with the same eigenvalue �T1 .
However, since it was assumed that �T1 is non-degenerate, this new eigenvector
might differ from j�T1i only by a constant factor:

OT2 j�T1i D �T2 j�T1i ;

which means that j�T1i is an eigenvector of OT2.
The non-degenerate nature of the eigenvalue of OT1 is essential for this proof to

work. Thus, if eigenvalues of OT1 are degenerate, not all eigenvectors of OT1 will also
be eigenvectors of OT2. However, it can be proven (though the proof is much more
involved and will not be reproduced here) that one can always form such a linear
combination of these degenerate eigenvectors which will become an eigenvector of
OT2, with its own eigenvalue �T2 . In this case, assigning eigenvalues of both OT1 andOT2 might provide a unique characterization of a vector, which is a simultaneous
eigenvector of both operators and can be notated as j�T1 ; �T2i. Comparing this
notation to Eq. 3.33, one can see that the index � in that equation can be understood
as an eigenvalue of a commuting partner operator. If there exists a third operator,
OT3, commuting with both OT1 and OT2, one can find common eigenvectors for all three

3.2 Operators in Quantum Mechanics 57

operators, in which case a full unique characterization of such a state would require
specifying three eigenvalues: j�T1 ; �T2 ; �T3i, where

OT1 j�T1 ; �T2 ; �T3i D �T1 j�T1 ; �T2 ; �T3i
OT2 j�T1 ; �T2 ; �T3i D �T2 j�T1 ; �T2 ; �T3i
OT3 j�T1 ; �T2 ; �T3i D �T3 j�T1 ; �T2 ; �T3i :

In general, in order to fully uniquely characterize an eigenvector of an operator with
degenerate eigenvalues, one needs to find the complete set of commuting operators
(CSCO), i.e., all operators which commute with each other.

To help you visualize these rather abstract concepts, I will illustrate them with
a simple example involving commuting matrices, but you have to be prepared for
some lengthy computations. So, embrace yourself! This example will also illustrate
the process of finding eigenvalues and eigenvectors of operators in a matrix form.

Example 11 (Eigenvectors of Commuting Matrices) Consider two 3 � 3 matrices

M1 D

2
664

5
4

1

2
p
2

1
4

1

2
p
2

3
2

1

2
p
2

1
4

1

2
p
2

5
4

3
775 I M2 D

2
664

1 � 1p
2

�1
� 1p

2
0 � 1p

2

�1 � 1p
2

1

3
775 : (3.34)

It does not take much effort to compute their products (you can use symbolic
computational platform such as Mathematica or Maple if you are too lazy to do
it yourself) and to see that the matrices, indeed, commute:

M1 � M2 D M2 � M1 D

2
664

3
4

� 3
2
p
2

� 5
4

� 3
2
p
2

� 1
2

� 3
2
p
2

� 5
4

� 3
2
p
2

3
4

3
775 :

Vectors in this case are single columns with three elements:

j˛i D

2
64

u1
u2
u3

3
75 :

and the eigenvector equation 3.32 takes the form of a matrix equation. For M1 this
equation is

2
64

5
4

1

2
p
2

1
4

1

2
p
2

3
2

1

2
p
2

1
4

1

2
p
2

5
4

3
75

2
64

u1
u2
u3

3
75 D �

2
64

u1
u2
u3

3
75 :

58 3 Observables and Operators

It is convenient to collect all terms on one side and present this equation in the form

2
664

5
4

� � 1
2
p
2

1
4

1

2
p
2

3
2

� � 1
2
p
2

1
4

1

2
p
2

5
4

� �

3
775

2
664

u1

u2

u3

3
775 D 0: (3.35)

What we have here is a matrix form of a system of three linear homogeneous
equations, which always has at least one solution: u1 D u2 D u3 D 0. This solution,
however, is not what I had in mind when introducing the concept of eigenvectors.
We need non-zero solutions, but they might exist only if the determinant of the
matrix representing coefficients of this equation is equal to zero. (Cramer’s rule of
linear algebra, anyone?) Computing the determinant and setting it to zero, I arrive
at the following equation:

�3 � 4�2 C 5� � 2 D 0;

which has three solutions, �1;2 D 1I �3 D 2, two of which coincide signifying
that the matrix does have degenerate eigenvalues. (These solutions can be found by
factoring the determinant as .� � 1/2 .� � 2/.)

Now, for each eigenvalue, I will find a respective eigenvector, beginning with a
non-degenerate eigenvalue �3 D 2. Substituting this eigenvalue in Eq. 3.35, I reduce
it to

2
664

� 3
4

1

2
p
2

1
4

1

2
p
2

� 1
2

1

2
p
2

1
4

1

2
p
2

� 3
4

3
775

2
664

u.3/1

u.3/2

u.3/3

3
775 D 0:

where added upper index in u.3/i indicates that this eigenvector belongs to the third
eigenvalue. Expanding the matrix equation in an explicit system of linear equations
yields

�3
4

u.3/1 C
1

2
p
2

u.3/2 C
1

4
u.3/3 D 0 ) �3u.3/1 C

p
2u.3/2 C u.3/3 D 0

1

2
p
2

u.3/1 �
1

2
u.3/2 C

1

2
p
2

u.3/3 D 0 ) u.3/1 �
p
2u.3/2 C u.3/3 D 0

1

4
u.3/1 C

1

2
p
2

u.3/2 �
3

4
u.3/3 D 0 ) u.3/1 C

p
2u.3/2 � 3u.3/3 D 0:

Combining the last two equations, I get 2u.3/1 � 2u.3/3 D 0 ) u.3/1 D u.3/3 . Then, the
first two equations are reduced to two identical equations:

3.2 Operators in Quantum Mechanics 59

�2u.3/1 C
p
2u.3/2 D 0

2u.3/1 �
p
2u.3/2 D 0;

which means that the value for one of the coefficients u.3/1;2 can be chosen arbitrarily.

For instance, you can express these coefficients in terms of yet undefined u.3/1 :

u.3/2 D
p
2u.3/1 I u.3/3 D u.3/1 . Using notation j2i to designate this eigenvector (2

in this notation refers to the value of the respective eigenvalue), I can write

j2i D u.3/1

2
64
1p
2

1

3
75 :

The value of the remaining coefficient can be fixed (if the undefined coefficients
make you nervous) by requiring that the vector is normalized:

ˇ̌
ˇu.3/1

ˇ̌
ˇ
2 �
1

p
2 1
�
2
664
1p
2

1

3
775 D

ˇ̌
ˇu.3/1

ˇ̌
ˇ
2

4 D 1 ) u.3/1 D
1

2
:

Thus, the normalized eigenvector belonging to the eigenvalue � D 2 is found to be

j2i D 1
2

2
664
1p
2

1

3
775 : (3.36)

Now let me deal with degenerate eigenvalue �1;2 D 1. In this case, the eigenvector
equation becomes

2
664

1
4

1

2
p
2

1
4

1

2
p
2

1
2

1

2
p
2

1
4

1

2
p
2

1
4

3
775

2
664

u.1;2/1

u.1;2/2

u.1;2/3

3
775 D 0

or in the expanded form

1

4
u.1;2/1 C

1

2
p
2

u.1;2/2 C
1

4
u.1;2/3 D 0 ) u.1;2/1 C

p
2u.1;2/2 C u.1;2/3 D 0

1

2
p
2

u.1;2/1 C
1

2
u.1;2/2 C

1

2
p
2

u.1;2/3 D 0 ) u.1;2/1 C
p
2u.1;2/2 C u.1;2/3 D 0

60 3 Observables and Operators

1

4
u.1;2/1 C

1

2
p
2

u.1;2/2 C
1

4
u.1;2/3 D 0 ) u.1;2/1 C

p
2u.1;2/2 C u.1;2/3 D 0:

In this case, all three equations coincide, meaning that I can choose arbitrarily two
coefficients, e.g., u.1;2/1 and u

.1;2/
3 , while expressing the remaining coefficients as

u.1;2/2 D �
�

u.1;2/1 C u.1;2/3
�
=
p
2:

Choosing different values of the remaining coefficients, I can generate different
eigenvectors all belonging to the same eigenvalue. For instance, choosing u.2/3 D 0
and u.1/1 D 0, I generate distinct vectors:

j1i1 D u.2/3

2
664

0

� 1p
2

1

3
775 I j1i2 D u

.1/
3

2
664

1

� 1p
2

0

3
775 ; (3.37)

which can also be normalized. Any linear combination of these vectors will also be
an eigenvector.

Now I turn my attention to matrix M2. Again computing the determinant

��������

1 � � � 1p
2

�1
� 1p

2
�� � 1p

2

�1 � 1p
2
1 � �

��������

and setting it to zero, I end up with the equation

�3 � 2�2 � �C 2 D 0

which again can be solved by factorization and yields �1 D 2; �2 D �1; �3 D
1. Each of these eigenvalues (which, by the way, are non-degenerate) has its own
eigenvector, which can be found in the same way as above. I will leave the actual
calculations as an exercise and present here only the final answers for the normalized
eigenvectors:

j2i D 1p
2

2
664

�1
0

1

3
775 ; j�1i D

1

2

2
664
1p
2

1

3
775 ; j1i D

1

2

2
664

1

�p2
1

3
775 ; (3.38)

where eigenvectors are again labeled by their respective eigenvalues. Now, it is
obvious that eigenvector j�1i of matrix M2 is also an eigenvector of M1, so I only
need to check the remaining vectors:

3.3 Operators and Observables 61

2
664

5
4

1

2
p
2

1
4

1

2
p
2

3
2

1

2
p
2

1
4

1

2
p
2

5
4

3
775

2
664

�1
0

1

3
775 D

2
664

�1
0

1

3
775 ;

so this vector is an eigenvector of M1 with eigenvalue � D 1. Note that the
elements of this vector obey condition u.1;2/2 D �

�
u.1;2/1 C u.1;2/3

�
=
p
2 derived for

the degenerate eigenvectors of M1 with u2 D 0, u1 D �u3 D 1. Now, for the
remaining eigenvector of M2, I have

2
664

5
4

1

2
p
2

1
4

1

2
p
2

3
2

1

2
p
2

1
4

1

2
p
2

5
4

3
775

2
664

1

�p2
1

3
775 D

2
664

1

�p2
1

3
775 ;

i.e., this is also an eigenvector of M1 with the same eigenvalue. For this vector
I also have u2 D � .u1 C u3/ =

p
2 with u1 D u3 D 1. Thus, I can present

the system of common eigenvectors of these two matrices, in which degenerate
eigenvectors become uniquely defined by the virtue of their belonging to the
eigenvalues of a second commuting matrix. Now, all these eigenvectors can be
designated as j1; 2i ; j1; 1i, and j2;�1i, where the first and second numbers refer to
the eigenvalues of M1 and M2, respectively.

3.3 Operators and Observables

3.3.1 Hermitian Operators

One might notice a striking similarity between CSCO and the concept of the
complete set of mutually consistent observables discussed in Sect. 2.1. Also, the
state vectors characterized by definite values of compatible observables look like
eigenvectors of operators characterized by eigenvalues of commuting operators.
It appears reasonable, therefore, to expect that one can establish a connection
between physical observables and quantum states characterized by the values of
the observables on one hand and the mathematical concepts of operators and
their eigenvalues and eigenvectors on the other hand. This connection is indeed
established by the following postulates laying down the foundation of formalism of
quantum mechanics.

Postulate 1 (Observables and Hermitian Operators) Every observable is
represented in quantum theory by a Hermitian operator.

62 3 Observables and Operators

Postulate 2 Eigenvalues of operators constructed to represent an observable
determine values, which a measurement of the observable might yield, and
eigenvectors define states, in which a measurement of the observable represented
by the operator will with certainty produce the corresponding value.

The first question which might pop up in someone’s mind after reading the first
of these postulates is, why does it single out Hermitian operators? The fact of the
matter is that Hermitian operators possess a number of special properties, which
make them practically suitable for their intended use as representative of physical
observables. These properties can be formulated in the form of several theorems.

Theorem 2 (Theorem of the Eigenvalues) Eigenvalues of Hermitian operators
with discrete spectrum are necessarily real-valued.

Proof Let j�ni be an eigenvector of a Hermitian operator OT corresponding to
eigenvalue �n:

OT j�ni D �n j�ni :

Premultiplying this expression by h�nj, I get

h�nj OT j�ni D �n h�n j�ni :

Performing complex conjugation of this expression and using the definition of the
Hermitian conjugate operator, Eq. 3.14, I derive

�
h�nj OT j�ni

�� D h�nj OT� j�ni D ��n h�nj�ni

where it is assumed that the norm of the vector exists and is a real-valued quantity.
For Hermitian operators OT� D OT , in which case left-hand sides of the last two
equations coincide yielding ��n D �n, which means, of course, that �n is a real
number.

The importance of this theorem for association between physical observables
and operators is obvious—results of any measurements are always expressed by real
numbers, and the theorem guarantees that the mathematical constructs (eigenvalues)
used to connect the formalism with the real world of experiments and observations
are consistent with this natural requirement. The assumption that the norm of
the respective eigenvectors exists, which is a critical element of the proof of the
theorem, can be rigorously validated only for Hermitian operators with discrete
spectrum.3

Eigenvectors of operators with continuous spectrum are not normalizable in the
usual sense (see Sect. 2.3), so this theorem does not apply to them. At the same
time, we need such continuous spectrum operators as momentum or coordinate

3I borrowed this fact without proof from the branch of mathematics called functional analysis that
studies the properties of linear operators.

3.3 Operators and Observables 63

to describe physical reality, so we have to find a way to avoid having to deal
with unrealistic complex eigenvalues. Leaving the mathematical intricacies of this
problem to mathematicians, I solve it here by a sleight of hand. I simply postulate
that only real eigenvalues and their corresponding eigenvectors of such operators
can be used to represent quantum states and the results of measurements. It can
be shown that the eigenvectors corresponding to real eigenvalues of Hermitian
operators with continuous spectrum can be normalized in the sense of Eq. 2.41. To
illustrate the last point, consider operator id=dx that I have previously proved to be
Hermitian. The eigenvectors of this operator have the form of e�ikx, with k being an
eigenvalue:

id.e�ikx/=dx D ke�ikx:

If I force k to be a real number, I can use the properties of the delta-function to write

ˆ
dxeix.k�k1/ D 2�ı .k � k1/ ;

which is the orthonormalization requirement for the eigenvectors belonging to
continuous spectrum. You may want to notice that the integral in this expression
is reduced to the delta-function only for real-valued k.

Theorem 3 (Theorem of Eigenvectors) Eigenvectors of Hermitian operators with
discrete spectrum belonging to different eigenvalues are necessarily orthogonal.

Proof Consider two different eigenvalues �1 and �2 of a Hermitian operator OT
together with their eigenvectors j�1i and j�2i:

OT j�1i D �1 j�1i
OT j�2i D �2 j�2i :

Premultiply first of these equations by h�2j and the second one by h�1j:

h�2j OT j�1i D �1 h�2 j�1i
h�1j OT j�2i D �2 h�1 j�2i :

Complex conjugate the second of these equations, use Eq. 3.14 (which defines the
Hermitian conjugate operator), and take into account that OT is Hermitian. This yields

h�2j OT j�1i D .�2/� h�1 j�2i� :

so that the pair of equations from above can be written as

h�2j OT j�1i D �1 h�2 j�1i
h�2j OT j�1i D .�2/� h�1 j�2i�

64 3 Observables and Operators

Taking into account that the eigenvalues of the Hermitian operators are real and that
according to the property of the inner product h�2 j�1i D h�1 j�2i�, you finally
obtain

�1 h�2 j�1i D �2 h�2 j�1i :

If �1 ¤ �2, you have no choice but to conclude that h�2 j�1i D 0.
In the case of Hermitian operators with degenerate spectrum, the situation is more

complex because, as we saw in the matrix example in Sect. 3.2.3, one can generate
multiple sets of linearly independent vectors belonging to the same eigenvalue,
and they do not have to be orthogonal. At the same time, we also saw that one
can always find such a set, in which eigenvectors are orthogonal. These special
sets of orthogonal vectors belonging to the degenerate eigenvalues are usually also
eigenvectors of another operator from the respective CSCO. Thus, you can be rest
assured that for any Hermitian operator, there exists a set of mutually orthogonal
eigenvectors. I already mentioned that the physical meaning of the mathematical
concept of orthogonality is mutual exclusivity of values of the observables used to
characterize the states, and this comment essentially completes our identification
of mutually exclusive states characterized by a set of mutually consistent set of
observables with eigenvectors of operators belonging to a complete set of mutually
commuting operators.

Theorem 4 (Completeness of Eigenvectors) The set of eigenvectors of Hermitian
operators is complete in a sense that any state in the respective Hilbert vector space
can be presented as a linear combination of these eigenvectors.

The completeness property gives a rigorous mathematical justification to the
generalization of the superposition principle expressed by Eq. 2.26. This property
essentially states that eigenvectors of Hermitian operators with discrete spectrum
form a countable basis in the Hilbert vector space. It can also be expressed in the
form of a so-called completeness or “closure” relation, which can be presented as
a useful operator identity. To derive it, I, first, rewrite Eq. 2.26 in a more compact
form as

j˛i D
X

n

an j�ni ; (3.39)

where index n enumerates the eigenvectors and each eigenvector j�ni, which is
assumed to be normalized, is characterized by all available eigenvalues of the
respective CSCO. Expansion coefficients an in this expression can be found as
an D h�nj˛

˛
as established in Eq. 2.24. After substitution of this expression back

into Eq. 3.39, the latter becomes

j˛i D
X

n

j�ni h�nj˛
˛ �

X
n

j�ni h�nj
! ˇ̌
˛
˛
: (3.40)

3.3 Operators and Observables 65

In the last expression here, I split off ket vector j˛i from the bra h�nj and combined
the latter with another ket j�ni. The ket and bra vectors enclosed in the brackets
are in unusual positions: the bra is on the left of the ket, which is opposite to
their regular positions in the standard inner product. As you can guess, expression
OP.n/ D j�ni h�nj is not an inner product, but does it have any sensible meaning
at all? In the matrix example of the vectors, this expression corresponds to the
situation in which the column vector is written down to the left of the row vector—
the arrangement used to form the outer or tensor product mentioned in the previous
section. Respectively, in the case of abstract generic ket and bra vectors, j�ni h�nj
can be understood as an outer product of two vectors. Naturally, just as the outer
product of rows and columns yields a matrix, the outer product of bras and kets
generates an operator: indeed, if you bring the split-off ket vector back, you can
construct the following expression:

OP.n/ ˇ̌˛˛ D j�ni h�nj˛
˛
: (3.41)

i.e., the result of the action of OP.n/ on j˛i is vector j�ni multiplied by a number.
If j˛i and j�ni were a regular three-dimensional vector and one of the unit vectors
specifying a particular direction correspondingly, you could say that OP.n/ projects
j˛i on j�ni and generates a component of j˛i in the direction specified by j�ni. It
is customary to maintain the same terminology and call operator OP.n/ a projection
operator.

Example 12 (Projection Operators) To get accustomed to working with operators
of the form OP.n/ D j�ni h�nj, let me prove the main property of the projection
operators,

h OP.n/
i2 D OP.n/:

h OP.n/
i2 D j�ni h�nj �ni h�nj :

The expression in the middle looks like an inner product of a basis vector with itself,
and as such it is equal to unity. Thus, we have

h OP.n/
i2 D j�ni h�nj D OP.n/:

The expression inside the parentheses in Eq. 3.40 is a sum of projection operators,
but most importantly, it is easy to see that this sum is identical to a unity operator:
it acts on vector

ˇ̌
˛
˛

and generates the same vector. This statement can be written as
the following identity:

X
n

j�ni h�nj D OI; (3.42)

which is the completeness or closure relation. This is a useful operator identity,
which will be frequently used in what follows.

66 3 Observables and Operators

Not all vector spaces used in quantum mechanics can be described by a discrete
basis, and sometimes we have to use as a basis eigenvectors of operators with
continuous spectrum. I have already discussed this possibility in Sect. 2.3 using
states characterized by a definite value of particle’s position jri. Now you can
associate these states with eigenvectors of a position operator Or. In general, if jqi
is an eigenvector of some Hermitian operator with continuous spectrum and q is the
respective eigenvalue, you can present an arbitrary state j˛i as an integral instead of
a sum:

j˛i D
ˆ

dq .q/ jqi : (3.43)

Premultiplying Eq. 3.43 by bra hq1j and using the orthogonality condition for
continuous spectrum, Eq. 2.41, you will obtain

hq1 j˛i D
ˆ

dq .q/ hq1 jqi D
ˆ

dq .q/ı .q1 � q/ D .q1/: (3.44)

Replacing .q/ in Eq. 3.43 with its expression derived in Eq. 3.44, you end up with

j˛i D
ˆ

dq jqi hq j˛i :

Considering expression
´

dq jqi hqj as an operator, you can, similarly to the case of
discrete basis, write

ˆ
dq jqi hqj D OI: (3.45)

Equation 3.45 constitutes a completeness condition for eigenvectors of operators
with continuous spectrum.

Example 13 (Expansion in Terms of Continuous Basis) To illustrate Eq. 3.43,
consider again a linear vector space of integrable functions of a single variable:
j˛i � f .x/. The Fourier transform of this function can be defined as

f .x/ D 1p
2�

1̂

�1
dkQf .k/eikx;

where the “coefficient” function Qf .k/ is defined via the inverse transform

Qf .k/ D 1p
2�

1̂

�1
dxf .x/e�ikx:

3.3 Operators and Observables 67

The role of the continuous basis is played here by functions

jki � 1p
2�

eikx;

which are eigenvectors of Hermitian operator �id=dx with continuous spectrum
consisting of real numbers k. These eigenvectors are orthogonal and delta-function
normalized:

1

2�

1̂

�1
dxei.k1�k/x D ı .k � k1/ :

The completeness condition, Eq. 3.45, for these functions takes the form of

1

2�

1̂

�1
dkei.x1�x/k D ı .x � x1/ ;

with delta-function ı .x � x1/ playing the role of the identity operator OI in this space:
ˆ

f .x/ı .x � x1/ dx D f .x1/:

Some operators have a mixed spectrum: it is discrete for one range of eigenvalues
and continuous for another range. Completeness relation in this case will be a
combination of Eq. 3.42 and Eq. 3.45 with sum over all discrete eigenvectors and
the integral over the continuous one.

3.3.2 Quantization Postulate

Most physical observables can be constructed from just two elements: position
vector r and momentum p. I have already introduced states with definite values of the
position vector, jri, which are supposed to be eigenvectors of a respective Hermitian
operator Or. Similarly, I can introduce states with definite values of momentum jpi,
which are supposed to be eigenvectors of the Hermitian momentum operator Op. The
first question, of course, which you shall want to know is what these operators do
to quantum states. You could have guessed the answer for the states represented by
eigenvectors of respective operators: Or jQri D Qr jQri , Op jQpi D Qp jQpi, where I placed
above r and p to better distinguish between symbols of respective operators and their
eigenvalues and eigenvectors. Using these results I can compute expressions like
Or j˛i or Op j˛i by expanding the state j˛i in terms of eigenvectors of the respective
operators. For instance, by presenting

j˛i D
ˆ

dQr .Qr/ jQri ;

68 3 Observables and Operators

I can find

Or j˛i D
ˆ

dQr .Qr/ Or jQri D
ˆ

dQr .Qr/ Qr jQri :

Similar treatment for the momentum operator yields

Op j˛i D
ˆ

d Qp' .Qp/ Op jQpi D
ˆ

d Qp' .Qp/ Qp jQpi :

The problem arises when both position and momentum operators appear in the
same expression and we have to figure out how to operate, say, Op, on a state
expanded in terms of eigenvectors of Or or vice versa. I will discuss this issue
later in the book, in the section devoted to “representations” of the state vectors
and operators. For now I would just like to say that the solution to this problem
depends on the fundamental assumptions about commutation relations involving
position and momentum operators. Essentially, the quantization procedure, i.e., the
rules determining how to replace classical observables with their representation as
quantum operators, consists in the postulation of these commutation relations. You
will see many times in this text that the knowledge of the commutators of various
operators is all what you need to know to perform quantum mechanical calculations.
So, please meet the fundamental commutation relations of quantum mechanics.

Postulate 3 (Quantization Postulate) Operators, corresponding to various
Cartesian components of position vector and momentum, obey the following
commutation relations:

�Ori; Orj
� D 0I �Opi; Opj

� D 0 (3.46)
�Ori; Opj

� D i„ıi;j; (3.47)

where subindexes take values 1; 2; 3 indicating x; y; z Cartesian components of
the position and momentum vectors, respectively.

The first of the commutators in Eq. 3.46 indicates that the Cartesian components
of the position vectors are mutually consistent observables. In other words, it means
that if a system is in the state with a certain position, all three components of the
position vector are well-defined. The same is true for the vector of momentum as
expressed by the second of the commutators in Eq. 3.46. These commutators reflect
our desire born out of empirical experience for the position and momentum of the
quantum systems to be genuinely well-defined quantities, at least when measured
independently of each other.

The commutators presented in Eq. 3.47 are often called canonical commutation
relations, and they also express our empiric experience, namely, the fact that the
same Cartesian components of position and momentum vectors of a quantum system
are not mutually consistent observables and cannot, therefore, be described by

3.3 Operators and Observables 69

commuting operators. The actual form of the commutator is chosen to reproduce
Heisenberg’s uncertainty principle, which is discussed in the next section. You
will also see later that the empirical foundation for this form of the commutator
can be traced to the de Broglie relation, Eq. 1.3. It is interesting to note a striking
similarity between commutators given in Eqs. 3.46 and 3.47 and canonical Poisson
brackets of classical mechanics, Eq. 3.5. This similarity lies in the foundation
of the so-called canonical quantization rule: any classical conjugated quantities
satisfying Eq. 3.5 in quantum theory are promoted to quantum operators obeying the
canonical commutation relation 3.47. Therefore, canonically conjugated variables
never belong to the same class of mutually consistent observables and are found on
the opposite sides of the Bohr complementarity principle.

3.3.3 Constructing the Observables: A Few Important
Examples

Using coordinate and momentum operators, I can construct operators for other
observables, which is done according to the standard quantization rule.

Quantization Rule To turn a classical observable into an operator, replace
all coordinate and momentums appearing in its classical definition with corre-
sponding operators respecting the requirements of hermiticity and the order of
multiplication, when necessary.

In many situations, the issues related to hermiticity or to the multiplication order
of observables are resolved automatically, but in some cases one needs to pay special
attention to them. To have you started, consider several simplest examples.

Kinetic Energy
Kinetic energy of a single particle with mass me is described by operator

OK D Op
2

2me
;

which is obtained from the corresponding classical expression by replacing classical
momentum with the momentum operator. The eigenvectors of this operator coincide
with the eigenvectors of the momentum operator, and its eigenvalues, which form
a continuous spectrum, provide values of kinetic energy that can be observed for a
system under study.

Potential Energy
Potential energy is obtained from the respective classical potential energy func-
tion by replacing classical coordinate argument of the function with its operator
equivalent: U.r/ ! U .Or/. It is assumed here, of course, that the potential energy
function can be presented as a series of positive and negative powers of r, in

70 3 Observables and Operators

which case the corresponding operator expression would have an easily identifiable
meaning. Examples of such transformations are one-dimensional harmonic potential
(kx2 ! kOx2) and Coulomb potential (k=r ! kOr�1), where r is the absolute value
of the position vector.4 The eigenvectors of this operator are the same as of the
position operator, and the respective eigenvalues determine the possible values of
the potential energy of the system.

Hamiltonian
Hamiltonian, which in classical mechanics is defined as the energy of the system
expressed in terms of canonically conjugated coordinate and momentum, in quan-
tum mechanics becomes, in a single particle case, an operator of the form

OH D Op
2

2m
C U .Or/ : (3.48)

Since position and momentum operators do not commute, the eigenvectors of
the Hamiltonian are usually different from the eigenvectors of both position
and momentum operators. Eigenvalues of Hamiltonian can belong to discrete,
continuous, or mixed spectrum and determine the values of energy, which the system
can have in the given environment. This is the most important operator in all of the
quantum physics: just like classical Hamiltonian, its quantum counterpart controls
the dynamics of the quantum objects.

Angular Momentum
Angular momentum is a very special kind of an observable. Classical angular
momentum is a vector defined as a cross product of the position and momentum
operators L D r � p. The quantization rule requires that the quantum mechanical
angular momentum operator is constructed by promoting position and momentum
vectors to the corresponding operators:

OL D Or � Op: (3.49)

However, since this expression involves the product of the potentially non-
commuting operators, one has to be careful with the order of the multiplication.
One also needs to make sure that the resulting operator is Hermitian. To address
both these concerns, I will expand the angular momentum vector in its Cartesian
components:

OLx D OyOpz � OzOpy (3.50)

4This transformation is not as trivial as it might seem since taking absolute value of a vector
involves operation of square root, which is not well defined for operators. Practically it is not
a problem, however, because usually one works in the basis of the eigenvectors of the position
operator, in which case Or�1 becomes simply 1=r. If you are not concerned with any of this, this
note is not for you. I mention it here simply in order to avoid accusations in sweeping something
under the rug.

3.3 Operators and Observables 71

OLy D OzOpx � OxOpz (3.51)
OLz D OxOpy � OyOpx: (3.52)

(One can use as a useful mnemonic device representation of the vector product as a
determinant:

r� p �

�������

ex ey ez
x y z

px py pz

�������
;

where the first line is formed by unit vectors defining corresponding axes of a
Cartesian coordinate system.)

The first thing to notice in Eqs. 3.50–3.52 is that operators that are actually being
multiplied correspond to commuting components of the position and momentum
vectors; thus, the order, in which you place these operators, is not important. Next,
you need to verify that each of the components of the angular momentum operator
is a Hermitian operator. Hermitian conjugation, e.g., on x-component yields

OL�x D .OyOpz/� �
�OzOpy

�� D Opz Oy � OpyOz D OLx
proving hermiticity of this operator. Similarly, you can demonstrate the Hermitian
nature of two other components. The most unusual property of the angular
momentum, however, is that different components of the angular momentum do

not commute. To illustrate this point, compute commutator
h OLx; OLy

i
:

h OLx; OLy
i

D �OyOpz � OzOpy
�
.OzOpx � OxOpz/ � .OzOpx � OxOpz/

�OyOpz � OzOpy
� D

OyOpzOzOpx C OzOpy OxOpz � OzOpyOzOpx � OyOpz OxOpz � OzOpx OyOpz � OxOpzOzOpy C OzOpxOzOpy C OxOpz OyOpz D

OyOpx OpzOz C Opy OxOzOpz ����Oz2 Opy Opz ���Op2z OxOy � OyOpxOzOpz � Opy OxOpzOz C��
�Oz2 Opy Opz C��Op2z OxOy D

OyOpx .OpzOz � OzOpz/C Opy Ox .OzOpz � OpzOz/ D i„
�Opy Ox � OyOpx

� D i„OLz; (3.53)

where, when transitioning from the second line to the third, I took into account that
different components of the coordinate and momentum operators do commute, so
that their order can be changed at will. Similarly, you will find (do it!)

h OLz; OLx
i

D i„OLy (3.54)
h OLy; OLz

i
D i„OLx: (3.55)

72 3 Observables and Operators

These results indicate that the vector of the angular momentum in quantum theory
is quite different from regular classical vectors as well as from vector operators
of position and momentum: different components of this vector do not belong to
the same group of mutually commuting operators and do not represent mutually
consistent observables, meaning that this vector is not really well-defined. More
specifically, if a quantum system is in a state in which one of the Cartesian
components of the angular momentum is known with certainty, measurements of
two other components will produce statistically uncertain results. This conclusion,
in addition to making the direction of the angular momentum vector uncertain, also
raises a question about its magnitude. Indeed, the magnitude of a generic classical

3-D vector is defined as jAj D
q

A2x C A2y C A2z . Formal quantization of this
expression is not possible because the square root of an operator

q
OA2x C OA2y C OA2z

is not a well-defined object. In the case of position and momentum operators,
this problem did not arise because different components of these operators are
commuting so that one can always choose a coordinate system in which all but one
component of the position or momentum operators are equal to zero. The possible
values of the remaining non-zero component will define the magnitude of the entire
vector. This approach is not possible in the case of angular momentum because of
the incompatibility of its components. This problem is circumvented by choosing
the operator of the square of angular momentum defined as

OL2 D OL2x C OL2y C OL2z (3.56)

to represent its magnitude. Computing commutators
h OL2;Lx;y;z

i
you will find that all

three commutators vanish. (The proof of this statement is left to you as an exercise.)
This means that operators of the square of the angular momentum and one (any)
component of the angular momentum are compatible observables, so that a quantum
system can be created in a state in which one of the components and the magnitude
of the angular momentum are known with certainty. Obviously such a state would
be a common eigenvector of OL2 and OLz.
Quantization of p � r
As a last example, consider a classical expression of the form p � r, which appears in
some applications. An attempt to directly transform this expression in the quantum
form by promoting the momentum and position vectors to operators faces two
obstacles. First, the operators in this expression do not commute, and so it is
unclear what is the correct order of multiplication. Second, even if I arbitrarily
impose a particular order, say, Op � Or, the resulting operator is not Hermitian because
.Op � Or/� D Or � Op ¤ Op � Or. To carry out the quantization procedure in this case, you need
to come up with an expression, which would coincide with its original classical
version but would not depend on the order of the operators, and be Hermitian. One
way to achieve this is to introduce operator

1

2
.Op � Or C Or�Op/

3.3 Operators and Observables 73

which satisfies all these conditions. However, this quantization procedure is not
unique, and it might (and does) create problems down the road, but luckily for us
this is not the road I choose for us to travel.

3.3.4 Eigenvalues of the Angular Momentum

The operators of the angular momentum play an extraordinary role in quantum
theory, both on the fundamental level and for applications. The fundamental role
of the angular momentum is derived from its relation to the rotation operator and
rotational symmetry of quantum systems, but discussion of this topic is well above
your pay grade. Those interested in the topic are free to consult any graduate level
quantum mechanics text. From the point of view of applications, the importance
of the angular momentum stems from the fact that many fundamental interactions
in nature are described by so-called central potentials. The potential energy of
such interactions depends only on the absolute value of the distance between two
interacting particles, but not on the orientation of the vector of their relative position.
This text is mostly concerned with quantum mechanics of a single particle in an
external potential (a two-particle problem can often be presented in this form as
well). If the external potential belongs to the class of central potentials, it can be
shown that the Hamiltonian of such a system commutes with all components of the
angular momentum as well as with operator OL2. The proof of this statement requires
proving it separately for kinetic energy operator (essentially for operator Op2) and for
the potential energy operator V .Or/. I believe that the readers of this text are already
equipped to prove that

h OLx;y;z; Op2
i

D
h OL2; Op2

i
D 0, so I leave it to you as an exercise.

As far as the commutators with the potential energy operator go, this proof will have
to be left till later.

Vanishing of the commutators of angular momentum operators and the Hamil-
tonian means that the Hamiltonian, OL2, and one of the components of the angular
momentum form a system of commuting operators and that the eigenvectors of OL2
and, say, OLz are also eigenvectors of the Hamiltonian. This fact can significantly
simplify finding eigenvalues and eigenvectors of the Hamiltonian.

It is also remarkable that the eigenvalues of OL2 and, for instance, OLz can be found
using only commutation relations given by Eqs. 3.53–3.55. The choice of the z-
component here is a random historical occasion and does not have any physical
significance. By choosing this particular component, which, you shall understand,
is attached to a particular choice of the coordinate system, we essentially say
to the experimentalists that if the quantum system is in a state described by the
eigenvectors of OLz as defined by this coordinate system, then a measurement of a
component of the angular momentum in the same direction will produce results
corresponding to the respective eigenvalue with certainty, while measurements of
any other component of the angular momentum will have quantum uncertainty.

74 3 Observables and Operators

I begin the search for the eigenvalues by introducing abstract vectors j�L; �zi
defined as common eigenvectors of operators OL2 and OLz characterized by some yet
unknown eigenvalues �L and �z:

OL2 j�L; �zi D �L j�L; �zi ; (3.57)
OLz j�L; �zi D �z j�L; �zi : (3.58)

It is convenient to present these eigenvalues as �L D „2p and �z D „m. Pulling out
factors „2 and „ from eigenvalues of OL2 and OLz, respectively, makes the remaining
quantities p and m dimensionless since the dimension of the angular momentum is
the same as that of Planck’s constant. Apparently, I will need to invoke, somehow,
two remaining components of the angular momentum. It is not right away obvious
how to do it, but let’s say that I have had a divine intervention or premonition that
the following two new operators might be useful:

OLC D OLx C i OLy; (3.59)
OL� D OLx � i OLy: (3.60)

The first thing I need to do with these operators is to compute their commutators
with operators OL2 and OLz:

h OLC; OLz
i

D
h OLx; OLz

i
C i

h OLy; OLz
i

D �i„OLy � „OLx D �„OLC; (3.61)
h OL�; OLz

i
D
h OLx; OLz

i
� i

h OLy; OLz
i

D �i„OLy C „OLx D „OL�: (3.62)

It is also easy to see that commutators
h OL2; OL˙

i
vanish. Indeed, OL2 commutes with

all component operators and, therefore, with OL˙, which are combinations of OLx andOLy. Now, the new operators for a theoretician are like new toys for a child, and I am
eager to play with them and see what they can do. So, to satisfy the urge, and in
hopes to learn something new, I want to apply operators OL˙ to Eq. 3.57:

OL˙ OL2 j�L; �zi D OL2 OL˙ j�L; �zi D „2p OL˙ j�L; �zi ; (3.63)

where I used OL2L˙ D OL˙ OL2. OK, and what did we learn from this exercise? Well, I
know now that if j�L; �zi is the eigenvector of OL2 with eigenvalue „2p, then vectorOL˙ j�L; �zi is still the eigenvector of OL2 with the same eigenvalue, which is not really
surprising because L˙ do commute with OL2. So far, it is not much, and you would
be right to say that so far operators OL˙ have not given us any particular advantages
because we would have gotten the same result with operators OLx;y. But let’s not jump
the gun—always a bad idea—while patience and persistence are virtues. Instead, let
me play another game and apply OLC to Eq. 3.58:

OLC OLz j�L; �zi D „m OLC j�L; �zi ;

3.3 Operators and Observables 75

� OLz OLC � „OLC
�

j�L; �zi D „m OLC j�L; �zi ;
OLz OLC j�L; �zi D „ .m C 1/ OLC j�L; �zi ; (3.64)

where I used commutation relation 3.61 to make the transition from the first to the
second line. Now, the last line in Eq. 3.64 tells us that OLC j�L; �zi is an eigenvector
of OLz with eigenvalue „m C „. This is a quite exciting result: it means that if I start
with some eigenvector with a known eigenvalue, I can generate new eigenvectors
with progressively increasing eigenvalues: „m C „; „m C 2„; „m C 3„ : : : . This is
already something new, which we could not have gotten without the operator OLC.
The secret of this operator lies in its commutator with OLz, which is proportional toOLC itself. The same is true for operator OL�, so it is worth looking into what this
operator can do:

OL� OLz j�L; �zi D „m OL� j�L; �zi ;
� OLz OL� C „OL�

�
j�L; �zi D „m OL� j�L; �zi ;

OLz OL� j�L; �zi D „ .m � 1/ OL� j�L; �zi : (3.65)

When deriving Eq. 3.65, I again applied commutator from Eq. 3.62 to its first line.
The final result of this calculation indicates that operator OL� also generates new
eigenvectors of OLz but with progressively decreasing eigenvalues. Not surprisingly
operators OLC and OL� are called raising and lowering ladder operators.

Now, the question arises: will this process of generating new eigenvectors and
eigenvalues ever stop? In other words, can operator OLz have arbitrary large and
arbitrary small eigenvalues? Intuitively, it is clear that the answer to this question
must be negative and that the possible eigenvalues of OLz must be limited both
from above and from below. Indeed, these eigenvalues represent possible results
of the measurement of one component of a vector, while eigenvalues of OL2 represent
possible experimentally observable values of the squared magnitude of the same
vector. It is difficult to imagine that the component of a vector can be larger than
the magnitude of the same vector, and therefore one should expect that there must
be some kind of a relation between these two eigenvalues, e.g., something like this
m2 < p. In order to see if such a relation, indeed, exists, consider the following
expression:

h�L; �zj OL2 j�L; �zi D h�L; �zj OL2x j�L; �zi C h�L; �zj OL2y j�L; �zi
C h�L; �zj OL2z j�L; �zi :

Taking into account Eqs. 3.57 and 3.58, this can be written as

„2p D h�L; �zj OL2x j p; �zi C h�L; �zj OL2y j p; �zi C „2m2: (3.66)

76 3 Observables and Operators

Since the expectation values of operators OL2x and OL2y in any state are positive
quantities, Eq. 3.66 yields that p > m2. This means that there exists the smallest m,
which I will designate as l, and there exists the largest m, for which I will use symbol
l. Now, assume that you are dealing with the eigenvector

ˇ̌
�L; „Nl

˛
and applying

operator OL� to it. Generally speaking, this operator must lower the eigenvalue, but
we assumed that this eigenvalue is already the lowest. The only way to reconcile
Eq. 3.65 with this assumption is to require that

OL�
ˇ̌
�L; „Nl

˛ D 0: (3.67)

In order to figure out how to use this important piece of information, I again need a
bit of divine inspiration, or I can just notice that the product of operators OLC OL� can
be expressed in terms of operators OL2 and OLz:

OLC OL� D OL2x C OL2y C i OLy OLx � i OLx OLy D OL2 � OL2z C „OLz:

Rewriting this expression as

OL2 D OL2z � „OLz C OLC OL�; (3.68)

and applying it to vector
ˇ̌
�L; „Nl

˛
while taking into account Eq. 3.67, I obtain

OL2 ˇ̌�L; „Nl
˛ D OL2z

ˇ̌
�L; „Nl

˛ � „OLz
ˇ̌
�L; „Nl

˛C OLC OL�
ˇ̌
�L; „Nl

˛ )
„2p ˇ̌�L; „Nl

˛ D „2Nl2 ˇ̌�L; „Nl
˛ � „2Nl ˇ̌�L; „Nl

˛ )
p D Nl2 � Nl: (3.69)

Now, consider the state characterized by the largest values of m, j�L; „li. Attempting
to act on this vector with operator OLC leaves you with the same conundrum
encountered when discussing vector

ˇ̌
�L; „Nl

˛
, but by now you know the way out:

you must require that

OLC j�L; „li D 0: (3.70)

The derivation of Eq. 3.69 based on Eq. 3.67 was successful because the lowering
operator OL� appears in this equation after operator OLC. Consequently, when the
product OLC OL� is made to act on

ˇ̌
�L; „Nl

˛
, the resulting expression vanishes. In order

to achieve the same effect with state j�L; „li and Eq. 3.70, I need to modify Eq. 3.68
in such a way that it would contain combination OL� OLC instead of OLC OL�. To achieve
this, consider

OL� OLC D OL2x C OL2y � i OLy OLx C i OLx OLy D OL2 � OL2z � „OLz (3.71)

3.3 Operators and Observables 77

which can be rewritten in the desired form

OL2 D OL2z C „OLz C OL� OLC: (3.72)

Now applying OL2 to j�L; „li and using Eqs. 3.72 and 3.70, I get

p D l2 C l: (3.73)

Comparing Eq. 3.69 with Eq. 3.73, I infer that smallest and largest eigenvalues of OLz
are related to each other as

l2 C l D Nl2 � Nl:

It is easy to see (one can always just solve the quadratic equation for Nl) that this
relation implies that Nl D �l or Nl D l C 1. The latter solution contradicts to
the assumption that Nl is the smallest eigenvalue and l is the largest; thus the only
possibility which makes sense is Nl D �l.

Now imagine that you have found the smallest eigenvalue �l and you start
applying operator OLC to state j�L;�„li. After each application of the operator, the
eigenvalue of OLz increases by one, so that after applying it N times, you end up with
eigenvalue �l C N. Eventually you must reach the largest eigenvalue l, at which
point you will have �l C N D l ) 2l D N. N is apparently an integer number, so l
can be either integer, if N is even, or half-integer, if N is odd.

Now, let us gather our thoughts and try to summarize what it is that we have
got:

1. The eigenvalue of operator OL2 is equal to „2l.l C 1/, where l determines the
maximum eigenvalue of the operator OLz, „l.

2. l can take either integer or half-integer values, forming two non-overlapping
series of allowed values: 0; 1; 2; 3 � � � or 1=2; 3=2; 5=2 � � � .

3. Allowed values of m start at �l and advance increasing by one until it reaches l.
For instance, for l D 0; the only possible value of m is zero; for l D 1=2, m can
be �1=2; 1=2; and for l D 1, we can have states with m D �1; 0; 1. In general
for a state characterized by the same eigenvalue of operator OL2, „2l.l C 1/, there
are 2l C 1 possible states with different eigenvalues of OLz.

It is interesting to note that if I were talking about a classical vector, the maximum
magnitude of its component along an axis would simply equal to the length of the
vector. If we interpret expression „l as such a component’s length, then the squared
length of the entire vector would have been „2l2, which is different from the quantum
result „2l2 C „2l. One can see that the “extra” contribution to the “length” comes
from fluctuations of two other components of the angular momentum. Indeed, using
what you have learned from Eq. 3.66, you can write

78 3 Observables and Operators

„2l2 C „2l D „2l2 C hl; lj OL2x jl; li C hl; lj OL2y jl; li )
hl; lj OL2x jl; li C hl; lj OL2y jl; li D „2l )
hl; lj OL2x jl; li D hl; lj OL2y jl; li D „2l=2:

In the last expression, I introduced a shortcut notation for the common eigenvectors
of operators OL2 and OLz, which in general looks like jl;mi with the first number
indicating that this vector belongs to the eigenvalue „2l.l C 1/ of OL2 and the second
number pointing at the eigenvalue „m of OLz. For brevity, l is often referred to as
the “angular momentum,” and m is often called a “magnetic” quantum number. The
origin of this name will become clear later, when we get to consider the behavior of
atoms in the magnetic field.

Finally, let me note that even though we know now that ladder operators OL˙
generate eigenvectors of OLz, there is no guarantee that the resulting eigenvectors
will be normalized even if the initial vector is. So, in order to finalize the rule for
obtaining normalized eigenvectors using ladder operators, we have to analyze their
action more carefully. First, it is easy to see that they are Hermitian conjugates of
each other:

OL� D OL�C: (3.74)

Assuming that vectors jl;mi and jl;m C 1i are normalized and introducing yet
unknown normalization coefficient, I can write

OLC jl;mi D Al;m jl;m C 1i :

The Hermitian conjugation of this expression yields

hl;mj OL� D hl;m C 1j A�l;m:

Multiplying the left-hand side of this equation by the left-hand side of the previous
one and doing the same to their right-hand sides yields

hl;mj OL� OLC jl;mi D A�l;mAl;m hl;m C 1j jl;m C 1i :

Since it was assumed that all ket vectors are normalized, I now immediately have
for jAlmj2:

jAlmj2 D hl;mj OL� OLC jl;mi :

Taking into account Eq. 3.71, and the fact that kets in this expression are eigenvec-
tors of OL2 and OLz, I find

jAlmj2 D „2 Œl .l C 1/ � m .m C 1/�

3.3 Operators and Observables 79

which allows to establish the final rule for the generation of new eigenvectors from
the known ones:

OLC jl;mi D „
p

l .l C 1/ � m .m C 1/ jl;m C 1i : (3.75)

I will leave it to you to show that

OL� jl;mi D „
p

l .l C 1/ � m .m � 1/ jl;m � 1i : (3.76)

To conclude this section, let me just emphasize once again that we were able to
find eigenvalues for the system of operators, as well as a rule for generating their
eigenvectors, using nothing but their commutation relations. The key to successful
completion of this task was the existence of the ladder operators with their very
special commutation relations given by Eqs. 3.61 and 3.62.

3.3.5 Statistical Interpretation

In Chap. 2 I have already introduced the relation between coefficients in the
superposition states and probabilities of various outcomes of the measurements on
quantum systems. This time I will elaborate those ideas in a more precise way by
formulating two postulates introducing statistical interpretation to the formalism of
quantum mechanics.

Postulate 4 (Born’s Rule) A measurement of an observable can only yield a
value from the set of the eigenvalues of the operator representing the measured
observable. If a system before the measurement is not in a state described by
one of the eigenvectors of this operator, the result of the measurement cannot
be predicted a priori. Only a probability (or probability density for observables
with continuous spectrum) of a particular outcome can be known. If the measured
eigenvalue is not degenerate, this probability is given by

pn D jh˛j �nij2 ; (3.77)

where j˛i represents a state of the system before the measurement, �n is one of
the eigenvalues, and j�ni is the corresponding eigenvector. If the eigenvalue is
degenerate, the probabilities given by Eq. 3.77 must be summed up with other
degenerate states belonging to this eigenvalue. In the case of observables with
continuous spectrum, the probability is replaced with probability density p.q/:

p.q/ D jh˛j qij2 ;

which determines a differential probability dP that the measured value of the
observable lies within interval of values Œq; q C dq� as dp D p.q/dq.

80 3 Observables and Operators

Postulate 5 Regardless of the state in which the system was before an observable
is measured, immediately after the measurement, the system will be in a state
represented by the eigenvector of the corresponding operator belonging to the
observed non-degenerate eigenvalue. If the measured eigenvalue is degenerate,
all we can state is that after the measurement the system will be in a state in the
subspace of eigenvectors belonging to this eigenvalue.

Both these postulates are essentially more accurate restatements of the proposi-
tions already discussed in Sect. 2.2.3, where somewhat vague notion of “the state
with definite values of an observable” is replaced with its mathematical representa-
tion as an eigenvector of a respective operator. This more formal approach allows
carrying out a more comprehensive exploration of the statistical interpretation of
quantum mechanical formalism.

I begin by considering an expression of the form h˛j OT j˛i, where j˛i is an
arbitrary state and OT is a Hermitian operator representing a certain observable. I have
already mentioned that this expression is often referred to as “expectation value,” but
now I can demonstrate what it actually means. Expanding this state into eigenvectors
of OT (Eq. 3.39), I can present h˛j OT j˛i as

h˛j OT j˛i D
X

n

X
m

a�n am h�nj OT j�mi D
X

n

X
m

�ma
�
n am h�nj �mi D

X
n

�n janj2 ; (3.78)

where I first took advantage of the fact that j�mi is an eigenvector of OT with
eigenvalue �m: OT j�mi D �m j�mi and then used orthonormalization condition for
the eigenvectors, h�nj �mi D ınm. According to Born’s rule, janj2 is the probability
that the measurement of the observable will produce �n. Then, it becomes clear that
the final result in Eq. 3.78 has the meaning of the average value of the observable,
which one would “expect” to find if the same measurement is repeated multiple
times or if an experimentalist carries out the measurement on multiple identical
copies of the same system. The simplest measure of the statistical uncertainty of
such measurements would be the standard deviation, which in regular probability
theory would be defined as

�T D
q
�2 � �2

where the bar above the letters means statistical averaging with probabilities given

by pn D janj2: �2 D Pn pn�2n; �
2 D �Pn pn�n

�2
. In the context of quantum theory,

the measure of uncertainty of a measurement can be described as

�T D
r

h˛j OT2 j˛i �
�
h˛j OT j˛i

�2
: (3.79)

3.3 Operators and Observables 81

Indeed,

h˛j OT2 j˛i D
X

n

X
m

a�n am h�nj OT OT j�mi D
X

n

X
m

�ma
�
n am h�nj OT j�mi

D
X

n

X
m

�2ma
�
n am h�nj �mi D

X
n

�2n janj2 :

This shows that the measure of uncertainty expressed by Eq. 3.79 does agree with
the probabilistic definition of the standard deviation. If state j˛i is one of the
eigenvectors j�n0i, all coefficients an are zeroes, with the exception of an0 D 1.
In this case, we have h˛j OT2 j˛i D �2n0 D

�
h˛j OT j˛i

�2
, and uncertainty �T vanishes.

This justifies calling states represented by eigenvectors determinant states or states
in which the observable has a definite value. If there are several mutually consistent
observables represented by commuting operators, we can have a state, which is a
common eigenvector of all operators, in which all observables will have definite
values.

If two observables are not mutually consistent and are described by operators OT1
and OT2 that do not commute, one can derive the following inequality for uncertainties
of these operators �T1 and �T2 :

�T1�T2 �
1

2
h˛j

ˇ̌
ˇ
h OT1; OT2

iˇ̌
ˇ j˛i (3.80)

which is valid for an arbitrary state j˛i. This is the so-called generalized uncertainty
principle. Using canonical commutation relations 3.47, I can immediately reproduce
the Heisenberg inequality

�x�p � 1
2

„ (3.81)

which now becomes a particular case of a more general result presented by
Eq. 3.80. It is interesting that using Heisenberg uncertainty principle, Eq. 1.4, as
an empiric formula and combining it with Eq. 3.80, I can “derive” or justify, if you
want, the canonical commutator between the coordinate and momentum operators.
Indeed, since Eq. 3.81 is valid for an arbitrary state, in order to reconcile Eq. 3.81
with Eq. 3.80, I have to admit that the commutator of coordinate and momentum
operators must be a regular number (only in this case the right-hand side of Eq. 3.81
becomes proportional to h˛j ˛i D 1, so that the dependence on the state vanishes).
The absolute value of this number must obviously be equal to „, but recalling that if
the commutator of two Hermitian operators is a number, it must be an imaginary
number (see Eq. 3.31), I can conclude that ŒOx; Opx� D i„, which is the canonical
commutation relation given in Eq. 3.47. Of course, these arguments are not sufficient
to show if this commutator is Ci„ or �i„, but the choice of the sign is, actually, the
matter of convention, and the standard agreement is to write this commutator as
given in Eq. 3.47.

82 3 Observables and Operators

To illustrate all these rather abstract postulates, I will finish this section with an
example, in which, to save time, I will again use matrices M1 and M2 defined by
Eq. 3.37.

Example 14 (Probabilities of Measurements) Assume that these matrices represent
two observables of some quantum system and that you intend to measure these
observables. It is given that the system is prepared in the state j
i represented by the
column

j
i D 1p
7

2
4
2i
1

1 � i

3
5

and you are asked to predict the results of the different sequence of measurements
of observables M1 and M2. The first step you have to do is to verify that your
initial state is normalized, which is just a good housekeeping habit. The norm of
this vector is (do not forget to do complex conjugation when converting ket into a
bra—for some reason even good students keep forgetting about it)

k
k D 1
7

��2i 1 1C i�
2
64
2i

1

1 � i

3
75 D

1

7
..�2i/ .2i/C 1C .1C i/ .1 � i// D

1

7
.4C 1C 2/ D 1:

Once normalization is verified, you are ready for the next step. Let’s say you
first want to measure the observable represented by M2. We found earlier that the
eigenvalues of this matrix are �1 D 2, �2 D 1, and �3 D �1. Thus, these are
the values that you can expect to see on the dial of your measuring device (more
or less, experimental errors are unavoidable, of course). The actual issue is to find
the corresponding probabilities. Using Born’s rule, Eq. 3.77, and the corresponding
eigenvectors given in Eq. 3.38, you can find for each of the eigenvalues

p�1 D jh
j 2ij2 D

ˇ̌
ˇ̌
ˇ̌
ˇ
1p
2

1p
7

��2i 1 1C i�
2
64

�1
0

1

3
75

ˇ̌
ˇ̌
ˇ̌
ˇ

2

D

1

14
j2i C 1C ij2 D 5

7

3.3 Operators and Observables 83

p�2 D jh
j �1ij2 D

ˇ̌
ˇ̌
ˇ̌
ˇ
1

2

1p
7

��2i 1 1C i�
2
64
1p
2

1

3
75

ˇ̌
ˇ̌
ˇ̌
ˇ

2

D

1

28

ˇ̌
ˇ�2i C

p
2C 1C i

ˇ̌
ˇ
2 D

�
1C p2

�2 C 1
28

D 2C
p
2

14

and

p�3 D jh
j 1ij2 D

ˇ̌
ˇ̌
ˇ̌
ˇ
1

2

1p
7

��2i 1 1C i�
2
64

1

�p2
1

3
75

ˇ̌
ˇ̌
ˇ̌
ˇ

2

D

1

28

ˇ̌
ˇ�2i �

p
2C 1C i

ˇ̌
ˇ
2 D

�
1 � p2

�2 C 1
28

D 2 �
p
2

14
:

It is always a good idea to run a quick check:

p�1 C p�2 C p�3 D
5

7
C 2C

p
2

14
C 2 �

p
2

14
D 5
7

C 2
7

D 1;

as it should be. So far so good. The expectation value of M2 can be computed in two
different ways. First, I will use the standard probabilistic definition of the average

hM2i D . p�1�1 C p�2�2 C p�3�3/ D

2 � 5
7

C .�1/2C
p
2

14
C 12 �

p
2

14
D 10 �

p
2

7
:

And I will also compute this quantity using quantum-mechanical definition:

hM2i � h
j cM2 j
i D

1

7

��2i 1 1C i�
2
664

1 � 1p
2

�1
� 1p

2
0 � 1p

2

�1 � 1p
2

1

3
775

2
64
2i

1

1 � i

3
75 D

1

7

��2i 1 1C i�
2
664
2i � 1p

2
� 1C i

� 2ip
2

� 1�ip
2

�2i � 1p
2

C 1 � i

3
775 D

84 3 Observables and Operators

1

7

��2i 1 1C i�
2
664
3i � 1p

2
� 1

� 1Cip
2

�3i � 1p
2

C 1

3
775 D

1

7

6C 2ip

2
C 2i � 1C ip

2
� 2i � 1p

2
C 4 � ip

2

D 1
7

�
10 � p2

�
;

again, exactly as promised. If immediately after measuring M2 you will attempt to
measure M1 and are interested in probabilities of various outcomes (now you are
talking about outcomes consisting of pairs of measurements, which are given by all
nine possible pairs of eigenvalues .�.M2/i ; �

.M2/
j /), you have to take into account

that after the first measurement, the system is no longer in the initial state j
i.
Depending on the outcome of the first measurement, it will be in a state presented
by one of the eigenvectors of M2. However, since these two matrices commute, and
the eigenvectors of M1 are also eigenvectors of M2, the outcomes of the second
measurement are completely determined by the outcome of the first, and there are
only three possible results. For instance, if the first measurement produced for M2
value �1 (probability .2 � p2/=14), the measurement of M1 will be guaranteed
to yield 2 (the state corresponding to eigenvalue �1 of matrix M2 is described by
the same vector as the eigenvector of M1 belonging to its eigenvalue 2). Thus, the
probability of getting the pair .�1; 2/ is still .2 � p2/=14.

If you measure M1 first, the situation is a bit more complex since M1 has degener-
ate eigenvalues. So, if you want, for instance, to find the probability of getting 1 after
measuring M1, you have to compute two probabilities—one for each degenerate
state—and sum them up. To do that you can use the corresponding orthogonal and
normalized vectors given in Eq. 3.38, which are common eigenvectors of both M1
and M2. This will yield

p1 D

ˇ̌
ˇ̌
ˇ̌
ˇ
1p
2

1p
7

��2i 1 1C i�
2
64

�1
0

1

3
75

ˇ̌
ˇ̌
ˇ̌
ˇ

2

C

ˇ̌
ˇ̌
ˇ̌
ˇ
1

2

1p
7

��2i 1 1C i�
2
64

1

�p2
1

3
75

ˇ̌
ˇ̌
ˇ̌
ˇ

2

D 10
14

C 4 � 2
p
2

28
D 12 �

p
2

14
:

At this point a question might pop up in your head, if this result is unique. Indeed,
you already know that degenerate eigenvalues can be characterized by an infinite
number of different normalized and orthogonal eigenvectors. It would be nice if the
probability would not depend on this arbitrary choice, but is it really so? I will give
you a chance to answer this question as an exercise.

3.4 Problems 85

Finally let me compute the uncertainty of the observable M2 in this experiment.
For this computation I need to first find M22, which is

M22 D

2
64

5
2
0 � 3

2

0 1 0

� 3
2
0 5

2

3
75 :

Now you can compute

h
j cM22 j
i D 1
7

��2i 1 1C i�
2
64

5
2
0 � 3

2

0 1 0

� 3
2
0 5

2

3
75

2
64
2i

1

1 � i

3
75 D

1

7

��2i 1 1C i�
2
64

� 3
2

C 13
2

i

1

� 5
2

� 11
2

i

3
75 D 17

7

so that the uncertainty �2M2 is found to be

�2M2 D h
j cM22 j
i � h
j cM2 j
i2 D
17

7
� 1
49

�
10 � p2

�2 D 17C 20
p
2

49
:

3.4 Problems

Section 3.1

Problem 10 A constant force F is acting on a particle of mass m. Derive an
expression for the potential energy associated with this force, write down the
Hamiltonian of the system, and derive Hamiltonian equations.

Problem 11 Consider a particle moving in a central potential field with Hamilto-
nian

H D p
2

2m
C V .jrj/ :

Compute the following Poisson bracket:

fLx;Hg ;
˚
Ly;H

�
; fLz;Hg ;

where Lx;y;z are Cartesian coordinates of angular momentum of the particle in some
arbitrarily chosen coordinate system. Interpret the results.

86 3 Observables and Operators

Section 3.2.1

Problem 12 Which of the following is a linear operator?

1. Inversion operator OP, which acts on functions of coordinates according to the rule
OPf .r/ D f .�r/.

2. Square operator OS defined as OSf D f 2.
3. Determinant operator bDet, which when applied to a square matrix turns it into

the matrix’s determinant.
4. Exchange operator OE acting on functions of two variables as OEf .x1; x2/ D

f .x2; x1/.
5. Trace operator bTr, which acts on a matrix and turns it into the sum of its diagonal

elements.

Problem 13 Prove the linearity of the rotation operator.

Problem 14 Find a Hermitian conjugate for the integral operator OK acting on
integrable functions of a single variable and defined by kernel K .x1; x2/:

OKf D
1̂

�1
K.x1; x2/f .x2/:

The inner product is defined in a regular way: hgj f i D ´ 1�1 g�.x/f .x/dx. Determine
under which condition on the kernel this operator is Hermitian.

Problem 15 Expression OP D j˛i hˇj can be understood as an operator acting in the
following way:

OP j�i � j˛i hˇj �i :

Find its Hermitian conjugate.

Section 3.2.2

Problem 16 Specify the condition that must be obeyed by an operator so that it is
both unitary and Hermitian.
Consider the following matrices:

"
1 0

0 �1

#
;

"
0 1

1 0

#
;

"
0 i

�i 0

#
:

Do they satisfy this condition?

3.4 Problems 87

Problem 17 For three operators OA; OB, and OC, prove the following identity (known
as Jacobi identity):

hh OA; OB
i
; OC
i

C
hh OC; OA

i
; OB
i

C
hh OB; OC

i
; OA
i

D 0:

Problem 18 Which of the following matrices are Hermitian?

1.

2
64
3i 5i 7

�5i 2 3
7 3 0

3
75

2.

2
64
1 i 2i

�i 0 3
�2i 3 2

3
75

3.

2
64

p
2 1 �2

�1 2 4p5
7 �4p5 p3

3
75

4.

2
64
7 4 2

4 2 1

2 1 �4

3
75

Problem 19 Prove the identity

� OA OB
��1 D OB�1 OA�1:

Problem 20 Prove the following properties of the commutators:

h OT1 O;T2
i

D �
h OT2 O;T1

i

h OT1 C OT2; ; OT3
i

D
h OT1 O;T3

i
C
h OT2 O;T3

i

h
c1 OT1; c2 OT2

i
D c1c2

h OT1 O;T2
i
:

88 3 Observables and Operators

Problem 21 If operator OD is defined as

ODf .x/ D df
dx
;

what would be an inverse of this operator?

Problem 22 Find an inverse of the following matrices:

1.

2
64
1 i 2i

�i 0 3
�2i 3 2

3
75

2.

2
64
0 i 2

�i 0 1
�i i 0

3
75

Problem 23 Consider an operator O� characterized by the following property: O�2 D
OI, where OI is a unity operator. Using power series expansion, find the closed-form
expression (not in the form of a series) for the operator exp .i O� t/.
Problem 24 Prove that if the commutator of two Hermitian operators is a number,
this number is necessarily imaginary.

Problem 25 Given that ŒOx; Op� D i„, compute
�Ox2; Op2� :

Section 3.2.3

Problem 26 Consider matrices

"
0 i

�i 0

#
and

"
0 1

1 0

#
.

1. Find the eigenvalues and normalized eigenvectors of these matrices.
2. Check orthogonality of the found vectors.

Problem 27 Consider two matrices:

A1 D

2
64
1 0 0

0 �1 0
0 0 �1

3
75 I A2 D

2
64
1 0 0

0 0 1

0 1 0

3
75 :

3.4 Problems 89

1. Show that these operators commute.
2. Find a set of eigenvectors common for both of them.

Problem 28 Find eigenvalues and normalized eigenvectors of the following matrix:

2
664

1 � 1p
2

�1
� 1p

2
0 � 1p

2

�1 � 1p
2

1

3
775 :

Problem 29 Consider the following matrix:

A D

2
64
0 0 �1
0 1 0

�1 0 0

3
75 :

1. Find its eigenvalues. Are there degenerate ones?
2. Construct a system of normalized and orthogonal eigenvectors.
3. Show that

exA D cosh x C A sinh x:

Section 3.3.1

Problem 30 Consider an operator defined as

OA D j 1i h 1j C j 2i h 2j C j 3i h 3j �
i j 1i h 2j � j 1i h 3j C i j 2i h 1j � j 3i h 1j

where j 1i ; j 2i, and j 3i form an orthonormalized basis.
1. Check if this operator is Hermitian by computing OA�.
2. Compute OA2.
3. What are the possible values an experimentalist can observe when measuring an

observable represented by this operator?
4. Find states in which the system will be immediately after the measurement for

each of the possible outcomes. Verify that the states are presented by orthogonal
vectors.

Problem 31 Show that if OP is a projection operator, OI � OP is also a projection
operator.

90 3 Observables and Operators

Section 3.3.2

Problem 32 Derive the commutation relations
h OLz; OLx

i
D i„OLy

h OLy; OLz
i

D i„OLx:

Problem 33 Prove that the commutator of the operator of the square of angular
momentum OL2 commutes with all components of the angular momentum operator,
OLx;y;z.
Problem 34 Compute commutators

h OLz; Ox
i
;
h OLz; Oy

i
;
h OLz; Oz

i

h OLz; Opx
i
;
h OLz; Opy

i
;
h OLz; Opz

i

h OL2; Ox
i
;
h OL2; Oy

i
;
h OL2; Oz

i

h OL2; Opx
i
;
h OL2; Opy

i
;
h OL2; Opz

i
:

Problem 35 Prove that
h OLx;y;z; Op2

i
D
h OL2; Op2

i
D 0:

Section 3.3.4

Problem 36 Prove that

OL� jl;mi D „
p

l .l C 1/ � m .m � 1/ jl;m � 1i :

Problem 37 Compute the following expressions:

hl;m0j OL� jl;mi
hl;m0j OLC jl;mi :

For l D 1 present the results as a matrix.

3.4 Problems 91

Problem 38 Compute
˝
l;m0

ˇ̌ OL2x jl;mi :

Hint: Use the representation of OLx in terms of raising and lowering ladder operators.

Section 3.3.5

Problem 39 An observable A represented by an operator OA can be in two mutually
exclusive states represented by eigenvectors of OA ja1i and ja2i, where a1;2 are
corresponding eigenvalues. The second observable B represented by an operator
OB also can be in two mutually exclusive states represented by eigenvectors of OB
jb1i and jb2i, where b1;2 are corresponding eigenvalues. These eigenvectors can be
related to each other as

ja1i D 1
5
.3 jb1i C 4 jb2i/

ja2i D 1
5
.4 jb1i � 3i jb2i/ :

1. If observable A is measured and value a1 is obtained, what is the state of the
system immediately after the measurement?

2. If now B is measured, what are the possible outcomes, and what are their
probabilities?

3. Right after B was measured, A is measured again. What is the probability of
getting a1 for different possible outcomes of the first measurement?

Problem 40 A quantum system is in a state described by a vector

j˛1i D ip
3

j�1i C
p
2p
3

j�2i :

Find the probability that a measurement of some observable will bring the system
to state described by a vector

j˛2i D 1C ip
3

j�1i C 1p
6

j�2i C 1p
6

j�3i

where j�1;2;3i form an orthonormalized basis.
Problem 41 Consider a quantum system in a state described by a column vector

j i D 1p
5

2
64

�i
2

0

3
75 :

92 3 Observables and Operators

The system is characterized by two observables T1 and T2 presented by matrices

T1 D

2
64
1 i 1

�i 0 0
1 0 0

3
75 I T2 D

2
64
3 0 0

0 1 i

0 �i 0

3
75 :

1. If T1 is measured first and T2 immediately afterward, what is the probability of
obtaining �1 for T1 and 3 for T2?

2. What are the probabilities of getting the same values if the order of measurements
is reversed? Discuss the result in terms of commutation properties of the two
matrices.

Problem 42 Consider a system described by the Hamiltonian

H D 1p
2

2
64
0 �i 0
i 3 3

0 3 0

3
75

placed in a quantum state described by a column vector

j i D

2
64
4 � i

�2C 5i
3C 2i

3
75 :

1. Find the expectation value of energy in this state.
2. Find the uncertainty of energy in this state.
3. Find the possible values of energy measurements and their probabilities.
4. Use the results of the previous task to calculate the expectation value and

uncertainty of energy again. Compare the results with results of tasks 1 and 2.

Problem 43 Go back to Example 14 at the end of the chapter, and using a different
set of orthogonal and normalized eigenvectors of M1 (you will have to find it first,
of course), compute the probability of getting the degenerate eigenvalue of M1. Is
the result the same?

Problem 44 Consider a system described by a Hamiltonian

OH D �1
2

d2

dx2
C 1
2

x2

presented by an operator acting on square-integrable functions of a single variable
x forming a Hilbert space with an inner product defined in Sect. 2.1. This system is
prepared in state

j i D 1p
3

j 1i C
p
2p
3

j 2i

3.4 Problems 93

where vectors j 1;2i are defined as the following functions:

j 1i D exp

�x
2

2

I j 2i D

�
1 � 2x2� exp

�x

2

2

:

1. Verify that these functions are eigenvectors of the Hamiltonian, determine the
respective eigenvalues, and normalize the eigenvectors.

2. Rewrite the expression for the state j i in terms of normalized versions of the
vectors j 1;2i.

3. If the energy of the system is measured, what are the possible outcomes, and
what are their probabilities?

4. Find expectation values and uncertainties of the operators

O� f .x/ D �i df
dx

I Oxf .x/ D xf .x/

in state j i.

Chapter 4
Unitary Operators and Quantum
Dynamics

In the previous section, I explained how one can dig out experimentally relevant
information using states of a quantum system and operators representing the observ-
ables. The remaining burning question, however, is how can we find these states
so that we could use these methods. In a typical experiment, an experimentalist
begins by “preparing” a quantum system in some state, which they believe they
know.1 After that they smash the system with a hammer, or hit it by a laser light,
or subject it to an electric or magnetic field, wait for some time, and measure
new values of the selected observables. In order to predict the results of new
measurements, you must be able to describe how the quantum system changes
between the time of preparation and the time of subsequent measurement, or,
speaking more scientifically, you must know its dynamics. As it has been made clear
in the previous section, you need two objects to predict the results of a measurement:
a state of the system and the operator assigned to the measured observable. Now
you can ask an interesting question: “When the quantum system evolves in time,
what is actually changing—the state or the operator?” To make this question more
specific, consider an expectation value of an observable described by operator OT:
h˛j OT j˛i. When your system evolves, this expectation value becomes a function of
time. The question is, which element of the expression for the expectation value, OT
or j˛i, must be considered as a time-dependent quantity to describe the dynamics
of the expectation value? It turns out that time dependence can be ascribed to either
of these two elements, and depending on the choice, it will generate two different
but equivalent pictures of quantum mechanics. In the so-called Schrödinger picture,
the state vectors are treated as time-dependent quantities, while operators remain
fixed rules transforming the states. In the Heisenberg picture, the state vector is
considered as a constant, and all the dynamics of the system is ascribed to the time-
dependent operators. The origins of these two pictures can be found in the earlier

1Preparation of a quantum system in a predefined state usually consists in carrying out a
measurement, but it is not an easy task to prepare a system in a state we want.

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_4

95

96 4 Unitary Operators and Quantum Dynamics

days of quantum theory with the Heisenberg matrix mechanics competing against
Schrödinger’s matter wave theory. The first attempt to prove equivalence of the two
pictures was undertaken by Schrödinger as early as in 1926, but the rigorous math-
ematical proof of the equivalence did not exist until John von Neumann published
in 1932 his definitive book Mathematical Foundations of Quantum Mechanics.

Von Neumann was one of the major figures in mathematics and mathematical
physics of the twentieth century. Born to a rich Jewish family in Hungary, which
was elevated to nobility by Austro-Hungary Emperor Franz Joseph (hence the prefix
von in his name), he was a child prodigy, got his Ph.D. in mathematics at the age
of 23, and became the youngest privatdocent at the University of Berlin. In 1929 he
got an offer from Princeton University and moved to the USA. He brought his entire
family to America in 1938 saving them from certain death. In addition to laying
rigorous mathematical foundation to quantum theory, von Neumann is famous for
his role in the Manhattan Project and developing the concept of digital computers
(among other things).

After this brief historical detour, I begin presentation of quantum dynamics
starting with the Schrödinger picture.

4.1 Schrödinger Picture

4.1.1 Time-Evolution Operator and Schrödinger Equation

The statistical interpretation of quantum mechanical formalism makes sense only
if all vectors describing states of quantum system remain normalized at all times. I
will begin digging deeper into this issue by computing the norm of a generic vector
k˛k using Eq. 3.39. First, I need the corresponding bra vector:

h˛j D
X

n

a�n h�nj

so that I can write for the norm

k˛k2 D h˛j ˛i D
X

m

X
n

ama
�
n h�nj �mi D

X
m

X
n

ama
�
n ınm D

X
n

janj2 :
(4.1)

According to the postulate 4 in Sect. 3.3.5, janj2 is equal to probability pn that the
respective eigenvalue will be observed. Equation 4.1 in this case can be interpreted
as a statement that the norm of a generic vector is equal to the sum of probabilities
of all possible measurement outcomes. The latter must obviously be equal to unityP

n pn D 1 regardless of the time dependence of state j˛i. This result has quite a
profound consequence. Indeed, time dependence of a state vector can be considered
as a transformation of a vector j˛ .t0/i defined at some initial instant of time t0 into
another vector j˛ .t/i at time t under the action of an operator:

4.1 Schrödinger Picture 97

j˛ .t/i D OU .t; t0/ j˛ .t0/i : (4.2)

In order to keep the norm of the vector unchanged, the operator OU .t; t0/ must be
unitary, which significantly limits the class of operators that can be used to describe
the dynamics of quantum states. It also must obey an obvious condition:

OU .t0; t0/ D OI: (4.3)

Now, consider an evolution of the system from state j˛ .t0/i to state j˛ .t1/i and then
to state

ˇ̌
˛
�
tf
�˛

, which can be described as

j˛ .t1/i D OU .t1; t0/ j˛ .t0/iˇ̌
˛
�
tf
�˛ D OU �tf ; t1

� j˛ .t1/i :

I can also describe a system’s dynamics from the initial state to the final, bypassing
the intermediate state:

ˇ̌
˛
�
tf
�˛ D OU �tf ; t0

� j˛ .t0/i :

Comparing this with the first two lines of the previous equation, you can infer an
important property of the time-evolution operator OU:

OU �tf ; t0
� D OU �tf ; t1

� OU .t1; t0/ : (4.4)

An important corollary of Eq. 4.4 is obtained by setting tf D t0, which yields

OU .t0; t1/ OU .t1; t0/ D OI ) OU .t0; t1/ D OU�1 .t1; t0/ (4.5)

where I also used Eq. 4.3. In other words, the reversal of time in quantum dynamics
is equivalent to replacing the time-evolution operator with its inverse. This idea
can also be expressed by saying that by inverting the time-evolution operator, you
describe the evolution of the system from present to the past. This property can also
be described as reversibility of quantum dynamics: taking a system from t0 to tf and
back brings the system in its original state completely reversing its initial evolution.

Now, let me consider the action of OU .t1; t0/ over an infinitesimally small time
interval t0; t1 D t0 C dt. Expanding this operator over the small interval dt and using
Eq. 4.3, I can write:

OU .t0 C dt; t0/ D OI C OGdt (4.6)

where OG � d OU=dt
ˇ̌
ˇ
tDt0

is an operator obtained by differentiating the time-evolution

operator with respect to time. Inverse to the operator defined by Eq. 4.6 can be found

98 4 Unitary Operators and Quantum Dynamics

by expanding function .1C x/�1 with respect to x and keeping only linear in x terms:
.1C x/�1 ' 1 � x. Applying this to operator

�OI C OGdt
��1

, I get

OU�1 .t0 C dt; t0/ D OI � OGdt:

At the same time, Hermitian conjugation of Eq. 4.6 returns

OU� .t0 C dt; t0/ D OI C OG�dt:

Since the time-evolution operator is unitary ( OU�1 D OU�), operator OG has to be anti-
Hermitian: OG� D � OG, so that it can be presented as OG D �i OH=„ (see Eq. 3.30),
where OH is a Hermitian operator and „ is introduced to ensure that OH has the
dimension of energy. Indeed, since the time-evolution operator is dimensionless,
it is clear that operator OG has the dimension of inverse time. The dimension of the
Planck’s constant is that of energy multiplied by time, so it is clear that OH has indeed
the dimension of energy. This simple analysis leads the way to the next postulate of
quantum theory.

Postulate 6 Hermitian operator OH in the expansion of the time-evolution operator
is the operator version of Hamiltonian function of classical mechanics.

Thus, Eq. 4.6 can now be rewritten as

OU .t0 C dt; t0/ D OI � i
OH
„ dt: (4.7)

Taking advantage of the composition rule, Eq. 4.4, I can write:

OU .t C dt; t0/ D OU .t C dt; t/ OU .t; t0/ D

OI � i
OH
„ dt

!
OU .t; t0/ D OU .t; t0/ � i

OH
„

OU .t; t0/ dt

where I also used Eq. 4.7. The main difference between this last expression and
Eq. 4.7 is that t in the latter can be separated from t0 by a finite interval. The last
equation can be rewritten in the form of differential equation:

d OU .t; t0/
dt

D �i
OH
„

OU .t; t0/ : (4.8)

Applying Eq. 4.7 to Eq. 4.2, I can also derive:

j˛ .t C dt/i D

OI � i„
OHdt

j˛ .t/i ) j˛ .t C dt/i � j˛ .t/i
dt

D � i„
OH j˛ .t/i

4.1 Schrödinger Picture 99

which can be rewritten in a standard form

i„d j˛i
dt

D OH j˛i (4.9)

called Schrödinger equation. As any differential equation, Eq. 4.9 has to be com-
plemented by an initial condition specifying the state of the system at an arbitrary
chosen initial time. Given the role of Hamiltonian in classical mechanics discussed
in Sect. 3.1, it is not very surprising that the same quantity (in its operator
reincarnation) determines the dynamics of quantum systems as well.

4.1.2 Stationary States

If Hamiltonian does not contain explicit time dependence, which might appear, for
instance, if an atom interacts with a time-dependent electric field E.t/, Eq. 4.9 has a
very simple formal solution:

j˛.t/i D exp

�i
OH
„ t
!

j˛0i ; (4.10)

where j˛0i is the state of the system at time t D 0. For practical calculations,
however, this solution is not very helpful because the action of the exponent of
an operator on an arbitrary vector in general is not easy to compute. Situation
becomes much simpler if the initial state is presented by one of the eigenvectors of
the Hamiltonian. If j˛0i D j�ni, where j�ni is an eigenvector of OH with respective
eigenvalue En

OH j�ni D En j�ni ; (4.11)

the right-hand side of Eq. 4.10 can be computed as follows:

j˛.t/i D exp

�i
OH
„ t
!

j�ni D
1X

mD0

1

mŠ

�it
„
m

OHm j�ni D

1X
mD0

1

mŠ

�it
„
m

Emn j�ni D exp

�iEn„ t

j�ni ; (4.12)

where I used the definition of the exponential function of an operator, Eq. 3.21, and
the fact that

OHm j�ni D Emn j�ni ;

which is easily proved. (You will have a chance to prove it when doing your
homework.) Thus, if a system is initially in a state represented by an eigenvector of

100 4 Unitary Operators and Quantum Dynamics

the Hamiltonian, it remains in this state forever and ever. The time-dependent factor
in this case is a complex number with absolute value equal to unity (pure phase
as physicists like to say) and does not affect, therefore, any measurable quantities.
Indeed, consider, for instance, an expectation value of some generic operator OT when
a system is in the state described by Eq. 4.12:

h˛.t/j OT j˛.t/i D exp

i
En
„ t

h�nj OT j�ni exp

�iEn„ t

D h�nj OT j�ni :

The eigenvector equation 4.11 is often called time-independent Schrödinger
equation, and it’s solutions represent the very same stationary states, which were
postulated by Bohr, whose existence was proven in Davisson–Germer experiments
mentioned in the Introduction, and which became the main object of the Schrödinger
wave mechanics. The corresponding eigenvalues are called energy levels or simply
energies. Here I will use the term stationary states to designate solutions of the
time-independent Schrödinger equation with exponential time dependence attached
to the eigenvectors and given by Eq. 4.12. As you just saw, this time dependence
does not affect experimentally observable quantities, which remain independent of
time, justifying the name “stationary” for these states.

I need to point out at a general ambiguity of relation between quantum states
and vectors representing them: the latter are always defined with accuracy to a
phase, meaning that all vectors can be multiplied by a complex number of unit
magnitude without affecting any physical results. This is obvious from Eq. 4.10,
which does not change upon multiplying the state vector by any constant factor.
But since it is required that the states are normalized, this constant factor is limited
to have a magnitude equal to unity, i.e., to be a pure phase. However, in the case
presented in Eq. 4.12, the multiplying factor is not constant and, therefore, cannot
be simply dismissed making it physically significant. This significance manifests
itself, however, only when we have to deal with several stationary states. Indeed,
since energy is always defined with an accuracy up to a constant factor, one can
always make the energy eigenvalue corresponding to any one of stationary states
to vanish killing thereby the time dependence of the corresponding stationary state.
This vanishing trick, however, can be achieved only for one state, while all others
will retain their exponential factor albeit with different energy values equal to the
difference between their initial values and the one you chose to be equal to zero.2

In order to demonstrate that this general property of energies retain its meaning in
quantum theory as well, I will consider a state evolving from a superposition of
two eigenvectors of a Hamiltonian with different eigenvalues. Thus, assume that the
initial state of the system is

j˛0i D a1 j�1i C a2 j�2i :

2Technically this can be achieved by subtracting one of the energy eigenvalues from the potential
appearing in the Hamiltonian, which is equivalent to simple change of the zero level of the energies.

4.1 Schrödinger Picture 101

Using linearity of the time-evolution operator and results from Eq. 4.12, I can easily
compute:

j˛.t/i D a1 exp

�i
OH
„ t
!

j�1i C a2 exp

�i
OH
„ t
!

j�2i D

a1 exp

�iE1„ t

j�1i C a2 exp

�iE2„ t

j�2i D

exp

�iE1„ t

a1 j�1i C a2 exp

�iE2 � E1„ t

j�2i :

(4.13)

Equation 4.13 shows that an initial vector in the form of a superposition of two
eigenvectors of a Hamiltonian evolves by “dressing up” each of the initial state with
the exponential time factor containing the energy eigenvalue corresponding to the
respective eigenvector. However, the absolute values of these energy eigenvalues
are again not important as the dynamics of the state is determined by the difference
between them. I emphasized this point in the last line of Eq. 4.13 by factoring out
one of the time-dependent exponential factors. It is clear that the overall phase factor
will again disappear from all experimentally relevant expressions, and the entire
time dependence will be determined by exp

��i E2�E1„ t
�
. Apparently, it would not

matter for this dynamic if I factored out the other exponential factor. To illustrate
this point, I will now compute an expectation value of some generic operator with
the state described by Eq. 4.13:

h˛.t/j OT j˛.t/i D exp

i
E1
„ t

a�1 h�1j C a�2
�
�2 exp

i
E2 � E1

„ t
ˇ̌
ˇ̌

OT

a1 j�1i C a2 exp

�iE2 � E1„ t

j�2i

exp

�iE1„ t

ja1j2 T11 C ja2j2 T22C

T12a
�
1a2 exp

�iE2 � E1„ t

C T21a�2a1 exp

i
E2 � E1

„ t

where Tij D h�ij OT
ˇ̌
�j
˛
. Taking into account that for Hermitian operators diagonal

elements are real-valued and nondiagonal are complex conjugates of their trans-
posed elements (Tij D T�ji ), this can be written down as

h˛.t/j OT j˛.t/i D ja1j2 T11 C ja2j2 T22C

2 jT12j ja1j ja2j cos

E2 � E1
„ t C ıT21 C ıa1 � ıa2

; (4.14)

102 4 Unitary Operators and Quantum Dynamics

where ıs are phases of elements appearing in the corresponding subindexes.
This expression is explicitly real and is periodic with frequency dependent on
the difference of energies .E2 � E1/ =„. I will leave it to you as an exercise to
demonstrate that this result wouldn’t change if you factor out exp

�
i E2„ t

�
instead

of exp
�
i E1„ t

�
.

In a general case, expanding an arbitrary initial state vector in the basis of
the eigenvectors of the Hamiltonian, you can see that a time dependence of the
vector representing the state of the system is obtained by adding the corresponding
exponential factors exp

��i En„ t
�

in front of each j�ni term in this expansion:

j˛.t/i D
X

n

an exp

�iEn„ t

j�ni : (4.15)

Expansion coefficients an are determined by the initial state j˛0i with the help
of Eq. 2.24. Equation 4.15 essentially solves the problem of quantum dynamics
provided one knows eigenvalues and eigenvectors of the system’s Hamiltonian. For
this reason solving the time-independent Schrödinger equation is one of the main
technical problems in quantum theory. Respectively, much of this text as well as of
all other books on quantum mechanics will be devoted to devising various ways of
doing so.

If the Hamiltonian has a continuous spectrum of energy, the same idea for
generating the time-dependent state from an initial state still works. One only needs
to replace the sum over the discrete index in Eq. 4.15 by an integral over a relevant
continuous quantity k labeling states of the system to get

j˛.t/i D
ˆ

dka.k/ exp

�iEk„ t

j�ki : (4.16)

Coefficients a.k/ are again determined by an initial state in exactly the same way
as in the discrete case (you will be well advised to remember though that the
operational definitions of the inner product can be very different in the discrete and
continuous cases).

4.1.3 Ehrenfest Theorem and Correspondence Principle

I want to finish the discussion of the Schrödinger picture by deriving the so-
called Ehrenfest theorem, which is concerned with the dynamics of the expectation
value of a generic Hermitian operator OA.t/, which might have its own explicit time
dependence. Assuming that the system is in state j˛.t/i, I will derive a differential
equation for quantity

D OA.t/
E

D h˛.t/j OA j˛.t/i, where
D OA.t/

E
is a frequently used

shortened notation for the expectation values. This expression can be differentiated
using standard rules for differentiation of a product of several functions:

4.1 Schrödinger Picture 103

d
D OA.t/

E

dt
D d h˛.t/j

dt
OA j˛.t/i C h˛.t/j @

OA
@t

j˛.t/i C h˛.t/j OAd j˛.t/i
dt

D

i

„ h˛.t/j
OH OA j˛.t/i � i„ h˛.t/j

OA OH j˛.t/i C h˛.t/j @
OA
@t

j˛.t/i D

� i„
Dh OA; OH

iE
C
*
@ OA
@t

+
: (4.17)

In the Schrödinger picture, the operators are devoid of their own dynamics. The time
derivative in the last term of the Ehrenfest theorem takes into account a possibility
of an external time dependence of an operator, which is not related to their internal
dynamics. This explicit time dependence is a reflection of the changing environment
of the system, such as a time-dependent electromagnetic field interacting with an
atom. If operator OA does not have such an externally imposed time dependence,
then the last term in Eq. 4.17 vanishes, and the dynamics of the expectation value of
the observable represented by OA is completely determined by its commutator with
the Hamiltonian.

There is a special class of observables, whose operators commute with Hamilto-
nian. You already know that such observables are compatible with Hamiltonian,
i.e., they would have a definite value if the system is in one of its stationary
states. Ehrenfest theorem shows that such observables have an additional property—
regardless of the state of the system, their expectation values do not depend on time.
In other words, the expectation values of observables whose operators commute
with the Hamiltonian are conserving quantities.

Finally, I would like you to note a remarkable similarity between the Ehrenfest
theorem and Eq. 3.8 expressing time derivative of a classical function of coordinate
and momentum in terms of its Poisson brackets with the classical Hamiltonian: the
two equations become identical if one makes a substitution f� � � g ! � .i=„/ Œ� � � �.
This similarity is physically significant as illustrated by the following example.
Let me apply Ehrenfest theorem to a very special but extremely important case
of coordinate and momentum operators of a single particle described by a time-
independent Hamiltonian, like the one given in Eq. 3.48. For simplicity I will limit
myself to a one-dimensional case, so that I will only need to consider one component
of the position and momentum operators and can treat the potential energy as a
function of a single coordinate only. The Ehrenfest theorem involves commutators
of the respective operators with the Hamiltonian. In the case under consideration, I
have to compute ŒOx; OH� and ŒOpx; OH� for OH given by the one-dimensional version of
Eq. 3.48, which I reproduce below for your convenience:

OH D Op
2
x

2m
C V.Ox/:

104 4 Unitary Operators and Quantum Dynamics

The easiest commutator to compute is ŒOx; OH�:
h
Ox; OH

i
D

Ox; Op
2
x

2m

�
D i„

m
Opx (4.18)

where I used the fact that Ox commutes with V.Ox/ as well as identity 3.24 and
canonical commutation relation, Eq. 3.47. It takes a bit more labor to compute
ŒOp; OH� D ŒOp;V.Ox/�. In order to evaluate this commutator, I first present the potential
energy as a power series:

V.Ox/ D
1X

nD0

1

nŠ

dnV

dxn
Oxn

so that I can write

ŒOpx;V.Ox/� D
1X

nD0

1

nŠ

dnV

dxn
ŒOpx; Oxn� :

Again using identity 3.24 to evaluate commutator ŒOp; Oxn�, I get

ŒOp; Oxn� D �i„nOxn�1;

substitution of which into the previous equation yields

ŒOpx;V.Ox/� D �i„
1X

nD1

1

.n � 1/Š
dnV

dxn
Oxn�1 � �i„

1X
nD0

1

nŠ

dnC1V
dxnC1

Oxn:

Here I took into account that n D 0 term of the initial series is a constant and
vanishes upon the differentiation. Correspondingly the summation in the middle
expression above begins with n D 1. In the next step, I changed the dummy index
of summation n � 1 ! n, turning n D 1 term into n D 0; n D 2 into n D 1; and
so on. This process naturally forces to replace n � th derivative with n C 1th and Oxn
with OxnC1. All what is now left is to recognize that the final resulting power series is
the expansion of the derivative of function V.x/:

dV

dx
D

1X
nD0

1

nŠ

dnC1V
dxnC1

Oxn:

Thus, I can proudly present

ŒOpx;V.Ox/� D �i„dV
dx
: (4.19)

4.1 Schrödinger Picture 105

Now, the Ehrenfest theorem for these two operators becomes

d hOxi
dt

D hOpxi
m

d hOpxi
dt

D �
�

dV

dx

�
:

Repeating the same calculations for all three components of the position and the
momentum vectors, you can easily obtain the three-dimensional version of these
equations:

d hOri
dt

D hOpi
m

(4.20)

d hOpi
dt

D � hrVi (4.21)

where rV (in case you forgot) is a gradient of V defined in the Cartesian coordinates
with unit vectors ex, ey, and ez in the direction of the corresponding axes X;Y , and
Z as

rV D ex @V
@x

C ey @V
@y

C ez @V
@z
:

The obtained equations resemble classical Hamiltonian equations, but it actually
would be wrong to say (as many textbooks do) that Ehrenfest equations make
expectation values of position and momentum operators to behave like correspond-
ing classical quantities. In reality these equations do not even constitute a closed
system of equations, which becomes almost obvious once you realize that generally
speaking hV .Ox/i ¤ V .hOxi/. Equality here is realized only if the potential energy
is either a linear or a quadratic function of the coordinates. In the former case
�d OV=dx D F D const, so that the Ehrenfest equations have a simple solution:

hOpi D p0 C FtI
hOxi D x0 C . p0=m/ t C .1=2/ .F=m/ t2;

reproducing classical equations for a particle moving with constant acceleration. In
the case of a quadratic potential (harmonic oscillator)

�d OV=dx D kOx

so that

�
D
d OV=dx

E
D k hOxi

106 4 Unitary Operators and Quantum Dynamics

reducing the Ehrenfest equations to classical equations describing a harmonic
oscillator. To illustrate the difficulty arising in a more general situation, consider
V.x/ D ax3=3. In this case Ehrenfest equations become

d hOxi
dt

D hOpi
m

d hOpi
dt

D �a ˝x2˛ :

Since
˝
x2
˛ ¤ hxi2, the resulting system of equations is not complete, because now

you need to derive a separate equation for
˝
x2
˛
. Trying to do so (see the exercises)

will appear as a recurring nightmare—you will end up with new variables at each
step, and this process will never end. You might wonder, of course, that maybe it is
possible to find such a state for which hOxni D hOxin, in which case Ehrenfest equations
will literally coincide with the Hamiltonian equations. It is not very difficult to
prove that the only state in which this might be true is the state represented by
the eigenvector of the coordinate operator. Unfortunately, even if at some time
t D 0 you can create a system in such a state, it will lose this property as it
evolves in time. To see that this is indeed the case, imagine a state jx.t/i such that
Ox jx.t/i D x.t/ jx.t/i and try to plug it in Eq. 4.9 describing the dynamics of quantum
states. You will immediately see that since the coordinate and momentum operators
do not commute, this state cannot be a solution of the time-dependent Schrödinger
equation.

There is, however, another way to make Ehrenfest equations identical to their
classical Hamiltonian counterparts. All what you need to do is to neglect quantum
uncertainty of coordinate and momentum. Since technically these uncertainties arise
from the canonical commutation relation, you can do away with them by passing to
the limit „ ! 0. The emergence of Hamiltonian equations, in this so-called classical
limit, is a very attractive and soothing feature of quantum formalism indicating that
the developed theory adheres to the correspondence principle formulated (again!) by
Niels Bohr. This principle played an important heuristic and philosophical role in the
development of quantum theory. It states that the quantum theory must reproduce
the results of classical physics in situations where classical physics is known to be
valid. Even though the concrete mathematical expressions defining situations when
the quantum description must reduce to the classical one vary from phenomenon
to phenomenon, they all involve taking the limit „ ! 0. In this limit, for instance,
the quantum of energy „! introduced by Planck vanishes, or de Broglie wavelength
� D h=p goes to zero, and quantum uncertainties of various observables, which
prevented you from replacing hOxni ! hOxin, disappear.

4.2 Heisenberg Picture 107

4.2 Heisenberg Picture

As I mentioned in the beginning of this chapter, quantum dynamics can be described
by imposing a time dependence on operators rather than on the states. This
approach, a version of which was designed by Heisenberg, Born, and Jordan, was
historically first, is directly connected with classical Hamiltonian equations, and
is quite popular in current research literature on quantum mechanics, especially in
quantum optics. However, for some reasons it rarely appears in undergraduate texts
on quantum theory. Probably, it is believed that the idea of time-dependent operators
is too complicated for infirm minds of undergraduate physics majors to comprehend,
but I personally do not see why this must be the case. So, let’s try to remove the veil
of mystery from this alternative version of quantum theory, called the Heisenberg
picture.

At first glance, the idea of time-dependent operators seems indeed quite strange:
if an operator is, e.g., a prescription to differentiate a function, how can this rule
change with time? The best way to answer this type of question is to first develop
a formal way to describe the time dependence of operators and, then, to illustrate it
using a few simple examples.

I begin by considering an expectation value of some arbitrary operator OA in state
j˛.t/i: h˛.t/j OA j˛.t/i. Using the time-evolution operator defined in Eq. 4.2, I can
present this expectation value as

hA.t/i D h˛0j OU�.t; 0/ OA.t/ OU.t; 0/ j˛0i : (4.22)

Lumping together all three operators appearing between the bra and ket vectors in
Eq. 4.22 into a new operator

OAH.t/ D OU�.t; 0/ OA.t/ OU.t; 0/ (4.23)

yields a time-dependent Heisenberg representation of the initial operator. This
time dependence has, in general, two sources: an external time dependence of the
initial Schrödinger operator discussed in the previous section and an internal time
dependence responsible for the quantum dynamics of the system represented by the
time-evolution operators. Differentiating this equation with respect to time, I obtain

d OAH.t/
dt

D
OdU�.t; 0/

dt
OA.t/ OU.t; 0/C

OU�.t; 0/ OA.t/d
OU.t; 0/
dt

C OU�.t; 0/@
OA.t/
@t

OU.t; 0/:

108 4 Unitary Operators and Quantum Dynamics

Using Eq. 4.8 this can be rewritten as

d OAH.t/
dt

D i„
OU�.t; 0/ OH OA.t/ OU.t; 0/�

i

„
OU�.t; 0/ OA.t/ OH OU.t; 0/C OU�.t; 0/@

OA.t/
@t

OU.t; 0/:

Taking advantage of the unitarity of the time-evolution operator, I insert combina-
tion OU OU� � OI between the Hamiltonian and operator OA in the first two terms of the
equation above. This procedure yields

d OAH.t/
dt

D i„
OU�.t; 0/ OH OU„ ƒ‚ … OU

� OA.t/ OU.t; 0/„ ƒ‚ …�

i

„
OU�.t; 0/ OA.t/ OU„ ƒ‚ … OU

� OH OU.t; 0/„ ƒ‚ …C OU
�.t; 0/

@ OA.t/
@t

OU.t; 0/
„ ƒ‚ …

where each of the bracketed terms defines, according to Eq. 4.23, a Heisenberg
representation of the corresponding operator. Therefore, the last equation can be
rewritten as

d OAH.t/
dt

D � i„
h OAH; OHH

i
C @

OAH.t/
@t

: (4.24)

The resulting equation is called the Heisenberg equation for time-dependent
operators. It looks very much like the Ehrenfest theorem, and just like the latter, it
resembles the classical Eq. 3.8. However, unlike Ehrenfest theorem, the Heisenberg
equation describes the time evolution of operators rather than of the expectation
values and, therefore, does not suffer from perpetual emergence of new variables.
The Ehrenfest theorem can be obtained from Eq. 4.24 by computing the expectation
values of both sides of this equation with an initial state j˛0i. The initial condition
for the Heisenberg equation can be easily ascertained from Eq. 4.23: setting t D 0 in
this equation immediately yields that the initial conditions for Heisenberg operators
are given by the corresponding Schrödinger operators, establishing an intimate
connection between the two pictures. Now you can answer the question posed in
the beginning of this section: how can a rule representing an operator change with
time? The time dependence comes from combinations of various immutable rules
with time-dependent coefficients. You will see the example of this a few paragraphs
below.

Hamiltonian appearing in Eq. 4.24 is the Heisenberg representation of the regular
Schrödinger Hamiltonian and must be evaluated before Heisenberg equations can
be used. However, in the special important case of a time-independent Schrödinger
Hamiltonian, one can easily show that OHH � OH. Indeed, one can easily infer from

4.2 Heisenberg Picture 109

Eq. 4.10 that the time-evolution operator for time-independent Hamiltonian is

OU.t; 0/ D exp
�
�i OHt=„

�
: (4.25)

This operator obviously commutes with Hamiltonian, and as a result we have

OHH D OU� OH OU D OU� OU OH D OH:

The same arguments apply to any Schrödinger operator commuting with Hamilto-
nian, so all such operators remain independent of time. This result also obviously
follows from Eq. 4.24. Thus, one can say that operators commuting with Hamilto-
nian represent quantum conserving observables not only at the level of expectation
values as in the Ehrenfest theorem but at a deeper level of operators themselves.

In the case of Hamiltonians with explicit time dependence, you can no longer
claim that the Heisenberg representation of the Hamiltonian coincides with the
Schrödinger one. The Heisenberg picture in this case loses its immediate appeal, and
people often prefer a picture, intermediate between Schrödinger’s and Heisenberg’s,
called interaction representation. In this representation the Hamiltonian is divided
into time-independent and time-dependent parts:

OH.t/ D OH0 C OV.t/:

The transition to new operators is carried out now using the time-evolution operator
OU.t; 0/ D exp

�
�i OH0t=„

�
. As a result one ends up with both operators and the

state of the system displaying dependence of time: the dynamics of the operators is
defined by operator OH0 and the dynamics of the states by OV.t/. However, something
tells me that continuing with this line of thought would bring me way over the line
allowed in the undergraduate course. So, consider this as a teaser and preview of
things to come if you decide to deepen your knowledge of quantum theory.

Now back to the time-independent Hamiltonians. The same calculations as in
the case of Ehrenfest equations yield the following Heisenberg equations for the
one-dimensional motion of a quantum particle:

dOx
dt

D Opx
me

(4.26)

d Opx
dt

D �d
OV

dx
(4.27)

which coincide with the respective Hamiltonian equations of classical mechanics.
Again just like in the case of Ehrenfest theorem, these equations can be easily
generalized for the three-dimensional case:

110 4 Unitary Operators and Quantum Dynamics

dOr
dt

D Op
me

(4.28)

d Op
dt

D �r OV: (4.29)

To illustrate how Heisenberg equations work, let me consider the case of a one-
dimensional harmonic oscillator—a particle moving in a quadratic potential of the
form V D 1

2
m!2x2, in which case Eq. 4.27 becomes

d Opx
dt

D �m!2 Ox: (4.30)

Differentiation of this equation with respect to time yields a differential equation of
the second order:

d2 Opx
dt2

D �!2 Opx

where the term dOx=dt is replaced with Op=m with the help of Eq. 4.26. It is
straightforward to verify that the equation for the momentum operator is solved
by

Op.t/ D O�1 cos!t C O�2 sin!t (4.31)

while an expression for the coordinate operator is obtained from Eq. 4.30 by simple
differentiation:

Ox.t/ D O�1
m!

sin!t � O�2
m!

cos!t: (4.32)

Unknown operators O�1;2 in Eqs. 4.31 and 4.32 are to be determined from the initial
conditions. (General solution of any linear differential equation is a combination of
particular solutions with undefined constant coefficients. Since we are dealing with
operator equations, these unknown coefficients must also be operators.) Substituting
t D 0 in the found solutions, you can see that

O�1 D Opx0
O�2 D �m! Ox0

so that, just as I advertised, the time-dependent momentum and coordinate operators
are expressed as linear combinations of Schrödinger operators Opx0 and Ox0 with time-
dependent coefficients

4.3 Problems 111

Op.t/ D Opx0 cos!t � m! Ox0 sin!t

Ox.t/ D Opx0
m!

sin!t C Ox0 cos!t: (4.33)

The obtained solution for the Heisenberg operators looks identical to the solution of
the classical harmonic oscillator problem, but do not get deceived by this similarity.
For instance, in classical case one can envision initial conditions such that either
xo or px0 (not both, of course) are zeroes. In quantum case these are operators and
cannot be set to zero. At this point you do not know enough about properties of the
quantum harmonic oscillator to analyze this result any further, so I will postpone
doing this till later. Still, we can have a bit of fun and, as an exercise, calculate the
commutator between operators Op.t/ and Ox.t/, taken not necessarily at the same time.
Example 15 (Commutator of Heisenberg Operators for Harmonic Oscillator)

ŒOx.t1/; Op.t2/� D sin!t1 cos!t2
Opx0

m!
; Opx0

�
C cos!t1 cos!t2 ŒOx0; Opx0��

sin!t1 sin!t2 ŒOpx0; Ox0� � sin!t2 cos!t1 ŒOx0;m! Ox0� D
i„ cos!t1 cos!t2 C i„ sin!t1 sin!t2 D i„ cos Œ! .t1 � t2/�

It is interesting to note that this commutator depends only on the time interval t1� t2
and not on t1 and t2 separately. The equal time commutator (t1 D t2/ coincides with
the canonical commutator for Schrödinger coordinate and momentum operators.

It is also fun to think about eigenvectors and eigenvalues of the Heisenberg
operators in Eq. 4.33, but I will let you play this game as an exercise.

4.3 Problems

Section 4.1.1

Problem 45 Consider a Hamiltonian presented by a 2 � 2 matrix

OH D „!

cos � sin � exp .i'/
sin � exp .�i'/ � cos �

�
:

1. Find Hermitian conjugate and inverse matrix and convince yourself that this
operator is simultaneously Hermitian and unitary.

2. Using representation of an exponential function as a power series, evaluate the
time-evolution operator for this Hamiltonian.

112 4 Unitary Operators and Quantum Dynamics

3. Assume that the initial state of the system is given by a vector

j˛0i D 1p
2

"
1

1

#

and find j˛.t/i using the time-evolution operator.

Section 4.1.2

Problem 46 Prove that if OH j�ni D En j�ni, then OHm j�ni D Emn j�ni.
Problem 47 Consider a system with Hamiltonian

OH D Op
2

2me
C V.r/:

Assume that you know its eigenvalues and eigenvectors j�ni and En. Show that if
you change the potential in this Hamiltonian to V.r/ � E0, all the eigenvectors will
stay the same, the eigenvalue E0 will become equal to zero, and all other eigenvalues
will become En � E0.
Problem 48 Re-derive Eq. 4.14 factoring out exp

�
i E2„ t

�
instead of exp

�
i E1„ t

�
, and

demonstrate that this result does not change.

Problem 49 Consider a system described by a Hamiltonian

OH D E0
"
1 ia

�ia 1

#
:

1. Find stationary states of this Hamiltonian.
2. Assuming that at t D 0 the system is in the state

j˛0i D 1p
2

"
1

i

#
;

find j˛.t/i using stationary states of the Hamiltonian.

Section 4.1.3

Problem 50 Prove that h˛j Ox2 j˛i D .h˛j Ox j˛i/2 if and only if Ox j˛i D x j˛i. It is
not very difficult to prove that Ox j˛i D x j˛i implies h˛j Ox2 j˛i D .h˛j Ox j˛i/2, but

4.3 Problems 113

the proof of the opposite statement requires a bit more ingenuity. You can try to
prove it by demonstrating that if Ox j˛i ¤ x j˛i, then h˛j Ox2 j˛i cannot be equal to
.h˛j Ox j˛i/2.
Problem 51 Go back to the problem involving a one-dimensional motion of a
particle in the cubic potential OV D aOx3=3 discussed in Sect. 3.3.1. It has been shown
in the text that the Ehrenfest equation for hOpi involves the expectation value of ˝x2˛.
Derive the Ehrenfest equation for this quantity. Do you see expectation values of
any new operators or a combination of operators in the equation for

˝
x2
˛
? Derive

Ehrenfest equations for those new quantities. Comment on the results.

Section 4.2

Problem 52 You will learn in the following section that quantum states can be
described by functions of coordinates—wave functions, in which case Schrödinger
momentum operator becomes

Opx0 .x/ D �i„d
dx

while the coordinate operator becomes simple multiplication by the coordinate
Ox0 .x/ D x .x/. Using this form of time-independent operators, find the functions
representing eigenvectors of their time-dependent Heisenberg counterparts:

Op.t/ D Opx0 cos!t � m! Ox0 sin!t

Ox.t/ D Opx0
m!

sin!t C Ox0 cos!t:

Analyze the behavior of these eigenvectors as functions of time; especially, consider
limits t D �n and t D �=2C�n, where n D 0; 1; 2 � � � . Hint: Time-dependent terms
here are just parameters, and their time dependence does not affect how you shall
solve the respective differential equations.

Problem 53 Derive Heisenberg equations for operators Oa, Oa� and Ob, Ob� appearing
in the following Hamiltonian:

OH D „! Oa� Oa C „�Ob� Ob C
�

Oa� Ob C Ob� Oa
�
:

Commutation relations for these operators are as follows:

�Oa; Oa�� D 1I
hOb; Ob�

i
D 1I

h
Oa; Ob�

i
D 0I

h
Oa; Ob
i

D 0:

Chapter 5
Representations of Vectors and Operators

5.1 Representation in Continuous Basis

We have managed to get through four chapters of this text without specifying
any concrete form of the state vectors, and treating them as some abstractions
defined only by the rules of the games that we could play with them. This
approach is very convenient and rewarding from a theoretical point of view as it
emphasizes the generality of quantum approach to the world and allows to derive a
number of important general results with relative ease. However, when it comes
to responding to experimentalists’ requests to explain/predict their quantitative
experimental results, we do need to have something a bit more concrete and tangible
than the idea of an abstract vector. The similar situation actually arises also in the
case of our regular three-dimensional geometric vectors. It is often convenient to
think of them as purely geometrical objects (arrows, for instance) and derive results
independent of any choice of coordinate system. However, at some point, eventually,
you will need to get to some “down-to-earth” computations, and to carry them out,
you will have to choose a coordinate system and replace the “arrows” with a set of
numbers—the vector components.

In the case of abstract vectors that live in an abstract linear vector space, you can
use the same idea to get a more concrete and handy representation of the quantum
states. All these representations require that we use a basis in our abstract space.
It seems more logical to begin with representations based on discrete bases, but in
reality, we are somewhat forced to start with continuous bases. The reason for this
is that two main observables in quantum mechanics, from which almost all of them
can be constructed, are the position and the momentum (see Sect. 3.3.2). Operators
corresponding to these observables have continuous spectrum, and, therefore, you
will have to learn how to represent these operators using continuous bases.

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_5

115

116 5 Representations of Vectors and Operators

5.1.1 Position and Momentum Operators in a Continuous
Basis

Let me begin with some abstract operator OC and a continuous basis formed by
orthonormalized vectors jqi:

hqj q0˛ D ı �q � q0� ;

where q is a continuously changing parameter. You already know from Sect. 2.3 that
an abstract ket vector j˛i can be presented as an integral:

j˛i D
ˆ

dq'˛.q/ jqi : (5.1)

Hermitian conjugation of this expression produces its bra counterpart:

h˛j D
ˆ

dq'�̨.q/ hqj : (5.2)

I can now write down the inner product between vectors j˛i and jˇi as

h˛j ˇi D
ˆ

dq
ˆ

dq0'�̨.q/'ˇ.q0/ hqj q0
˛ D

ˆ
dq

ˆ
dq0'�̨.q/'ˇ.q0/ı

�
q � q0� D

ˆ
dq'�̨.q/'ˇ.q/: (5.3)

Subindexes ˛ and ˇ in these expressions indicate the correspondence between an
abstract vector and the respective function appearing in the superposition given by
Eq. 5.1. Applying Eq. 5.3 to the case when j˛i D jˇi, I reproduce Eq. 2.42, and by
recalling that all state vectors must be normalized, I end up with condition

ˆ
dq'�̨.q/'˛.q/ D 1 (5.4)

generalizing Eq. 2.43, which was originally derived only for the functions of
coordinates. It should be noted here that while I am using a single variable q as
an argument of the functions ' .q/, you must understand that it is just a convenient
notation, and in reality, q can represent several variables. For instance, eigenvectors
of the position operator depend on three components of the position vector, but we
have been using the single symbol r to designate them all.

As long as we all agree on the choice of the basis, and do not change it in the
middle of a conversation (or calculations), we have a one-to-one correspondence
between abstract vectors and the respective superposition coefficients. This function
'˛.q/ provides a complete description of the corresponding vector and can be,
therefore, considered as its faithful representation. It can be expressed in terms of

5.1 Representation in Continuous Basis 117

vector j˛i and the basis vectors by premultiplying Eq. 5.1 by hq0j and using the
orthonormality condition:

'˛.q/ D hqj ˛i : (5.5)

This essentially completes the discussion of the representation of vectors, but
this was an easy part. You also need to learn how to find representation of operators
appropriate for the developed representation of vectors, which is the hard part. For
starters, I need to explain to you what it means to represent an operator. Consider an
expression

jˇi D OQ j˛i

where two abstract vectors are related to each other by abstract operator OQ. It
seems reasonable to define a representation of the operator as such an object that
would yield the same relation between functions '˛.q/ and 'ˇ.q/ representing the
corresponding abstract vectors j˛i and jˇi. In order to figure it out, let me try to
insert the completeness condition for the continuous spectrum, Eq. 3.45, formed
with basis vectors jqi into three places in jˇi D OQ j˛i, in front of vector jˇi, in
front of operator OQ, and between the operator and j˛i:

ˆ
dq jqi hq jˇi D

ˆ
dq

ˆ
dq0 jqi hqj OQ ˇ̌q0˛ ˝q0 j˛i : (5.6)

(This amounts to inserting unity operators in all places, so, obviously, I have not
changed anything.) Using Eq. 5.5 I get

ˆ
dq jqi'ˇ.q/ D

ˆ
dq

ˆ
dq0 jqi hqj OQ ˇ̌q0˛'˛.q0/ jqi ;

which can be rewritten as
ˆ

dq jqi
'ˇ.q/ �

ˆ
dq0 jqi hqj OQ ˇ̌q0˛'˛.q0/

�
D 0:

Since the vectors of the basis are linearly independent, the integral in this expression
can be zero only if the integrand is zero, which yields

'ˇ.q/ D
ˆ

dq0 hqj OQ ˇ̌q0˛'˛.q0/ �
ˆ

dq0Q.q; q0/'˛.q0/ (5.7)

where I introduced Q.q; q0/ D hqj OQ jq0i. Thus, the abstract operator OQ in the
continuous basis takes the form of an integral operator with kernel hqj OQ jq0i. If we
know how this operator acts on the basis vectors, we can determine the kernel and
replace the abstract relation jˇi D OQ j˛i by an integral relation given by Eq. 5.7.

118 5 Representations of Vectors and Operators

For instance, you can easily find the kernel if the basis is formed by eigenvectors
of operator OQ. Indeed, if you know that OQ jqi D q jqi, then Q.q; q0/ D hqj OQ jq0i D
qı .q � q0/ and Eq. 5.7 simplifies to

'ˇ.q/ D
ˆ

dq0Q.q; q0/'˛.q0/ D q'˛.q/; (5.8)

i.e., the integral operator is reduced to simple multiplication by a corresponding
eigenvalue, and what can be simpler? Unfortunately, life is always more compli-
cated, and in most cases, you will have to deal simultaneously with at least two
non-commuting operators, so that eigenvectors of one operator will not be the
eigenvectors of the other. To deal with this situation, you have to learn how to
represent an operator in a basis formed by vectors that are not its eigenvectors.

Most of the bases used for practical calculations are formed by eigenvectors
of some Hermitian operator, and recognition of this fact can be quite useful. So,
in addition to the original operator OQ with eigenvectors jqi and eigenvalues q, let
me introduce another operator OS with eigenvectors jsi and eigenvalues s. The goal
now is to find an integral kernel representing operator OQ in the basis of vectors
jsi. In other words, I need to rewrite expressions hsj OQ js0i in the basis formed by
vectors jqi. It can be done by exploiting (twice) again the same old trick with the
completeness relation expressed in terms of these vectors:

hsj OQ ˇ̌s0˛ D
ˆ

dqdq0 hs jqi hqj OQ ˇ̌q0˛ ˝q0 ˇ̌ s0˛ :

Now, taking into account that kernel Q .q; q0/ in the basis of its own eigenvectors is
Q .q; q0/ D qı .q � q0/, I can simplify the above expression into

Q.s; s0/ � hsj OQ ˇ̌s0˛ D
ˆ

dqq hsj qi hqj s0˛ (5.9)

which gives me exactly what I have been looking for. For this expression to be
useful, however, you would need to know functions �q.s/ D hsj qi and �s.q/ D
hqj si. The first of them can be interpreted as the representation of eigenvector jqi
in the basis of vectors jsi, and the second one is clearly a representation of jsi in the
basis of jqi. These functions are related to each other by the property of the inner
product described by Eq. 2.19: �q.s/ D ��s .q/. So, the whole business of finding
the kernel Q.s; s0/ is now reduced to finding representation of eigenvectors of OQ in
terms of those of OS (or vice versa).

Finding function �s.q/ is impossible without bringing in some additional
information. It can be, for instance, a commutator between operators OQ and OS,
or just outright expression for �s.q/ obtained empirically or heuristically, on the
ground of some physical arguments, or just by divine insight. Whatever method you
chose, you need to specify now which operators we are dealing with. Of biggest
interest are, of course, operators of position and momentum, so let’s agree to identify
operator OQ with x-component OPx of the momentum operator and OS with operator

5.1 Representation in Continuous Basis 119

OX—x-component of the position operator. In this case function �q.s/ becomes the
coordinate representation �px.x/ of the eigenvectors of the momentum operator (its
x-component, of course).

This function represents a state of the particle with definite momentum px, which,
according to de Broglie hypothesis, corresponds to motion of a free particle and is
described by a harmonic wave with the wave vector with x-component kx D px=„.
Disregarding the time-dependent portion of such a wave, I can write its coordinate-
dependent part as �px.x/ D a exp .ikxx/ D a exp .ipxx=„/. Choice a D 1=

p
2�„

generates, according to Eq. 2.36, a delta-normalized function:

�px.x/ D
1p
2�„ exp .ipxx=„/ : (5.10)

Indeed,

1

2�„

1̂

�1
ei. px�p0x/x=„dx D 1

2�

1̂

�1
ei. px�p0x/QxdQx D ı � px � p0x

�
(5.11)

where Qx D x=„. Similarly, you can also find that

1

2�„

1̂

�1
ei.x�x0/px=„dpx D ı

�
x � x0� : (5.12)

Now I can write Eq. 5.9 as

Px
�
x; x0

� D 1
2�„

1̂

�1
dpxpxe

ipxx=„e�ipxx0=„ D 1
2�„

1̂

�1
dpxpxe

i.x�x0/px=„:

This integral might look puzzling, because it is naturally diverging. Applying a
magic trick, however, I can turn it into something that actually makes sense. The
trick is quite popular, so it is useful to have it up your sleeve. Differentiation of
Eq. 5.12 with respect to x produces

dı .x � x0/
dx

D i
2�„2

1̂

�1
dpxpxe

i.x�x0/px=„:

I hope that you have recognized the integral of interest on the right-hand side of this
equation, so that you can find for Px .x; x0/:

Px
�
x; x0

� D „
i

dı .x � x0/
dx

: (5.13)

120 5 Representations of Vectors and Operators

Substituting Eq. 5.13 into Eq. 5.7 with variable q replaced with s (in that equation
q was just a generic variable not yet identified with eigenvalues of the operator),
I derive a relation between functions '˛.x/ and 'ˇ.x/ indicating states j˛i and
jˇi in the representation of the eigenvectors of the position operator (position
representation for brevity):

'ˇ.x/ D „
i

ˆ
dx0

dı .x � x0/
dx

'˛.x
0/ D „

i

d

dx

ˆ
dx0'˛.x0/ı

�
x � x0� D �i„d'˛.x/

dx
:

Thus, we see that the OPx operator in the position (coordinate) representation is
equivalent to a differential operator:

Opx D �i„ d
dx

(5.14)

where I used the lowercase letter for a particular representation of the operator as
opposed to the uppercase used for abstract operators. The coordinate operator in
the coordinate representation is obviously just an operator of multiplication by the
coordinate’s eigenvalue.

You can turn this analysis around and identify OS with a component of the
momentum operator and OQ with x-coordinate. Then �q.s/ becomes the momentum
representation of the eigenvector of coordinate:

�x. px/ D h p jxi D .hx j pi/� D 1p
2�„ exp .�ipxx=„/ : (5.15)

Repeating all the same manipulation as before, you will end up with the coordinate
representation of the coordinate operator in the form of a differential operator:

Ox D i„ d
dpx

: (5.16)

Equations 5.14 and 5.16 are obtained from each other by interchanging x � px and
complex conjugating the result. The complex conjugation bit is, of course, in sync
with the fact that coordinate representation of the momentum’s eigenvectors and
momentum representation of the coordinate’s eigenvectors are complex conjugates
of each other. Obviously, all the same arguments can be carried out for any other
Cartesian component of the position and momentum operators, which brings us to
the following conclusion. The position representation of the momentum operator is

given by the coordinate gradient operator
�!r as

Op D �i„�!r ; (5.17)

while the momentum representation of the position operator is

Or D i„�!r p (5.18)

5.1 Representation in Continuous Basis 121

where
�!r p is defined as

�!r p D ex @
@px

C ey @
@py

C ez @
@pz

(5.19)

and ex;y;z are unit vectors of Cartesian coordinate system with axes X;Y , and Z.
The functions representing the eigenvectors of the 3-D momentum operator in the
position representation and of the 3-D position operator in the momentum repre-
sentation are, obviously, obtained by multiplying their respective one-dimensional
counterparts:

�r.p/ D 1
.2�„/3=2 e

�ip�r=„ (5.20)

�p.r/ D 1
.2�„/3=2 e

ip�r=„: (5.21)

You might be wondering how we ended up with OP and OR being represented
by differential rather than by integral operators as was my original intention. It
happened thanks to a singular nature of the kernels, hr0j OP jri and hp0j OR jpi, which
turned out to be proportional to the derivative of the delta-function. And the delta-
function derivatives are quite capable of turning integrals into derivatives, as it
happened in this particular case.

Having found representations of the coordinate and momentum operators, you
can easily compute their commutator. For instance, for the x-components in the
coordinate representation, you will easily find

h OX; OPx
i

f .x/ D �i„x d
dx

f .x/C i„ d
dx
.xf .x// D i„f .x/ )

h OX; OPx
i

D i„: (5.22)

Since this commutator is just a number, it must not depend on the particular
representation (this is why I returned capital letters for the operators). Indeed, the
calculations carried out in the momentum representation yield

h OX; OPx
i

f . p/ D i„ d
dp
. pf . p// � i„p d

dp
f . p/ D i„f . p/ )

h OX; OPx
i

D i„

as expected.
I will finish this section by deriving a relation between functions '˛.r/ and

Q'˛.p/ representing the same state j˛i correspondingly in the coordinate and
momentum representations. To achieve this, I am again resorting to the magic of
the completeness relation based upon the eigenvectors of momentum. Substitution
of this relation into Eq. 5.5 adapted for the eigenvectors of the position operator
gives

122 5 Representations of Vectors and Operators

'˛.r/ D hrj ˛i D
ˆ

dp hrj pi hpj ˛i D
ˆ

dp�r.p/ Q'˛.p/ (5.23)

D 1
.2�„/3=2

ˆ
d3pe�ip�r=„ Q'˛.p/

where I used Eq. 5.20 for the momentum representation of the position operator’s
eigenvector. One can easily invert Eq. 5.23 using Fourier representation of the delta-
function, Eq. 2.36, to obtain

Q'˛.p/ D 1
.2�„/3=2

ˆ
d3reip�r=„'˛.r/: (5.24)

5.1.2 Parity Operator

In this section I want to make a slight detour and define an important operator closely
related to the eigenvectors of the position operator used to introduce the position
representation of the state vectors. This operator, called parity operator, is often used
to classify wave functions arising in this representation, so it seems quite appropriate
to talk about it here.

The parity operator is defined by its action on the eigenvectors of the position
operator jri as

O… jri D j�ri ; (5.25)

the operation often called inversion. It is easy to see that this operator is Hermitian:

˝
r0
ˇ̌ O… jri D ˝r0 j�ri D ı �r0 C r�

�
hrj O… ˇ̌r0˛

�� D �hr ˇ̌�r0˛�� D ı �r0 C r�

and that it is equal to its inverse, O…2 jri D O… j�ri D jri ) O…2 D OI, where OI is the
identity operator. It follows immediately from the last expression that O… D O…�1. It
also means that this operator is unitary. The action of this operator on an arbitrary
state can be defined using its position representation:

hrj O… j i D h�rj i D .�r/

where .r/ D hrj i. It is also important to know how this operator acts on the
eigenvectors of the momentum operator, which can be found out using again the
coordinate representation:

5.1 Representation in Continuous Basis 123

O… jpi D
ˆ

dr hrj pi O… jri D
ˆ

dr hrj pi j�ri D
ˆ

dr h�rj pi jri :

To derive this result I used coordinate representation of jpi and changed the
integration variable r to �r in the last integral. Finally, using Eq. 5.21 I can write
h�rj pi D hrj �pi, which results in the following transformation rule for jpi:

O… jpi D j�pi : (5.26)

Parity operator has only two eigenvalues: 1 or �1. Indeed, assume that j�i is an
eigenvector with eigenvalue � W O… j�i D � j�i. Apply the parity operator to this
relation again: O…2 j�i D � O… j�i ” j�i D �2 j�i ) � D ˙1. Accordingly,
eigenvectors of O… represent those states that either do not change upon inversion (we
can call them even states) or those that change their sign (odd states). Obviously all
even and all odd functions are coordinate representations of the eigenvectors of the
parity operator.

Parity operator is one of the simplest symmetry operators, which means that it
can be used to determine that a Hamiltonian or another operator corresponding to
a quantum observable does not change, when a system is transformed in a certain
way. In fancy language, this property is called invariance with respect to a certain
transformation. To see why such invariance can be important, consider the time-
independent Schrödinger equation:

OH j˛i D E j˛i

and assume that there is an operator (usually a unitary one), which can be used
to describe a transformation of the system. Parity operator is one such example: it
generates spatial inversion of the system with respect to an origin of the coordinate
system. Rotations with respect to an axis or a point provide other examples of
transformations described by unitary operators. In what follows I will use notation
for the parity operator for the sake of concreteness, but most conclusions in the next
paragraph will be applicable to any symmetry operator.

So, let me apply operator O… to the time-independent Schrödinger equation. In
addition, I will also insert expression O…�1 O…, which is obviously equal to the identity
operator, between OH and j˛i:

O… OH O…�1 O… j˛i D E O… j˛i :

The Schrödinger equation preserves its form when rewritten in terms of new vector
Qj˛i D O… j˛i and new Hamiltonian OH0 D O… OH O…�1. This exercise demonstrates

that a relation between vectors and operators is preserved if a transformation of a
vector is accompanied by the corresponding transformation of the operator. I can
now give a formal definition of the invariance of a system, which I earlier loosely
described by saying that “the system does not change” upon certain operation,
i.e., the system is invariant under a transformation if its Hamiltonian obeys the

124 5 Representations of Vectors and Operators

following condition: OH0 D O… OH O…�1 D OH. One of the immediate consequences of
this condition is that if j˛i is the eigenvector of the Hamiltonian, then Qj˛i D O… j˛i
is also an eigenvector. This is an important conclusion, but I cannot dwell on it
for too long as it will bring us way outside of our comfort zone. What is more
important for us is that condition O… OH O…�1 D OH implies that the Hamiltonian and the
transformation operator commute O… OH D O… OH. This information can be immediately
put to use because we already know what this means—the transformation operator
and the Hamiltonian have a common set of eigenvectors. Usually eigenvectors of
the former are known, and this knowledge makes finding of the eigenvectors of the
latter easier. For instance, if I prove that my Hamiltonian is invariant with respect
to parity transformation, I can immediately conclude that all eigenvectors of the
Hamiltonian are presented by either even or odd functions, which, as you will see
in Sect. 6.2, significantly simplify their computation.

Hamiltonian is not the only operator whose behavior under parity transformation
is of interest. Other operators worthy of our consideration are position and momen-
tum operators. Let me begin with a position operator defined, as you well know, by
Or jri D r jri. Performing the same manipulation with this expression as the one to
which I just subjected Hamiltonian, I will have

O…Or O…�1 O… jri D r O… jri :

Using Eq. 5.25 I transform this into

O…Or O…�1 j�ri D r j�ri

which only makes sense if

O…Or O…�1 D �Or:

This result demonstrates that the position operator changes its sign upon inversion,
which, after some reflection, appears as almost obvious. Operators which have this
property are called “odd” as opposed to “even” operators, which do not change upon
parity transformation. Obviously, the inversion-invariant operators are by definition
“even.” I will leave it for you as an exercise to prove that the momentum operator is
also “odd.”

5.1.3 Schrödinger Equation in the Position Representation

The position representation is the most popular in practical applications of quantum
theory. This is the representation in which the original de Broglie matter waves
were described and in which Schrödinger wrote his equation. Much of the classical
physics deals with processes occurring in space and time, so it is not surprising

5.1 Representation in Continuous Basis 125

that the wave functions written in the position representation hold a special place
in our hearts.1 It is also important, of course, that the potential energy operator,
which might have quite elaborate position dependence, looks the simplest in the
position representation. The momentum operator, on the other hand, does not have
a significant multiplicity of the forms appearing mostly in kinetic energy as Op2 term,
whose coordinate representation looks quite tolerable.

To derive the coordinate representation of the Hamiltonian, I need first to
resolve a few technical questions. In particular, I need to know how to generate
a representation of the product of two operators from representations of individual
factors. Consider, for instance, operator expression OQOS, whose integral kernel in
some basis j�i is h�0j OQOS j�i. Inserting a completeness relation (again!) between the
operators, I obtain

˝
�0
ˇ̌ OQOS j�i D

ˆ
d�00

˝
�0
ˇ̌ OQ j�00i h�00j OS j�i D

ˆ
d�00Q.�0; �00/S.�00; �/: (5.27)

An important example is operator OP2x , whose position representation would be useful
to know. The integral kernel for OPx was found in the previous section as Q .x0; x00/ D
i„ı0 .x0 � x00/ , where 0 on the delta-function signifies differentiation with respect to
the first argument. Substitution of these expressions into Eq. 5.27 yields

˝
x0
ˇ̌ OP2x jxi D �„2

ˆ
dx00

dı .x0 � x00/
dx0

dı .x00 � x/
dx00

D

„2 d
2ı .x0 � x00/

dx0dx00

ˇ̌
ˇ̌
x00Dx

D �„2 d
2ı .x0 � x/

dx02
D �„2 d

2ı .x � x0/
dx2

where in the last line, I used evenness of the delta-function to switch from ı .x0 � x/
to ı .x � x0/ and the chain differentiation rule to change the differentiation variable.
If you plug this result into expression

'ˇ.x
0/ D

ˆ ˝
x0
ˇ̌ OP2x jxi'˛.x/dx;

you will get

'ˇ.x
0/ D �„2

ˆ
d2ı .x � x0/

dx2
'˛.x/dx D �„2 d

2'˛.x/

dx2
(5.28)

which means that the coordinate representation of OP2x operator is just the square
of �i„d=dx operator (could have guessed this, of course, but this derivation was

1You might remember that the lack of spatial-temporal picture was the main complaints
Schrödinger leveled against Heisenberg’s “transcendental” algebraic approach.

126 5 Representations of Vectors and Operators

a nice exercise, wasn’t it?). Obviously the same result can be obtained for two

other components of the momentum, which means that operator OP2 in the position
representation is given by

Op2 D �„2r2; (5.29)

where r2 D �!r � �!r is the Laplacian operator. Using Eq. 5.19, one can easily derive

r2 D @
2

@x2
C @

2

@y
C @

2

@z2
: (5.30)

Since the action of the position operator in the position representation amounts to
the simple multiplication by position vector r, the position representation of the
potential energy operator V .Or/ amounts to multiplication by V .r/. Thus the action
of the entire Hamiltonian in the position representation can now be described as

OHr‰˛ .r/ �
ˆ

dr0 hrj OH ˇ̌r0˛‰˛
�
r0
� D

1

2me

ˆ
dr0 hrj Op2 ˇ̌r0˛‰˛

�
r0
�C

ˆ
dr0 hrj OV ˇ̌r0˛‰˛

�
r0
� D

� „
2

2me
r2‰˛ .r/C

ˆ
dr0V.r0/ hrj r0˛‰˛

�
r0
� D

� „
2

2me
r2‰˛ .r/C

ˆ
dr0V.r0/ı

�
r � r0�‰˛

�
r0
� D

� „
2

2me
r2‰˛ .r/C V .r/‰˛ .r/ ; (5.31)

where ‰˛ .r/ stands for hr j˛i. Correspondingly, I can write down the Hamiltonian
in the position representation simply as

OHr D � „
2

2m
r2 C V.r; t/ (5.32)

which acts on functions �.r; t/ realizing the position representation of the corre-
sponding quantum states.

The time-dependent Schrödinger equation in the coordinate representation is
obtained from Eq. 4.9 by premultiplying it with the basis bra vector hrj and using
the completeness relation:

i„d hr j˛i
dt

D
ˆ

dr0 hrj OH ˇ̌r0˛ ˝r0 j˛i :

5.1 Representation in Continuous Basis 127

The left-hand side of this equation is simply �.r; t/ (I will drop the subindex ˛ from
now on), while the right-hand side was evaluated just a few lines above in Eq. 5.31.
Thus, I can write the position representation of Eq. 4.9 as

i„@�.r; t/
@t

D
� „

2

2m
r2 C V.r; t/

�
�.r; t/: (5.33)

This is what most of quantum mechanics textbooks call the celebrated time-
dependent Schrödinger equation governing quantum dynamics of a single-particle
quantum state represented by wave function �.r; t/. If the potential function in
Eq. 5.33 does not depend on time, one can separate time and coordinate dependence
of the wave function as

�.r; t/ D exp

�iE„ t
.r/ (5.34)

where .r/ obeys equation

� „

2

2m
r2 C V.r/

�
.r/ D E .r/ (5.35)

often called time-independent Schrödinger equation. Rewritten in the form
OHr .r/ D E .r/, where subindex r points to the position representation, it
becomes reminiscent of Eq. 4.11 defining eigenvalues and eigenvectors of the
Hamiltonian. Obviously, Eq. 5.35 produces eigenvectors of the Hamiltonian in the
position representation.

This equation, which is a linear differential equation of the second order, has to be
complemented by boundary conditions specifying behavior of the wave functions at
infinity. They depend on the type of spectrum (discrete or continuous) the respective
wave functions belong to. If the eigenvalue E belongs to a discrete spectrum, we
know from the discussion in Sect. 2.2 that the corresponding states are square-
integrable, which means that integral

´ j .r/j2 dr taken over the entire volume (it
defines the norm of the state vector in the coordinate representation; see Eq. 2.43
or 5.4) is finite. Only functions which tend to zero fast enough when jrj ! 1 will
satisfy this requirement. Thus, the boundary condition for the wave functions of
discrete spectrum can be formulated as

lim
jrj!1

j .r/j D 0: (5.36)

The existence of a discrete spectrum depends on the behavior of the potential
function V.r/ and is closely related to the type of classical motion at a given energy.
Imagine, for instance, that there exists a closed surface in space separating regions
where E > V.r/ from the regions where E < V.r/. A classical particle can only
exist in the latter region, because the former would correspond to negative values of

128 5 Representations of Vectors and Operators

kinetic energies. Regions where classical kinetic energy would be positive are called
classically allowed, while regions where kinetic energy turns negative are called
classically forbidden. The boundary between these two regions, where E D V.r/,
forms a surface, which a classical particle cannot cross. Such motion of a classical
particle is called bound motion. In the quantum mechanical case, Schrödinger
Eq. 5.35 has solutions in both regions, which, however, have a completely different
behavior. An analysis in the most generic three-dimensional case is mathematically
too involved to attempt it here, so I shall illustrate this difference considering a
one-dimensional model, with the wave function and the potential depending on
a single coordinate, e.g., x. For a classically bound motion to take place in this
case, there must exist an interval of coordinates x1 < x < x2, where E > V.x/,
while everywhere else E < V.x/. The terminal points of this interval are so-called
turning points, where a classical particle would momentarily stop before reversing
its velocity.

It is convenient to analyze this situation quantum mechanically by rewriting the
Schrödinger equation as

d2 .x/

dx2
D 2m„2 ŒV.x/ � E� .x/: (5.37)

In the classically forbidden regions, which extend to infinity in both positive and
negative directions of the coordinate axes, the second derivative of the wave function
always has the same sign as the wave function itself. It is easier to discuss the
meaning of this result assuming that the wave function and, respectively, its second
derivative, in the classically forbidden region, are positive. In this case, if the first
derivative is positive (wave function grows), it becomes even more positive so that
the wave function bends upward growing even faster with increasing x. If, however,
the first derivative is negative (wave function decreases), it is becoming less and less
negative approaching zero. The wave function in this case must also asymptotically
approach zero without ever changing its sign. This wave function would obviously
satisfy the boundary condition given in Eq. 5.36 and, therefore, correspond to the
eigenvalue from the discrete spectrum. If the wave function is negative, all the
same arguments work, and the wave function is either monotonically decreasing,
becoming even more negative, or increasing approaching zero from the negative
side. In the classically allowed region, the second derivative is negative, and the
solution to the equation does not have to be monotonic. The main conclusion
following from these arguments is that the energy eigenvalues belonging to an
interval corresponding to a classically bound motion form, in quantum description,
a discrete spectrum.

Wave functions corresponding to the continuous spectrum of energy usually
appear in the situations when the potential approaches a constant finite value at
infinity. If energy E exceeds this limiting value of the potential, than asymptotically
for large values of x, Eq. 5.37 takes the following form:

d2 .x/

dx2
D �2m„2 ŒE � V.1/� .x/

5.1 Representation in Continuous Basis 129

which has two possible solutions .x/ / exp.ikx/ or .x/ / exp.�ikx/, where
k D p2m ŒE � V.1/�=„. Any one of these asymptotic forms can be chosen as
a boundary condition at infinity: the actual choice is determined by the physical
problem at hand. This situation often appears in so-called scattering problems, when
one is interested in the behavior of a stream of particles incident on the potential
from infinity and being registered by a detector on the opposite side of the potential.
For this reason, wave functions with asymptotic behavior of this kind are called
scattering wave functions. I will talk much more about this situation in subsequent
chapters of the book.

Finally, you need to learn about the continuity properties of the wave functions.
This issue arises only if the potential V.r/ is not everywhere continuous (if the
potential is continuous, the wave functions are automatically continuous). We
require that the wave function remains continuous regardless of the discontinuity
of the potential. The physical foundation for such a requirement can be given as
follows. A discontinuity of wave function means that its first derivative becomes
infinite at the point of discontinuity, which creates a whole bunch of problems, e.g.,
the expectation value of the momentum of the particle at this point becomes infinite.

However, the continuity of the first derivative of the wave function is not neces-
sarily guaranteed. In one-dimensional case, one can show that if the discontinuities
of the potential only occur in the form of finite “jumps,” the first derivative of the
wave function remains continuous (provided that the mass of the particle remains
the same on both sides of the “step in the potential”). To see this one simply
needs to integrate Eq. 5.37 over an infinitesimal interval surrounding the point of
discontinuity of the potential, xd:

lim
"!0

xdC�ˆ

xd��

d2 .x/

dx2
dx D lim

"!0

d .x/

dx

ˇ̌
ˇ̌
xdC"

� d .x/
dx

ˇ̌
ˇ̌
xd�"

!
D

lim
"!0

2m

„2
xdC�ˆ

xd��
ŒV.x/ � E� .x/dx )

lim
"!0

d .x/

dx

ˇ̌
ˇ̌
xdC"

� d .x/
dx

ˇ̌
ˇ̌
xd�"

!
D 2m„2 Œ.V2 � V1/ .xd/" � E .xd/"� D 0 )

d .x/

dx

ˇ̌
ˇ̌
xdC0

D d .x/
dx

ˇ̌
ˇ̌
xd�0

(5.38)

where V1 D V.xd �0/ and V2 D V.xd C0/. In some semiconductor heterostructures
(alternating planar layers of different semiconductors), Eq. 5.37 is sometimes used
to describe the behavior of charged particles in the so-called effective mass
approximation. In this approximation the periodic potential of ions felt by electrons
is approximately taken into account by modifying the mass of the electrons from
their normal “free electron” value. The new “effective” masses are usually different
in different materials, and if the discontinuity of the potential occurs due to an

130 5 Representations of Vectors and Operators

electron passing from one semiconductor to another, its effective mass also changes.
Repeating previous derivation taking into account the possibility of discontinuity of
the mass, you can derive a generalized derivative continuity condition:

1

m1

d .x/

dx

ˇ̌
ˇ̌
xdC0

D 1
m2

d .x/

dx

ˇ̌
ˇ̌
xd�0

(5.39)

where m1;2 are values of the effective mass on both sides of the potential step.
The position representation allows for a useful and conceptually important gen-

eralization of the idea of probability conservation expressed by the normalization
condition 2.43. Consider the following quantity:

P.t/ D
ˆ

v

j� .r; t/j2 d3r;

which yields a probability that a measurement of the particle’s position will find it
within the integration volume v. Computing the time derivative of this quantity and
utilizing Schrödinger’s equation, Eq. 5.33, you get

@P

@t
D

ˆ

v

� .r; t/

@�� .r; t/
@t

C �� .r; t/ @� .r; t/
@t

�
d3r D

1

i„
ˆ

v

�
�� .r; t/

� „

2

2m
r2 C V.r; t/

�
�.r; t/

�� .r; t/
� „

2

2m
r2 C V.r; t/

�
��.r; t/

�
d3r D

i„
2m

ˆ

v

˚
�� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/� d3r: (5.40)

To proceed you will need the following vector identity:

�� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/ �

r � ��� .r; t/r�.r; t/ � �.r; t/r��.r; t/�

which is easily proved by working it out from the right to the left. What is important
is that the expression on the right has a form of a divergence of a vector so that
Eq. 5.40 can be written as

ˆ

v

@

@t
j� .r; t/j d3r C

ˆ

v

r � jd3r D 0; (5.41)

5.1 Representation in Continuous Basis 131

where I introduced a vector called probability current density

j D i„
2me

�
� .r; t/r��.r; t/ � ��.r; t/r�.r; t/� : (5.42)

One important property of this quantity is that it vanishes for the wave functions rep-
resenting a stationary state if its time-independent part is real. Indeed, substituting

‰ .r; t/ D exp .�iEt=„/ .r/ ;

you can see that the product of time-dependent factors yields unity, and, if .r/
is real, the remaining two terms simply cancel each other. Equation 5.42 can be
rewritten in a more illuminating form: introducing a velocity operator

Ov � Op
me

D � i„
me

r

it can be presented as

j D 1
2

�
‰� Ov‰ C‰ . Ov‰/�� : (5.43)

If you do not see an immediate usefulness of bringing out the velocity operator in
the definition of j (besides a purely aesthetic fact that Eq. 5.43 is more pleasant to
the eye), let me point out that it highlights the connection between quantum and
classical concepts of the current density. As you may remember from introductory
physics course, the current density for any flowing quantity in classical physics can
be written down as �v, where � is the density of whatever does the flowing (charge,
mass, etc.) and v is the velocity of the flow. This connection becomes even more
direct for a free propagating particle with wave function:

‰ .r; t/ D A exp .�iEt=„ C ipr=„/ :
Substituting this wave function into Eq. 5.42 or 5.43, you will find for the quan-
tum j W

j D jAj2 p=m;
which is an exact reproduction of the classical expression if you identify jAj2 with �.

Using Gauss’ theorem (google it, if you do not remember!), I can rewrite
Eq. 5.41 as

ˆ

v

@

@t
j� .r; t/j2 d3r D �

ˆ

†

j � ndS (5.44)

where n is a unit vector normal to surface† enclosing volume v (directed outward).
The right-hand side of Eq. 5.44 has a meaning of a flux (just like electric field flux

132 5 Representations of Vectors and Operators

in electromagnetism) characterizing the “flow” of probability across a boundary
encompassing the volume. This equation simply states that the probability “to
locate” a particle within a given volume decreases if the probability “flows” outside
of the volume and increases if the flow of probability is reversed. In this sense, this
equation is the statement of conservation of probability, just like a similar statement
in electromagnetism would mean conservation of charge, and in hydrodynamics,
conservation of mass. An alternative expression of this statement can be obtained
if you drop the volume integration in Eq. 5.41 and introduce probability density
� .r; t/ � j� .r; t/j2:

@

@t
� .r; t/C r � j D 0: (5.45)

This equation is called probability continuity equation, and it looks very much like
any other continuity equation: in the electrodynamic context, � is the charge density,
and j is the current density; in hydrodynamics, � is a density of a fluid, and j is the
mass flux; in thermodynamics, � is local energy density, and j is energy flux; etc.
While in quantum mechanics this equation does not describe the flow of anything
material, such as charge or mass, it has very similar empirical significance. Proba-
bility current density, for instance, determines such experimentally observable char-
acteristics as scattering cross-sections or reflection and transmission coefficients.

5.1.4 Orbital Angular Momentum in Position Representation

5.1.4.1 Operators

When I first introduced angular momentum operators in Sect. 3.3.4, I emphasized
that the importance of the angular momentum is derived from the fact that it
commutes with the Hamiltonian of a particle in a central field. At that time I
did not have the tools to prove this fact as well as to study eigenvectors of the
angular momentum operators in any particular detail. Using position representation
for these operators, I can eliminate some of those gaps. This representation is
generated by substituting Eq. 5.17 for position representation of the momentum
operator to Eqs. 3.50–3.52 with additional understanding that the action of the
position operator is reduced to mere multiplication by r. This procedure generates
the following expressions for the Cartesian components of the angular momentum
defined with respect to some coordinate axes:

OLx D �i„y @
@z

C i„z @
@y

(5.46)

OLy D �i„z @
@x

C i„x @
@z

(5.47)

OLz D �i„x @
@y

C i„y @
@x
: (5.48)

5.1 Representation in Continuous Basis 133

Fig. 5.1 Spherical
coordinate system

x

y

z

(r, θ, φ)

φ

θ

r

These expressions imply that operators OLx;y;z act on wave functions defined in terms
of Cartesian coordinates x; y, and z of a position vector r. However, Cartesian
coordinates are not the only way to characterize a position of a point in space.
Spherical coordinates, for instance, can do the same job, and in some instances,
we might want to have operators acting on functions .r; �; '/, where r; � , and '
are radial, polar, and azimuthal spherical coordinates (see Fig. 5.1). To make sure
that there is no confusion left, let me reiterate: I am using spherical coordinates to
describe position dependence of the wave functions in the coordinate representation,
but I keep using Cartesian coordinate system to introduce components of the vector
of the angular momentum and respective operators. It is important that the two
coordinate systems are mutually dependent: the spherical angles � and ' are defined
with respect to the same axes, which are used to define Cartesian components of the
angular momentum.

To proceed with my plan, I need to remind you the well-known relations between
Cartesian and spherical coordinates:

z D r cos � (5.49)
x D r sin � cos' (5.50)
y D r sin � sin' (5.51)

and

r D
p

x2 C y2 C z2 (5.52)

� D arccos

zp
x2 C y2 C z2

!
(5.53)

' D arctan
�y

x

�
: (5.54)

134 5 Representations of Vectors and Operators

To make the transition from the operators defined in space of functions f .x; y; z/
to the operators acting on functions f .r; �; '/, I shall use the regular chain rule for
differentiation of the functions of several variables, which in this case takes the
following form:

@

@x
D @r
@x

@

@r
C @�
@x

@

@�
C @'
@x

@

@'

@

@y
D @r
@y

@

@r
C @�
@y

@

@�
C @'
@y

@

@'

@

@z
D @r
@z

@

@r
C @�
@z

@

@�
C @'
@z

@

@'
:

I will illustrate this transition deriving expression for OLz in the spherical coordinates.
According to Eq. 5.48, I need derivative operators @[email protected] and @[email protected] Using Eqs. 5.52–
5.54, as well as Eqs. 5.49–5.51, I get

@r

@x
D xp

x2 C y2 C z2 D sin � cos':

To compute derivative @�[email protected], it is more convenient to transform Eq. 5.53 into

cos � D zp
x2 C y2 C z2

and differentiate it with respect to x:

� sin � @�
@x

D � zx
.x2 C y2 C z2/3=2 :

This expression can now be transformed into

@�

@x
D r

2 cos � sin � cos'

r3 sin �
D cos � cos'

r
:

Similarly, starting with tan' D y=x, I find
1

cos2 '

@'

@x
D � y

x2
D � sin'

r sin � cos2 '
) @'

@x
D � sin'

r sin �
:

Gathering all these results together, I finally have

y
@

@x
D r sin2 � cos' sin' @

@r
C sin � sin' cos � cos' @

@�
� sin2 ' @

@'
: (5.55)

5.1 Representation in Continuous Basis 135

Now I need to repeat these calculations for [email protected][email protected] contribution to Eq. 5.48:

@r

@y
D yp

x2 C y2 C z2 D sin � sin'

� sin � @�
@y

D � zy
.x2 C y2 C z2/3=2 )

@�

@y
D cos � sin'

r

1

cos2 '

@'

@y
D 1

x
D 1

r sin � cos'
) @'

@y
D cos'

r sin �

x
@

@y
D r sin2 � cos' sin' @

@r
C sin � sin' cos � cos' @

@�
C cos2 ' @

@'
: (5.56)

Finally, combining Eqs. 5.55 and 5.56, I am getting my reward for all this hard work
because the derived expression for OLz is so remarkably simple:

OLz D �i„x @
@y

C i„y @
@x

D �i„ @
@'
: (5.57)

This result justifies going into all these troubles involved in transitioning to spherical
coordinates. One can also derive similar expressions for x- and y-components of the
angular momentum, but they are not that pretty:

OLx D i„

sin'
@

@�
C cot � cos' @

@'

(5.58)

OLy D i„

� cos' @
@�

C cot � sin' @
@'

: (5.59)

The remarkable simplicity of OLz expressed in terms of the derivative with respect
to spherical coordinates is the main reason why it became customary to consider

the pair OLz; OL2 as a set of commuting operators, when dealing with the angular
momentum. Derivation of the expression for operator OL2 in terms of spherical
coordinates is quite straightforward, and while it is excruciatingly tedious, it does
lead to a really awesome answer:

OL2 D �„2
1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

�
: (5.60)

However, in order to appreciate its awesomeness, you might have to google
“Laplacian operator” unless, of course, you are also awesome and remember how

136 5 Representations of Vectors and Operators

Fig. 5.2 Breaking down a
classical momentum in
components

pt

ppr

r

q

Y

X

it looks like in spherical coordinates (in Cartesian coordinates it was defined in
Eq. 5.30). For your convenience I will present it here:

r2 D 1
r2
@

@r

r2
@

@r

C 1

r2

1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

�
(5.61)

hoping that you notice that the angular part of the Laplacian (expression in square
brackets) is identical to �OL2=„2. And this fact is not left without important conse-
quences. Recall that the Laplacian operator defines the coordinate representation of
the kinetic energy operator OK D �„2r2=2me which now can be written down in
spherical coordinates as

OK D � „
2

2mer2
@

@r

r2
@

@r

C

OL2
2mer2

: (5.62)

This presentation of the kinetic energy makes it plainly obvious that
h OK; OL2

i
D 0.

Indeed, the radial part of kinetic energy commutes with OL2 because they contain
derivatives with respect to different coordinates, and the angular part is simply
proportional to OL2, which obviously commutes with itself. To get an even better
appreciation of Eq. 5.62, it is interesting to consider a classical kinetic energy
rewritten in terms of two mutually perpendicular components of the momentum:
p� , which is normal to the particle’s position vector, and pr, which is aligned with
it. Taking into account that the momentum is tangential to the particle’s trajectory,
you can see (Fig. 5.2) that p� D p sin# , where # is the angle between the vector
of momentum and the position vector at a given point. In terms of these two
components, the kinetic energy can be presented as

K D p
2
�

2me
C p

2
r

2me
D p

2 sin2 #

2me
C p

2
r

2me
:

5.1 Representation in Continuous Basis 137

Now, let me play a bit with the first of these terms multiplying its numerator and
denominator by r2:

p2 sin2 #

2me
D p

2r2 sin2 #

2mer2
:

I am sure you recognize now that the numerator of this expression is jr�pj2, which
is nothing, but the classical angular momentum L D r�p. Thus, the classical kinetic
energy can be presented as

K D p
2
r

2me
C L

2

2mer2

where the last term “miraculously” reproduces a similar term in quantum mechani-
cal Eq. 5.62. Isn’t it true that physics (and math) work in mysterious ways?

Now I can fulfill my promise made in Sect. 3.3.4 and prove that the operators
of the angular momentum commute with the Hamiltonian if the particle’s potential
energy belongs to the class of central potentials. Actually, Eq. 5.60 makes the proof
quite trivial: the angular momentum operators in the position representation contain
only derivatives with respect to angular variables, so that if the potential energy
V.r/ depends only on the radial coordinate V.r/ (definition of the central potential!),
then neither OLz nor OL2 affects V.r/ so that OL2V.r/ .r; �; '/ D V.r/ OL2 .r; �; '/,
and the same is obviously true for OLz operator. Since I already showed that the
angular momentum commutes with the kinetic energy, the last remark completes
the required proof.

The direct consequence of vanishing commutators
h OH; OLz

i
and

h OH; OL2
i

is that

the common eigenvectors of OL2 and OLz are also eigenvectors of Hamiltonians with
a central potential, which makes the task of finding these eigenvectors especially
important. And this is what, without further ado, I am going to do now.

5.1.4.2 Eigenvectors

First of all, let me remind you that we are looking for the functions, which represent

common eigenvectors of operators OL2 and OLz. This means that these functions must
simultaneously obey both equations:

OLz lm .�; '/ D „m lm .�; '/ (5.63)
and

OL2 lm .�; '/ D „2l.l C 1/ lm .�; '/ : (5.64)
I begin with operator OLz whose eigenvectors in the coordinate representation are
particularly easy to find. First, let me notice that this operator only contains
derivatives with respect to ', so that the angular variable � plays here the role of

138 5 Representations of Vectors and Operators

“silent” parameter, a constant, as far as operator OLz is concerned. In formal language
it means that dependence of � may appear in function lm .�; '/ only as a factor in
front of the “main” function dependent only of ':

lm .�; '/ D Pml .�/ˆm .'/ : (5.65)

Substituting this form into Eq. 5.63, you can see that Pml .�/ indeed behaves as a
constant and can be discarded. The resulting equation for the remaining function

�i„@ˆm .'/
@'

D „mˆm .'/

has an obvious solution

ˆm .'/ D 1p
2�

exp .im'/ : (5.66)

Now consider how function ˆm .'/ evolves when the position vector rotates around
the axis Z. After one complete rotation, which corresponds to the change of ' by
2� , the position vector returns to the initial position. It would have been weird
if the wave function would not return to its initial value as well. In a somewhat
more sophisticated language, it means function ˆm .'/ is expected to be periodic
in '. This can be only achieved if you allow only for integer values of m: m D
0;˙1;˙2 � � � . This is only half of the eigenvalues of the operator OLz found by
algebraic methods in Sect. 3.3.4. The eigenvalues corresponding to half-integer
values of m result in the solutions that change its sign upon rotation by 2� and
shall be discarded. It does not mean, of course, that half-integer values m have no
place in quantum theory; it only means that they cannot correspond to eigenvectors
permitted the position representation. The factor 1=

p
2� in Eq. 5.66 ensures that the

wave function ˆm .'/ is normalized with respect to the inner products defined as

hˆm1 j ˆm2i �
2�ˆ

0

ˆ�m1 .'/ˆm2 .'/ d' D
1

2�

2�ˆ

0

exp Œi .m2 � m1/ '� d': (5.67)

It is obvious that with this definition of the inner product, the functions representing
the eigenvectors are not only normalized but also orthogonal. The integral in
Eq. 5.67 is a part of a surface integral carried out over the surface of a sphere, which
in spherical coordinates has the following form:

h 1j 2i D
�̂

0

2�ˆ

0

d�d' sin � �1 .�; '/ 2 .�; '/ (5.68)

5.1 Representation in Continuous Basis 139

where d�d' sin � is a spherical area element. The remaining integration over polar
angle � defines the inner product for yet unknown functions Pml .�/:

˝
Pm1l1

ˇ̌
Pm2l2

˛ D
�̂

0

d� sin �
�

Pm1l1 .�/
��

Pm2l2 .�/ : (5.69)

These functions are found by substituting lm .�; '/ D Pml .�/ exp .im'/ into
Eq. 5.64, which results in the following equation:

�
1

sin �

@

@�

sin �

@

@�

C 1

sin2 �

@2

@'2

�
Pml .�/ exp .im'/ D l.lC1/P .�/ exp .im'/ :

Carrying out the differentiation with respect to ' and canceling the exponential
factor results in the following equation for Pml .�/:

1

sin �

@

@�

sin �

@Pml
@�

� m

2

sin2 �
Pml C l.l C 1/Pml D 0:

Do you see now why I kept both indexes l and m in the notation for Pml ? By
introducing the new variable x D cos � , this equation can be rewritten as

d

dx

�
1 � x2� dP

m
l

dx

�
C

l.l C 1/ � m
2

1 � x2
�

Pml D 0 (5.70)

where I used relation d=d� D .dx=d�/ d=dx D � sin �d=dx and replaced sin2 �
with 1 � cos2 � D 1 � x2.

This equation is very well known in mathematical physics as general Legendre
equation, whose solutions can be presented in the form of associated Legendre
functions, Pml .x/ � Pml .cos �/. As is clear from the relation between variables x
and cos � , functions Pml .x/ are defined on the interval x 2 Œ�1; 1�, where they are
orthogonal with the inner product defined as

´ 1
�1 P

m
l1
.x/Pml .x/dx:

1ˆ

�1
Pml1 .x/P

m
l .x/dx D

2 .l C m/Š
.2l C 1/ .l � m/Š ıl;l1 : (5.71)

You may want to notice that the substitution of the integration variable x D cos �
converts this integral into the form identical to the integral in Eq. 5.69.

The proof of orthogonality of the Legendre functions is fairly standard for
differential equations of this kind, and you will benefit from learning how to carry
it out. First, copy Eq. 5.70 for Pml1 :

d

dx

�
1 � x2� dP

m
l1

dx

�
C

l1.l1 C 1/ � m
2

1 � x2
�

Pml1 D 0: (5.72)

140 5 Representations of Vectors and Operators

Now, multiply Eq. 5.70 by Pml1 and Eq. 5.72 by P
m
l , and integrate the resulting

expressions from �1 to 1:
1ˆ

�1
Pml1

d

dx

�
1 � x2� dP

m
l

dx

�
dx C l.l C 1/

1ˆ

�1
Pml1 .x/P

m
l .x/dx�

m2
1ˆ

�1

Pml1 .x/P
m
l .x/

1 � x2 dx D 0

1ˆ

�1
Pml1

d

dx

�
1 � x2� dP

m
l1

dx

�
dx C l1.l1 C 1/

1ˆ

�1
Pml1 .x/P

m
l .x/dx�

m2
1ˆ

�1

Pml1 .x/P
m
l .x/

1 � x2 dx D 0:

Integration of the first terms in both equations by parts yields

�
1ˆ

�1

�
1 � x2� dP

m
l

dx

dPml1
dx

dx C l.l C 1/
1ˆ

�1
Pml1 .x/P

m
l .x/dx

� m2
1ˆ

�1

Pml1 .x/P
m
l .x/

1 � x2 dx D 0

�
1ˆ

�1

�
1 � x2� dP

m
l

dx

dPml1
dx

dx C l1.l1 C 1/
1ˆ

�1
Pml1 .x/P

m
l .x/dx

� m2
1ˆ

�1

Pml1 .x/P
m
l .x/

1 � x2 dx D 0;

and by subtracting these two expressions, you get

Œl.l C 1/ � l1.l1 C 1/�
1ˆ

�1
Pml1 .x/P

m
l .x/dx D 0:

It is quite obvious now that for l ¤ l1 this equality can only hold if
1ˆ

�1
Pml1 .x/P

m
l .x/dx D 0:

5.1 Representation in Continuous Basis 141

The derivation of the normalization coefficient in Eq. 5.71 requires a bit more effort,
and I shall leave it for the most curious readers to discover it for themselves (google
it!). You can also notice that in the case of functions with equal l and different m,
the same line of reasoning results in a different orthogonality condition:

1ˆ

�1

Pm1l .x/P
m
l .x/

1 � x2 dx D

8̂
<̂
ˆ̂:

0 m ¤ m1
.lCm/Š

m.l�m/Š m D m1 ¤ 0
1 m D m1 D 0

(5.73)

where, again, derivation of the normalization integral lies outside the scope of this
text.

The associated Legendre polynomials can be computed using the following
expression:

Pml .x/ D .�1/m
�
1 � x2�m=2 d

lCm

dxlCm
�
x2 � 1�l (5.74)

where factor .�1/m is known as Condon-Shortley phase and is sometimes excluded
from the definition of Pml .x/. Equation 5.74 makes sense and gives non-zero results
if and only if l and m are integers and 0 � l C m � 2l , �l � m � l. The
integer part of this statement is obvious—derivatives of fractional order are not
something that we can live with at this point. The second part of this statement,
which reiterates what we have already learned about the relation between these
two quantum numbers in Sect. 3.3.4, can be understood by noticing that function�
x2 � 1�l is a polynomial of the order 2l and, therefore, can be differentiated no

more than 2l times before it starts producing zeroes.
Legendre equation 5.70 is invariant (does not change) if you replace m to �m:

This means that solutions of this equation characterized by m and �m must be
proportional to each other. Indeed, one can show that functions defined by Eq. 5.74
satisfy the following important relation:

P�ml .x/ D .�1/m
.l � m/Š
.l C m/ŠP

m
l .x/: (5.75)

Finally, combining Eqs. 5.66 and 5.74 and adding corresponding normalization
coefficients, we end up with a set of functions lm � Yml .�; '/ known as spherical
harmonics and defined as

Yml .�; '/ D .�1/m
s
2l C 1
4�

.l � m/Š

.l C m/ŠP
m
l .cos �/ e

im': (5.76)

In light of the results presented above, the spherical harmonics are obviously
orthogonal and normalized:

142 5 Representations of Vectors and Operators

�̂

0

2�ˆ

0

�
Yml .�; '/

��
Ym1l1 .�; '/ sin �d�d' D ıll1ımm1 (5.77)

providing us with the position representation of normalized common eigenvectors
of operators OL2 and OLz.

I will conclude this section with a brief description of main qualitative properties
of the spherical harmonics. Numerous identities and recursion relations involving
associated Legendre functions are well documented and are easily available in
the literature and on the Internet. However, it is important to have a qualitative
understanding of how spherical harmonics behave off the top of one’s head.

The first thing to notice is the symmetry of the spherical harmonics upon
inversion of the position vector on the sphere with respect to the origin of the
coordinate system: r ! �r. This corresponds to the transformation of the angular
spherical coordinates � ! � � �; ' ! ' C � . Upon this transformation
exp .im'/ ! .�1/m exp .im'/, while the argument x D cos � of the associated
Legendre function, Pml .cos �/, transforms as cos � ! cos .� � �/ D � cos � , i.e.,
we are dealing here with inversion x ! �x. Associated Legendre functions have
a definite parity: they are either even (do not change) or odd (change the sign)
when their argument changes the sign. This is quite obvious from Eq. 5.74: replacing
x ! �x does not change the function being differentiated or the factor preceding
differentiation, while the derivatives with respect to x change their sign with each
differentiation. It is obvious, therefore, that

Pml .�x/ D .�1/lCm Pml .x/: (5.78)

Combining this result with the transformation property of exp .im'/, we have for
the spherical harmonics

Yml .� � �; ' C �/ D .�1/l Yml .�; '/ (5.79)

which means that the spherical harmonics have definite parity: they are either even
with respect to inversion (for even values of l) or odd, if l is an odd number.
This behavior is consistent with the fact that the operator of the orbital angular
momentum OL D Or � Op is invariant with respect to the parity transformation, and,
therefore, its eigenvectors must also be eigenvectors of the parity operator, i.e., have
a definite parity.

It is also important to have a picture of dependence of the spherical harmonics
upon its arguments. Dependence on azimuth angle ' is trivial: the real and
imaginary parts of the spherical harmonics oscillate with frequency m, but these
oscillations are not really significant, unless we are dealing with a superposition
state comprised of several spherical harmonics with different azimuthal numbers.
For a single spherical harmonics, relevant properties are often described by its
absolute values

ˇ̌
Yml .�; '/

ˇ̌2
, which lose all dependence on '. Dependence on polar

5.2 Representations in Discrete Basis 143

angle contained in Pml .cos �/ is a more interesting matter and is determined by
values of both quantum numbers l and m separately, as well as by their difference
l � m. For instance, for m D l, it is easy to see that

Pll .cos �/ / sinl �;

which takes zero values at � D 0; � (two poles of the sphere) and has a single
maximum at the equator � D �=2. The width of the maximum (loosely defined)
becomes smaller with increasing l (the function decreases more rapidly away from
equator for larger l). If one likes pseudoclassical mind helpers (I would not even
call them “analogies”), one can think about a particle rotating around the equator
with its angular momentum pointing in the polar direction. The larger the angular
momentum is, the more torque would be required to turn it away from the poles,
which can be kind of loosely interpreted as a smaller probability for a particle to
deviate from the equatorial trajectory. But, please, do not take these pseudoclassical
mumbo jumbo too seriously.

The case of m D 0 corresponds to the classical angular momentum lying in
the equatorial plane, while respective spherical harmonics are reduced to regular
Legendre polynomials:

Pl.x/ D d
l

dxl
�
x2 � 1�l :

These are the only spherical harmonics which do not have zeroes at the poles of the
sphere .x D ˙1, or � D 0; �), but it has zeroes between the poles, whose number is
equal to the orbital number l. Obviously, the number of minimums and maximums
of these functions is always equal to l � 1 (the only exception is l D 0, when we
are dealing with a constant). In the case of generic values of m ¤ 0, the spherical
harmonics vanish at the poles, and the number of their nods in the polar direction is
equal to l � m. In my opinion, mastering the provided information will help you not
only to have a qualitative feeling for various expressions and phenomena involving
spherical harmonics but also to make quite an impression at a cocktail party. To help
you with visualizing these properties, I plotted graphs of the associated Legendre
polynomials with l D 3 in Fig. 5.3. To make the picture prettier, I normalized all
functions in the plot to bring their maximum values closer to each other; obviously
this procedure did not change their qualitative behavior.

5.2 Representations in Discrete Basis

Now let’s talk about the representation of abstract vectors in discrete bases.
Equation 3.39, which represents vector j˛i in a basis j�ni, establishes a one-
to-one correspondence between the vector and a set of coefficients an. These
coefficients are a discrete analog of functions representing vectors in continuous

144 5 Representations of Vectors and Operators

Fig. 5.3 Graphs of
associated Legendre
polynomials with l D 3 and
0 � m � 3

basis introduced in the previous section and can be arranged in the form of a
column vector. Thus, in this case we are representing the abstract vector space by
a space of column vectors with all the rules of matrix addition and multiplication
defined for these objects. The Hermitian conjugation in this space was discussed in
Sect. 2.2.2 and includes transitioning to the adjoint space inhabited by row vectors
with complex-conjugated elements:

h˛j D
X

n

a�n h�nj :

The inner product now becomes a standard matrix multiplication between a row
vector on the left and a column vector on the right:

h˛j ˇi D �a�1 a�2 � � � a�N � � �
�

2
6666664

b1
b2
:::

bN
:::

3
7777775

D
1X

iD1
a�i bi; (5.80)

where bn are coefficients in the expansion of ket jˇi in the same basis, while the
outer or tensor product j˛i hˇj is represented by a matrix formed according to the
rules of the matrix tensor product, Eq. 2.16:

S.˛;ˇ/nm D

2
6666664

a1
a2
:::

aN
:::

3
7777775

�
b�1 b�2 � � � b�N � � �

� D

2
666666664

a1b�1 a1b�2 � � � a1b�N � � �
a2b�1

: : :
: : : a2b�N

: : :
:::

: : :
: : :

:::
: : :

aNb�1 aNb�2 � � � aNb�N
: : :

:::
: : :

: : :
: : : � � �

3
777777775
: (5.81)

5.2 Representations in Discrete Basis 145

Due to the normalization requirement accepted for the state vectors, the expansion
coefficients obey the following obvious “sum” rule:

X
n

janj2 D 1 (5.82)

which is, again, a discrete analog of Eq. 5.4.
If a space of states of a given quantum system can be fully described by a discrete

basis, these states can always be presented in the form of column and row vectors
reducing the problem to that of a matrix algebra (remember Heisenberg’s matrix
mechanics—this is where it finds its roots). The main difference, of course, is that
in standard linear algebra problems, the dimension of the space is always finite,
while normally spaces of quantum mechanical states have infinite dimensionality.
This creates a number of technical problems of mathematical nature, but we shall let
mathematicians to worry about them. At any rate, in most practical applications of
quantum theory, you wouldn’t have to deal with the entire infinitely dimensional
space of states. Usually, it is possible to find a way to restrict attention to a
much smaller (sometimes just two-dimensional) subspace using certain physically
meaningful assumptions about hierarchy of interactions relevant for the problem
under study.

To have you started, consider this simplest of simplest example, which, however,
often gives students a headache.

Example 16 (A Basis Vector in Its Own Basis) This example deals with the
following question: what is a representation of a vector in a basis to which this vector
itself belongs? In other words, if j�Qni is one of the set of orthogonal normalized
vectors j�ni ; n D 1; 2 � � � , which column vector will represent it in the basis formed
by these vectors? Even though the answer to this question is almost trivial, it never
fails to confuse students. Assume, for instance, that Qn D 1. In this case I have
j�1i D 1 j�1iC0 j�2iC0 j�3iC� � � . Obviously, the corresponding column vector is

2
6664

1

0

0
:::

3
7775 :

Considering Qn D 2, I will similarly find that the column representing this vector
contains unity in the second position and zeroes everywhere else. This pattern, of
course, repeats itself for all other elements of the basis: any basis vector j�Qni is
represented in the basis it is the element of, by a column, where all components but
one are zeroes, and the only component in the Qn-th place is unity.

Now, if column vectors can represent vector states, it is almost obvious that
operators must be represented by matrices, in which case the word “act” would
mean matrix multiplication. Matrices multiply column vectors from the left and row
vectors from the right. The main question is how to construct a matrix representing

146 5 Representations of Vectors and Operators

a given operator in a chosen basis. To answer this question, I will again rely on the
completeness relation (its discrete basis reincarnation, Eq. 3.42) for the basis vectors
j�ni. Insertion of this relation into jˇi D OT j˛i yields

1X
nD0

j�ni h�n jˇi D
1X

nD0
j�ni h�nj OT

1X
mD0

j�mi h�m j˛i

D
1X

nD0

1X
mD0

h�nj OT j�mi h�m j˛i j�ni :

Taking into account that coefficients bn are given by bn D h�n jˇi and coefficients
an are an D h�m j˛i, I transform the previous equation into

1X
nD0

bn j�ni D
1X

nD0

1X
mD0

h�nj OT j�mi am j�ni :

Now, thanks to the linear independence of the basis vectors, I can simply equate the
coefficients in front of each of j�ni separately:

bn D
1X

mD0
h�nj OT j�mi am: (5.83)

This expression can be rewritten in the matrix form as

b D T � a

where I am using bold Latin letters to denote columns and matrices representing
vectors and operators in a given discrete basis. This means that the required matrix
representation of the given operator is

Tnm D h�nj OT j�mi : (5.84)

To illustrate an application of this result, consider the matrix of the operator
OS D j˛i hˇj obtained by the outer product of two vectors. Using Eq. 5.84, you
immediately find

Smn D h�n j˛i hˇj �mi D amb�n
with full agreement with the result obtained using standard matrix definition of the
outer product, Eq. 5.81.

The equation for eigenvectors OT j˛i D � j˛i in the matrix representation is
reduced to the matrix equation:

5.2 Representations in Discrete Basis 147

1X
mD0

Tnmam D �an

which can be rewritten as

1X
mD0

.Tnm � �ınm/ am D 0: (5.85)

This is essentially a shortcut notation for the system of uniform linear equa-
tions, which has nontrivial (meaning non-zero) solutions, only if the determinant
kTnm � �ınmk vanishes (Cramer’s rule that had already been mentioned earlier). If
you go back to Sect. 3.2.3, you will find examples of eigenvector and eigenvalue
calculations with matrices.

The matrix representation of operators is practically useful only if you know
how the operator acts on the vectors of the chosen basis. If an operator in question
is built out of basis vectors, the problem is resolved almost trivially. For instance,
the projection operators introduced in Eq. 3.41 have a simple matrix representation
in the same basis in which it is defined:

P.n/km D h�kj �ni h�nj �mi D ıknınm
which is a matrix with a single non-zero element k D m D n on a main diagonal.
You can find other examples of matrix representations for operators of this kind, in
the exercises in this chapter.

In most cases the issue of finding how operators act on the basis vectors is not
that trivial. Often it is resolved by using position representation of operators and the
basis vectors. This approach works especially well for the class of operators, which
can be presented as a combination of position and momentum operators. This class
includes many important operators, but not all of them.

5.2.1 Discrete Representation from a Continuous One

To illustrate this point, let me consider an example of a single particle of mass me
allowed to move freely along a linear segment of finite length L. The probability
that the particle’s position can be anywhere outside of this segment is assumed to
be zero. This condition is most naturally expressed in the position representation,
where Hamiltonian, which contains only kinetic energy term, takes the form of

OH D Op
2
x

2m
D � „

2

2m

d2

dx2
: (5.86)

148 5 Representations of Vectors and Operators

The confinement of the particle inside the specified linear segment is formally
expressed by the requirement that the wave function .x/ representing states of the
system is equal to zero outside of the allowed interval. The continuity of the wave
function then requires that it also vanishes at the terminal points of this interval.
Choosing the origin of a coordinate system at the left end of the allowed interval,
and assigning coordinate x D L to its right end, I can express the confinement
conditions by requiring that the wave function vanished at both ends of the interval:

.0/ D .L/ D 0: (5.87)

It is easy to check that Schrödinger equation 5.37 for a free particle (V.x/ D 0) has
two linearly independent solutions C.x/ D exp .ikx/ and �.x/ D exp .�ikx/,
where k D p2mE=„. Now I need to construct a linear combination of these
functions obeying the confinement conditions, Eq. 5.87. Beginning with a general
solution

.x/ D Aeikx C Be�ikx;

I find that requirement .0/ D 0 yields A C B D 0, which allows me to write the
wave function as

.x/ D A sin kx:

(I used Euler’s formula sin kx D .exp.ikx/ � exp.�ikx// =2i and incorporated
constant 2i into coefficient A.) The condition at x D L yields

A sin kL D 0

with two possible ways to fulfill it. One is to make A D 0, in which case the entire
wave function vanishes, and we definitely do not want this to happen. Thus, you are
stuck with the only other option, namely, to require that

kL D �n; n D 1; 2 � � � :

This result means that the states of the system considered in this example can only
be presented by a discrete set of wave functions:

n.x/ D A sin knx

characterized by parameter

kn D �n
L

(5.88)

5.2 Representations in Discrete Basis 149

with corresponding discrete energy levels

En D „
2�2n2

2mL2
: (5.89)

The appearance of the discrete spectrum is not surprising here, of course, since
the classical motion in this example is clearly bound. The remaining unknown
coefficient A remains unknown at this point—it cannot be fixed by the boundary
condition, which is a fairly typical situation in problems of this kind. I, however,
have one additional weapon at my disposal—the normalization condition, which
in this case reads as (using standard definition of the inner product for the square-
integrable functions)

1̂

�1
j .x/j2 dx D jAj2

Lˆ

0

sin2 knxdx D 1 ) A D
r
2

L

where at the last step, I chose A to be a real positive quantity. This choice while
pleasing to the eye does not make any difference since normalization condition only
defines A up to an arbitrary phase factor of the form exp .i'/ in alignment with the
already mentioned general principle that vectors representing quantum states are
always defined only up to a phase.

The system of wave functions

n.x/ D
r
2

L
sin

�nx

L
(5.90)

forms a normalized orthogonal basis, which can be used to present any other
wave function defined on the interval x 2 Œ0;L� (one can recognize here just a
Fourier series expansion for a function defined on a finite interval). This basis can
also be used to represent various operators acting on such functions. For instance,
Hamiltonian, Eq. 5.86, in this basis is represented by an infinite diagonal matrix:

Hmn D � „
2

2m

2

L

Lˆ

0

sin
�mx

L

d2

dx2
sin

�nx

L
dx D

„2�2n2
2mL2

2

L

Lˆ

0

sin
�mx

L
sin

�nx

L
dx D „

2�2n2

2mL2
ımn: (5.91)

Now, assume that the particle that you follow is also subjected to an external
uniform electric field (with all other conditions and limitations intact). This will
add a potential energy term to the Hamiltonian of the form V.x/ D eFx, where e
is the absolute value of the particle’s charge, presumed to be negative, and F is the

150 5 Representations of Vectors and Operators

magnitude of the field. Now, I want you to try to present the new Hamiltonian of the
particle:

OH D Op
2
x

2m
C eFx

in the same basis of functions n.x/ defined in Eq. 5.90. The resulting matrix would
have the diagonal part given in Eq. 5.91, and the part, which can be written as eFxmn,
where

xmn D 2
L

Lˆ

0

x sin
�nx

L
sin

�mx

L
dx D

1

L

Lˆ

0

x

cos

� .n � m/ x
L

� cos � .n C m/ x
L

�
dx D (5.92)

1

L

L

�

x

n � m sin
� .n � m/ x

L
� x

n C m sin
� .n C m/ x

L

�L
0

�

1

L

L

�

2
4 1

n � m

Lˆ

0

sin
� .n � m/ x

L
dx � 1

n C m

Lˆ

0

sin
� .n C m/ x

L
dx

3
5 D

L

�2
1

.n � m/2 cos
� .n � m/ x

L

ˇ̌
ˇ̌
L

0

� L
�2

1

.n C m/2 cos
� .n C m/ x

L

ˇ̌
ˇ̌
L

0

D

L

�2
Œ.�1/n�m � 1� 4nm

.n2 � m2/2 ; n ¤ m: (5.93)

The diagonal element of this matrix, which is just an expectation value of the
coordinate, is easily found to be (from the first line of Eq. 5.93) xnn D L=2, which
has an obvious physical meaning. The total Hamiltonian in the representation based
on functions defined in Eq. 5.90 is now an infinite nondiagonal matrix:

Hmn D
„2�2n2
2mL2

C eFL
2

ımn C eFL

�2
Œ.�1/n�m � 1� 4nm

.n2 � m2/2 (5.94)

where the second term contributes only to the nondiagonal elements. The electric
field-related correction to the diagonal elements of the Hamiltonian is just a constant
and can be eliminated by choosing a different zero level for the energies, for
instance, by writing the electric field potential as eF.x � L=2/.

This example illustrates a rather general situation: often in order to find a
representation of an operator in one basis, we have to use its known representation

5.2 Representations in Discrete Basis 151

in a different basis. This approach works especially well with observables that
can be expressed as combinations of position and momentum operators, whose
representations in continuous bases were discussed above. This approach often
leads to nondiagonal matrices, and finding the eigenvalues and eigenvectors of
the operator of interest is reduced to finding eigenvalues and eigenvectors of
the resulting matrix. In many cases this cannot be done exactly because the
dimensionality of the resulting matrices can be infinite, but it is often possible to
truncate them and solve the problem approximately. How this is done practically
will be discussed in a separate chapter.

5.2.2 Transition from One Discrete Basis to Another

Quite often you will find yourself in a situation when having found (or being given)
matrix representation of an operator in one discrete basis, you will need to find an
equivalent matrix representing this operator in a different basis. Here I will show
how this can be done.

So, let’s assume that you have an operator OT and a system of basis vectors
ˇ̌
ˇ�.old/m

E
.

The representation of this operator in this basis, as we have already established, is
given by a matrix:

T.old/mn D
˝
�.old/m

ˇ̌ OT ˇ̌�.old/n
˛
:

However, I would like to re-derive this expression in a slightly different way. Let
me multiply the operator OT by two unity operators expressed by the completeness
relation, Eq. 3.42, formed with the vectors of this basis:

OT D
1X

nD0

1X
mD0

ˇ̌
�.old/m

˛ ˝
�.old/m

ˇ̌ OT ˇ̌�.old/n
˛ ˝
�.old/n

ˇ̌ D

1X
nD0

1X
mD0

T.old/mn
ˇ̌
�.old/m

˛ ˝
�.old/n

ˇ̌
: (5.95)

This representation of an operator in terms of a matrix and operators
ˇ̌
ˇ�.old/m

E D
�
.old/
n

ˇ̌
ˇ

is akin to the expansion of a vector into a linear combination of basis vectors. Now,

let me assume that I have another basis
ˇ̌
ˇ�.new/m

E
, and I want to relate the matrix of

the operator in this basis T.new/mn to matrix T
.old/
mn . To achieve this goal, let me express

the matrix T.new/mn using Eq. 5.95:

T.new/kl D
1X

nD0

1X
mD0

T.old/mn
D
�
.new/
k

ˇ̌
�.old/m

˛ ˝
�.old/n

ˇ̌
�
.new/
l

E
:

152 5 Representations of Vectors and Operators

This can be rewritten with the help of two new matrices:

Unl D
˝
�.old/n

ˇ̌
�
.new/
l

E
(5.96)

and

QUkm D
D
�
.new/
k

ˇ̌
�.old/m

˛

as

T.new/kl D
1X

nD0

1X
mD0

QUkmT.old/mn Unl:

(Note the position of the indexes in this expression, which adhere to the regular rule
for the matrix multiplication.) Now I need to figure out how to construct matrices
U and QU and their relation to each other. Let me first perform complex conjugation
of each element of matrix QUkm and take advantage of the main property of the inner
product, Eq. 2.19:

QU�km D
D
�
.new/
k

ˇ̌
�.old/m

˛� D ˝�.old/m
ˇ̌
ˇ�.new/k

E
D Umk:

Thus I can see that matrix QU can be obtained from U by complex conjugation and
transposition, or, expressing this in fewer words, QU is a Hermitian conjugate of U:
QU D U�, and the matrix transformation rule can be presented as

T.new/kl D
1X

nD0

1X
mD0

U�kmT
.old/
mn Unl: (5.97)

Now, let me focus on one particular column of matrix U, say, column l0. Then you

can easily recognize that quantities
D
�
.old/
n

ˇ̌
ˇ �.new/l0

E
are nothing but coefficients of

expansion of the new basis vector
ˇ̌
ˇ�.new/l0

E
in the old basis:

ˇ̌
ˇ�.new/l0

E
D
X

n

ˇ̌
�.old/n

˛ ˝
�.old/n

ˇ̌
�
.new/
l0

E
;

which gives a simple recipe for preparing matrix U: find representation of the nth
vector of a new basis in the old one and use the corresponding coefficients as a n-th
column of matrix U. Let me illustrate this rule with a simple example.

Example 17 (Transformation to a New Basis) Consider a Hermitian matrix:

1 i
�i �1

�

and rewrite it in the basis of its own eigenvectors.

5.2 Representations in Discrete Basis 153

Solution

First I need to find these eigenvectors, which are given by equation

1 i
�i 1

�
a1
a2

�
D �

a1
a2

�
:

The corresponding eigenvalues are found from

.1 � �/2 � 1 D 0 )
�2�C �2 D 0 )
�1 D 0; �2 D 2:

Now I can find two eigenvectors:
For �1 D 0, I have

a1 C ia2 D 0 ) j0i D 1p
2

1

i

�

where I used j0i as a notation for a normalized eigenvector belonging to �1 D 0.
You can verify that this vector is indeed normalized.

For �2 D 2, the eigenvector equations become

a1 C ia2 D 2a1 ) j2i D 1p
2

1

�i
�
:

What you need to realize now (quite obvious, but always gives students a shudder)
is that the numbers in these columns are the coefficients in the representation of the
new basis (vectors j0i and j2i) in terms of the vectors of the old basis. Thus, the
transformation matrix U can be generated as

U D 1p
2

1 1

i �i
�

and its Hermitian conjugate matrix as

U� D 1p
2

1 �i
1 i

�
:

Plugging these matrices in the transformation rule, Eq. 5.97, I get

1

2

1 �i
1 i

�
1 i
�i 1

�
1 1

i �i
�

D

154 5 Representations of Vectors and Operators

1

2

1 �i
1 i

�
0 2

0 �2i
�

D
0 0

0 2

�
;

which is exactly what you should have expected: a matrix in the basis of its own
eigenvectors is diagonal with eigenvalues along the main diagonal.

Sometimes the transformation rule connecting representation of operators in
different bases is presented in an alternative form

T.new/kl D
1X

nD0

1X
mD0

QUkmT.old/mn QU�nl (5.98)

with matrix QUnl defined as

QUnl D
˝
�.new/n

ˇ̌
�
.old/
l

E
: (5.99)

Complex conjugation of Eq. 5.99 yields

QU�nl D
˝
�.new/n

ˇ̌
�
.old/
l

E� D
D
�
.old/
l

ˇ̌
ˇ �.new/n

˛ D Uln

Performing matrix transposition and recalling that complex conjugation plus trans-
position yields Hermitian conjugation, you can see that

QU D U�

and that Eq. 5.98 and Eq.5.97 are equivalent to each other.
The transformation matrix in the form of Eq. 5.99 appears naturally when one is

looking for the transformation between components of the same vector written in
two different bases. Indeed, consider a vector j˛i represented in two different bases
as

j˛i D
X

l

a.old/l

ˇ̌
ˇ�.old/l

E
D
X

l

a.new/l

ˇ̌
ˇ�.new/l

E
:

The simplest way to express coefficients a.new/l in terms of coefficients a
.old/
l , which

is the goal of this exercise, is to premultiply the expression above by the bra-vectorD
�
.new/
m

ˇ̌
ˇ and take advantage of the orthogonality of the basis vectors. This yields

a.new/m D
X

l

˝
�.new/m

ˇ̌
ˇ�.old/l

E
a.old/l D

X
l

QUmla.old/l D
X

l

U�mla
.old/
l :

What is left for me to do now is to show that matrix U defined by Eq. 5.96 is
unitary. To this end I need to compute the product of two matrices Unm and U

�
ml,

using standard matrix multiplication rule:

5.2 Representations in Discrete Basis 155

�
UU�

�
nl D

X
m

UnmU
�
ml:

Substituting here Eq. 5.96 I can write

�
UU�

�
nl D

X
m

˝
�.old/n

ˇ̌
�.new/m

˛ ˝
�.new/m

ˇ̌
ˇ�.old/l

E
D ˝�.old/n

ˇ̌
�
.old/
l

E
D ınl;

where I replaced the sum over m with a unity operator because it is again just
a completeness condition and used the orthonormalization of the vectors of the
basis to replace their inner product with Kronecker’s delta-symbol. This calculation
reveals that U� D U�1, which is the definition of the unitary matrix. One should
not be surprised that the transformation of the vector components from one basis to
another is provided by a unitary matrix. Indeed, such a transformation clearly should
not change the norm of the vector, and it is, indeed, one of the important properties
of the unitary operators.

5.2.3 Spin Operators

The approach to generating representation of operators outlined in the previous
section would not work for operators which cannot be built out of momentum and
position. If, however, you somehow know the eigenvalues of the operator in question
(most likely this knowledge comes by distilling empirical facts), you can construct
the matrix of this operator in the basis of its own eigenvectors. Indeed, if j�mi is
the eigenvector of OT , corresponding to eigenvalue tm, i.e., OT j�mi D tm j�mi, then
Eq. 5.84 immediately gives

Tnm D h�nj OT j�mi D tm h�nj �mi D tmınm:

Thus, any operator in the basis of its own eigenvectors is presented by a diagonal
matrix with eigenvalues along the main diagonal. Unfortunately, we often have to
deal with a set of non-commuting operators, only one of which can be presented by a
diagonal matrix. The question then remains how to generate a matrix representation
of other non-commuting operators in the same basis. Fortunately, in all practical
situations, this problem can be solved if one knows commutation relations between
relevant operators. I will illustrate this approach by considering representation of
angular momentum operators in the situation when eigenvalues of the z-component
of the angular momentum can only take two values C„=2 and �„=2: Quantum
numbers m and l introduced in Sect. 3.3.4 take in this case values ˙1=2 and 1=2,
respectively. In Sect. 5.1.4, you saw that the orbital angular momentum, which
is constructed of position and momentum operators, admits only integer values
for these numbers. The suggested half-integer values, which are allowed by the
algebraic properties of these operators, can, therefore, correspond only to a very

156 5 Representations of Vectors and Operators

special angular momentum of electrons not related to their orbital motion. This
intrinsic angular momentum is known as spin. Leaving more detailed discussion
of this quantity till later, here let’s just accept its existence and use it to illustrate
a method of generating matrix representation of operators, which do not have a
position or momentum representation.

To distinguish between spin and orbital angular momentum, I will introduce
special notations for the former designating the respective operators as OSx, OSy, and OSz,
which have the same meaning as operators OLx, Ly, and Lz of Sect. 3.3.4. Accordingly,
I will replace the quantum number l with s and m with ms. It is important to realize
from the outset that, while orbital quantum number l is allowed to take any integer
values, the value of the respective spin number s is fixed at 1=2 and cannot be
changed—it is an intrinsic property of electrons just like its mass or charge. Thus,
the only quantum number which can be used to distinguish between different spin
states is ms.

Since ms takes on only two distinct values, there exist only two respective states
described by eigenvectors of operator OSz. Thus, the space occupied by different spin
states is two-dimensional, and the respective vectors are represented by 2�1 column
vectors, and operators are represented by 2 � 2 matrices. In the basis of its own
eigenvectors, OSz is simply a diagonal matrix:

Sz D
„
2
0

0 � „
2

�
; (5.100)

while the states take the form of columns

j1=2i D
1

0

�
; j�1=2i D

0

1

�
; (5.101)

where I chose to numerate the state corresponding to the positive eigenvalue as first.
(This choice determines the positions of negative and positive elements in the matrix
Sz and the ones and zeroes in the corresponding columns.) An arbitrary state in the
space of spin states can be written down as a linear combination of the basis vectors:

j�i D a
1

0

�
C b

0

1

�
: (5.102)

The result expressed by Eq. 5.100 is somewhat obvious, and our main task is to
find matrices realizing a representation of two remaining components of the spin
angular momentum, OSx and OSy in this basis. (Operator OS2 in this instance is trivial—
it is diagonal with identical diagonal elements equal to „2s.s C 1/ D 3„2=4; so it is
proportional to an identity matrix.)

I begin solving this problem by focusing on operators OS˙ D OSx ˙ iOSy, which are
spin analogs of the ladder operators OL˙ introduced in Eqs. 3.64 and 3.65. Since I
postulated that spin operators obey the same commutation relations as operators

5.2 Representations in Discrete Basis 157

OLx;y;z, I can use the results obtained for these operators, in particular Eq. 3.75
describing how operator OLC acts on eigenvectors of OLz. Adapting this equation to
the case of spin states, I can write

OSC js;msi D „
r
3

4
� ms .ms C 1/ js;ms C 1i (5.103)

where I took into account that s D 1=2. Applying this equation to the only two
existing states j1=2i and j�1=2i (I dropped quantum number s, because it never
changes), I have

OSC j1=2i D 0
OSC j�1=2i D „ j1=2i (5.104)

from which you can immediately infer that the matrix representation of OSC is

SC D „
0 1

0 0

�
: (5.105)

The matrix representation for the lowering operator OS� can be derived in a similar
way using Eq. 3.76, which yields

OS� j1=2i D „ j�1=2i
OS� j�1=2i D 0; (5.106)

but it is much faster simply to recall that OS� D OS�C, so that the respective matrix is
obtained by matrix transposition and complex conjugation of Eq. 5.105:

S� D „
0 0

1 0

�
: (5.107)

Now, using the definition of the ladder operators, you can write for OSx and OSy:

OSx D 1
2

�OSC C OS�
�

(5.108)

OSy D 1
2i

�OSC � OS�
�

(5.109)

which together with Eqs. 5.105 and 5.107 generate the required matrices:

Sx D „
2

0 1

1 0

�
(5.110)

158 5 Representations of Vectors and Operators

Sy D „
2

0 �i
i 0

�
: (5.111)

Equations 5.110 and 5.111 provide the solution to the problem of finding the
matrix representation for operators, which cannot be reduced to combinations of
the position and momentum. As you can see, the commutation relations played the
crucial role in solving this problem.

5.3 Problems

Section 5.1.1

Problem 54 Reproduce calculations leading to Eq. 5.16 for the momentum repre-
sentation of the coordinate.

Problem 55 Derive Eq. 5.20 generalizing the approach that led to Eq. 5.15 in
Sect. 5.1.1.

Problem 56 Derive Eq. 5.24 using the same method which I used deriving Eq. 5.23
(do not attempt to simply invert the previous equation).

Problem 57 Assuming that function �s.q/ presenting eigenvectors of operator OS in
the basis of operator OQ is given by

�s.q/ D Aeisq�s2q2 ;

find the integral representation of the operator OS in this basis.

Section 5.1.2

Problem 58

1. Prove that the momentum operator is “odd” (changes its sign upon the parity
transformation).

2. Prove that the operator of the angular momentum is invariant with respect to the
parity transformation (“even”).

Section 5.1.3

Problem 59 Which of the following can be used as wave functions describing
states of the discrete spectrum:

5.3 Problems 159

1. ex
2=2

2.
�
2x2 � x4=3� e�x2=2

3. A sin kx
4. B exp.ikx/C C exp.�ikx/
5. xe�jxj
6. Ae�x2 cos kx

Problem 60 It is known that a potential energy of a quantum particle exhibits a
finite discontinuity at point x D 0. It is also known that for x < 0 and x > 0, the
wave functions of the particle are presented by

.x/ D
(

A
�
x2 C 2� exp ��˛1x2

�
x < 0

B exp
��˛2x2

�
x > 0:

Using continuity of the wave function and its derivative, establish relations between
parameters A, B, ˛1, and ˛2.

Problem 61 Prove the following identity:

�� .r; t/r2�.r; t/ � �.r; t/r2��.r; t/ � r � ��� .r; t/r�.r; t/ � �.r; t/r��.r; t/� :

Problem 62 Compute probability current densities for a particle in the states
described by the following wave functions:

1. .x/ D A exp .ikz/C B exp .�ikz/
2. .x/ D A cos kx
3. .r/ D Ar exp .ikr/C Br exp .�ikr/, where r D

p
x2 C y2 C z2

4. .r/ D A exp .i k� r/C C exp .�i k� r/

Section 5.1.4

Problem 63 Derive expressions for operators OLx and OLy in spherical coordinates
presented in Eqs. 5.58 and 5.59.

Problem 64 Derive the orthogonalization condition for the associated Legendre
functions with l1 D l2 and m1 ¤ m2 (Eq. 5.73). Do not attempt to obtain the
normalization coefficient.

Problem 65 Consider a function of polar and azimuthal angles � and ' defined as

.�; '/ D sin � .1 � cos �/ cos':

1. Normalize this function.
2. Present this function as a linear combination of spherical harmonics Yml .�; '/.

160 5 Representations of Vectors and Operators

3. If the observables presented by operators OL2 and OLz are measured when a particle
is in the state presented by this wave function, what would be the possible
outcomes and their probabilities?

4. Find the expectation values and uncertainties of these observables in this state.

Problem 66 Repeat the previous problem for a following function:

.�; '/ D 3
2

sin 2� exp .�i'/C 2 sin2 � sin 2';

but do not attempt to normalize it before rewriting it as a combination of spherical
harmonics.

Problem 67 Find the coordinate representation for lowering and raising ladder
operators introduced in Sect. 3.3.4, and using found expressions, find Yll .�; '/ and
Y�ll .�; '/.

Problem 68 Find all zeroes of the angular probability distribution for a particle in
angular states described by spherical harmonics Y03 .�; '/, Y

1
3 .�; '/, Y

2
3 .�; '/, and

Y33 .�; '/.

Problem 69 Find energy values for the system described by Hamiltonian:

H D
OL2x C OL2y
2I1

C
OL2z
2I2
:

Section 5.2

Problem 70 Using eigenvectors j1i and j2i of matrix
0 i
�i 0

�

as a basis, construct the matrix representation of operators j1i h1j and j2i h2j and
verify the closure (or completeness) condition:

j1i h1j C j2i h2j D OI

where OI is the unity operator.
Problem 71 Find the matrix of the operator:

OA D j 1i h 1j C j 2i h 2j C j 3i h 3j �
i j 1i h 2j � j 1i h 3j C i j 2i h 1j � j 3i h 1j

in the basis formed by orthonormalized vectors j 1i ; j 2i, and j 3i.

5.3 Problems 161

Problem 72 Write down expressions for spherical harmonics with orbital quantum
number l D 1. You can consider them as a basis in the subspace of eigenvectors of
operator OL2 belonging to this eigenvalue in the sense that any linear combination of
them will also be an eigenvector of this operator.

1. Prove that this is indeed the case, i.e., that any linear combination of spherical

harmonics with l D 1 represents an eigenvector of OL2 with the same eigenvalue.
2. Find the matrix representation of operator OLx in this basis, and using the obtained

matrix, find the representation of this operator’s eigenvectors in this basis.
3. The found eigenvectors can also be considered as yet another basis. Find the

representation of operators OLx and OLz in this basis.
Problem 73 Present operator id=dx in the basis of spherical harmonics with l D 1.
Problem 74 Consider the matrix:

A D
2
4
0 0 �1
0 1 0

�1 0 0

3
5 :

Transform this matrix to the basis of its eigenvectors. Verify that the elements along
the diagonal of the resulting matrix are eigenvalues of A.

Problem 75 Consider two matrices:

A D
2
4
1 i 1
�i 0 0
1 0 0

3
5

and

B D
2
4
3 0 0

0 1 i
0 �i 0

3
5 :

Rewrite matrix B in the basis formed by eigenvectors of matrix A.

Section 5.2.3

Problem 76 Using the same approach, which was used in Sect. 5.2.3 for spin 1=2,
find the matrix representation for the operators of the spin s D 3=2. Hint: What is
the dimension of the space that contains the vectors representing states of this spin?

Part II
Quantum Models

In this part of the book, we will play with some of the toys, which physicists created
in order to get better insight into a variety of new and unusual properties exhibited
by real systems obeying laws of quantum mechanics. These toys rarely represent
real systems and this is why we call them models. Still, in many instances they
provide necessary first experience and conceptual understanding required to deal
with reality in all its complexity. Using models allows us to focus on those properties
of real world, which appear to be of the most significance at least for the class of
problems we are interested in. One can think of models in quantum mechanics as of
impressionist or postimpressionist paintings, when instead of painstaking attention
to details, the main focus is on capturing “the essence” of the object, whatever
this might mean. Quantum mechanical models develop physical intuition about the
phenomena under study and can often be used as a first iteration of an approximation
scheme yielding more accurate and quantitative description of nature.

Chapter 6
One-Dimensional Models

One-dimensional models might appear in quantum mechanics in two, in a way,
diametrically opposite situations. In one case, you can pretend that the potential
energy of a particle changes only in one direction, such as a potential energy
of a uniform electric field. Classically, this would mean a motion characterized
by acceleration in one direction and constant velocity in perpendicular directions.
By choosing an appropriate inertial coordinate system, you can always eliminate the
constant velocity component and consider this motion as straightlinear. Quantum
mechanically, this situation has to be described in the coordinate representation,
and the respective coordinate wave function can be presented in the form

.r/ D ei.kxxCkyy/'.z/ (6.1)

where Z-axis of the coordinate system is arbitrarily chosen to lie along the direction,
in which the potential energy changes. The behavior of the wave function in
two perpendicular directions (X and Y) is that of a free particle with conserving
components of momentums being px D „kx and py D „ky. Substituting this
expression into Eq. 5.35, where the potential energy is taken to have the form of
V .r/ � V.z/, and canceling the exponential factors on both sides of the equation,
you will end up with the following one-dimensional equation:

� „
2

2me

d2'

dz2
C V.z/' .z/ D Ez' .z/ (6.2)

where

Ez D E � p
2
x

2me
� p

2
y

2me

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_6

165

166 6 One-Dimensional Models

Fig. 6.1 A schematic of a
semiconductor
heterostructure, in which the
motion of electrons in the
direction perpendicular to the
planes of the layers can be
described by the
one-dimensional model

GaAIAs

GaAs

GaAs

GaAs

GaAIAs

GaAIAs

GaAIAs

is the contribution of the motion in z direction to the total energy of the system E.
Values of px and py are determined by the initial state of the system, which may
or may not be one of the eigenvectors of the Hamiltonian with given values of px
and py. In the latter case, the particular solution of the time-dependent Schrödinger
equation satisfying the initial conditions will be given by the linear combination of
the functions presented in Eqs. 6.1 and 6.2, but for now I will focus only on the
stationary states, which correspond to initial conditions with definite px and py.

While for a long time this type of one-dimensional model was used mostly
in classrooms to illustrate basic quantum effects to unsuspecting students, the
technological advances of the last 50 years made this model quite relevant as
a stepping stone to understanding properties of practically important artificial
structures made of planar layers of several different semiconductors arranged in an
alternating order (see Fig. 6.1). It can be shown (way above your pay grade though)
that the motion of electrons in such structures can be approximately described by a
potential energy, which only changes in the direction perpendicular to the plane of
the layers (growth direction).

The second situation, in which the one-dimensional model can have at least some
relation to reality, is the case of potentials confining the motion of the particle in all
directions but one. One can imagine a particle moving inside of a cylindrical tube
with impenetrable walls. The motion perpendicular to the axis of the cylinder is
characterized by discrete allowed values of energy (I will show it later, for now
you will have to trust me on that), and if the radius of the tube is small enough, the
distance between adjacent energy levels can be sufficiently large that for all practical
purposes only one of this energy levels can be taken into account. In this case, the
transverse (as in perpendicular to the axes of the cylinder) motion is completely
“frozen,” and one is again left with pure one-dimensional motion. In all cases, we
are dealing with the Schrödinger equation in the form of Eq. 6.2, which is the main
object of study in this chapter.

6.1 Free Particle and the Wave Packets 167

6.1 Free Particle and the Wave Packets

Before taking on quantum states of electrons in one-dimensional piecewise poten-
tials such as wells or barriers, it is useful to consider the simplest quantum
mechanical model—a freely propagating, i.e., not interacting with anything, parti-
cle. In classical physics, as we all know, such a particle would move with a constant
velocity, v, and can be characterized by conserving kinetic energy K D mv2=2 and
momentum p D mv. In quantum mechanics, states of a free particle are the solution
of the Schrödinger equation with zero potential

i„@ j‰i
@t

D
OP2
2me

j‰i : (6.3)

It is quite easy to see that the stationary states of a free particle are eigenvectors of
the momentum operator:

j‰i D exp

�iEp„ t

jpi (6.4)

where jpi is defined by OP jpi D p jpi. Substitution of Eq. 6.4 into Eq. 6.3 yields

Ep D p
2

2me
(6.5)

which is an expected classical relation between energy and momentum of a free
particle, often called dispersion relation. Historically, Schrödinger equation was
devised to make sure that the quantum theory respects this relation between energy
and momentum. Indeed, in the position representation, the Schrödinger equation
becomes

i„@‰ .r; t/
@t

D �„
2r2
2me

‰ .r; t/ (6.6)

with stationary state solutions of the form

‰ .r; t/ D 1
.2�„/3=2 exp

�iEp„ t C i

p � r
„

(6.7)

where I used ı-function normalized eigenvectors of momentum operator given in
Eq. 5.21. Now, one can argue that the Schrödinger equation contains the first-order
time derivative because the dispersion relation, Eq. 6.5, is linear in E, while the
derivative over coordinates must be of the second order to reproduce the term p2 in
Eq. 6.5. Further, one can argue that since the time derivative in the Schrödinger

168 6 One-Dimensional Models

equation is only of the first order, the corresponding wave function must be
represented by a complex exponential function rather than by a real trigonometric
function, which, in turn, makes the factor i in front of the time derivative necessary
to compensate for the similar factor in the argument of the wave function.

The wave function of the form given in Eq. 6.7 has been conceived at the early
days of quantum mechanics as a mean to reconcile particle and wavelike properties
of quantum objects. However, it was clear from the very beginning that regardless of
the chosen interpretation (statistical due to Born or Schrödinger’s pilot wave), there
are several problems with assigning this function to represent quantum states of real
particles. First, its absolute value is uniform in space, which can hardly represent an
actual localized particle regardless of the chosen interpretation. Also, the motion of
the wave represented by Eq. 6.7 is characterized by phase velocity vph D !=k D
E=p D p=.2me/, which is half of the corresponding classical velocity vcl D p=me
making it difficult to associate it with the motion of a particle.

To get around this conundrum, it was suggested that actual states of the
particles (in either interpretation) are presented not by stationary states but by their
superposition, which still will solve the Schrödinger equation 6.6. It is quite easy to
show that by choosing an appropriate superposition, it is possible, for instance, to
localize a particle within an arbitrarily small region solving at least one of the listed
problems. To see how this comes about, consider a wave function at time t D 0 and
form a superposition of the form

.r/ D 1
.2�„/3=2

ˆ
d3pA.p/ exp

�
i
p � r
„
�
: (6.8)

In Sect. 5.1.1, it was shown that Eq. 6.8 can be inverted to yield

A .p/ D 1
.2�„/3=2

ˆ
d3r .r/ exp

�
�ip � r„

�
(6.9)

so that by choosing an appropriate A .p/, I can “generate” an initial (t D 0) wave
function with an arbitrary degree of localization. Now all what I need is to consider
the time dependence of this initial superposition to see if other problems outlined
above can also be circumvented by the superposition states, which, in the case of
free propagating particles, are often called wave packets.

Here I will focus on just one particular example of the wave packets, which
despite its relative simplicity will help me to illustrate most of the relevant ideas.
First of all, I will simplify the consideration by limiting it to the case of one-
dimensional motion described by a wave function, which depends on a single
coordinate, say, z. Integrals in Eqs. 6.8 and 6.9 are in this case reduced to one-
dimensional form

.z/ D 1p
2�„

1̂

�1
dpzA. pz/ exp

�
i
pzz

„
�

(6.10)

6.1 Free Particle and the Wave Packets 169

A. pz/ D 1p
2�„

1̂

�1
dz .z/ exp

�
�i pzz„

�
: (6.11)

Next, I will assume that the initial state of the particle is described by function

.z/ D C exp
"

� .z � Nz/
2

.24z0/2
#

exp

i
Npz
„ z

(6.12)

where constant C is found from the normalization condition

1̂

�1
j .z/j2 dz D jCj2

1̂

�1
exp

"
� .z � Nz/

2

2 .4z0/2
#

dz D

jCj2 p24z0
1̂

�1
exp

��x2� dx D jCj2 4z0
p
2� D 1 H)

C D 1p
4z0

p
2�
:

In the course of computing the normalization integral, I introduced a new

integration variable x D .z � Nz/ =
�p

24z0
�

and used a well-known integral
´ 1

�1 exp
��x2� dx D p� . Thus, my initial state is represented by the normalized

wave function, where the amplitude of the plane wave exp .iNpzz=„/ is modulated by
the so-called Gaussian function

.z/ D 1p
4zp2�

exp

"
� .z � Nz/

2

.24z0/2
#

exp

i
Npz
„ z
: (6.13)

The probability distribution corresponding to this wave function is peaked at z D
Nz and falls off from its maximum value as z moves away from Nz. Parameter 4z0
determines how fast the decrease of the probability takes place: the larger 4z0, the
larger deviation from Nz is required to decrease the probability density by e. Varying
4z0, one can control the degree of particle localization—a smaller 4z0 corresponds
to better localized particles (see Fig. 6.2). Formally speaking, one can define Nz and
4z0 as expectation value and uncertainty of the coordinate in the state described by
this wave function. Indeed, I can easily compute

170 6 One-Dimensional Models

Fig. 6.2 Normalized
Gaussian wave functions with
different values of the width
parameter 4z0: with
decreasing 4z0 the function
narrows, while its maximum
grows such that the total area
under the curve remains equal
to unity

hzi D 14zp2�

1̂

�1
dzz exp

"
� .z � Nz/

2

2 .4z0/2
#

D

1p
�

1̂

�1

�
x
p
24z0 C Nz

�
exp

��x2� dx D Nz

where I took into account that normalization integral computed earlier is equal to
unity and the fact that the integral of an odd function over a symmetric interval is
zero. The uncertainty takes a bit more work:

˝
z2
˛ D 14zp2�

1̂

�1
dzz2 exp

"
� .z � Nz/

2

.4z0/2
#

D

1p
�

1̂

�1

�
x
p
24z0 C Nz

�2
exp

��x2� dx D

Nz2 C 2 .4z0/
2

p
�

1̂

�1
x2 exp

��x2� dx D Nz2 C .4z0/2

where I used another well-known integral
´ 1

�1 x
2 exp

��x2� dx D p�=2. Subtract-
ing Nz2 from ˝z2˛, you can convince yourself that 4z0 is, indeed, the uncertainty of
the coordinate.

It shall be noticed that these arguments do not contradict to Schrödinger’s pilot
wave interpretation, according to which the wave presented by the wave packet is a
real material object accompanying a particle and whose width defines the degree of
particle localization.

Now I can find the appropriate amplitudes A.pz/ in Eq. 6.10, which would
reproduce the wave function given by Eq. 6.13. Substitution of this equation into
Eq. 6.11 yields

6.1 Free Particle and the Wave Packets 171

A. pz/ D 1p
2�„

1p
4z0

p
2�

1̂

�1
dz exp

"
� .z � Nz/

2

.24z0/2
#

exp

�i . pz � Npz/ z„

D

24z0p
2�„

p
4z0

p
2�

1̂

�1
dx exp

�x2 � i . pz � Npz/„ .2x4z0 C Nz/

�
D

p
24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz
1̂

�1
dx exp

�x2 � i24z0 pz � Npz„ x

�
D

p
24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz

�
1̂

�1
dx exp

"
�x2 � i24z0 pz � Npz„ x �

i
. pz � Npz/4z0

„
2

C

i
. pz � Npz/4z0

„
2#

D
p
24z0p

�„ .2�/1=4 exp

�i pz � Npz„ Nz

�

exp

"
� . pz � Npz/

2 .4z0/2
„2

# 1̂

�1
dx exp

"
�

x C i . pz � Npz/4z0„
2#

D

p
24z0p„ .2�/1=4 exp

�i pz � Npz„ Nz

exp

"
� . pz � Npz/

2 .4z0/2
„2

#
:

This was a long calculation, but it is worth the efforts to carefully peruse it.
Some of the tricks that I used in its course were the substitution of variable
x D .z � Nz/ =.24z0/, presenting an expression of the form a2 C 2ba as a complete
square, a2 C 2ba C b2 � b2 D .a C b/2 � b2, and finally the fact that integral
´ 1

�1 dx exp
h
� .x � x0/2

i
still equals to

p
� regardless of the value of x0: Before

continuing, I will set Nz D 0, which amounts to the choice of the zero of the
coordinate z, and introduce new parameter 4p D „= .24z0/. Then the expression
for A.pz/ becomes

A. pz/ D 1p4p .2�/1=4 exp
"

� . pz � Npz/
2

.24p/2
#
; (6.14)

and it is easy to verify (do it!) that as expected
´ 1

�1 jA.p/j2 dp D 1, while h pzi D Npz;
and parameter 4p determines the uncertainty of the particle’s momentum. Recalling
the definition of this parameter in terms of the uncertainty of coordinates, you
can see that these two parameters obey the minimum version of the Schrödinger
uncertainty principle:

172 6 One-Dimensional Models

4p4z0 D „
2
: (6.15)

This is the special property of the Gaussian distribution: for all other initial states,
the product of the uncertainties would be larger than „=2.

Having found A.pz/, I can now find the time dependence of the initial wave
function by considering the superposition of the stationary states at an arbitrary
time t:

‰.z; t/ D 1p
2�„4p .2�/1=4

1̂

�1
dpz exp

"
� . pz � Npz/

2

.24p/2
#

exp

"
�i p

2
z

2„me t C i
pz
„ z
#
;

which at t D 0 is obviously reduced to the function given in Eq. 6.12. I begin
evaluating this integral, again, by introducing a dimensionless variable

x D pz � Npz
24p

and transforming this integral into

‰.z; t/ D

24pp
2�„4p .2�/1=4

1ˆ

�1

dx exp
��x2� exp

�it 1

2„me .Npz C 24px/
2 C i z„ .Npz C 24px/

�

D 2
p4pp

2�„ .2�/1=4 exp

�it Np
2
z

2„me C i
z

„ Npz
!

�

1ˆ

�1

dx exp

"
�x2 � it2Npz4p„me x � it

2 .4p/2
„me x

2 C 2i z4p„ x
#

D

2
p4pp

2�„ .2�/1=4 exp

�it Np
2
z

2„me C i
z

„ Npz
!

�

1ˆ

�1

dx exp

"
�x2

1C it2 .4p/

2

„me

!
� 2ix4p„

Npz
me

t � z
#

:

Before continuing, let me brush up this expression a bit, first, by replacing Npz=m,
which corresponds to the classical velocity of the particle with momentum Npz with
vgr, and second by introducing the notation

6.1 Free Particle and the Wave Packets 173

˛ D
s
1C it2 .4p/

2

„me D
s
1C it „

2me .4z0/2
(6.16)

where in the second expression I replaced 4p with 4z0 using Eq. 6.15. As a result,
the expression for the wave function now takes a somewhat less cumbersome form:

‰.z; t/ D 2
p4pp

2�„ .2�/1=4 exp

�it Np
2
z

2„me C i
z

„ Npz
!

�

1̂

�1
dx exp

�x2˛2 � 2ix4p„

�
vgrt � z

��
:

Performing integral over x (using all the same tricks as before, substitution Qx D
˛x and completion of the square), I will obtain the final expression for the wave
function ‰.z; t/

‰.z; t/ D
p
24pp„ .2�/1=4 ˛ exp

�it Np

2
z

2„me C i
z

„ Npz
!

exp

"
�
4p
˛„

2 �
vgrt � z

�2
#

D

1

˛
p4z0 .2�/1=4

exp

"
�
�
vgrt � z

�2
.2˛4z0/2

#
exp

�it Np

2
z

2„me C i
z

„ Npz
!

(6.17)

where at the last step I used Eq. 6.15 to replace 4p with 4z0. Not surprisingly, at
t D 0 Eq. 6.17 is reduced to .z/ as given by Eq. 6.13.

It is quite educational to inspect various factors in this expression separately. The
last factor is a regular plane wave with a wave number determined by the expectation
value of the momentum Npz and corresponding frequency N! D Np2z= .2„me/. This wave
propagates with standard phase velocity vph D „ N!=Npz, but its amplitude, defined
by the second exponential factor, is also time and coordinate dependent. For any
given instant t, there is coordinate zmax D vgrt, when the amplitude is the largest,
and decreases when z deviates away from it in any direction. One can say that the
amplitude factor modulates the initial plane wave turning it into a wave packet more
or less localized within a finite coordinate region. This localization region obviously
changes its position with time as is evident from the definition of zmax, and this
motion of the localization region occurs with velocity vgr D zmax=t. This velocity is
called group velocity of the wave packet because it characterizes the motion of the
entire group of waves participating in the superposition forming the packet, while
the phase velocity describes the motion of each separate wave component of this
superposition. These two velocities are different because the phase velocity vph D
p=2me depends on the momentum p and is, therefore, different for each member of
the group. To illustrate all these points, I plotted the real part of the wave function

174 6 One-Dimensional Models

Fig. 6.3 The real part of the wave function representing a wave packet as a function of the
coordinate for two different instances. You can see suppression of oscillations away from the main
maximum as well as the displacement of the main maximum. The time interval is chosen to be
equal to the period of the oscillating factor so that the magnitude of the main maximums remains
the same for both instances. For other time intervals, it does not have to be the case because the
decrease of the cos function can damp the maximum’s magnitude. Also, this plot does not account
for the fact that the parameter ˛ in Eq. 6.17 is complex-valued and depends on time. See discussion
of the role of this parameter further in the text

presented in Eq. 6.17 for two distinct time instances as shown in Fig. 6.3. It is easy
to see that the expression for the group velocity vgr D Np=me can be obtained from
the dispersion relation E.p/ D p2z=2me of the free particle as

vgr D dE
dpz

ˇ̌
ˇ̌
pzDNp

: (6.18)

Equation 6.18 can also be generalized to the case of three-dimensional propagation,
in which case the derivative is replaced by a gradient vgr D rE.p/jpDNp and also
to more exotic cases of particles whose dispersion relation is different from Eq. 6.5.
If you are wondering where on earth you can find free particles with dispersion
different from standard quadratic form, here are two examples for you: (1) relation
between energy and momentum of relativistic particles is E D pm2ec4 C p2c2
and (2) electrons in semiconductors can be in many practically important cases
approximated as “free” particles with modified dispersion relation E.p/. In general,
the picture of a wave packet propagating with the group velocity can be generalized
to non-Gaussian wave packets as long as they can be described by a momentum
wave function A.pz/ with a single and relatively narrow maximum.

So far, the behavior of the wave packet appears to be consistent not only
with traditional Copenhagen interpretation of quantum mechanics but also with
Schrödinger’s pilot wave picture. However, when discussing the role of the ampli-
tude modulating factor in Eq. 6.17, I so far ignored an obvious “elephant in the
room,” which makes this discussion somewhat more nuanced. The parameter ˛,
which sits “quietly” in the denominator of the modulating factor (as well as in a
normalization pre-factor), is complex-valued and time dependent. The first of these
circumstances makes the expression

6.1 Free Particle and the Wave Packets 175

B.z; t/ D 1
˛

p4z0 .2�/1=4
exp

"
�
�
vgrt � z

�2
.2˛4z0/2

#
(6.19)

not quite a pure amplitude because it also has a phase attached to it. This phase,
however, is of little interest, and in order to focus on the actual amplitude part of this
expression, I will consider its squared absolute value, jB.z; t/j2, which, of course,
coincides with j‰.z; t/j2 and in the Copenhagen interpretation yields the probability
distribution P.z; t/ for the coordinates of the particle at any given time in the state
described by the wave packet ‰.z; t/:

P.z; t/ D 1p
2�4z0 j˛j2

exp

"
�
�
vgrt � z

�2
.24z0/2

1

˛2
C 1
.˛2/

�
#

: (6.20)

Now I define

1

4z
2

D
1

4z0
2

1

˛2
C 1
.˛2/

�

and using the definition of ˛ from Eq. 6.16 calculate

1

4z
2

D 1
2 .4z0/2

0
@ 1
1C it „

2me.4z0/2
C 1
1 � it „

2me.4z0/2

1
A D

1

2 .4z0/2
2

1C t2 „2
4m2e .4z0/4

D .4z0/
2

.4z0/4 C „2t24m2e
: (6.21)

I can also find

1

j˛j2 D
1r�

1C it „
2me.4z0/2

� �
1 � it „

2me.4z0/2
� D

1q
1C t2 „2

4m2e .4z0/4
D 4z04z : (6.22)

Substitution of Eqs. 6.21 and 6.22 to Eq. 6.20 converts the expression for the
probability density in the following nice looking form:

P.z; t/ D 1p
2�4z exp

"
�
�
vgrt � z

�2
2 .4z/2

#
: (6.23)

176 6 One-Dimensional Models

Comparing this result with Eq. 6.13, it becomes clear that zmax D vgt is the
expectation value of the coordinate in the state described by the wave packet, while
4z represents its uncertainty. Rewriting Eq. 6.21, I can present this uncertainty in a
more illuminating form:

4z D 4z0
s
1C „

2t2

4m2e .4z0/4
(6.24)

which shows that the localization range of the wave packet increases with time
with the rate (roughly defined as derivative d .4z/2 =dt2) inversely proportional
to the initial uncertainty 4z0. In other words, the tighter you try to squeeze your
particle into a smaller volume, the faster the localization volume of the particle
increases with time. This phenomenon of the wave packet spreading is what kills
Schrödinger’s pilot wave interpretation: the broadening of the wave packets would
make such a pilot wave unstable. To get an intuitive feeling for how fast this
spreading takes place, assume that an electron is initially localized in a region of
atomic dimensions with 4z0 ' 10�10 m. Substituting the values of the Planck
constant and the electron’s mass into Eq. 6.24, you will get for 4z:

4z D 10�10
p
1C 1:32 � 1033t2 m;

which reaches the value of 103 m in just about 3ms!
This essentially completes the discussion of the free particle wave packets, but

I cannot pass an opportunity to play with the Heisenberg picture whenever it is
possible, and this is one of the simplest situations to showcase it. You can consider
it as a reward to you for being such a good sport and wading with me through the
tedious analysis of the Gaussian wave packet.

Recalling that the Hamiltonian of a free particle is just

OH D
OP2
2me

;

you can easily derive the Heisenberg equations for the components of the position
and momentum operators

dOr
dt

D
OP
m

d OP
dt

D 0

with obvious solution

Or D Or0 C
OP0
m

t (6.25)

6.2 Rectangular Potential Wells and Barriers 177

where Or0 and OP0 are as usual Schrödinger picture’s operators setting initial condi-
tions in the Heisenberg picture. Assuming that the particle is in some arbitrary state
j�i, which does not change with time in the Heisenberg picture, I can immediately
derive for the expectation value of the position operator:

hOri D hOr0i C
D OP0
E

m
t

which is, of course, a three-dimensional version of the expression for the expectation
value of the coordinate found from Eq. 6.23 with z0 set to zero. Now, squaring
Eq. 6.25 I get

Or2 D Or20 C
OP20
m2

t2 C t
m

�
Or0 OP0 C OP0Or0

�
:

Using position representation for the operators Or0, OP0, and Eq. 6.13 to represent state
j�i, you can demonstrate by direct computations that

h�j Or0 OP0 C OP0Or0 j�i D 0
so that one has for the uncertainty of the position 4r2:

4r2 D 4r20 C
4p2
m2

t2; (6.26)

where 4p2 is again the uncertainty of the momentum computed with an initial state
of the particle. If this initial state is Gaussian, and limiting Eq. 6.26 to just a single
coordinate, you can use Eq. 6.15 to replace the momentum uncertainty with 4z0,
which will yield Eq. 6.24 for the spreading of the wave packet. In addition to this,
however, Eq. 6.26 demonstrates that the phenomenon of wave packet spreading is
not limited only to one-dimensional Gaussian packets and is a general feature of
free propagation of quantum particles.

6.2 Rectangular Potential Wells and Barriers

6.2.1 Potential Wells: Systems with Mixed Spectrum

The first important model, which I am going to introduce in this section, is
characterized by a potential profile shown in Fig. 6.4, and which can be described as

V.z/ D
(

Vw jzj < d=2
Vb jzj > d=2

(6.27)

178 6 One-Dimensional Models

Fig. 6.4 Rectangular potential well

where I assumed for concreteness that Vb > Vw. Such a potential profile is called a
potential well. If one chooses to count the energy from the bottom of the well, then
Vw ! 0, and Vb ! Vb � Vw. The energy levels in this potential must be separated
in two different regions Vw < Ez < Vb and E > Vb with distinctly different types of
behavior (states with E < Vw do not exist). In the former case, a classical particle
would have been confined between the two “walls” of this potential well at jzj D
d=2 bouncing back and forth, while in the latter, classical motion is unbounded (the
particle can be anywhere along the Z-axis). As was already discussed in Sect. 5.1.3,
two different types of classical behavior translate in to different quantum behaviors
as well.

Bound States: Discrete Spectrum
I will begin with spectral region Vw < Ez < Vb, which corresponds to bound
classical motion, and where you should expect to see discrete spectrum of energy
eigenvalues. The potential we are dealing with is a piecewise continuous function
with finite jumps at z D ˙d=2. For the range of energies under consideration,
spatial regions defined by jzj > d=2 are classically forbidden. Therefore, as it was
discussed in Sect. 5.1.3, Eq. 5.37 must be complemented by the boundary conditions
requiring that the wave function vanishes at z ! ˙1 and by continuity conditions
at jzj D d=2.

However, before I start dirtying my hands and digging into the boring business
of actually writing down the wave functions and matching the boundary conditions
and all that, I want to play with the problem a little bit more and see if I can
make this task a bit less boring. The nice thing about this particular potential is
that it is symmetric with respect to inversion of coordinate z: V.�z/ D V.z/,
and I hope that you recognize here your old acquaintance from Sect. 5.1.2—the
parity transformation, O…V.z/ D V.�z/. And not only the potential is symmetric,
but the boundary conditions are also symmetric: ' .z/ ! 0, when jzj ! 1.
Since the kinetic energy was shown earlier to be always parity invariant (does not
change upon the parity transformation), you can confidently conclude that the entire

6.2 Rectangular Potential Wells and Barriers 179

Hamiltonian of this system is symmetric with respect to this transformation. And
as I have already explained in Sect. 5.1.2, it means that the Hamiltonian commutes
with the parity operator, O…, so that the wave functions representing eigenvectors
of this Hamiltonian also represent eigenvectors of O…. Wherefore, solutions of the
Schrödinger equation 5.37 with potential given by Eq. 6.27 can be classified into
even ('.�z/ D '.z/) and odd ('.�z/ D �'.z/) functions with the immediate
consequence that you only need to deal with boundary and continuity conditions
at z > 0. Indeed, the definite parity of the solutions, even or odd, ensures that the
conditions for z < 0 are satisfied simultaneously with those at z > 0. Here is the
power of the symmetry to you: I just cut the number of equations to be solved to
satisfy the continuity conditions by half without even breaking a sweat! Using the
symmetry arguments for such a simple problem might seem a bit as an overkill—it
is not too difficult to solve it by simply using the brute force. I still wanted to show
it to you so that you would be better prepared to understand the implications of
symmetry in more “sanity-threatening situations.”

Because of the discontinuity of the potential at jzj D d=2, solutions for intervals
jzj < d=2 and jzj > d=2 must be found independently and stitched afterward using
continuity conditions.

1. jzj < d=2. The Schrödinger equation for this interval takes the form

d2'

dz2
D �2me„2 .Ez � Vw/ ' .z/

where Ez � Vw > 0. This equation is similar to that of the free particle with
positive energy Ez � Vw, and its most general solution has, therefore, the form

'.z/ D Aeikz C Be�ikz D QA sin kz C QB cos kz

where

k D
p
2me .Ez � Vw/=„ (6.28)

is a real quantity. The choice of the exponential or trigonometric functions to rep-
resent this solution is a matter of one’s taste and/or convenience: the expressions
are equivalent with QA D i.A � B/ and QB D A C B. However, since we know that
'.z/ must have a definite parity, the trigonometric form is more convenient to
take advantage of this insight. Indeed, in order to generate an even solution I can
simply make QA D 0, while an odd solution is obtained by choosing QB D 0:

'e.z/ D B cos kz; jzj < d=2 (6.29)
'o.z/ D A sin kz; jzj < d=2: (6.30)

Notation for the remaining coefficients is irrelevant, so I dropped the tildes above
the letters.

180 6 One-Dimensional Models

2. jzj > d=2. In this case, the right-hand side of the corresponding Schrödinger
equation

d2'

dz2
D 2me„2 .Vb � Ez/ ' .z/

is positive for the range of energies under consideration (Vb � Ez > 0/ so that the
general solution of this equation is given by

'.z/ D Ce
z C De�
z (6.31)

where now

D
p
2me .Vb � Ez/=„ (6.32)

is a real quantity. As has already been mentioned, the solution of the Schrödinger
equation in the region z > d=2 must vanish at z ! C1 (Eq. 5.36). The function
presented by Eq. 6.31 satisfies this requirement only if the exponentially growing
term is gotten rid of, which I achieve by simply requiring that C D 0. Thus, the
wave function for z > d=2 becomes

'.z/ D De�
z; z > d=2 (6.33)

for both even and odd solutions. For negative values of coordinates z < �d=2,
Eq. 6.33 would produce for even and odd solutions correspondingly:

'e.z/ D De
z (6.34)
'o.z/ D �De
z: (6.35)

Before continuing with stitching the wave functions at z D d=2, let me point out
that the solutions in the classically allowed region jzj < d=2 are presented by
oscillating functions. Upon crossing to the classically forbidden region jzj > d=2,
the oscillating character of the solutions turns into a monotonic decrease. This
example illustrates generic properties discussed in Sect. 5.1.3 and can be used to
formulate a general rule of thumb applied to any piecewise constant potential: in
the classically allowed regions, the wave function is represented by combination of
trigonometric function, while in classically forbidden regions, the solution is given
by combination of exponential functions with a real argument. However, I need to
warn you to pay attention to the fact that I was able to eliminate the exponentially
growing terms in Eqs. 6.33–6.35 only because the classically forbidden region
extended all the way to positive or negative infinities. If, as it might happen in
certain problems, the potential would have another jump and a classically forbidden
region would have crossed over to a classically allowed region, you would have to

6.2 Rectangular Potential Wells and Barriers 181

keep both growing and decreasing exponential functions because the conditions at
infinity can only be used within a region of coordinates extending, well, to infinity.

Equation 6.33 must be stitched with either Eq. 6.29 or 6.30 to generate continuous
solution describing the wave function in the entire domain of the coordinate z. This
must be done separately for even and odd solutions. In the latter case, the continuity
of the wave function and of its derivative at z D d=2 requires that

B cos
kd

2
D De�
d=2 (6.36)

�Bk sin kd
2

D �
De�
d=2: (6.37)

For arbitrary values of Ez, which appears in these equations via parameters k and
,
Eqs. 6.36 and 6.37 can have only trivial solution B D D D 0. Obviously, this is not
what we want. However, if I insist on having non-zero solutions, I must impose a
special condition on the allowed values of k and
. One way to derive this condition
is to divide Eq. 6.36 by Eq. 6.37 (this is allowed because we require that B;D ¤ 0)
yielding

k tan
kd

2
D
: (6.38)

Taking into account Eqs. 6.28 and 6.32, you can recognize Eq. 6.38 is a transcen-
dental equation for energy Ez. Solutions of this equation determine the values of
energy permitting the existence of non-zero coefficients B and D and, hence, of
the wave functions satisfying all the boundary conditions. The solutions of Eq. 6.38
are obviously eigenvalues of the Hamiltonian also called allowed energy values or
energy levels.

Equation 6.38 does not submit to an analytical solution, but it still can be
qualitatively analyzed to help you to determine, at least, the number of solutions
it might have. To this end, it is convenient to rewrite this equation introducing
dimensionless variables, such as kd=2 and
d=2. To facilitate the transition to these
variables, I first compute k2 C
2 using Eqs. 6.28 and 6.32:

k2 C
2 D 2me .Vb � Vw/„2 :

Multiplying this expression by d2=4 and introducing dimensionless " for kd=2, I
have

"2 C
2d2

4
D me .Vb � Vw/ d

2

2„2 )
d

2
D
q
"20 � "2;

182 6 One-Dimensional Models

where I introduced another dimensionless parameter "0 defined as

"0 D d„

r
me .Vb � Vw/

2
:

Multiplying both sides of Eq. 6.38 by d=2, I can rewrite it now as

tan " D
q
"20 � "2
"

: (6.39)

You can see that "0 incorporates all relevant parameters of the system, solely
determining the allowed energy values. This is an excellent illustration of the power
of dimensionless variables: four different parameters have collapsed into a single
one, which rules them all. Without even solving the equation, I know now that all
rectangular potential wells with different values of me, Vb � Vw and d will have the
same dimensionless energy levels as long as all these parameters correspond to the
same "0.

Obviously, Eq. 6.39 only makes sense for " � "0, so it is important to understand
the physical meaning of this condition. Substituting all necessary definitions, you
can see that " D "0 turns into

me .Ez � Vw/ d2
2„2 D

me .Vb � Vw/ d2
2„2 ) Ez D Vb;

i.e., at the point " D "0 an energy crosses over the potential barrier, where
assumptions used to derive Eq. 6.39 lose their validity.

In order to understand solutions to Eq. 6.39, it is useful to visualize graphs of its
left-hand and right-hand sides. As " increases from zero, the function on the right
decreases from positive infinity to zero at " D "0, where it terminates. The left-
hand side is a tangent, which grows from zero at " D 0 and reaches its asymptotic
behavior at " D �=2, where it jumps all the way to negative infinity, starts its
climb toward the next zero at � , goes to infinity at 3�=2, and so on. If "0 < � ,
the two functions will cross only once because the right-hand side will end before
the left-hand side manages to get to the positive territory again. Once "0, however,
crosses the � threshold, the second crossing becomes possible, and one more when
"0 exceeds 2� , and so on. An important point here is that at least one even solution
always exists no matter how small "0 becomes. Another important qualitative point
one may take home is that the magnitude of "0 depends on two main parameters:
the depth of the well Vb � Vw and its geometric width d; the wider and deeper
wells would be able to accommodate a large number of allowed energy levels, and
in order to decrease the number of energy eigenvalues belonging to the discrete
spectrum, one can either make the well narrower or shallower. It is important to
remember though that the geometric width of the well affects not only values of
allowed energies but also the difference between adjacent energy levels, which are
closer to each other in wider wells.

6.2 Rectangular Potential Wells and Barriers 183

The stitching conditions for the odd wave functions take the form

A sin
kd

2
D De�
d=2 (6.40)

Ak cos
kd

2
D �
De�
d=2 (6.41)

resulting to a different equation for allowed energy values

cot " D �
q
"20 � "2
"

: (6.42)

The right-hand side of this equation changes from negative infinity to zero at " D "0,
while the left-hand side begins at positive infinity at " D 0 and crosses to the
negative territory only for " > �=2. Thus, if "0 < �=2, this equation has no
solutions. In this case, the only allowed eigenvalue of energy corresponds to a single
even solution for the wave function. Increasing "0 beyond �=2 will produce the first
odd solution, and it will happen before the second even solution appears. Following
this line of reasoning, you can see that with increasing "0, eigenvalues corresponding
to odd and even wave functions appear in an alternating manner.

The state corresponding to the lowest energy is called the ground state, and as you
just saw, it is always represented by an even function. The next energy corresponds
to an odd solution, then you have again an energy level corresponding to the even
solution, then to an odd one again, and this pattern repeats until the last allowed
eigenvalue is reached, which can be either odd or even.

All solutions of Eqs. 6.39 and 6.42 can be enumerated as "n, where n D 1
corresponds to the ground state (even) solution, n D 2 to the lowest in energy odd
solution, and so on and so forth. Similar enumeration can be applied to the wave
functions

'n.z/ D
(

Bn cos knz jzj < d=2
Bn cos .knd=2/ e
nd=2e�
njzj jzj > d=2

; n D 1; 3; 5 : : : (6.43)

and

'n.z/ D

8̂
<̂
ˆ̂:

An sin knz jzj < d=2
An sin .knd=2/ e
nd=2e�
nz z > d=2
�An sin .knd=2/ e
nd=2e
nz z < �d=2

; n D 2; 4; 6 : : : : (6.44)

Here, kn D 2"n=d,
n D 2
q
"20 � "2n=d, while the corresponding values of energy

Ezn are

Ezn D Vw C 2„
2

md2
"2n: (6.45)

184 6 One-Dimensional Models

Fig. 6.5 Graphic solution of the eigenvalue equation for even and odd wave functions

You may notice that wave functions in Eqs. 6.43 and 6.44 still contain undefined
coefficients Bn and An correspondingly, while coefficient D was eliminated using
Eqs. 6.36 and 6.40. This is, however, normal because all eigenvectors and represent-
ing their functions are always defined only up to a constant factor (I have said that
already and not once, right?). As usual, values of these coefficients can be fixed by
the normalization condition

1̂

�1
'2n.z/dz D 1:

Figure 6.5 illustrates the process of emergence of the even and odd solutions
described above. The graph on the left refers to Eq. 6.39 for energies of even
states, and the graph on the right corresponds to Eq. 6.44 for energies of the odd
states. One can see that the crossing points on the graphs signifying values of the
dimensionless energy parameter " alternate in their values between even and odd
states: the lowest energy value comes from the graph on the left, the second lowest
appears in the graph on the right, and this alternation continues throughout all the ten
energy values depicted in these plots. It is also instructive to plot the wave functions
corresponding to a few lowest energy eigenvalues. Graphs in Fig. 6.6 present (from
left to right) the ground state and the, first and second excited states. In addition to
clearly demonstrating the even and odd nature of the respective states, these graphs
reveal an important phenomenon—a transition to a higher energy level always
adds an extra zero to the corresponding wave function. This behavior is actually
a manifestation of a mathematical theorem valid for any one-dimensional problems
with discrete spectrum: the number of zeroes of a wave function corresponding to
n-th energy level (n D 1 corresponds to the ground state) is always equal to n � 1.
Since a rigorous proof of this statement is beyond our reach, I will illustrate this
point considering a limiting case of a very deep well, such that "0
1. In the limit
"0 ! 1, Eq. 6.39 has solutions "n D �n=2; n D 1; 3; 5 � � � , while solutions of
Eq. 6.42 are "n D �n=2; n D 2; 4; 6 � � � . The corresponding energy values from
Eq. 6.45 coincide with those given in Eq. 5.89 (if one replaces L with d) for a

6.2 Rectangular Potential Wells and Barriers 185

Fig. 6.6 Wave functions corresponding to the first three lowest energy eigenvalues of a rectangular
potential well

particle, whose motion is confined in a finite region of the total length d. Obviously,
this confinement corresponds to the limit of the potential well with infinitely high
barriers. The wave function in this case becomes

'n.z/ D
(

Bn cos �nzd n D 1; 3; 5 � � �
An sin �nzd n D 2; 4; 6 � � �

within the well jzj < d=2, and it is exact zero outside of the well (
n goes to
infinity and vanquishes the exponential terms exp
n .�z C d=2/ for all z > d=2
and exp
n .z C d=2/ for all z < �d=2). Now, one can clearly see how the increase
of n by 1 transforms cos into sin adding an extra zero to the function when z changes
from �d=2 to d=2.
Unbound (Scattering) States: Continuous Spectrum
The range of energies satisfying the condition Ez > Eb corresponds to an unbound
classical motion, where the entire domain �1 < z < 1 becomes classically
allowed. The motion of the classical particle depends on the combination of its
initial position and initial velocity: if a particle is initially at z < �d=2 with velocity
directed to the left or at z > d=2 with positive velocity, it will keep moving with the

186 6 One-Dimensional Models

same velocity—the potential well would not affect its motion at all. If, however, the
initial motion of the particle is directed toward the well, it will experience infinite
acceleration (or deceleration) for infinitesimally short time interval when passing
points z D ˙d=2, which will result in finite increase and then decrease of the
particle’s speed. After passing the region of the well, the particle resumes its straight
linear motion with the same velocity as before.

Quantum mechanical behavior of the particle is described by the solution of the
Schrödinger equation, which in the classically allowed region can be presented as
a combination of exponential functions with complex arguments. In this spectral
region, the symmetry arguments, which I used to find discrete energy levels and
the corresponding wave functions, are no longer valid because of the inherent
asymmetry in the initial conditions. You will see soon that this asymmetry, which is
evident in the classical description of the unbound motion, will manifest itself in the
quantum description as well. Therefore, it is no longer necessary to keep the origin
of the coordinate axis at the center of the well, and the consideration becomes a bit
more convenient if I move it to the left by d=2: In this case, the left boundary of
the well corresponds to z D 0, and the coordinate regions, for which different wave
functions must be written, are now defined as z < 0, 0 < z < d, and z > d. The
most general solution for the wave function in each of these regions can be written
down as

'.z/ D

8̂
<̂
ˆ̂:

A1eik1z C B1e�ik1z z < 0
A2eik2z C B2e�ik2z 0 < z < d
A3eik1z C B3e�ik1z z > d

(6.46)

where k1 D
p
2m .Ez � Vb/ and k2 D

p
2m .Ez � Vw/. This expression for the wave

function has to be complemented by four stitching conditions—two at each of the
points of discontinuity. Requiring continuity of the function and its derivative at
z D 0 and z D d, I get

A1 C B1 D A2 C B2 (6.47)
k1 .A1 � B1/ D k2 .A2 � B2/ (6.48)

A2e
ik2d C B2e�ik2d D A3eik1d C B3e�ik1d (6.49)

k2
�
A2e

ik2d � B2e�ik2d
� D k1

�
A3e

ik1d � B3e�ik1d
�

(6.50)

where Eqs. 6.47 and 6.49 ensure continuity of the wave function at z D 0 and z D d
correspondingly, while Eqs. 6.48 and 6.50 do the same for its derivative. Simple
counting of the number of unknown coefficients and comparing it with the number
of equations tells me that I have got a problem here: there are only four equations for
six unknowns, which is one unknown too many. However, I have not yet specified a
desirable behavior of the wave function at infinity (a boundary condition), which can
be useful in eliminating extra unknowns. Unlike the case of the discrete spectrum,

6.2 Rectangular Potential Wells and Barriers 187

where the behavior of the wave functions at infinity is uniquely prescribed, here
I have an array of choices reflecting different physical situations for which the
problem at hand is being used.

Before digging into the issue of the boundary conditions at infinity for this
problem, it might be useful to get a better physical understanding of the terms
appearing in the expressions for '.z/. To this end, let me dust off some results from
Sect. 5.1.3, namely, the concept of the probability current, Eq. 5.42, which in its
one-dimensional reincarnation takes the form of

j D i„
2me

'

d'�

dz
� '� d'

dz

: (6.51)

Substituting a generic form of the wave function Aieikiz C Bie�ikiz into Eq. 6.51,
you find

j D i„
2me

��iki
�
Aie

ikiz C Bie�ikiz
� �

A�i e�ikiz � B�i eikiz
�

�iki
�
A�i e�ikiz C B�i eikiz

� �
Aie

ikiz � Bieikiz
�� D

„ki
2me

�
jAij2 � jBij2 C BiA�i e�2ikiz � B�i Aie2ikiz C

C jAij2 � jBij2 � BiA�i e�2ikiz C B�i Aie2ikiz
�

D
„ki
me

jAij2 � „ki
me

jBij2 :

The first term in this expression describes a positive (directed in positive z direction)
probability current associated with term Aieikiz in the wave function, while the
second, negative, term describes a probability current in the opposite, negative z
direction and is associated with the term Bie�ikiz in the wave function. Now, imagine
a classical beam of particles of mass me all moving with the same speed v in the
positive direction of the z-axis. The current of particles in this beam (a number of
particles crossing a plane perpendicular to the flow per unit time per unit area of
the cross section of the beam) is easily found to be Nv, where N is the number of
particles in the beam per beam’s unit volume. This expression coincides with the
quantum mechanical probability current if you replace v with p=me, p with „k and
identify jAj2 with N. This comparison allows interpreting the terms of the wave
function containing eikz as corresponding to the beam of particles propagating from
left to right and terms with e�ikz as describing particles propagating in the opposite
direction.

A typical experiment involving particles with energies in the continuous segment
of the spectrum consists in sending particles created by some source, positioned far
away from the potential well, toward the well and counting the number of particles
in the beam behind the well (transmitted particles) or the number of particles in

188 6 One-Dimensional Models

front of the well but propagating in the negative z direction (reflected particles). In
this case, the asymptotic behavior of the wave function at negative infinity must
contain both left- and right-propagating currents, while the wave function at the
positive infinity only contains the right-propagating particles. This gives us one
of the possible boundary conditions at infinity corresponding to this particular
experimental situation: ' .z ! 1/ D A3eik1z. For the wave function to have this
form, coefficient B3 in Eq. 6.46 must be set to zero. As a result, I end up with five
unknown coefficients and the same four equations, and what is left to realize is
that the term A1eik1z describes the current of particles created by the source, which
is external to the Schrödinger equation and is determined by an experimentalist
controlling the concentration of particles in the outgoing beam. Thus, A1 shall be
treated as a free parameter, while all remaining coefficients must be expressed in its
terms.

Quantities actually measured in the experiment, i.e., the fraction of particles
reflected by the potential or the fraction of particles transmitted past the potential,
can be interpreted quantum mechanically as probabilities of reflection R D jr=jinc
and transmission T D jtr=jinc, where I introduced notations for the reflected current
jr D „k1 jB1j2 =me, the incident current jinc D „k1 jA1j2 =me, and transmitted current
jr D „k3 jA3j2 =me. Wave number k3 D

p
2me.Ez � V1/ is determined by the value

of the potential at z ! 1, V1. In the particular case I am dealing with now, the
potentials at z < 0 and z > d are the same, so that V1 D Vb and k3 D k1. One
should realize, however, that it is not always the case, so that one has to be careful
when defining the transmission probability. The most general expressions for the
reflection and transmission probabilities are

R D jB1j
2

jA1j2
(6.52)

T D k3 jA3j
2

k1 jA1j2
: (6.53)

Now it becomes clear that in order to obtain experimentally relevant form
for the scattering wave function, you need to solve the following system of
equations expressing all unknown coefficients in terms of amplitude of the incident
particles A1:

1C r D A2 C B2; (6.54)
k1 .1 � r/ D k2 .A2 � B2/ ; (6.55)

A2e
ik2d C B2e�ik2d D teik1d; (6.56)

k2
�
A2e

ik2d � B2e�ik2d
� D k1teik1d: (6.57)

6.2 Rectangular Potential Wells and Barriers 189

Here, I introduced the amplitude reflection and transmission coefficients r D B2=A1
and t D A3=A1 correspondingly and redefined amplitudes A2 and B2 as A2=A1 ! A2,
and B2=A1 ! B2. Combining the first two equations, I obtain

1C k1

k2

C
1 � k1

k2

r D 2A2;

1 � k1

k2

C
1C k1

k2

r D 2B2;

while the other two yield

1C k1

k2

teik1d D 2A2eik2d;

1 � k1

k2

teik1d D 2B2e�ik2d:

Expressing A2 and B2 from the last pair of equations and substituting it in the first
ones, I get

1C k1

k2

C
1 � k1

k2

r D t

1C k1

k2

eik1de�ik2d;

1 � k1

k2

C
1C k1

k2

r D t

1 � k1

k2

eik1deik2d;

which after some brushing up yields

1C r k2 � k1

k2 C k1
�

e�ik1deik2d D t;
1C r k2 C k1

k2 � k1
�

e�ik1de�ik2d D t:

Equating the left-hand sides of these two equations gives

e�ik1deik2d
1C k2 � k1

k2 C k1 r

D e�ik1de�ik2d
1C k2 C k1

k2 � k1 r

)

r

k2 C k1
k2 � k1 e

�2ik2d � k2 � k1
k2 C k1

D 1 � e�2ik2d:

Finally, some simple algebraic manipulations, which I hope you can reproduce
yourselves, yield

190 6 One-Dimensional Models

r D
�
k21 � k22

�
sin k2d�

k22 C k21
�

sin k2d C 2ik2k1 cos k2d
; (6.58)

t D 2ik2k1�
k22 C k21

�
sin k2d C 2ik2k1 cos k2d

: (6.59)

Now, you can easily find two remaining coefficients A2 and B2:

A2 D 1
2

1C k1

k2

ei.k1�k2/d

2ik2k1�
k22 C k21

�
sin k2d C 2ik2k1 cos k2d

; (6.60)

B2 D 1
2

1 � k1

k2

ei.k1Ck2/d

2ik2k1�
k22 C k21

�
sin k2d C 2ik2k1 cos k2d

: (6.61)

Now, once the expressions for the coefficients of the wave function are found in
terms of A1, you might wonder if it is possible and/or necessary to fix the value of
the latter. Generally speaking, this is again a question of normalization of the wave
function, and according to our general understanding, we must be able to normalize
this function using the delta-function. However, when the wave function is not
just a plane wave, the procedure becomes rather cumbersome and requires careful
evaluation of diverging integrals. From a practical point of view, it does not make
much sense to jump from all these hoops to achieve the normalization, which would
matter only if you plan to use the resulting functions as a basis, and this almost never
happens. Thus, if you are only concerned with obtaining experimentally relevant
quantities, you will be happy leaving A1 undetermined and use Eqs. 6.58 and 6.59
to find the transmission and reflection probabilities from Eqs. 6.52 and 6.53:

R D
�
k21 � k22

�2
sin2 k2d�

k22 C k21
�2

sin2 k2d C 4k22k21 cos2 k2d
(6.62)

T D 4k
2
2k
2
1�

k22 C k21
�2

sin2 k2d C 4k22k21 cos2 k2d
: (6.63)

The denominator of these expressions can be rewritten in the following form:

�
k22 C k21

�2
sin2 k2d C 4k22k21 cos2 k2d D 4k22k21 C

�
k21 � k22

�2
sin2 k2d:

Thanks to this rearrangement, you can realize two important facts. First, you can
immediately see that

R C T D 1 (6.64)

6.2 Rectangular Potential Wells and Barriers 191

and, second, that the transmission, considered as a function of energy, oscillates
between its maximum value equal to unity, achieved at k2d D �n; n D 1; 2; 3 � � � ,
and its minimum value

Tmin D 4k
2
2k
2
1�

k22 C k21
�2

which occurs at k2d D �=2 C �n. For large values of energy Ez
Vb, when k1
and k2 become close to each other, the minimum value of transmission differs little
from unity, so that the transmission remains close to one for almost all energies. The
reflection probability, in this case, becomes correspondingly small for all energies as
well. This is the behavior close to what you would expect from a classical particle,
so that the higher energy limit means transition to the classical regime. This behavior
is illustrated in Fig. 6.7.

Equation 6.64 is an important expression of the conservation of probability—
it simply states that since transmission and reflection are the only two mutually
exclusive events that can occur when a particle is incident on the potential, the sum
of their probabilities must be equal to unity. Even though this relation was derived
here for the particular case of the rectangular well, it is valid for a generic potential
asymptotically approaching a constant value at z ! ˙1. The validity of Eq. 6.64
serves in reality as a test on correctness of Eqs. 6.62 and 6.63. Using the definitions
of transmission and reflection coefficients in terms of the probability currents, I can
rewrite Eq. 6.64 as

jr
jinc

C jtr
jinc

D 1 ” jinc � jr D jtr: (6.65)

This equation establishes that the total probability current on the left of the potential
well is equal to the probability current on its right, which is just a general statement
of the conservation of probability, which can also be interpreted as a continuity of
the probability current across any finite discontinuity of the potential.

Fig. 6.7 Transmission
probability for the rectangular
potential well

192 6 One-Dimensional Models

Fig. 6.8 Spatial dependence of j'.z/j2 for three different values of energy: low energy, high
energy, and resonance energy where transmission goes to one and reflection to zero. The absence
of reflection in the last plot is evidenced by the absence of oscillations of the probability density
due to interference of the incident and reflected waves. The vertical lines delineate the edges of the
well

The behavior of the wave function also changes with energy. Figure 6.8 illustrates
this point plotting the spatial dependence of the respective probability density
j'.z/j2 at three different energies, including the one which corresponds to zero
reflection. In the latter case, the probability distribution becomes flat at both z <
�d=2 and z > d=2 signaling the absence of interference between incident and
reflected waves. One can also notice the decrease in the period of the oscillations for
higher energies as it should be expected because higher energy means large wave
number and shorter wavelength.

6.2.2 Square Potential Barrier

Square potential barrier is a potential well turned upside down, when the higher
value of the potential energy Vb is limited to the finite interval jzj < d=2, while the
lower energy Vw corresponds to the semi-infinite regions jzj > d=2 outside of this

6.2 Rectangular Potential Wells and Barriers 193

interval. The first principal difference between this situation and the one considered
in the previous section is that there are no energies corresponding to a classically
bound motion in this potential, and, therefore, there are no states corresponding to
discrete energy levels. In both cases, Ez < Vb and Ez > Vb, classical motion is
unbound, and quantum mechanical states belong to the continuous spectrum (there
are no states with Ez < Vw). The difference between these energy regions is that in
the former case, the interval jzj < d=2 is classically forbidden, while in the latter,
the entire domain of z-coordinate is classically allowed. Respectively, there are two
different types of wave functions: when Vw < Ez < Vb

'.z/ D

8̂
<̂
ˆ̂:

A1eik2z C B1e�ik2z z < �d=2
A2e
1z C B2e�
1z �d=2 < z < d=2
A3eik2z z > d=2

(6.66)

where k2 is defined as in the previous section, while
1 D
p
2m .Vw � Ez/ is related

to k1 as k1 D �i
1. I already mentioned it once, but I would like to emphasize
again—you cannot eliminate either of the real exponential functions in the second
line of Eq. 6.66 because the requirement for the wave function to decay at infinity
can be used only when the classically forbidden region expands to infinity. In the
case at hand, it is limited to the region jzj < d=2, so the exponential growth of the
wave function does not have enough “room” to become a problem. For energies
Ez > Vb, the wave function has the form of

'.z/ D

8̂
<̂
ˆ̂:

A1eik2z C B1e�ik2z z < �d=2
A2eik1z C B2e�ik1z �d=2 < z < d=2
A3eik2z z > d=2:

(6.67)

This wave function is essentially equivalent to the one considered in the case of
the potential well, so you can simply copy Eqs. 6.58 and 6.59 while exchanging k1
and k2:

B1 D
e�ik2d

�
k22 � k21

�
sin k1d�

k22 C k21
�

sin k1d C 2ik2k1 cos k1d
(6.68)

A3 D 2ik2k1e
�ik2d

�
k22 C k21

�
sin k1d C 2ik2k1 cos k1d

: (6.69)

Transmission and reflection coefficients in this case have all the same properties as
in the case of the potential well, which I am not going to repeat again.

Going back to the case Vw < Ez < Vb, it might appear that here I would have
to carry out all the calculations from scratch because now I have to deal with real
exponential functions. But fear not, you still can use the previous result by replacing
k1 with k1 D �i
1. The negative sign in this expression is important—it ensures

194 6 One-Dimensional Models

that the coefficient A2 in Eq. 6.67 goes over to the same coefficient A2 in Eq. 6.66
(the same obviously applies to coefficients B2). In order to finish the transformation
of Eqs. 6.68 and 6.69 for the under-the-barrier case, you just need to recall that
sin.ix/ D i sinh x and cos.ix/ D cosh x. With these relations in mind, you easily
obtain

B1 D �
ie�ik2d

�
k22 C
21

�
sinh
1d

�i �k22 �
21
�

sinh
1d C 2k2
1 cosh
1d
(6.70)

A3 D 2k2
1e
�ik2d

�i �k22 �
21
�

sinh
1d C 2k2
1 cosh
1d
: (6.71)

The respective transmission and reflection coefficients become

R D
�
k22 C
21

�2
sinh2
1d�

k22 �
21
�2

sinh2
1d C 4k22
21 cosh2
1d
(6.72)

T D 4k
2
2
2
1�

k22 �
21
�2

sinh2
1d C 4k22
21 cosh2
1d
: (6.73)

Even though I derived Eqs. 6.70 and 6.71 by merely extending Eqs. 6.68 and 6.69
to the region of imaginary k1 (for mathematically sophisticated—this procedure is
a simple example of what is known in mathematics as analytical continuation), the
properties of the reflection and transmission coefficients given by Eq. 6.72 are very
different from those derived for the over-the-barrier transmission case Ez > Vb.
Gone are their periodic dependence on the energy and d, as well as special values
of energy, when the transmission turns to unity and reflection goes to zero. What
do we have instead? Actually quite a boring picture: transmission is exponentially
decreasing with increasing width of the barrier d and slowly approaches unity as
the energy swings between Vw and Vb because
1 D

p
2me .Vb � Ez/ vanishes at

Ez D Vb. To illustrate the exponential dependence of the transmission on d, I will
consider a case of a “thick” barrier, which in mathematical language means
1d
1. To find the required approximate expression for T and R, I need to remind you a
simple property of hyperbolic functions cosh x and sinh x: for large values of their
argument x, these functions can be approximated by a simple exponential, sinh x '
cosh x ' 1

2
exp x. Taking this into account, I can derive

R ' 1 (6.74)

T ' 4k
2
2
2
1�

k22 C
21
�2 e�4
1d: (6.75)

When deriving the expression for the reflection coefficient, I “lost” the exponentially
small term, which is supposed to be subtracted from unity to ensure conservation
of probability. At the same time, this term makes the main contribution to the

6.3 Delta-Functional Potential 195

transmission coefficient and, therefore, survives. A better approximation for the
reflection coefficient can be found simply by writing it down as R D 1 � T .
Obviously, the same results can be derived directly from Eq. 6.72 by being a bit
more careful and keeping leading exponentially small terms.

What is surprising here is, of course, not the fact that the transmission is small,
but that it is not exactly equal to zero. Because what it means is that there exists a
non-zero probability for the particle to travel across a classically forbidden region,
emerge on the other side, and keep moving as a free particle. This phenomenon,
which is a quantum mechanical version of “walking through the wall,” is called
tunneling, and you can hear physicists saying that the particle tunnels through
the barrier. The exponential nature of the dependence upon d is very important,
because exponential function is one of the fastest changing functions appearing
in mathematical description of natural processes. It means that a small change
in d results in a substantial change in transmission. This effect has vast practical
importance and is used in many applications such as tunneling diodes, tunneling
microscopy, flush memory, etc.

6.3 Delta-Functional Potential

In this section, I will present a rather peculiar model potential, which does not really
have direct analogies in the real world. I can justify spending some time on it by
making three simple points: (a) it is easily solvable, so considering it would not
take too much of our time, (b) it is useful as an illustration of a situation when the
derivative of the wave function loses its continuity property, and (c) in the case of
shallow potential wells, which are able to hold only a single bound state, it can
provide a decent qualitative understanding of real physical situations. This utterly
unrealistic potential has the form of a delta-function

V D �&ı.z/; (6.76)

where the negative sign signifies that the potential is attractive and that the states
with negative energies are possible and must belong to the discrete spectrum.
Indeed, the entire region of z except of a single point z D 0 is classically forbidden,
so the motion of a classical particle, if one can imagine being localized to a single
point as a motion, is finite. Parameter & in this expression represents a “strength”
of the potential, but one needs to understand that the dimension of this parameter is
energy � length, so it should not be interpreted as a “magnitude” of the potential.
It becomes obvious if one integrates Eq. 6.76: & D ´ V.z/dz, so it is clear that & is
the area under the potential. If one thinks of the delta-function as a limiting case of
a rectangular potential of depth Vw and width d, with Vw ! 1 and d ! 0, in such
a way that & D Vwd remains constant, the meaning of this parameter becomes even
more transparent.

196 6 One-Dimensional Models

The main peculiarity of this model is that the discontinuity of the potential in this
case involves more than just a finite jump, so that my previous arguments concerning
the continuity of the derivative of the wave function are no longer applicable.
Actually, this derivative is not continuous at all, and the first matter of business
is to figure out how to “stitch” derivatives of the wave function defined at z < 0
with those defined at z > 0. To solve this puzzle, let me start with the basics—the
Schrödinger equation

� „
2

2me

d2'

dz2
� &ı.z/'.z/ D E'.z/: (6.77)

Integrating this equation over infinitesimally small interval ��; � and taking into
account that an integral of a continuous function over such an interval is zero (in the
limit � ! 0), I get

� „
2

2me

d'

dz

ˇ̌
ˇ̌
zD�

� d'
dz

ˇ̌
ˇ̌
zD��

� &'.0/ D 0:

This yields the derivative stitching rule:

d'

dz

ˇ̌
ˇ̌
zD�

� d'
dz

ˇ̌
ˇ̌
zD��

D �2me„2 &'.0/: (6.78)

Now, all what I need is to solve the Schrödinger equation with zero potential and
negative energy separately for z < 0 and z > 0 and stitch the solutions. Since both
these regions are classically forbidden for a particle with E < 0, the solutions have
the form of real-valued exponential functions:

'.z/ D
(

A1e
z z < 0

A2e�
z z > 0

where
D p�2meE=„, and I discarded the contributions which would grow
exponentially at positive and negative infinities to satisfy the boundary conditions.
A continuity of the wave function at z D 0 requires that A2 D A1, and Eq. 6.78
yields

�2
A D �2me„2 &A:

Assuming that A is non-zero (naturally) and taking into account the definition of
,
I find that this expression is reduced to the equation for allowed energy levels:

E D �me&
2

2„2 : (6.79)

6.3 Delta-Functional Potential 197

Obviously, Eq. 6.79 shows that there is only one such energy, which is why this
model can only be useful for description of shallow potential wells with a single
discrete energy level.

Solutions with positive energies can be constructed in the same way as it was
done for the rectangular potential well or barrier

'.z/ D
(

A1eikz C B1e�ikz z < 0
A2eikz z > 0

(6.80)

where k D p2meE=„; and the continuity of the wave function at z D 0 yields

A1 C B1 D A2:

The derivative stitching condition, Eq. 6.78, generates the following equation:

ikA2 � ik .A1 � B1/ D �2me„2 &A2:

Solving these two equations for B1 and A2, one can obtain

A2
A1

D 1
1 � i�;

B1
A1

D i�
1 � i�;

where I introduced a convenient dimensionless parameter � defined as

� D me&
k„2 :

The amplitude transmission and reflection coefficients t D A2=A1 and r D B1=A1
are complex numbers, which can be presented in the exponential form using Euler
formula as

r D pRei�r I t D pTei�t

where reflection and transmission probabilities R D jrj2 and T D jtj2 and
corresponding phases �r and �t are given by

R D �
2

1C �2 I T D
1

1C �2 (6.81)

�r D � arctan 1
�

I �t D arctan�: (6.82)

198 6 One-Dimensional Models

While the phase of the amplitude reflection and transmission coefficients do not
affect the probabilities, they still play an important role and can be observed. The
reflected wave function interferes with the function describing incident particles
and determines the spatial distribution of relative probabilities of position mea-
surements. These phases also define the temporal behavior of the particles in the
situations involving nonstationary states, but discussion of this situation is outside
of the scope of this book.

6.4 Problems

Problems for Sect. 6.2.1

Problem 77 Derive Eq. 6.42.

Problem 78 Find the reflection and transmission coefficients for the potential
barrier shown in Fig. 6.9. Show that R C T D 1.
Problem 79 In quantum tunneling, the penetration probability is sensitive to slight
changes in the height and/or width of the barrier. Consider an electron with energy
E D 15 eV incident on a rectangular barrier of height V D 7 eV and width
d D 1:8 nm. By what factor does the penetration probability change if the width
is decreased to d D 1:7 nm?
Problem 80 Consider a step potential

V.z/ D
(
0 x < 0

V0 x > 0:

Calculate the reflection and transmission probabilities for two cases 0 < E < V0
and E > V0.

Fig. 6.9 Potential barrier with an asymmetric potential

6.4 Problems 199

Problem 81 Find an equation for the energy levels of a particle of mass me moving
in a potential of the form

V.z/ D

8
ˆ̂̂̂
ˆ̂̂<
ˆ̂̂̂
ˆ̂̂:

1 x < �a
0 �a < x < �b
V0 �b < x < b
0 b < x < a

1 x > a:

Consider even and odd wave functions separately. Using any graphic software, find
the approximate values of the two lowest values of the energy if m D 1:78 �
10�27 kg, a D 0:12 nm, b D 0:42 nm, V0 D 1:5 eV. Sketch the respective wave
functions for each of the found eigenvalues.

Problem 82 Consider a particle moving in a potential comprised of two attractive
delta-functional potentials separated by a distance d:

V.x/ D �&ı .x C d=2/ � &ı .x � d=2/ :

1. Derive an equation for discrete energy levels in this potential, and solve it if
possible. How many discrete energy levels does this potential have? Analyze the
behavior of these energy levels when the distance d between the wells increases.

2. Find the wave functions corresponding to the continuous segment of the spec-
trum, and determine the respective transmission and reflection probabilities.

Chapter 7
Harmonic Oscillator Models

It is as difficult to overestimate the role of harmonic oscillator models in physics in
general and in quantum mechanics in particular as the influence of Beatles and Led
Zeppelin on modern popular music. Harmonic oscillators are ubiquitous and appear
every time when one is dealing with a system that has a state of equilibrium in the
vicinity of which it can oscillate, i.e., in a vast majority of physical systems—atoms,
molecules, solids, electromagnetic field, etc. It also does not hurt their popularity
that the harmonic oscillator is one of the very few models which can be solved
exactly.

Consider a particle moving in a potential V.x; y; z/, which has a minimum at
some point x D y D z D 0. Mathematically speaking, this means that at this
point @[email protected] D @[email protected] D @[email protected] D 0, while the matrix of the second derivatives
Lij � @[email protected]@rj

ˇ̌
xDyDzD0, where r1 � x; r2 � y, and r3 � z; is positive definite.

If you still remember the connection between the potential energy and the force in
classical mechanics, you should recognize that in this situation, point x D y D z D 0
corresponds to the particle being in the state of stable equilibrium. Stable in this
context means that a particle removed from the equilibrium by a small distance will
be forced to move back toward it rather than away from it. Expanding potential
energy in a power series in the vicinity of the equilibrium and keeping only the first
nonvanishing terms, you will get

V.x; y; z/ � 1
2

X
i;j

Li;;jrirj:

Respective classical Hamiltonian equations 3.2 and 3.3 yield for this potential:

dpi
dt

D �
X

j

Li;jrj (7.1)

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_7

201

202 7 Harmonic Oscillator Models

dri
dt

D pi
me

(7.2)

(i D x; y; z). They can be converted into Newton’s equations by differentiating
Eq. 7.2 (with respect to time) and eliminating the resulting time derivative of the
momentum using Eq. 7.1:

dr2i
dt2

D � 1
me

X
j

Li;jrj: (7.3)

The presence in matrix Lij of nondiagonal elements indicates that the particle’s
motion in the direction of any of the chosen axes X; Y , or Z is not independent
of its motion in other directions. In layman’s terms, it means that it is impossible
to arrange for this particle to move purely in the direction of any of the axes.
Nevertheless, solutions of these equations still can be presented in the standard
time-harmonic form ri D ai exp .i!t/ with amplitudes ai and frequency ! obeying
equations:

1

me

X
j

Li;jaj D !2ai; (7.4)

which is an eigenvalue equation for the matrix Li;j=me. It is obvious that this is
symmetric (Li;j D L;j;i), real-valued, and, therefore, Hermitian matrix. Thus, based
on the eigenvalue theorems discussed in Sect. 3.3.1, this matrix is guaranteed to
have real eigenvalues and corresponding orthogonal eigenvectors. The equation for
the eigenvalues is found by requiring that Eq. 7.4 has nontrivial solutions:

det
�
me!

2ıi;j � Li;j
� D 0

and in general has three solutions !2n , where n D 1; 2; 3. Substituting each of
these frequencies back in Eq. 7.4, you can find amplitudes a.n/x ; a

.n/
y a

.n/
z , which

form corresponding eigenvectors. These eigenvectors are regular three-dimensional
vectors defining three mutually orthogonal directions in space. Oscillations in
each of these directions, called normal modes, are characterized by their unique
frequencies !2n and can occur independently of each other. Indeed, these three
vectors can be used as a new basis, which in this particular case amounts to
introducing new coordinate axes along the directions of the normal modes. The
matrix Li;j=me transformed to this basis becomes diagonal, and introducing notation
�1; �2, and �3; to represent coordinates along these new directions, Eq. 7.3 will take
a form of three independent differential equations:

d2�n
dt2

D �!2n�n: (7.5)

7.1 One-Dimensional Harmonic Oscillator 203

One can also show (those interested in details are welcome to read any of many text-
books on classical mechanics, or molecular oscillations, or a combination thereof)
that Hamiltonian written in terms of these new coordinates and corresponding
conjugated momentums �n takes the form

H D
X

n

�2n
2mn

C mn!2n�2n

which is the sum of three independent one-dimensional Hamiltonians. The transition
to this form is not so trivial, and the mass parameter mn does not have to coincide
with the actual mass of the particle. Nevertheless, as long as �n and �n are a
canonically conjugated pair characterized by the standard for the coordinate and
momentum Poisson brackets, Eq. 3.5, we can treat them as such for all practical
purposes, including quantization.

Thus, using the concept of normal modes, one can always reduce a prob-
lem involving harmonic oscillations to a simple combination of one-dimensional
problems. This is actually true even in the case involving oscillations of several
particles such as multi-atom molecules. Therefore, using the one-dimensional model
to describe harmonic oscillations is even more justified than the one-dimensional
models described in the previous chapter. And so, the one-dimensional model of the
quantum harmonic oscillator is what I am going to consider next.

7.1 One-Dimensional Harmonic Oscillator

7.1.1 Stationary States (Eigenvalues and Eigenvectors)

Classical mechanics of the one-dimensional harmonic oscillator is described by
Hamiltonian:

H D p
2

2me
C 1
2

me!
2x2; (7.6)

and respective Hamiltonian equations for momentum p and coordinate x are
obtained by specializing Eqs. 7.1 and 7.2 to the one-dimensional situation:

dp

dt
D �me!2x (7.7)

dx

dt
D p

me
(7.8)

where I replaced the corresponding diagonal element of matrix Li;j as Lxx � me!2.
These equations are, of course, easy to solve, and the solution is well known:

204 7 Harmonic Oscillator Models

x D x0 cos!t C p0
me!

sin!t

p D �me!x0 sin!t C p0 cos!t; (7.9)

where x0 and p0 are initial values of the coordinate and momentum of the
particle. Equation 7.9 describes a familiar harmonic time dependence, which can
be presented in terms of amplitude A and initial phase ı:

x.t/ D A sin .!t C ı/ :

Both A and ı are determined by the initial conditions: the amplitude—by the total
energy E of the oscillator, which, as you know, is a conserving quantity—and phase,
by the ratio of the initial coordinate and momentum. Recalling that at the maximum
displacement E takes entirely the form of the potential energy, you can write

1

2
me!

2A2 D E D p
2
0

2me
C 1
2

me!
2x20 )

A D
s

2E

me!2
D
s

x20 C
p20

m2e!
2
: (7.10)

The phase of the oscillator can be found by expanding sin .!t C ı/ D sin!t cos ıC
cos!t sin ı and equating the resulting terms with their counterparts in Eq. 7.9. This
yields

A cos ı D p0
me!

A sin ı D x0
and subsequently

tan ı D x0me!
p0

:

Obviously, the motion of a harmonic oscillator is bounded with the maximum
deviation from the equilibrium position given by its amplitude A, Eq. 7.10. The
coordinate x becomes equal to A at two turning points, where the velocity of the
oscillator and, respectively, its kinetic energy turn to zero. The relation between
total, potential, and kinetic energies of the harmonic oscillator can be illustrated
by a diagram shown in Fig. 7.1, where vertical lines show the turning points of the
classical motion.

Even though you all have known the solution to the harmonic oscillator problem
almost since the elementary school, you might find it useful to play with its
Hamiltonian a bit more. Let me, for instance, factorize the Hamiltonian, taking
advantage of its u2 C v2 form, which can be presented as .u C iv/ .u � iv/:

7.1 One-Dimensional Harmonic Oscillator 205

Fig. 7.1 Energy diagram for
a classical oscillator. The
horizontal line corresponds to
its total energy E, and vertical
dashed lines indicate the
turning points and
coordinates corresponding to
two maximum displacements

H D

pp
2me

C i
r

me
2
!x

pp
2me

� i
r

me
2
!x

: (7.11)

Now, on a whim, I am going to compute the Poisson bracket involving these factors.
Designating the first of them as u,

u D pp
2me

C i
r

m

2
!x;

and the second one as u�,

u� D pp
2me

� i
r

me
2
!x;

I find, using Eq. 3.4 for Poisson bracket,

fu; u�g D @u
@x

@u�

@p
� @u
@p

@u�

@x
D i!:

Using this result as a hint, I now introduce new variables:

b D � ip
!

u D
r

me!

2
x � i pp

2me!
(7.12)

b� D 1p
!

u� D �i
r

me!

2
x C pp

2me!
(7.13)

whose Poisson bracket, by design, of course, is

˚
b; b�

� D 1:

206 7 Harmonic Oscillator Models

This means that b and b� constitute a canonically conjugated pair (if you have
already forgotten what I am talking about, check Sect. 3.1), with b playing the role
of the coordinate and b� pretending to be the momentum. Computing bb� (do it!),
you will see that the Hamiltonian can be presented as

H D i!bb�:

The corresponding Hamiltonian equations are

db

dt
D @H
@b�

D i!b (7.14)

db�

dt
D �@H

@b
D �i!b�: (7.15)

The advantage of these equations as compared to initial Eqs. 7.7 and 7.8 is that they
are independent first-order differential equations, which can be easily solved:

b D b0ei!tI b� D b�0e�i!t: (7.16)

Initial coordinate and momentum can be expressed in terms of b and b� by inverting
Eqs. 7.12 and 7.13, but I will leave it for you as an exercise.

Transition between pairs x; p and b; b� is an example of a so-called canonical
transformation of variables, and the only reason I decided to bother you with it is
that it paves a way to better understanding its quantum analog, which is of crucial
importance. According to the quantization rules discussed in Sect. 3.3.2, transition
from classical to quantum description consists in promoting classical variables to
quantum operators, and the coordinate-momentum dyad plays a crucial role in the
process, namely, because it is a canonical pair. The operators replacing classical
variables are, to a large extent, defined by their commutation relations, and in the
case of canonical pairs, the commutator is directly linked to the respective Poisson
brackets, as I have already mentioned previously. In the case of the coordinate-
momentum pair, the corresponding commutator is obtained from the Poisson
bracket by multiplying it by i„: As well as any pair of variables characterized
by the canonical Poisson bracket, which play the role similar to coordinate and
momentum in classical mechanics, any quantum mechanical pair of operators with
canonical commutator i„ will have properties similar to those of the coordinate
and momentum. For instance, if I know that two Hermitian operators O� and O�
have a commutator Œ O�; O�� D i„, I can without any doubts claim that the operator
O� in the representation based on eigenvectors of O� is O�� D �i„@[email protected]� similar
to the momentum operator in the coordinate representation. I have to emphasize,
however, the requirement that the operators must be Hermitian. Therefore, if I were
to promote b and b� to operators, it would not work, because they would not be
Hermitian. Nevertheless, operators similar to b and b� (while not exactly like them)
do play an important role in quantum theory (and not just for harmonic oscillators).

7.1 One-Dimensional Harmonic Oscillator 207

Now I am ready to get down to our main business and start developing quantum
theory of harmonic oscillators. The goal is to develop the theory as far as possible
without resorting to any particular representation for momentum and coordinate
operators. Such an approach will produce the most general results, independent of
a representation, offer important insights into the quantum properties of oscillators,
and create a formal framework for extending this theory beyond pure mechanical
harmonic oscillators.

I start by “factorizing” the quantum Hamiltonian in a way similar to factorization
of the classic Hamiltonian in Eq. 7.11. However, to make sure that this factorization
works for operators, I would like to review the origin of the identity:

u2 C v2 D .u C iv/ .u � iv/ :

Removing the parentheses on its right-hand side, I have

.u C iv/ .u � iv/ D u2 C v2 C ivu � iuv:

If u and v are regular variables, the last two terms in this expression cancel, but if
they are non-commuting operators, it is quite obvious that the original factorization
rule is no longer true and must be corrected:

Ou2 C Ov2 D .Ou C i Ov/ .Ou � i Ov/C i ŒOu; Ov� : (7.17)
The order of the terms in the parentheses on the right-hand side of this expression
can be changed, which will result in an alternative form of the identity:

Ou2 C Ov2 D .Ou � i Ov/ .Ou C i Ov/ � i ŒOu; Ov� : (7.18)
Identifying Ou and Ov as

Ou D
r

me
2
! Ox (7.19)

Ov D Opp
2me

; (7.20)

I find

ŒOu; Ov� D 1
2
! ŒOx; Op� D 1

2
i„!: (7.21)

Operators Ou and Ov have a dimension of penergy, while their commutator, pro-
portional to „!, obviously has the dimension of energy. If I am not mistaken, I
have already remarked that it is often quite beneficial to work with dimensionless
quantities. Thus, taking a clue from Eqs. 7.17 and 7.18 and the experience gained
working with classical Hamiltonian, I will try to generate dimensionless operators
such as

208 7 Harmonic Oscillator Models

Oa D 1p„! .Ou C i Ov/ D
r

me!

2„ Ox C i
Opp

2me„!
(7.22)

Oa� D 1p„! .Ou � i Ov/ D
r

me!

2„ Ox � i
Opp

2me„!
: (7.23)

The commutator of these operators is

�Oa; Oa�� D �i
r

me!

2„
1p

2me„!
ŒOx; Op�C i

r
me!

2„
1p

2me„!
ŒOp; Ox� D

� i
2„ ŒOx; Op�C

i

2„ ŒOp; Ox� D 1:

Due to a special importance of this result, I will reproduce it as a separate numbered
formula:

�Oa; Oa�� D 1: (7.24)

The operators Oa and Oa� are clearly not Hermitian: performing Hermitian conjugation
of Eqs. 7.22 and 7.23, you can immediately see that they are actually Hermitian
conjugates of each other, hence the notation Oa�. It will also be useful to express
coordinate and momentum operators in terms of Oa and Oa�. Adding and subtracting
Eqs. 7.22 and 7.23, I can invert these equations to get

Ox D
s

„
2!me

�Oa C Oa�� ; (7.25)

Op D i
r

„!me
2

�Oa� � Oa� : (7.26)

Using the operator factorization identities 7.17 or 7.18 with Ou and Ov defined in
Eqs. 7.19 and 7.20, I can derive two alternative forms of the Hamiltonian:

OH D „!
1

2
C Oa� Oa

D „!

�1
2

C OaOa�
:

These two expressions differ by the order of the operators in it and by the sign in
front of 1=2. Formally they are absolutely equivalent, and one can be reduced to
another using commutation relation 7.24. However, from a practical point of view
(and you will have to trust me on this for now), the first of these expressions is
much more convenient to use than the other. Thus, in what follows, I will rely on
the representation of the Hamiltonian in the form

7.1 One-Dimensional Harmonic Oscillator 209

OH D „!
1

2
C Oa� Oa

: (7.27)

Our first task is to find the eigenvalues and eigenvectors of this Hamiltonian,
i.e., the stationary states of the harmonic oscillator. Since the classical motion in
the harmonic potential is bound for all values of energy, it should be expected
that the entire spectrum of the Hamiltonian is discrete so that yet unknown energy
eigenvalues can be labeled by a discrete index as En and the respective eigenvectors
as jEni:

OH jEni D En jEni : (7.28)

Since I am not allowed to use any particular representation for the coordinate and
momentum operators, all what I have to go on with are the commutation relations.
This invites me to use the same purely algebraic technique, which I successfully
used previously when searching for eigenvalues of the operators of the angular
momentum in Sect. 3.3.4. However, in the role of the angular momentum ladder
operators OL˙, I am going to cast operators Oa and Oa�, which appear to have some
similarities with OL˙: they are also non-Hermitian and are Hermitian conjugates
of each other. You might remember that operators OL˙ applied to an eigenvector
of operator OLz generate other eigenvectors with decreased or increased eigenvalue.
Will you be surprised if it turns out that operators Oa and Oa� are doing the same to the
eigenvectors of the harmonic oscillator? Probably not.

The first step is to note that eigenvectors of the Hamiltonian coincide with those
of the operator ON D Oa� Oa, which is called the number operator and is obviously
Hermitian. Indeed, once you rewrite the Hamiltonian as

OH D „!
1

2
C ON

;

this statement becomes pretty obvious. Moreover, you can immediately see that if
�n is the eigenvalue of ON: ON jEni D �n jEni ; then

En D „!
1

2
C �n

: (7.29)

Therefore, I can focus my attention on finding the eigenvalues and eigenvectors of

the number operator ON. To this end, I first compute the commutator
h ON; Oa

i
(if you

want to know what prompted me to do so, the only excuse I can offer is that there
isn’t much more for me to do, so why not do that?):

h ON; Oa
i

D Oa� Oa2 � OaOa� Oa D �Oa� Oa � OaOa�� Oa D �Oa; (7.30)

210 7 Harmonic Oscillator Models

where I took advantage of Eq. 7.24. Carrying out Hermitian conjugation of this
result, and remembering to change the order of the operators in their product after
Hermitian conjugation, I immediately obtain

h ON; Oa�
i

D Oa�: (7.31)

In the next step, I consider ON Oa jEni and use the commutation relation 7.30 to get
ON Oa jEni D �Oa jEni C Oa ON jEni D

�n Oa jEni � Oa jEni D .�n � 1/ Oa jEni :

This result shows that Oa jEni is an eigenvector of ON with eigenvalue �n � 1, i.e.,
the operator Oa generates eigenvectors of ON with eigenvalues decreasing by one with
each application of the operator. Not surprisingly, this operator is called lowering
operator. The questions, which naturally pop up at this point, are how far down in
energy one can go and how one knows when the bottom is reached. The answer to
the first question is obvious—that energy eigenvalues of the harmonic oscillator
can never be negative, and thus �n > �1=2. The second question is answered
by recycling arguments that I have already used when discussing the angular
momentum—the only way to reconcile the ability of Oa to keep decreasing �n every
time it is applied and the requirement that there must exist a smallest �n is to impose
on the eigenvector corresponding to this minimum value condition:

Oa jEmini D 0: (7.32)

Another useful relation is obtained by performing Hermitian conjugation of this
equation:

hEminj Oa� D 0: (7.33)

Now you are going to appreciate the wisdom of writing the Hamiltonian in the form
of Eq. 7.27 and of introducing operator ON. Indeed Eq. 7.32 used in ON jEmini givesON jEmini D 0, which means that the minimum value �min D 0, and Emin D „!=2.
So, behold the power of the lowering operator—we found the bottom, the lowest
possible energy of a harmonic oscillator, its ground state!

Just like in other examples, the lowest energy is not zero, which is, of course,
the consequence of the uncertainty principle: zero energy would require that both
kinetic and potential energies are equal to zero, which would mean that both
coordinate and momentum operators would have certain values of zero, which is
impossible. The ground state energy „!=2 is one of the clearest examples of the
energy associated with so-called quantum fluctuations.

The contribution of these fluctuations to the energy can be quantified by
computing the expectation values of Op2 and Ox2, which determine the quantum

7.1 One-Dimensional Harmonic Oscillator 211

uncertainties of the respective observables. Using Eqs. 7.25 and 7.26 that express
operators Op and Ox in terms of operators Oa and Oa� in conjunction with Eqs. 7.32
and 7.33, you can immediately see that

hEminj Ox jEmini D hEminj Op jEmini D 0;

so that the uncertainties 4p and 4x are 4p D phOp2i and 4x D phOx2i. Squaring
Eqs. 7.25 and 7.26, and computing these expectation values with state jEmini, you
will get

hEminj Ox2 jEmini D „
2me!

�hEminj Oa2 jEmini C hEminj Oa�2 jEmini C

hEminj Oa� Oa jEmini C hEminj OaOa� jEmini
�
:

The first three terms in this expression vanish, thanks to Eqs. 7.32 and 7.33.
However, the last term requires some more efforts because the order of operators
Oa and Oa� in it is “wrong” in the sense that it is not conducive to the immediate
application of Eqs. 7.32 and 7.33. The situation, however, can be quite easily
rectified by using the commutation relations 7.24 to change this order and rewrite
this term as

hEminj OaOa� jEmini D hEminj 1C Oa� Oa jEmini D 1;

where I, as usual, assumed that whatever the state vector jEmini is, it is normalized.
Thus, finally, I find

hEminj Ox2 jEmini D „
2me!

: (7.34)

Similarly,

hEminj Op2 jEmini D �me!„
2

�hEminj Oa2 jEmini C hEminj Oa�2 jEmini �

hEminj Oa� Oa jEmini � hEminj OaOa� jEmini
� D

me!„
2

hEminj OaOa� jEmini D me!„
2

: (7.35)

Using Eqs. 7.34 and 7.35 in expressions for kinetic and potential energies, Op2=2me
and me!2x2=2, I immediately find that the ground state expectation values

˝Op2˛ =2me
and me!2

˝Ox2˛ =2 are both equal to „!=4. Isn’t it remarkable that while the ground
state of harmonic oscillator is characterized by a certain value of energy „!=2, it
is formed by two fluctuating quantities, kinetic and potential energies, contributing
equal amounts? One can actually see here a certain analogy with classical harmonic

212 7 Harmonic Oscillator Models

oscillator, whose energy, while being time independent, includes contributions from
kinetic and potential energies, whose time dependencies totally compensate each
other yielding a constant sum.

OK, by finding the energy of the ground state, I took you down to the very bottom
of the energy valley. Now it is time to climb back up, and we are going to do it with
the assistance of . . . wait for it. . . , of course, the operator Oa�! Actually, there is not
much surprise or suspense here because this is exactly what happened with angular
momentum operators: we used OL� to find the lowest eigenvalue and operator OLC to
move up from there. My next step is pretty obvious now—consider ON Oa� jEni:

ON Oa� jEni D Oa� jEni C Oa� ON jEni D
�n Oa� jEni C Oa� jEni D .�n C 1/ Oa� jEni

where this time I used commutation relation from Eq. 7.31. So, as expected, Oa� jEni
is an eigenvector of the number operator with eigenvalue �n C 1, i.e., operator Oa�
does generate eigenvectors with eigenvalues increasing by one for each application
of the operator. Starting with the ground state, for which �min D 0, operator Oa� will
generate eigenvectors with eigenvalues of ON equal to 1; 2; 3 � � � . In other words, the
eigenvalues of the number operator are all natural numbers n starting with 0, which
make energy levels of quantum harmonic oscillator, according to Eq. 7.29, equal to

En D „!
1

2
C n

; n D 0; 1; 2 � � � : (7.36)

What is left for us now is to find the corresponding eigenvectors, for which, from
now on, I will use the simplified notation jni. All what I know at this point is that
if jni is an eigenvector corresponding to the eigenvalue of the number operator n,
then Oa� jni is an eigenvector corresponding to the eigenvalue n C 1. But I cannot
guarantee that this new eigenvector will be normalized even if jni is. Therefore,
reserving the bra and ket notation only for normalized vectors, the best I can write
for now is

Oa� jni D cn jn C 1i ; (7.37)

where jn C 1i is assumed normalized and cn is yet an unknown normalization factor.
Again, I cannot help but remind you that we encountered exactly the same situation

when discussing eigenvectors of OL2: To find cn I, first, write down a Hermitian
conjugated version of Eq. 7.37:

hnj Oa D c�n hn C 1j : (7.38)

Then, multiplying left-hand and right-hand sides of Eqs. 7.37 and 7.38, I get

hnj OaOa� jni D jcnj2 hn C 1j n C 1i :

7.1 One-Dimensional Harmonic Oscillator 213

Using commutation relation 7.24 and taking into account that all vectors are now
assumed normalized, I have

hnj ON C 1 jni D jcnj2 ) jcnj2 D n C 1:

Taking advantage of the freedom in the choice of the phase of the normalization
factor, I choose cn to be real positive. Now I have the rule for generating new
normalized eigenvectors:

jn C 1i D 1p
n C 1 Oa

� jni :

Applying this rule sequentially starting with the ground state, I end up with the
following expression for an arbitrary eigenvector jni:

jni D 1p
nŠ

�Oa��n j0i ; (7.39)

where j0i stands for the eigenvector corresponding to the ground state. One can also
show that

Oa jni D pn jn � 1i ; (7.40)

but I will leave a proof of this relation as an exercise.
Equation 7.39 relates eigenvectors describing excited stationary states of the

oscillator to its ground state. The latter, however, might appear to you to be
undetermined, which is true if by “determining” it you mean expressing it in terms
of some known vectors or functions. However, for most purposes, all information
that you need about the ground state is contained in Eq. 7.32, and in this sense,
this equation is the definition of the ground state. You can use it to find answers
to any specific question pertaining to this state. For instance, if you are interested
in a function representing this state in coordinate representation, you can use
the coordinate representation of the momentum and coordinate operators to turn
Eq. 7.32 into an easy-to-solve differential equation for '0.x/ � hxj Emini:

0
@
r

me!

2„ x C
s

„
2me!

d

dx

1
A'0.x/ D 0 )

d'0.x/

dx
D �me!„ x'0.x/ )

'0 D C exp

� x
2

2�2

: (7.41)

214 7 Harmonic Oscillator Models

Parameter � appearing in this equation is defined as

� D
s

„
me!

(7.42)

and has the dimension of length. It specifies the characteristic scale of the spatial
dependence of the wave function: for x � � the wave function is almost constant,
while for x
� its behavior crosses over to a steep descent. It is easy to see that
this parameter characterizes a transition between classically allowed and classically
forbidden regions of coordinates for the harmonic oscillator. Indeed, the substitution
of quantum ground state energy E D „!=2 to Eq. 7.10 for the amplitude A of
classical oscillator yields A D � , which means that for the ground state of the
oscillator x < � corresponds to the classically allowed region, and the region x > �
is classically forbidden.

Integration constant C in Eq. 7.41 is found from the normalization condition:

C2
1̂

�1
exp

� x

2

�2

dx D C2�

1̂

�1
exp

��Qx2� dQx D C2p�� D 1

where I computed the integral by introducing a dimensionless variable Qx D x=� and
using a known value of the Gaussian integral

´ 1
�1 exp

��y2� dy D p� . Thus, the
normalized version of the oscillator ground state wave function becomes

'0 D 1pp
��

exp

� x

2

2�2

: (7.43)

Having found the normalized ground state wave function in the coordinate rep-
resentation, I can now use the raising operator (also rewritten in the coordinate
representation) to generate wave functions representing an arbitrary stationary state
of the Hamiltonian:

'n.x/ D 1p
2nnŠ�

p
�

Qx � d

dQx
n

exp

� Qx

2

2

�
:

Here I used the coordinate representation for the raising operator expressed in terms
of dimensionless variable Qx:

Oa� D
r

me!

2„ x �
„p
2me„!

d

dx
D

xp
2�

� �p
2

d

dx
D 1p

2

Qx � d

dQx

7.1 One-Dimensional Harmonic Oscillator 215

substituted in Eq. 7.39. You can easily convince yourselves that expression

Qx � d

dQx
n

exp

� Qx

2

2

�

generates polynomials multiplied by an exponential function exp
��Qx2=2� : Pulling

out this exponential factor, you end up with so-called Hermite polynomials Hn. Qx/
defined as

Hn. Qx/ D exp
Qx2
2

Qx � d

dQx
n

exp

� Qx

2

2

�

so that the oscillator’s wave function takes the form

'n.x/ D 1p
2nnŠ�

p
�

exp

� Qx

2

2

Hn. Qx/: (7.44)

Hermitian polynomials are well known in mathematical physics and can be
computed from the following somewhat simpler expression:

Hn. Qx/ D .�1/n eQx2 d
n

dQxn
�

e�Qx2
�
: (7.45)

The properties of these polynomials are well documented (google it!), so I will only
emphasize one point: these polynomials and, therefore, the entire wave function
have a definite parity—it is even for n D 0; 2; 4 � � � , and it is odd for n D 1; 3; 5 � � � .
Obviously this fact is the result of the symmetry of the harmonic oscillator potential
with respect to inversion and is an agreement with our previous discussions of the
connection between this symmetry and the parity of the quantum states. Figure 7.2
presents graphs of wave functions representing states with n D 0; 1; 2; 3, from
which you can see that another general rule is also fulfilled here: the number of
zeroes of the wave function coincides with the number of the respective energy level
n. Note that n is counted here starting from zero; therefore, the number of zeroes of
the wave function is n instead of n � 1.

Coordinate representation is, obviously, not the only possible way to present
eigenvectors of the harmonic oscillator. As a second example, I want to discuss
a representation based on eigenvectors of the Hamiltonian jni. The eigenvectors
themselves in this representation are presented, as all basis vectors, by columns
with a single entry, equal to unity, in the row corresponding to the number of the
respective basis vector. The Hamiltonian in this basis is presented by a diagonal
matrix Hnm D Enınm, where En are energy eigenvalues given by Eq. 7.36. Less
trivial is the representation of coordinate and momentum operators, and to find it I
must first compute the matrix elements of the lowering (or raising—does not really
matter) operator, amn D hmj Oa jni:

amn D hmj Oa jni D
p

n hmj n � 1i D pnım;n�1;

216 7 Harmonic Oscillator Models

Fig. 7.2 Wave functions representing states of harmonic oscillators with n D 0 (upper left graph),
n D 1 (upper right graph), n D 2 (lower left graph), and n D 3 (lower right graph)

where I used Eq. 7.40 and the fact that all eigenvectors jni are orthonormal. To
visualize this matrix correctly, it is important to remember that index n in Eq. 7.40
starts counting from zero, and it is convenient to keep it this way when numerating
matrix elements. In this case the first row is given by a0n, second by a1;n, and so
on. Respectively, the first column is given by am0. Non-zero elements in matrix amn
are characterized by column index exceeding the respective row index by one, i.e.,
a0;1; a1;2, etc.—they go parallel to the main diagonal but one element above it:

amn D

2
6666664

0
p
1 0 0 � � �

0 0
p
2 0 � � �

0 0 0
p
3 � � �

:::
:::

:::
: : :

:::

0 0 0 0
: : :

3
7777775
: (7.46)

The matrix for the raising operator

a�mn D hmj Oa� jni D
p

n C 1 hmj n C 1i D pn C 1ım;nC1

7.1 One-Dimensional Harmonic Oscillator 217

is obtained from Eq. 7.46 by simple transposition:

a�mn D

2
6666664

0 0 0 0 � � �p
1 0 0 0 � � �
0

p
2 0 0 � � �

:::
:::
: : :

: : :
:::

0 0 0 0
: : :

3
7777775
: (7.47)

Obtaining matrices for coordinate and momentum operators is now as easy as
adding two matrices. Using Eqs. 7.25 and 7.26, I find

xmn D
s

„
2me!

2
6666664

0
p
1 0 0 � � �p

1 0
p
2 0 � � �

0
p
2 0

p
3 � � �

:::
:::

:::
: : :

:::

0 0 0 0
: : :

3
7777775

(7.48)

pmn D i
r

me„!
2

2
6666664

0 �p1 0 0 � � �p
1 0 �p2 0 � � �
0

p
2 0 �p3 � � �

:::
:::

:::
: : :

:::

0 0 0 0
: : :

3
7777775
: (7.49)

Both matrices are obviously Hermitian, but for the matrix representing the momen-
tum operator, one must remember to do complex conjugation in addition to matrix
transposition.

7.1.2 Dynamics of Quantum Harmonic Oscillator

When talking (or thinking) about a harmonic oscillator, we are intuitively look-
ing for a quantity that changes periodically with time—oscillates. However, the
stationary states, which I presented to you in the preceding section, are not very
helpful in satisfying our intuitive subconscious desire to see a pendulum or at least
something oscillating. Stationary states even though they have non-zero energies
associated with them do not describe any dynamics and any physically relevant
time dependence. Any expectation values computed with stationary states are time-
independent, and those for coordinate and momentum are zeroes, not only for
the ground state but for any stationary state. This is, of course, obvious from the
coordinate and momentum matrices presented in Eqs. 7.48 and 7.49, but one can

218 7 Harmonic Oscillator Models

also make a symmetry-based argument explaining this result. Even though this is
a detour from the main goal of this section, I will take it because symmetry-based
arguments are important in many areas of quantum mechanics and also because they
are cool.

The Hamiltonian of the harmonic oscillator is invariant with respect to the
inversion operator O… (see Sect. 6.2.1), and, therefore, its eigenvectors have a definite
parity, as it was already mentioned. Any expectation value involves a bra and ket
pair of them, and, therefore, either they are odd or even, their overall contribution
is invariant with respect to O… (a product of two odd functions is even). At the
same time, coordinate and momentum operators are odd with respect to parity
transformation: O…�1 Ox O… D �Ox, O…�1 Op O… D �Op; as it was shown in Sect. 5.1.2. Thus,
on one hand, expectation values are supposed to change sign upon inversion, but
on the other hand, they must not because they represent a property of the system
invariant with respect to inversion. Thus, ponder this: you have a situation when
a quantity must simultaneously change its sign while remaining the same. Clearly,
there is only one quantity capable of this Houdini trick, and it is the great invention
of Hindu mathematicians—the zero.

Now, back to the main topics. It is clear that the only way a quantum harmonic
oscillator can actually oscillate is by being in a nonstationary state. We have dis-
cussed two approaches to dealing with nonstationary phenomena—the Schrödinger
picture (operators are time-independent, state vectors are time-dependent) and the
Heisenberg picture (operators depend on time, and state vectors do not). I will
treat the dynamics of the harmonic oscillator using both pictures beginning with
the Heisenberg approach.

Heisenberg equations 4.24 can be derived for any operator, and in Sect. 4.2 I
did that for the momentum and coordinate operators. Equation 4.33 provides you
with the complete solution of the respective Heisenberg equations and essentially
with all what you might need to describe the dynamics of any experimentally
relevant quantity. However, I would like to revisit the problem of finding time-
dependent position and momentum operators, but this time I will do it by solving
the Heisenberg equations for lowering and raising operators. The corresponding
equations are

d OaH
dt

D � i„
h
OaH; OH

i

d Oa�H
dt

D � i„
h
Oa�H; OH

i
:

The expression for the Hamiltonian in terms of Heisenberg operators OaH , Oa�H is the
same as in terms of Schrödinger operators:

e
i
„

OHt OHe� i„ OHt D OH D „!
�

e
i
„

OHt Oa�e� i„ OHte i„ OHt Oae �i„ OH
�

„!
1

2
C Oa�H OaH

;

7.1 One-Dimensional Harmonic Oscillator 219

and, therefore, all the commutation relations, which we calculated for Schrödinger
operators, remain the same. In particular, using Eqs. 7.30 and 7.31, I can findh
OaH; OH

i
D „!

h
OaH; ON

i
D „! OaH , and

h
Oa�H; OH

i
D „!

h
Oa�H; ON

i
D �„! Oa�H .

Substituting it in the Heisenberg equations, I obtain the following nice-looking
equations:

d OaH
dt

D � i! OaH (7.50)

d Oa�H
dt

D i! Oa�H : (7.51)

Unlike equations for the momentum and coordinate operators, Eqs. 7.50 and 7.51
are not coupled, so they can be solved independently—in that they have a striking
resemblance to classical Eqs. 7.14 and 7.13. Solutions to these equations are easy to
write:

OaH D Oae�i!tI Oa�H D Oa�ei!t; (7.52)

where Oa and Oa� are Schrödinger operators that play the role of initial conditions
for the Heisenberg equations. Equations 7.25 and 7.26 are obviously valid for
Heisenberg operators as well so that one can obtain for time-dependent coordinate
and momentum operators:

OxH D
s

„
2me!

�Oae�i!t C Oa�ei!t� (7.53)

OpH D i
r

me„!
2

�Oa�ei!t � Oae�i!t� : (7.54)

Using the Euler relation for the exponential functions, I can rewrite this result in the
form previously derived in Eq. 4.33:

OxH D
s

„
2me!

��Oa C Oa�� cos!t C i �Oa� � Oa� sin!t� (7.55)

OpH D i
r

me„!
2

��Oa� � Oa� cos!t C i �Oa C Oa�� sin!t� ; (7.56)

which agrees with Eq. 4.33 after one recognizes that at t D 0 these equations
reproduce coordinate and momentum operators in the Schrödinger representation.
Either of Eqs. 7.53–7.56 can be used, for instance, to compute the expectation values
of coordinate and momentum for an arbitrary initial state j�0i. This task can be
facilitated by using the basis of the eigenvectors jni to represent this state:

220 7 Harmonic Oscillator Models

j�0i D
1X

nD0
cn jni : (7.57)

It is a bit more convenient to carry out these calculations using the exponential form
of time dependence as in Eqs. 7.53 and 7.54:

hxi D
s

„
2me!

"
e�i!t

1X
n;mD0

c�mcn hmj Oa jni C ei!t
1X

n;mD0
c�mcn hmj Oa� jni

#
D

s
„

2me!

"
e�i!t

1X
n;mD0

c�mcn
p

nım;n�1 C ei!t
1X

n;mD0
c�mcn

p
n C 1ım;nC1

#
D

s
„

2me!

"
e�i!t

1X
mD0

c�mcmC1
p

m C 1C ei!t
1X

mD0
c�mC1cm

p
m C 1

#
; (7.58)

where I used previously derived matrix elements for the lowering and raising oper-
ators. Now, we are getting something familiar: the expectation value of coordinate
does indeed oscillate with the frequency of the harmonic oscillator !, and what is
interesting, this behavior does not depend on the actual initial state, as long as it has
contributions from at least two adjacent stationary states so that both cm and cmC1 are
different from zeroes. This requirement, of course, excludes initial stationary states,
which would have only one nonvanishing coefficient cm, as well as nonstationary
states with a definite parity, which would contain only coefficients cm with either
odd or even m. With a bit of imagination, you can recognize in Eq. 7.58 the typical
for a classical harmonic oscillator behavior which can be described as

hxi D A cos .!t C / ; (7.59)

where the amplitude and phase of the oscillations are determined by the initial
conditions (as they also are in the classical case):

A D
s

„
2me!

ˇ̌
ˇ̌
ˇ

1X
mD0

c�mcmC1
p

m C 1
ˇ̌
ˇ̌
ˇ I

D arctan Im
�P1

mD0 c�mC1cm
p

m C 1�

Re
�P1

mD0 c�mC1cm
p

m C 1� : (7.60)

Similar calculations for the momentum operator produce

h pi D i
r

me„!
2

"
ei!t

1X
mD0

c�mC1cm
p

m C 1 � e�i!t
1X

mD0
c�mcmC1

p
m C 1

#
;

(7.61)

7.1 One-Dimensional Harmonic Oscillator 221

which can be rewritten using the same amplitude and phase as

h pi D �me!A sin .!t C / (7.62)

in full agreement with the Ehrenfest theorem, Eq. 4.17.
Before shifting attention to the Schrödinger picture, let me consider a few more

examples of the application of the Heisenberg equations.

Example 18 (Uncertainties of Coordinate and Momentum of a Harmonic Oscilla-
tor) Assume that the harmonic oscillator is initially in a state described by an equal
superposition of its ground and the first excited states:

j˛0i D 1p
2
.j0i C j1i/ :

Compute uncertainties of the coordinate and momentum operators at an arbitrary
time t and demonstrate, using the Heisenberg picture, that the uncertainty relation is
fulfilled at all times.

Using Eqs. 7.59, 7.62, and 7.60 with c0 D c1 D 1=
p
2 and cm D 0 for m > 1, I

find for the expectation values

hxi D 1
2

s
„

2me!
cos!t

h pi D �1
2

r
„!me
2

sin!t:

To find the uncertainties, I first have to compute
˝
p2
˛

and
˝
x2
˛
. I begin by computing

Ox2 D „
2me!

�Oae�i!t C Oa�ei!t�2 D

„
2me!

�
Oa2e�2i!t C OaOa� C Oa� Oa C �Oa��2 e2i!t

�
;

Op2 D �me„!
2

�Oae�i!t � Oa�ei!t�2 D

�me„!
2

�
Oa2e�2i!t � OaOa� � Oa� Oa C �Oa��2 e2i!t

�
:

Now, remembering that Oa j0i D 0, Oa j1i D j0i, Oa j2i D p2 j1i, Oa� j0i D j1i,
Oa� j1i D p2 j2i, Oa� j2i D p3 j3i, I get

Ox2 j˛0i D „
2
p
2me!

�
j0i C p2e2i!t j2i C 2 j1i C j1i C p6e2i!t j3i

�
;

Op2 j˛0i D �me„!
2
p
2

�
� j0i C p2e2i!t j2i � 2 j1i � j1i C p6e2i!t j3i

�
;

222 7 Harmonic Oscillator Models

and, finally,

h˛0j Ox2 j˛0i D „
4me!

.1C 3/ D „
me!

h˛0j Op2 j˛0i D me„!
4

.1C 3/ D me„!:

Now I can find the uncertainties:

4x D
q

hOx2i � hOxi2 D
s

„
me!

1 � 1

8
cos2 !t

4p D
q

hOp2i � hOpi2 D
s

me„!
1 � 1

8
sin2 !t

4x4p D „
2
p
2

r
7C 1

32
sin2 2!t > 0:93„

in agreement with the uncertainty principle.
There also exists an alternative approach to computing time-dependent averages

of various observables using the Heisenberg picture, which allows to establish their
dependence of time in a more generic way. To develop such an approach, let me
first rewrite Eqs. 7.55 and 7.56 using Eqs. 7.25 and 7.26 for Schrödinger versions of
the coordinate and momentum operators, which I will designate here as Ox0 and Op0 to
emphasize the fact that they serve as initial values for the Heisenberg equations:

OxH D Ox0 cos!t C 1
me!

Op0 sin!t (7.63)

OpH D Op0 cos!t � me! Ox0 sin!t: (7.64)

Now, let’s say I want to compute the uncertainty of the coordinate for an arbitrary
state j˛i. The expectation values of the coordinate and momentum in this state,
using Eqs. 7.63 and 7.64, can be written as

hxi D hOx0i cos!t C 1
me!

hOp0i sin!t

h pi D hOp0i cos!t � me! hOx0i sin!t;

where hOx0i and hOp0i are time-independent “Schrödinger” expectation values that can
be computed for a given state using any of the representations for the Schrödinger
coordinate and momentum operators. Similarly, I can find for the expectation values
of the squared operators

7.1 One-Dimensional Harmonic Oscillator 223

˝Ox2˛ D ˝Ox20
˛
cos2 !t C 1

2me!
.hOx0 Op0i C hOp0 Ox0i/ sin 2!t C 1

m2e!
2

˝Op20
˛
sin2 !t

˝Op2˛ D ˝Op20
˛
cos2 !t � 1

2
me! .hOx0 Op0i C hOp0 Ox0i/ sin 2!t C m2e!2

˝Ox20
˛
sin2 !t

which will yield the following for the uncertainties:

.4x/2 D .4Ox0/2 cos2 !t C 1
m2e!

2
.4Op0/2 sin2 !tC

1

2me!
Œ.hOx0 Op0i C hOp0 Ox0i � 2 hOx0i hOp0i/� sin 2!t

.4p/2 D .4Op0/2 cos2 !t C m2e!2 .4Ox0/2 sin2 !t�
1

2
me! Œ.hOx0 Op0i C hOp0 Ox0i � 2 hOx0i hOp0i/� sin 2!t:

I already mentioned it once, but it is worth emphasizing again: all expectation values
in this expression refer to Schrödinger operators and can be computed using any
of the representations for the latter. Let me illustrate this point by considering the
following example.

Example 19 (Harmonic Oscillator with Shifted Minimum of the Potential) Consider
a harmonic oscillator with mass me and frequency ! in the ground state. Suddenly,
without disruption of the oscillator’s state, the minimum of the potential shifts by
d along the axes of oscillations and the stiffness of the potential changes such that
it is now characterized by a new classical frequency �. Find the expectation value
and uncertainty of coordinate and momentum of the electron in the potential with
the new position of its minimum.

It is convenient to solve this problem using coordinate representation for the
initial state and for the Schrödinger operators Ox and Op. First of all, let’s agree to place
the origin of the X-axis at the new position of the minimum. Then, the initial wave
function, which is the ground state wave function of the oscillator with potential in
the original position, is

0.x/ D
�me!
�„

�1=4
exp

�
�me!
2„ .x C d/

2
�
;

where x is counted from the new position of the potential. The expectation values of
the Schrödinger operators hOx0i and hOp0i are

hOx0i D
r

me!

�„

1̂

�1
x exp

�
�me!„ .x C d/

2
�

D

224 7 Harmonic Oscillator Models

r
me!

�„

1̂

�1
.x � d/ exp

�
�me!„ x

2
�

D �d

where I made a substitution of variables and took into account that the wave function
of the initial state is even. Similarly, I can find

hOp0i D �i„
r

me!

�„

1̂

�1
exp

�
�me!
2„ .x C d/

2
�

�

d

dx

h
exp

�
�me!
2„ .x C d/

2
�i

dx D

i„
r

me!

�„
me!

„

1̂

�1
exp

�
�me!„ .x C d/

2
�
.x C d/ dx D 0:

Thus, I have for the time-dependent expectation values

hxi D �d cos�t
h pi D me�d sin�t:

(Obviously, the dynamics of raising and lowering operators is defined by new
frequency �.) In order to find the respective uncertainties, I need to compute
4Ox0, 4Op0, and hOx0 Op0i. The uncertainties of the regular Schrödinger coordinate and
momentum operators do not depend on the position of the potential minimum with
respect to the origin of the coordinate axes, so I can simply recycle the results from
Eqs. 7.34 and 7.35:

4Ox0 D
s

„
2me!

I 4Op0 D
r

„me!
2

:

For the last expectation value, hOx0 Op0i, I will actually have to do some work:

hOx0 Op0i D �i„
�me!
�„

�1=2 �
1̂

�1
x exp

�
�me!
2„ .x C d/

2
� d

dx

h
exp

�
�me!
2„ .x C d/

2
�i

dx D

i„
�me!
�„

�1=2 �me!
„
� 1̂

�1
x .x C d/ exp

�
�me!„ .x C d/

2
�

dx D

7.1 One-Dimensional Harmonic Oscillator 225

i„
�me!
�„

�1=2 �me!
„
� 1̂

�1
x .x � d/ exp

�
�me!„ x

2
�

dx

D i„
�me!

„
� „

2me!

D i„
2
:

In the last line of this expression, I took into account that the integral with the
linear in x factor vanishes because of the oddness of the integrand, while the integral
containing x2 together with the normalization factor of the wave function reproduces
the uncertainty of the coordinate .4Ox0/2. If you are spooked by the imaginary result
here, you shouldn’t. Operator Ox0 Op0 is not Hermitian, and its expectation value does
not have to be real. To complete this calculation, I would have to compute hOp0 Ox0i,
but I will save us some time and use the canonical commutation relation ŒOx; Op� D i„
to find

hOx0 Op0i C hOp0 Ox0i D 2 hOx0 Op0i � i„ D 0:

Oops, so much efforts to get zero in the end? Feeling disappointed and a bit cheated?
Well, you should be, because we could have guessed that the answer here is zero
without any calculations. Indeed, the momentum operator contains imaginary unity
in it, and with the wave function being completely real, this imaginary factor is not
going anywhere. But, on the other hand, the result must be real because Ox0 Op0 C Op0 Ox0
is a Hermitian operator. So, the only conclusion a reasonable person can draw from
this conundrum is that the result must be zero. Thus, we finally have for the time-
dependent uncertainties:

.4x/2 D „
2me!

cos2 �t C 1
m2e�

2

„me!
2

sin2 �t D „
2me!

cos2 �t C !

2

�2
sin2 �t

.4p/2 D „me!
2

cos2 !t C m2e�2
„

2me!
sin2 !t D „me!

2

cos2 �t C �

2

!2
sin2 �t

;

and for their product

.4x/2 .4p/2 D „
2

4

cos4 �t C sin4 �t C

!2

�2
C �

2

!2

cos2 �t sin2 �t

�
D

„2
4

cos4 �t C sin4 �t C 2 cos2 �t sin2 �t C

!2

�2
C �

2

!2
� 2

cos2 �t sin2 �t

�
D

„2
4

1C

!2

�2
C �

2

!2
� 2

cos2 �t sin2 �t

�
;

where in the second line I added and subtracted term 2 cos2 �t sin2 �t and
in the third line used identity cos4 �t C sin4 �t C 2 cos2 �t sin2 �t D

226 7 Harmonic Oscillator Models

�
cos2 �t C sin2 �t�2 D 1. Function y C 1=y, which appears in the final result,

has a minimum at y D 1 (� D !), at which point the product of the uncertainties
becomes „2=4. For all other relations between the two frequencies, the product of
the uncertainties exceeds this value in full agreement with the uncertainty principle.
It is interesting to note that as the uncertainties oscillate, their product returns to its
minimum value at times tn D �n=.2�/.

In the Schrödinger picture, the dynamics of quantum systems is described by
the time dependence of the vectors representing quantum states. For the initial state
given by Eq. 7.57, the time-dependent state can be presented as (see Eq. 4.15)

j�.t/i D
1X

nD0
cne

�i!.nC1=2/ jni : (7.65)

Computing the expectation value of the coordinate with this state and using again the
representation of the coordinate operator in terms of lowering and raising operators,
I have

hxi D
s

„
2me!

" 1X
nD0

1X
mD0

c�mcnei!.m�n/t hmj Oa jni C

1X
nD0

1X
mD0

c�mcnei!.m�n/t hmj Oa� jni
#

D
s

„
2me!

" 1X
nD0

1X
mD0

c�mcnei!.m�n/t
p

nım;n�1 C

1X
nD0

1X
mD0

c�mcnei!.m�n/t
p

n C 1ım;nC1
#

D
s

„
2me!

"
e�i!t

1X
mD0

c�mcmC1
p

m C 1C ei!t
1X

mD0
c�mC1cm

p
m C 1

#
(7.66)

in full agreement with Eq. 7.61 obtained using the Heisenberg representation. What
is interesting about this result is that in the beginning of the computations, we had
complex exponential functions with all frequencies ! .m � n/. However, after the
matrix elements of the lowering and raising operators have been taken into account,
only terms with a single frequency ! survived. In the Heisenberg approach, frequen-
cies ! .m � n/ never appear because the properties of Oa and Oa� are incorporated from
the very beginning at the level of the Heisenberg equations. Similar expressions can
be easily derived for the expectation values of the momentum operator:

h pi D i
r

„me!
2

"
ei!t

1X
mD0

c�mC1cm
p

m C 1 � e�i!t
1X

mD0
c�mcmC1

p
m C 1

#
;

7.1 One-Dimensional Harmonic Oscillator 227

while generic expressions for the uncertainties of the coordinate and momentum
operators in the Schrödinger picture are much more cumbersome and are more
difficult to derive. Thus, I will illustrate the derivation of the uncertainties for the
time-dependent states in the Schrödinger picture with the same example 18, which
was previously solved in the Heisenberg picture.

Example 20 (Uncertainties of the Coordinate and Momentum of the Quantum
Harmonic Oscillator in the Schrödinger Picture) Let me remind you that we are
dealing with a harmonic oscillator prepared in a state

j˛0i D 1p
2
.j0i C j1i/ ;

and we want to compute the uncertainties of the coordinate and momentum
operators at an arbitrary time t using the Schrödinger picture.

Comparing the expression for the initial state with Eq. 7.66, expansion coeffi-
cients cn in Eq. 7.65 can be identified as c0 D c1 D 1=

p
2while all other coefficients

vanish. Thus, the time-dependent state vector now becomes

j�.t/i D 1p
2

exp

�1
2
!t

j0i C exp

�3
2
!t

j1i
�
:

The expectation values are immediately found from Eq. 7.66 to be as before

hxi D 1
2

s
„

2me!
cos!tI

h pi D �1
2

r
„!me
2

sin!t:

To find the uncertainties, I need

Ox2 D „
2me!

�
Oa2 C �Oa��2 C OaOa� C Oa� Oa

�

Op2 D �„!me
2

�
Oa2 C �Oa��2 � OaOa� � Oa� Oa

�
:

Using again the properties of the lowering and raising operators, I find

Ox2 j�.t/i D „
2
p
2me!

p
2 exp

�1
2
!t

j2i C exp

�1
2
!t

j0i C

p
6 exp

�3
2
!t

j3i C 3 exp

�3
2
!t

j1i :

228 7 Harmonic Oscillator Models

Now, premultiplying this result by h�.t/j and using orthogonality of the eigenvec-
tors, I find

˝Ox2˛ D „
me!

in complete agreement with the results obtained in the Heisenberg picture. I will
leave computing of the result for the momentum operator to you.

7.2 Isotropic Three-Dimensional Harmonic Oscillator

Using the concept of normal coordinates, any three-dimensional (or even multi-
particle) harmonic oscillator can be reduced to the collection of one-dimensional
oscillators with total Hamiltonian being the sum of one-dimensional Hamiltonians.
The spectrum of eigenvalues in this case is obtained by simply summing up the
eigenvalues of each one-dimensional component, and the respective eigenvectors
are obtained as direct product of one-dimensional eigenvectors. To illustrate this
point, consider a Hamiltonian of the form

OH D Op
2
x

2mex
C Op

2
y

2mey
C Op

2
z

2mez
C 1
2

�
mex!

2
x Ox2 C mey!2y Oy2 C mez!2z Oz2

�
(7.67)

D OHx C OHy C OHz:

I can define a state characterized by three quantum numbers
ˇ̌
nx; ny; nz

˛
which can

be considered as a “product” of the one-dimensional eigenvectors defined in the
previous section

ˇ̌
nx; ny; nz

˛ � jnxi
ˇ̌
ny
˛ jnzi, where the last notation does not presume

any kind of actual “multiplication” but just serves as a reminder that the x-dependent
part of the Hamiltonian 7.67 acts only on the jnxi portion of the eigenvector, the OHy
acts only on

ˇ̌
ny
˛
, and so on. Thus, as a result, I have

� OHx C OHy C OHz
�

jnxi
ˇ̌
ny
˛ jnzi D

„!x

nx C 1

2

C „!y

ny C 1

2

C „!z

nz C 1

2

�
jnxi

ˇ̌
ny
˛ jnzi ;

where nx;y;z independently take integer values starting from zero. The position
representation of the eigenvectors is obtained as

'nx;ny;nz.x; y; z/ D hx; y; z
ˇ̌
nx; ny; nz

˛ � hx jnxi h y
ˇ̌
ny
˛ hz jnzi D

'nx.x/'ny. y/'nz.z/;

where each 'ni.ri/ is given by Eq. 7.44.

7.2 Isotropic Three-Dimensional Harmonic Oscillator 229

In the most general case, when the parameters in OHx, OHy, and OHz are all different,
we end up with distinct eigenvalues characterized by three independent integers.
The energy of the ground state is characterized by nx D ny D nz D 0 and is given
by E0;0;0 D 12„

�
!x C !y C !z

�
.

If, however, all masses and all three frequencies are equal to each other so that
the Hamiltonian becomes

OH D Op
2
x C Op2y C Op2z
2me

C 1
2

mex!
2
�Ox2 C Oy2 C Oz2� ; (7.68)

a new phenomenon emerges. The energy eigenvalues are now given by

Enx;ny;nz D „!
3

2
C nx C ny C nz

;

and it takes the same values for different eigenvectors as long as respective indexes
obey condition n D nx Cny Cnz. In other words, the eigenvalues in this case become
degenerate—several distinct vectors belong to the same eigenvalue. The number of
degenerate eigenvectors is relatively easy to compute: for each n you can choose nx
to be anything between 0 and n, and once nx is chosen, ny can be anything between
0 and n � nx, so there are n � nx C 1 choices. Once nx and ny are determined, the
remaining quantum number nz becomes uniquely defined. Thus, the total number of
choices of nx and ny for any given n can be found as

nxDnX
nxD0

.n � nx C 1/ D .n C 1/ .n C 1/ � n.n C 1/=2 D .n C 1/.n C 2/=2:

This degeneracy can be easily traced to the symmetry of the system, which has
emerged once I made the parameters of the oscillator independent of the direction.

7.2.1 Isotropic Oscillator in Spherical Coordinates

Even though we already know the solution to the problem of an isotropic harmonic
oscillator, it is instructive to reconsider it by working in the position representation
and using the spherical coordinate system instead of the Cartesian one. The position
representation of the Hamiltonian in this case becomes

OH D � „
2

2me
r2 C 1

2
me!

2r2 D

� „
2

2mer2
@

@r

r2
@

@r

C

OL2
2mer2

C 1
2

me!
2r2; (7.69)

230 7 Harmonic Oscillator Models

where in the second line I used Eq. 5.62 representing Laplacian operator in terms of
the radial coordinate r and operator OL2. It is obvious that the Hamiltonian commutes
with both OL2 and OLz so that the eigenvectors of the Hamiltonian in the position
representation can be written as

nr ;l;m.r; �; '/ D Yml .�; '/Rnr ;l.r/: (7.70)

Substituting Eq. 7.70 into the time-independent Schrödinger equation

OH D E

with Hamiltonian given by Eq. 7.69, you can derive for the radial function Rnr ;l.r/:

� „
2

2mer2
d

dr

r2
@Rnr ;l
@r

C „

2l.l C 1/
2mer2

Rnr ;l C
1

2
me!

2r2Rnr ;l D El;nr Rnr ;l: (7.71)

It is convenient to introduce an auxiliary function unr ;l.r/ D rRnr ;l, which, when
inserted into the radial equation above, turns it into

� „
2

2me

d2unr ;l
dr2

C „
2l.l C 1/
2mer2

unr ;l C
1

2
me!

2r2unr ;l D El;nr unr ;l: (7.72)

Equation 7.72 looks exactly like a one-dimensional Schrödinger equation with
effective potential:

Veff D „
2l.l C 1/
2mer2

C 1
2

me!
2r2:

The plot of this potential (Fig. 7.3) shows that it possesses a minimum

Vmineff D „!
p

l.l C 1/

Fig. 7.3 The schematic of
the effective potential for the
radial Schrödinger equation
for isotropic 3-D harmonic
oscillator in arbitrary units

7.2 Isotropic Three-Dimensional Harmonic Oscillator 231

at

r2min D
„

me!

p
l.l C 1/:

(Of course, you do not need to plot this function to know that it has a minimum—
just compute the derivative and find its zero.) For any given l, the allowed values
of energy obeying inequality E > „!pl.l C 1/ correspond to the classical bound
motion; thus all energy levels in this effective potential are discrete (which, of
course, is nobody’s surprise, but still is a nice fact to know). States with l D 0 are
described by the Schrödinger equation, which is the exact replica of the equation
for the one-dimensional oscillator. You, however, should not rush to pull out of
the drawer old dusty solutions of the one-dimensional problem (OK, not that old
and dusty, but still). A huge difference with the purely one-dimensional case is the
fact that the domain of the radial coordinate r is Œ0;1�, unlike the domain of a
coordinate in the one-dimensional problem, which is Œ�1;1�. Consequently, the
wave function unr ;l must obey a boundary condition at r D 0. Given that the actual
radial function Rnr ;l must remain finite at r D 0, it is clear that unr ;l(0)=0. Now you
can go ahead, brush the dust from the solutions of the one-dimensional harmonic
oscillator problem, and see which fit this requirement. A bit of examination reveals
that we have to throw out all even solutions with quantum numbers 0; 2; 4 � � �
which do not satisfy the boundary condition at the origin. At the same time,
all odd solutions, characterized by quantum numbers 1; 3; 5 � � � , satisfy both the
Schrödinger equation and the newly minted boundary condition at r D 0, so they
(restricted to the positive values of the coordinate) do represent eigenvectors of the
isotropic oscillator with zero angular momentum.

Solving this problem with l > 0 requires a bit more work. To make it somewhat
easier to follow, I will begin by introducing a dimensionless radial coordinate
& D r=� , where � D p„=me! is the same length scale that was used in the one-
dimensional problem. The Schrödinger equation rewritten in this variable becomes

�„!
2

d2unr ;l
d&2

C „!l.l C 1/
2&2

unr ;l C
1

2
„!&2unr ;l D El;nr unr ;l

d2unr ;l
d&2

� l.l C 1/
&2

unr ;l � &2unr ;l C �l;nr unr ;l D 0 (7.73)

where I introduced dimensionless energy

�l;nr D 2El;nr=„!:

The resulting differential equation obviously does not have simple solutions
expressible in terms of elementary functions. One of the approaches to solving
it is to present a solution in the form of a power series

P
cj& j and search for

unknown coefficients cj. In principle, knowing these coefficients is equivalent to

232 7 Harmonic Oscillator Models

knowing the entire function. Before attempting this approach, however, it would be
wise to try to extract whatever information about the solution this equation might
contain. For instance, you can ask about the solution’s behavior at very small and
very large values of & . When & � 1, the main term in Eq. 7.73 is the angular
momentum contribution to the effective potential. Neglecting all other terms, you
end up with an equation

d2unr ;l
d&2

� l.l C 1/
&2

unr ;l D 0; (7.74)

which has a simple power solution

unr ;l D A& lC1: (7.75)

You are welcome to plug it back in Eq. 7.74 and verify it by yourself. For those
who think that I used magical divination to arrive at this result, I have disappointing
news: Eq. 7.74 belongs to a well-known class of so-called homogeneous equations.
This means that if I multiply & by an arbitrary constant factor �, the equation does
not change (check it), with a consequence that if function u.&/ is the solution, so
is function u .�&/. Such equations are solved by power functions u / &%, where
power % is found by plugging this function into the equation.

In the limit of large &
1, the main contribution to Eq. 7.73 comes from the
harmonic potential. We know from solving the one-dimensional problem that the
respective wave functions contain an exponential term exp

��Qx2=2� for x direction
and similar terms for two other coordinates. When multiplying all these wave
functions together to obtain a three-dimensional wave function, these exponential
terms turn into exp

��&2=2�; thus it is natural to expect that the radial function unr ;l
will contain such a factor as well. To verify this assumption, I am going to substitute
exp

��&2=2� into Eq. 7.73 and see if it will satisfy the equation, at least in the limit
& ! 1. Neglecting all terms small compared to &2; I find

d2u

d&2
D �e�&2=2 C &2e�&2=2 � &2e�&2=2:

Substituting this result in Eq. 7.73, and neglecting all terms except of the harmonic
potential, I find that this function is, indeed, an asymptotically accurate solution
of this equation. I want you to really appreciate this result: in order to reproduce
exponential decay of the wave function, which, by the way, almost ensures its
normalizability, using a power series, we would have to keep track of all the infinite
number of terms in it, which is quite difficult if not outright impossible. By pulling
out this exponential term as well as the power law for small & , you might entertain
some hope that the remaining dependence on & is simple enough to be dug out.

Thus, my next step is to present function unr ;l .&/ as

ul;nr .&/ D A& lC1 exp
��&2=2� vl;nr .&/ (7.76)

7.2 Isotropic Three-Dimensional Harmonic Oscillator 233

and derive a differential equation for the remaining function vnr ;l .&/. To this end, I
first compute

d2unr ;l
d&2

D d
d&

�
.l C 1/ & l exp ��&2=2� vnr ;l .&/�

& lC2 exp
��&2=2� vnr ;l .&/C & lC1 exp

��&2=2� dvnr ;l
d&

�
D

l.l C 1/& l�1 exp ��&2=2� vnr ;l .&/ � .l C 1/ & lC1 exp
��&2=2� vnr ;l .&/C

.l C 1/ & l exp ��&2=2� dvnr ;l
d&

� .l C 2/ & lC1 exp ��&2=2� vnr ;l .&/C

& lC3 exp
��&2=2� vnr ;l .&/ � & lC2 exp

��&2=2� dvnr ;l
d&

C

.l C 1/ & l exp ��&2=2� dvnr ;l
d&

� & lC2 exp ��&2=2� dvnr ;l
d&

C

& lC1 exp
��&2=2� d

2vnr ;l

d&2
D

exp
��&2=2� & l�1vnr .&/

�
l.l C 1/ � &2.2l C 3/C &4�C

& l exp
��&2=2� dvnr

d&

�
2l C 2 � 2&2�C & lC1 exp ��&2=2� d

2vnr
d&2

:

Frankly speaking, I did not have to torture you with these tedious calculations: such
computational platforms as Mathematica or Maple work with symbolic expressions
and can perform this computation faster and more reliably (and, yes, I did check my
result against Mathematica’s). Substituting this expression to Eq. 7.73, I get (and
here you are on your own, or you can try computer algebra to reproduce this result)

&
dv2nr ;l
d&2

C 2 �l C 1 � &2� dvnr;l
d&

C &vnr;l .&/ .�l;nr � 2l � 3/ D 0: (7.77)

Now I can start solving this equation by presenting the unknown function vnr ;l .&/
as a power series and trying to find the corresponding coefficients:

vnr ;l .&/ D
1X

jD0
cj&

j: (7.78)

The goal is to plug this expression into Eq. 7.77 and collect coefficients in front of
equal powers of & . First, I blindly substitute the series into Eq. 7.77 and separate all
sums with different powers of & :

234 7 Harmonic Oscillator Models

1X
jD0

cjj. j � 1/& j�1 C 2.l C 1/
1X

jD0
cjj&

j�1 � 2
1X

jD0
cjj&

jC1C

.�l;nr � 2l � 3/
1X

jD0
cj&

jC1 D 0:

Combining the first two and last two sums, I get

1X
jD0

j Œ j � 1C 2l C 2� cj& j�1 C
1X

jD0
Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

Next I notice that in the first sum, contributions from terms with j D 0 vanish, so
that this sum starts with j D 1. I can reset the count of the summation index back to
zero by introducing new index k D j � 1, so that this sum becomes

1X
kD0

ckC1 .k C 1/ .k C 2l C 2/& k:

Renaming k back to j (this is a dummy index, so you can call it whatever you want,
it does not care), we rewrite the previous equation as

1X
jD0
. j C 1/ Œ j C 2l C 2� cjC1& j C

1X
jD0

Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

The first sum in this expression begins with &0 term multiplied by coefficient c1.
The second sum, however, begins with linear in & term and does not contain &0 at
all. To satisfy the equation, coefficients in front of each power of & must vanish
independently of each other, so we have to set c1 D 0. This makes the first sum
again to start with j D 1. Utilizing the same trick as before, I am replacing j with
j C 1 while restarting count from new j D 0 again. The result is as follows:

1X
jD0
. j C 2/ Œ j C 2l C 3� cjC2& jC1 C

1X
jD0

Œ�l;nr � 2l � 3 � 2j� cj& jC1 D 0:

Now I can, finally, combine the two sums and equate the resulting coefficient in
front of & jC1 to zero:

. j C 2/ Œ j C 2l C 3� cjC2 D Œ2l C 3C 2j � �l;nr � cj

7.2 Isotropic Three-Dimensional Harmonic Oscillator 235

or

cjC2 D 2l C 3C 2j � �l;nr
. j C 2/ Œ j C 2l C 3�cj: (7.79)

This is a so-called recursion relation, which allows computing all expansion
coefficients recursively starting with the first one. It is important to note that Eq. 7.79
connects only coefficients with indexes of the same parity: all coefficients with
even indexes are expressed in terms of c0, and all coefficients with odd indexes
are expressed in terms of c1: But, wait, have not we determined a few lines back that
c1 D 0? Actually, we did determine that, and now, thanks to Eq. 7.79, I can establish
that not only c1 but all coefficients with odd indexes are zeroes. So, it looks like I
achieved the announced goal—finding all coefficients in the power series expansion
of vnr ;l. Formally speaking, I did, indeed, but it is a bit too early to dance around
the fire and celebrate. First, I still do not know what values of the dimensionless
energy �l;nr correspond to the respective eigenvectors, and, second, I have to verify
that the found solution is, indeed, normalizable. The last issue is not trivial because
we are dealing with an infinite series here, so there are always questions about its
convergence and the behavior of the function it represents. As I shall demonstrate
now, both these questions are connected and will be answered together.

Whether a function is normalizable or not is determined by its behavior for large
values of its argument. I pulled out an exponentially decreasing factor from the
solution hoping that it would be sufficient to guarantee normalization, but to be
sure I need to consider the behavior of vnr ;l at & ! 1. Any finite number of
terms in the expansion 7.78 cannot overcome the exponentially decreasing factor
exp

��&2=2�, so the anticipated danger can only come from the tail of the power
series, i.e., from coefficients cj with j ! 1. In this limit the recursion relation 7.79
can be simplified to

cjC2 � 2
j

cj; (7.80)

which, when applied repeatedly, yields

c2j0C2N D
22N

2j0.2j0 C 2/ � � � .2j0 C 2N/c2j0 D
1

j0 . j0 C 1/ � � � . j0 C N/c2j0 :

When writing this expression, I explicitly took into account that there are only even
indexes, which can be presented as 2j0C2k with the total number of recursive factors
being 2N. Even though this expression is only valid for j0
1, I can extend it to
all values of j0 because as I pointed out earlier, any finite number of terms in the
power series would not affect its asymptotic behavior. That means that the large &
behavior of the series in question is the same as that of the series:

1X
jD0

&2j

jŠ
D e&2 :

236 7 Harmonic Oscillator Models

Even after combining this result with exp
��&2=2� factor, which was pulled out

earlier, I still end up with function unr ;l .&/ behaving as exp
�
&2=2

�
at infinity. What

a bummer! It is disappointing, but not really surprising: it is easy to check that
exp

�
&2=2

�
is the second possible asymptotic solution of Eq. 7.73, which I choose

to discard because of its non-normalizable nature. Well, this is how it often is—you
chase math out of the door, but it always comes back through the window to bite
you. So, the question now is if there is anything I can do to save the normalizability
of our solution. That light at the end of the tunnel will appear if you recall that
any power series with a finite number of terms cannot overpower an exponentially
decreasing function. Therefore, if I find a way to terminate the series at some finite
number of terms, our conundrum will be resolved. To see how this is possible, let’s
take another look at the recursion relation, Eq. 7.79. What if at some value of j,
which I will call 2nr to emphasize its evenness, the numerator of this relation turns
zero? If this were to happen, then coefficient cjmxC2 would vanish and vanquish
all subsequent coefficients as well, so that instead of an infinite series, I will end
up with a finite sum. This will surely guarantee the normalizability of the found
solution. The condition for the numerator to vanish reads as

2l C 3C 4nr � �l;nr D 0

which is immediately recognizable as an equation for the dimensionless energy �l;nr !
While resolving the normalization problem, I just automatically solved finding the
eigenvalue problem. Using

�l;nr D 3C 2.l C 2nr/

as well as the relation between �l;nr and actual energy eigenvalues, I obtain

El;nr D „!
3

2
C l C 2nr

:

Thus, for each l and nr, you have an energy value and a respective wave function

ul;nr .&/ D & lC1 exp
��&2=2�

2nrX
jD0

cj&
j (7.81)

where coefficients cj are given by Eq. 7.79. To get a better feeling for this result,
consider a few special examples.

1. nr D 0. In this case the sum in Eq. 7.81 contains a single term c0, so the non-
normalized wave function becomes

ul;0 .&/ D c0& lC1 exp
��&2=2�

with respective energy value El;0 D „!
�
3
2

C l�.

7.2 Isotropic Three-Dimensional Harmonic Oscillator 237

2. nr D 1. Using Eq. 7.79 with �l;nr D 3C 2.l C 2/, I find for c2 (substituting j D 0
into Eq. 7.79):

c2 D 2l C 3 � .3C 2l C 4/
2 Œ2l C 3� c0 D �

2

2l C 3c0

so that

ul;1 .&/ D c0& lC1 exp
��&2=2�

1 � 2&

2

2l C 3
:

Following this pattern you can compute the wave functions belonging to any
eigenvalue. For higher energy eigenvalues, it would take more time and efforts,
of course, but you can always give this task to a computer. Before finishing this
section, I would like to note that the energy eigenvalues depend only on the sum
l C 2nr rather than on each of these quantum numbers separately. It makes sense,
therefore, to introduce a main quantum number n D lC2nr and use it to characterize
energy values:

En D „!
3

2
C n

: (7.82)

Then, the radial wave functions will be labeled by indexes l and n with a requirement
n � l D 2nr � 0, while the total wave function includes spherical harmonics and an
additional index m. In actual physical variables, it becomes

n;l;m D 1
�

r

�

l
exp

� r

2

2�2

n�lX
jD0

cj

r

�

j
Yml .�; '/ (7.83)

where I reintroduced radial function Rnr ;l D unr;;l=r. This function is not normalized
until the value of coefficient c0 in its radial part is defined, but I am not going to
bother you with that. Instead, I will compute the degree of degeneracy of an energy
eigenvalue characterized by main number n, which is much more fun. Taking into
account that for each l there are 2l C 1 possible values of m, and that l runs all the
way down from n in increments of 2 (n � l must remain an even number), the total
number of states with given n is

X
.2l C 1/ D .n C 1/C n.n C 1/=2 D .n C 1/ .n C 2/ =2;

where the summation over l is carried with increments of 2. It is a nice feeling
to realize that this expression for degeneracy agrees with the one obtained using
Cartesian coordinates.

238 7 Harmonic Oscillator Models

The resulting expression for the wave function given by Eq. 7.83 is an alternative
way to produce a position representation of the harmonic oscillator wave function
and is quite remarkably different from the one obtained using Cartesian coordinates.
One might wonder why it is at all possible to have such distinct ways to represent a
same eigenvector. After all, isn’t a representation, once chosen, supposed to provide
a unique way to describe a quantum state? The matter of fact is that it is, indeed,
so only if a corresponding eigenvalue is non-degenerate. In the degenerate case, one
can form an infinite number of the linear combinations of the eigenvectors, and any
one of them will realize the same representation of the corresponding state. In the
case of isotropic harmonic oscillator, it means that the wave functions expressed
in spherical coordinates can be presented as linear combinations of their Cartesian
counterparts and vice versa.

7.3 Quantization of Electromagnetic Field and Harmonic
Oscillators

7.3.1 Electromagnetic Field as a Harmonic Oscillator

Even though the idea of photons—the quanta of electromagnetic field—was one
of the first quantum ideas introduced into the conscience of physicists by Einstein
in 1905,1 the full quantum description of electromagnetic field turned out to
be a rather difficult problem. The first serious attempt in developing quantum
electrodynamics was undertaken by Paul Dirac in his famous 1927 paper,2 which
was just the beginning of a long and difficult path walked by too many brilliant
physicists to be mentioned in this book. Here are just a few names of those who
made critical theoretical contributions to this field: German-American Hans Bethe,
Japanese Sin-Itiro Tomonaga, and Americans Julian Schwinger, Richard Feynman,
and Freeman Dyson. Quantum electrodynamics is a difficult subject addressed in
multiple specialized books and is beyond the scope of this text. Nevertheless, I
would love to scratch a bit from the surface of this field and demonstrate how
ideas developed in the course of studying the harmonic oscillator emerge in new
and unexpected places.

1The irony is that an explanation of photoelectric effect did not require the quantization of light
despite what you might have read or heard. All experimental data could have been explained
treating light classically while describing electrons in metals by the Schrödinger equation.
Fortunately Einstein did not have the Schrödinger equation in 1905 and couldn’t know that. The
science does evolve in mysterious ways: Einstein’s erroneous idea about the photoelectric effect
inspired de Broglie and Schrödinger and brought about the Schrödinger equation, which could
have been used to disprove the idea. Compton’s effect, on the other hand, can indeed be considered
as a proof of reality of photons.
2P.A.M. Dirac, The quantum theory of the emission and absorption of radiation. Proc. R. Soc.
Lond. 114, 243 (1927).

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 239

To this end, I propose considering a toy model of electromagnetic field, in which
the field is described by single components of electric and magnetic fields:

Ex D aE0.t/ sin kz (7.84)

By D �1
c

aB0.t/ cos kz (7.85)

where I introduced a normalization coefficient a to be defined later; extra factor
1=c, where c is the speed of light in vacuum, in the formula for the magnetic field,
ensures that amplitudes E0 and B0 have the same dimension (you might remember
from the introductory course on electromagnetism relation E D cB between electric
and magnetic fields in a plane wave), and the negative sign is included for future
convenience. The Maxwell equations for the electromagnetic field in this simplified
case take the form

@Ex
@z

D �@By
@t

@By
@z

D � 1
c2
@Ex
@t
:

Plugging in the expressions for electric and magnetic fields given by Eqs. 7.84
and 7.85, you will find that the spatial dependence chosen for the fields in these
equations is indeed consistent with the Maxwell equations, which will be reduced
to the system of ordinary differential equations:

dB0
dt

D !E0.t/ (7.86)
dE0
dt

D �!B0.t/: (7.87)

Parameter ! appearing in these equations is defined as ! D ck. It is easy to see that
amplitudes of both electric and magnetic fields obey the same differential equation
as a harmonic oscillator. For instance, differentiating the first of these equations with
respect to time and using the second equation to replace the time derivative of the
electric field, you will get

d2B0
dt2

C !2B0 D 0:

Similar equation can be derived for E0. You can also notice that Eqs. 7.86 and 7.87
have some resemblance to the Hamiltonian equations of classical mechanics,
and this may make you wonder if they can be derived from some kind of a
Hamiltonian. If you are asking why on earth would I want to re-derive these
equations from a Hamiltonian, you were not paying attention to the first 130 pages of
the book. Hamiltonian formalism allows us to introduce canonical pairs of variables,

240 7 Harmonic Oscillator Models

which we can turn into operators obeying canonical commutation relations; thus a
Hamiltonian formulation is the key to turning classical theory of electromagnetic
field into the quantum one.

How would one go about introducing a Hamiltonian for the electromagnetic
fields? Naturally, one starts by remembering that Hamiltonian is the energy of the
system and that the energy of the electromagnetic field is given by

H D
ˆ

V

d3r

1

2
"0E2 C 1

2�0
B2
; (7.88)

where integration is carried over the entire space occupied by the field. However, if
you attempt to directly compute this integral using Eqs. 7.84 and 7.85 for electric
and magnetic fields, you will encounter a problem: the integral is infinite. This
happens because the field occupies the entire infinite space and does not decrease
with distance. To fix the problem, I introduce a large but finite region of volume
V D LzSxy, where Lz is the linear dimension of this region in z direction and Sxy is
the area of the limiting plane perpendicular to it, and assume that the field vanishes
outside of this region. This trick is very popular in physics, and you will encounter
it in different circumstances later in the book. It can be justified by noting that the
notion of a field occupying the entire space is by itself quite artificial with no relation
to reality. It is also natural to assume that the properties of the field far away from
the region of actual interest should not affect any observable phenomena, so that we
can choose them to be as convenient for us as possible.

With this in mind, I can write the integral in Eq. 7.88 as

H D a2Sxy
2
41
2
"0E20

Lˆ

0

dz sin2 kz C 1
2�0

B20
Lˆ

0

dz cos2 kz

3
5 D

1

4
a2"0SxyL

�E20 C B20
�
;

where I assumed that k satisfies condition kL D �n; n D 1; 2; � � � , making cos 2kz D
1 at both the lower and upper integration limits so that the respective terms cancel
out. Also, at the last step, I made a substitution .�0"0/

�1 D c2. You might, of
course, object to the artificial discretization of the wave number and imposition of
the arbitrary conditions on the values of the electric and magnetic fields at z D L.
So, what can I say in my defense? First, in the limit L ! 1, which I can make after
everything is said and done, the discretization will disappear, and as you will see in
a few short minutes, I will make the dependence on the volume which popped up in
the last expression for the Hamiltonian, disappear as well. Second, I can invoke the
same argument I just made when limiting the field to the finite volume: the behavior
of the field in any finite region of space shall not be affected by its values at an
infinitely remote plane. If you are still not convinced, I have my last line of defense:
it works!

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 241

Now, I am ready to fix the normalization parameter a introduced in Eqs. 7.84
and 7.85. For the reasons which will become clear later, I will choose it to be

a D
p
2!= ."0V/; (7.89)

so that the final expression for the energy of the field becomes

H D !
2

�E20 C B20
�
: (7.90)

Did you notice that the dependence on the volume in the Hamiltonian is gone? This
is fiction of course, because I have simply hidden it inside formulas for the fields,
but in all expressions concerned with actual physical observables, it will vanish in
all honesty.

Equation 7.90 looks very much like the Hamiltonian of a harmonic oscillator.
The first term can be interpreted as kinetic energy with E0 playing the role of the
canonical momentum and term 1=! replacing the mass, and the second term is an
analog of the potential energy with B0 as a conjugated coordinate (note that the
coefficient me!2=2 in the harmonic oscillator potential energy is reduced to !=2
factor in Eq. 7.90 if you replace me with 1=!). If you wonder why I chose the electric
field to represent the momentum and the magnetic field to be the coordinate, and not
vice versa, just compare Eqs. 7.86 and 7.87 with Hamiltonian equations 7.8 and 7.7,
paying attention to the placement of the negative sign in these equations. You can
easily see that the Hamiltonian equations reproduce Eqs. 7.86 and 7.87 justifying
this identification. But do not be fooled. Identifying magnetic field with coordinate
and electric field with momentum is, of course, a matter of convention resulting
from the choice to place the negative sign in Eq. 7.85.

The Hamiltonian formulation of the classical Maxwell equations allows me now
to introduce the quantum description of the fields. This is done by promoting E0 and
B0 to operators with the standard canonical commutation relation:

h OB0; OE0
i

D i„: (7.91)

As a result, the classical Hamiltonian, Eq. 7.90, becomes a Hamiltonian operator:

OH D !
2

� OE20 C OB20
�
: (7.92)

It is easy to see from Eq. 7.90 that both E0 and B0 have the dimension ofp
energy � time so that the dimension of the commutator on the left-hand side of

Eq. 7.91 is energy � time, which coincides with the dimension of Planck’s constant,
as it should. This result is not particularly surprising, of course, but it is always
useful to check your dimensions once in a while just to make sure that your theory
does not have any of the most basic problems. Using Eq. 7.91 together with Eqs. 7.84
and 7.85, I can compute the commutator of the non-zero components of the electric

242 7 Harmonic Oscillator Models

and magnetic fields, which, of course, are now also operators:

h OBy; OEx
i

D � i„!
"0cV

sin 2kz: (7.93)

One immediate consequence of this result is the uncertainty relation for these
components:

4By4Ex � „!
2"0cV

jsin 2kzj ; (7.94)

which shows that just like the coordinate and momentum, electric and magnetic
fields cannot both be known with certainty in the same quantum state.

Canonical commutator, Eq. 7.91, also indicates that in the representation using
eigenvectors of OB0 as a basis, in which states are represented by wave functions
dependent on the magnetic field amplitude OB0, the representation of the electric
field amplitude operator OE0 is

OE0 D �i„ @
@B0 ;

while the Hamiltonian takes the form

OH D !
2

�„2 @

2

@B20
C B20

:

Comparing this expression with the quantum Hamiltonian of the harmonic oscillator
in the coordinate representation, you can see that they are mathematically identical
if again you replace me with 1=!. The wave functions representing eigenvectors of
this Hamiltonian can be in this representation written down as

'n.B0/ D 1p
2nnŠ�em

p
�

exp

� B

2
0

2�2em

Hn

B0
�em

(7.95)

where the characteristic scale of the quantum fluctuations of the magnetic field,
�em, is determined solely by Planck’s constant �em D

p„ (this result follows from
Eq. 7.42 after the substitution me D 1=!). As with any wave function, j'n .B0/j2
determines the probability density function for the magnetic field amplitude.

While it is interesting to see how one can turn the coordinate representation
of the harmonic oscillator into the magnetic field representation of the quantum
electromagnetic theory, the practical value of this representation is quite limited.
Much more important, from both theoretical and practical points of view, is the
opportunity to introduce electromagnetic analogs of lowering and raising operators.
In order to distinguish these operators from those used in the harmonic oscillator
problem, I will use notation Ob and Ob� (do not confuse these operators with variables

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 243

b used in the description of the classical oscillator), where

Ob D
r
1

2„
OB0 C i

OE0p
2„ (7.96)

Ob� D
r
1

2„
OB0 � i

OE0p
2„ : (7.97)

Equations 7.96 and 7.97 are obtained from Eqs. 7.22 and 7.23 by setting me! D 1
and replacing Ox and Op by OB0 and OE0 correspondingly. Hamiltonian 7.92 expressed in
terms of these operators acquires a familiar form:

OH D „!
�Ob� Ob C 1=2

�
:

All commutators, which were computed in Sect. 7.1.1, remain exactly the same, so
I can simply reproduce the results from that section: the energy eigenvalues of the
electromagnetic field are given again by

En D „!

n C 1
2

; (7.98)

while eigenvectors can be constructed from the ground state j0i as

jni D 1p
nŠ

�Ob�
�n j0i : (7.99)

Formally, both these results are exactly the same as in the case of the harmonic
oscillator. However, the physical interpretation of the integer n in these expressions
and, therefore, of both energy values and eigenvectors is completely different.

Indeed, in the case of a harmonic oscillator, we have a material particle, which
can be placed in states with different energies, counted by the integer n. The
electromagnetic field, on the other hand, once created, carries a certain amount of
energy, and the same field cannot be made to have “more” energy. To produce a
field with higher energy, you need to increase its amplitude, i.e., add “more” field.
The discrete nature of allowed energy levels tells us that the energy of the field can
only be increased in finite increments: to go from a state of electromagnetic field
with energy En to the state with energy EnC1, you have to add a discrete “quantum”
of field with energy „!. This discrete energy quantum is what was introduced by
Einstein in 1905 as “das Lichtquantas.” Replacing the term “quantum of light”
with the term “photon,”3 you can say that number n is the number of photons in
a given state and that going from state jni to state jn C 1i amounts to generating

3It is interesting that the term “photon” was used for the first time in an obscure paper by an
American chemist Gilbert Lewis in 1926. His paper is forgotten, but the term he coined lives on.

244 7 Harmonic Oscillator Models

or creating an extra photon, while transitioning to state jn � 1i means removing
or annihilating a photon. To emphasize this point, operators Ob� and Ob are called in
the context of quantum electromagnetic field theory “creation” and “annihilation”
operators, respectively, rather than lowering and raising operators. The ground state
j0i in this interpretation is the state with zero photons and is called, therefore, the
vacuum state. A counterintuitive aspect of the vacuum state is that even though
it is devoid of photons, it still has non-zero energy, which in our oversimplified
model is just „!=2. To one’s mind it might appear as a nonsensical result: how
can zero photons have non-zero energy? I hope it will not blow your mind away
if I say that in a more complete theory, which takes into account multiple modes
(waves with different wave vectors k) of electromagnetic field, the “vacuum” energy
might become formally infinite. In order to wrap your mind around this weird result,
consider the following.

The photon is not just “a quantum of electromagnetic field” as you might
have read in popular books and introductory physics texts. The concept of a
“photon” has a quite specific mathematically rigorous meaning: a single photon
is an eigenvector of the electromagnetic Hamiltonian characterized by n D 1.
Eigenvectors characterized by higher values of n describe n-photon states. The states
described by eigenvectors of the Hamiltonian are not the states in which the electric
or magnetic field has any definite value. Moreover, the commutation relation,
Eq. 7.93, and following from it uncertainty relation 7.94 indicate that there are no
states in which electric and magnetic fields both have definite values. Moreover, in
the states with fixed photon numbers, the expectation values of electric and magnetic
fields are zeroes just like the expectation values of coordinate and momentum
operators of the mechanical harmonic oscillator. At the same time, the expectation
values of the squares of the fields are not zeroes, and these are the quantities which
determine the energy of the fields. These are what we call vacuum fluctuations of
electromagnetic field, where vacuum has, again, a very specific meaning—it is not
just emptiness or a void; it is a state with zero photons, which is not the same as a
state with zero field.

The second issue which needs to be discussed in connection with vacuum energy
is, again, the fact that a zero level of energy is always established arbitrarily. The
vacuum energy, which we found, is counted from the (non-existent in quantum
theory) state, in which both electric and magnetic fields are presumed to be zeroes.
As long as the energy of the vacuum state does not change, while the phenomena
we are interested in play out, we can set the vacuum energy to zero with no
consequences for any physically significant results. To provide a counterexample to
this statement, let me briefly describe a situation in which this assumption might not
be true. If you consider the electromagnetic field between two conducting plates,
the modes of the field and, therefore, its vacuum energy depend on the distance
between the plates. This distance can be changed, in which case the vacuum energy
also changes. Because of this capacity to change, it becomes relevant resulting in a
tiny but observable attractive force acting between the plates known as the Casimir
force. In most other situations, however, the vacuum energy is just a constant, whose
value (finite or infinite) has no physical significance.

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 245

Thus, the eigenvectors of the electromagnetic Hamiltonian representing states
with a definite number of photons, n, bear little resemblance to classical elec-
tromagnetic waves just like stationary states of the harmonic oscillator have no
relation to the motion of the classical pendulum. At the same time, in Sect. 7.1.2,
I demonstrated that a generic nonstationary state reproduces oscillations of the
expectation values of coordinate and momentum resembling those of their classical
counterparts. While this result is true for a generic initial state, and the behavior of
the expectation values to a large extent does not depend on their details, not all initial
states are created equal. However, to notice the difference between them, we have to
go beyond the expectation values and consider the uncertainties of both coordinate
and momentum or, in the electromagnetic context, of electric and magnetic fields.
The fact that different initial states result in different behavior of uncertainties
has already been demonstrated in the examples presented in Sect. 7.1.2. However,
out of all the multitude of various initial states, there exists one, for which these
uncertainties are minimized in a sense that their product has the smallest allowed by
the uncertainty principle value. In the electromagnetic case it means that the sign �
in Eq. 7.94 is replaced with D. These states are called “coherent” states, and they
are much more important in the electrodynamics rather than in mechanical context,
so this is where I shall deal with them.

7.3.2 Coherent States of the Electromagnetic Field

The coherent states are defined as eigenvectors of the annihilation operator:

Ob j˛i D ˛ j˛i : (7.100)

Since the annihilation operator is not Hermitian, you should not expect the
eigenvalues to be real, and we do not know yet if they are continuous or discrete.
I can, however, try to find the representation of vectors j˛i in the basis of the
eigenvectors jni of the electromagnetic Hamiltonian:

j˛i D
1X

nD0
cn jni ; (7.101)

where cn D hn j˛i. The Hermitian conjugation of Eq. 7.99 yields

hnj D 1p
nŠ

h0j
�Ob
�n

(7.102)

so that I can find for the expansion coefficients

246 7 Harmonic Oscillator Models

cn D 1p
nŠ

h0j
�Ob
�n j˛i D ˛

n

p
nŠ

h0j ˛i :

The only unknown quantity here is c0 D h0j ˛i, which I find by requiring that j˛i
is normalized, which means that

P
n jcnj2 D 1. Applying this last condition, I have

jc0j2
1X

nD0

j˛j2n
nŠ

D jc0j2 exp
�
j˛j2

�
D 1

where I recalled that
P
.xn=nŠ/ is a power series expansion for the exponential

function of x. Thus, choosing c0 to be real-valued, I have the following final
expression for the expansion coefficients:

cn D e� j˛j
2

2
˛np

nŠ
: (7.103)

Equation 7.103 together with Eq. 7.101 completely defines a coherent state with
eigenvalue ˛. Since the derivation of the eigenvector did not produce any restrictions
on ˛, it must be presumed to be a continuous complex-valued variable. The vector
that I found describes a state which is the superposition of states with different
numbers of photons and, respectively, with different energies. Respectively, the
number of photons in this case is a random quantity with a probability distribution
given by

pn D jcnj2 D e�j˛j2 j˛j
2n

nŠ
: (7.104)

Equation 7.104 describes a well-known probability distribution, called the Poisson
distribution, which appears in a large number of physical and mathematical
problems. This distribution describes the probability that n events will happen within
some fixed interval (of time or of distances) provided that the probability of each
event is independent of the occurrence of the others and all events are happening at a
constant rate (probability per unit time or unit length or unit volume does not depend
upon time or position). This distribution describes, for instance, the probability that
n atoms will undergo radioactive decay within some time interval or the number
of uniformly distributed non-interacting gas molecules that will be found occupying
some volume in space. For more examples of the Poisson distribution, just google it.
The entire Poisson distribution depends on a single parameter j˛j2, whose physical
meaning can be elucidated by computing the mean (or expectation value) of the
number of photons Nn˛ in the state j˛i:

Nn˛ D
1X

nD0
npn D e�j˛j2

1X
nD0

n
j˛j2n

nŠ
D

7.3 Quantization of Electromagnetic Field and Harmonic Oscillators 247

e�j˛j
2

1X
nD1

j˛j2n
.n � 1/Š D e

�j˛j2
1X

kD0

j˛j2.kC1/
kŠ

D

e�j˛j
2 j˛j2

1X
kD0

j˛j2k
kŠ

D e�j˛j2 j˛j2 e�j˛j D j˛j2 ;

where in the second line, I first took into account that the n D 0 term in the sum
is multiplied by n D 0 and, therefore, does not contribute. Accordingly I started
the sum with n D 1, after which I introduced a new index k D n � 1, which
reset the counter back to zero. As a result, I gained an extra term j˛j2, while the
remaining sum became just an exponential function canceling out the normalization
term e�j˛j2 . This calculation shows that j˛j2 has the meaning of the average
number of photons in the state with eigenvalue ˛. It is also interesting to compute

the uncertainty of the number of photons in this state 4n D
rD
.n � Nn˛/2

E
D

phn2i˛ � Nn2˛ . First, I compute
˝
n2
˛
˛
:

˝
n2
˛
˛

D e�j˛j2
1X

nD0
n2

j˛j2n
nŠ

D e�j˛j2
1X

nD1
n

j˛j2n
.n � 1/Š D

e�j˛j
2

1X
kD0

.k C 1/ j˛j2.kC1/
kŠ

D e�j˛j2 j˛j2
1X

kD0

j˛j2k
kŠ

C

e�j˛j
2 j˛j2

1X
kD0

k j˛j2k
kŠ

D j˛j2 C j˛j4 ;

where I used the same trick with the sum as above, twice. Now I can find that
4n D pNn˛ . The relative uncertainty of the photon numbers 4n=Nn˛ D 1=

pNn˛ and
becomes progressively smaller as the average number of photons increases. The
decrease of the quantum fluctuations signifies transition to classical behavior, and
one can suppose, therefore, that in the limit Nn˛
1, the electric and magnetic fields
in this state will reproduce behavior typical for a classical electromagnetic wave.
To verify this assumption, I will compute the expectation values and uncertainties
of the electric and magnetic fields for this state as well as will consider their time
dependence.

Reversing Eqs. 7.96 and 7.97, I find for the fields

OB0 D
r

„
2

�Ob C Ob�
�

(7.105)

OE0 D i
r

„
2

�Ob� � Ob
�
: (7.106)

248 7 Harmonic Oscillator Models

Taking squares of these expressions yields

OB20 D
„
2

�Ob2 C Ob�2 C ObOb� C Ob� Ob
�

D „
2

�Ob2 C Ob�2 C 2Ob� Ob C 1
�

(7.107)

OE20 D �
„
2

�Ob2 C Ob�2 � ObOb� � Ob� Ob
�

D �„
2

�Ob2 C Ob�2 � 2Ob� Ob � 1
�

(7.108)

where I changed the order of operators in ObOb� using the commutation relationhOb; Ob�
i

D 1. Now I am ready to tackle both the expectation values and the
uncertainties. The computation of the expectation values

D OB0
E

and
D OE0
E

is almost

trivial: taking into account that h˛j Ob j˛i D ˛ and h˛j Ob� j˛i D h˛j Ob j˛i� D ˛�, I
have

D OB0
E

D
r

„
2

�
˛ C ˛�� (7.109)

D OE0
E

D i
r

„
2

�
˛� � ˛� : (7.110)

The expectation values of the squares of the fields take just a bit more work:
before computing h˛j Ob� Ob j˛i, I first need to realize that the Hermitian conjugate
of expression Ob j˛i D ˛ is h˛j Ob� D ˛�. With this little insight, the rest of the
computation is as trivial as that for the expectation values. The result is

D OB20
E

D „
2

�
˛ C ˛��2 C „

2
(7.111)

D OE20
E

D „
2

�
˛� � ˛�2 C „

2
: (7.112)

Finally, the uncertainties of both fields are found to be independent of ˛ and equal to

4 OB0 D 4 OE0 D
r

„
2

so that their product indeed is the smallest allowed by the uncertainty principle

4 OB04 OE0 D „=2. Relative uncertainties 4 OB0=
D OB0

E
diminish with the increase in

j˛j D pNn˛ and vanish in the limit Nn˛ ! 1, which obviously corresponds to
the classical (no quantum fluctuations) limit. This result provides an additional
reinforcement to the idea that the electromagnetic field in the coherent states is as
close to a classical wave as possible.

Finally, I will consider how these quantities (the expectation values and uncer-
tainties) change with time. The easiest way to do this is to use the Heisenberg
picture, in which all dynamics is given by the time dependence of the annihilation

7.4 Problems 249

operator, which as we know from the consideration of harmonic oscillator is very
simple ObH.t/ D Ob exp .�i!t/, so that h˛j ObH j˛i D ˛ exp .�i!t/. With this I
immediately find for the field expectation values

D OB0.t/
E

D
r

„
2

�
˛e�i!t C ˛�ei!t� (7.113)

D OE0.t/
E

D i
r

„
2

�
˛�ei!t � ˛e�i!t� (7.114)

and for their squares

D OB20.t/
E

D „
2

�
˛e�i!t C ˛�ei!t�2 C „

2
(7.115)

D OE20 .t/
E

D „
2

�
˛�ei!t � ˛ei!t�2 C „

2
: (7.116)

It is remarkable that the uncertainties of the fields
D OB20.t/

E
�
D OB0.t/

E2
,
D OE20 .t/

E
�

D OE0.t/
E2

remain time independent and satisfy the minimal form of the uncertainty

principle at all times. While the harmonic time dependence of the expectation value
is typical for almost any initial state, the uncovered behavior of the uncertainties is
the special property of the coherent states and is what makes them so special. This
also guarantees that the shape of the coherent superposition of the stationary states
does not get distorted with time similar to what one would expect from a classical
electromagnetic wave.

7.4 Problems

Problems for Sect. 7.1

Problem 83 Using Eq. 7.16 together with Eqs. 7.12 and 7.13, find the time depen-
dence of the coordinate x and momentum p. Comparing the found result with
Eq. 7.9, find the relation between parameters b0; b

�
0 and x0; p0.

Problem 84 Verify that Eqs. 7.14 and 7.15 are equivalent to the Hamiltonian equa-
tions for the regular coordinate and momentum by computing the time derivatives
of the variables b and b

�
using Eqs. 7.12 and 7.13 together with Eqs. 7.7 and 7.8.

Problem 85 Prove that Oa jni D pn jn � 1i.

250 7 Harmonic Oscillator Models

Problem 86 Suppose that a harmonic oscillator is at t D 0 in the state described by
the following superposition:

j˛0i D a
�p

2 j0i C p3 j1i
�
:

1. Normalize the state.
2. Find a vector j˛.t/i representing the state of the oscillator at an arbitrary time t.
3. Calculate the uncertainties of the coordinate and momentum operators in this

state, and check that the uncertainty relation is fulfilled at all times.

Problem 87 Using the method of mathematical induction, prove that

y � d

dy

n
exp

�y

2

2

D .�1/n exp

�y

2

2

dn exp

��y2�

dyn

and derive Eq. 7.44 for the coordinate representation of an eigenvector of the
Hamiltonian of the harmonic oscillator.

Problem 88 Using matrices amn D hmj Oa jni I a�mn D hmj Oa� jni, demonstrate by
direct matrix multiplication that

�Oa� Oa�mn D
X

k

a�mkakn D mımn:

Problem 89 Using the coordinate representation of the lowering operator Oa, apply
it to the coordinate representation of the n D 3 stationary state of the harmonic
oscillator. Is the result normalized? If not, normalize it and compare the found
normalization factor with Eq. 7.40.

Problem 90 Using lowering and raising operators, compute the expectation value
of the kinetic, OK, and potential, OV; energies of a harmonic oscillator in an arbitrary
stationary state jni. Check that

D OK
E

D
D OV
E
:

This result is a particular case of the so-called virial theorem relating the expectation
values of kinetic and potential energies of a particle in the potential described by

V D kxp. The general form of the theorem is 2
D OK
E

D p
D OV
E
, which for p D 2

(harmonic oscillator) is reduced to the result of this problem.

Problem 91 Derive explicit expressions for Hermite polynomials with n D 3; 4; 5
(of course, you can always google it, but do it by yourselves—you can learn
something), and demonstrate explicitly that they obey the orthogonality relation:

1̂

�1
exp

��x2�Hm.x/Hn.x/dx D 0; m ¤ n:

7.4 Problems 251

Problem 92

1. Find eigenvectors j˛i of the lowering operator Oa: Oa j˛i D ˛ j˛i in the coordinate
representation. Normalize them.

2. Show that the raising operator Oa� does not have normalizable eigenvectors.
Problem 93 Compute a probability that a measurement of the coordinate will yield
the value in the classically forbidden region for the oscillator prepared in each of the
following stationary states: j0i, j1i, and j3i. (Note that the boundary of classically
allowed regions is different for each of these states.)

Problem 94 Consider an electron with mass me and charge �e in a harmonic
potential OV D me!2x2=2 also subjected to a uniform electric field E in the positive
x direction.

1. Write down the Hamiltonian for this system.
2. Using operator identities from Sect. 3.2.2, prove that

exp

iOpxd
„

Ox exp

� iOpxd„

D Ox C d; (7.117)

where Ox and Opx are regular operators of the coordinate and the respective
components of the momentum and d is a real number.

3. In Sect. 5.1.2 I already demonstrated using the example of a parity operator that

if two vectors are related to each other as jˇi D OT j˛i, while vectors
ˇ̌
ˇ Q̌
E

and

j Q̨ i are defined as
ˇ̌
ˇ Q̌
E

D OU jˇi, j Q̨ i D OU j˛i, one can show that
ˇ̌
ˇ Q̌
E

D OT 0 j Q̨ i,
where OT 0 D OU OT OU�1. Use this relation together with Eq. 7.117 to reduce the
Hamiltonian found in Part I of this problem to that of a harmonic oscillator
without the electric field, and express the eigenvectors of the Hamiltonian with
the field (perturbed Hamiltonian) in terms of the eigenvectors of the Hamiltonian
without the field (unperturbed).

4. Write down the coordinate wave function representing the states of the perturbed
Hamiltonian in terms of the wave functions representing the states of the
unperturbed Hamiltonian. Comment on the results. Explain how it can be derived
by manipulating the classical Hamiltonian before its quantization.

5. If the electron is in its ground state before the electric field is turned on,
find the probability that the electron will be found in the ground state of the
Hamiltonian with the electric field on. (Hint: You will need to use operator
identities concerning with the exponential function of the sum of the operators
discussed in Sect. 3.2.2 and the representation of the momentum operator in
terms of raising and lowering operators. Remember: The exponential function
of an operator is defined as a corresponding power series.)

252 7 Harmonic Oscillator Models

Problems for Sect. 7.1.2

Problem 95 Using the Heisenberg representation, find the uncertainty of coordi-
nate and momentum operators at an arbitrary time t for the state

j˛i D 1p
3
.j1i C j2i C j3i/ ;

where jni is nth stationary state of the harmonic oscillator. Verify that the uncertainty
relation is fulfilled at all times.

Problem 96 Solve the previous problem using the Schrödinger representation.

Problem 97 Consider the system described in Problem 94, but work now in the
Heisenberg picture.

1. Write down the Hamiltonian of the electron in the Heisenberg picture.
2. Write down the Heisenberg equations for lowering and raising operators and

solve them.
3. Now, assume that the electric field was turned on at t D 0, when the electron

was in the ground state j0i of the unperturbed Hamiltonian, and turned back off
at t D tf . In the Heisenberg picture, the state of the system does not change, so
that all time evolution is described by the operators. Let us call the lowering and
raising operators at t D 0 Oain, Oa�in (these are, obviously, the same operators that
appear as initial conditions in the solutions of the Heisenberg equations found in
Part I of the problem). These operators are just lowering and raising operators in
the Schrödinger picture, so that the initial state obeys equation Oain j0i D 0. In the
Heisenberg picture, raising and lowering operators change with time according
to the expressions found in Part I. Considering these expressions at t D tf , you
will find Oaf � Oa

�
tf
�
, and Oa�f D Oa�

�
tf
�
. Verify that these operators have the same

commutation relation as their Schrödinger counterparts.
4. The time evolution of the Hamiltonian, which at all times has the form found in

Part I, is completely described by the time dependence of lowering and raising
operators. Using the expressions for Oaf and Oa�f found in the previous part of the
problem, write down the Hamiltonian of the electron at times t > tf in terms of
operators Oain, Oa�in.

5. Using the found expression for the Hamiltonian, find the expectation value of
energy in the given initial state.

6. The Hamiltonian of the electron at t > tf has the same form in terms of operators
Oaf , Oa�f , as the Hamiltonian for t < t0 has in terms of operators Oain, Oa�in. Also, it
has been shown in Part III that Oaf , Oa�f have the same commutation relations as
Oain, Oa�in. This means that Hamiltonian t > tf has the same eigenvalues, and its
eigenvectors satisfy the same relations:

7.4 Problems 253

Oaf j0if D 0

jnif D
1p
nŠ

Oa�f j0if

where the first equation defines the new vacuum state j0if and the second
equation defines new eigenvectors. Since operators Oaf , Oa�f differ from Oain, Oa�in,
the new ground state and the new eigenvectors will be different from those
of the initial Hamiltonian. Using the representation of Oaf in terms of Oain, find
the probability that if the system started out in the ground state of the initial
Hamiltonian, it will be found in the new ground state j0if .

Problems for Sect. 7.2

Problem 98 Verify Eqs. 7.71 and 7.72.

Problem 99 Rewrite the Schrödinger equation for the stationary states of a 3-D
isotropic harmonic oscillator in cylindrical coordinates, �; '; z. Show that the wave
function can be written down �n1;n2;m D Zn1 .z/Rn2 .�/ exp.im'/, and derive
equations for functions Zn1 .z/ and Rn2 .�/. The first of these equations will coincide
with the Schrödinger equation for a one-dimensional harmonic oscillator, so you
can use the results of Sect. 7.1.1 to determine this function and the corresponding
contribution to energy, but the equation for Rn2 .�/ will have to be solved from
scratch. Do it using the power series method developed in the text for the spherical
coordinates.

Problem 100 You just saw that the wave functions of an isotropic oscillator
can be presented using Cartesian, spherical, and cylindrical coordinates. While
each of these functions, corresponding to the same degenerate energy value, has
very different forms, since all of them represent the same eigenvectors belonging
to the corresponding eigenvalue, you shall be able to present each of them as
linear combinations of the others belonging to the same eigenvalue. Verify that
this is indeed the case for states belonging to energy value E D 5„!=2 by
explicitly expressing wave functions written in Cartesian coordinates in terms of
their spherical and cylindrical coordinate counterparts.

Problems for Sect. 7.3.2

Problem 101 Verify Eqs. 7.115 and 7.116 for time-dependent expectation values
of the squares of electric and magnetic fields.

254 7 Harmonic Oscillator Models

Problem 102 The flow of the energy of the electromagnetic field is described by
the Poynting vector, which in SI units is given by

S D 1
�0

E � B:

In our toy model of the electromagnetic field, the Poynting vector becomes simply

S D 1
�0

ExBy:

In quantum theory, the Poynting vector becomes an operator. Find the time-
dependent expectation value and uncertainty of this operator in the coherent state.

Chapter 8
Hydrogen Atom

8.1 Transition to a One-Body Problem

Quantum mechanics of the atom of hydrogen, understood as a system consisting of
a positively charged nucleus and a single negatively charged electron, is remarkable
in many respects. This is one of the very few exactly solvable three-dimensional
models with realistic interaction potential. As such it provides the foundation for
much of our qualitative as well as quantitative understanding of optical properties
of atoms at least as a first approximation for more complicated situations. A similar
model also arises in the physics of semiconductors, where bound states of negative
and positive charges form entities known as excitons, as well as in the situations
involving a single conductance electron interacting with a charged impurity. Another
curious property of this model is that the energy eigenvalues emerging from the
exact solution of the Schrödinger equation coincide with energy levels predicted by
the heuristic Bohr model based on a rather arbitrary combination of Newton’s laws
with a simple quantization rule for the angular momentum. While it might seem as a
pure coincidence of a limited significance given that by now we have harnessed the
full power of quantum theory and do not really need Bohr’s quantization rules, one
still might wonder by how much the development of quantum physics would have
been delayed if it were not for this “coincidence.”

I will begin exploration of this model with a brief reminder of how classical
mechanics deals with the problem. There are two different aspects to it which need
to be addressed. First, unlike all previous models considered so far, which involved
a single particle, this is a two-body problem. Luckily for us, this problem only
pretends to be two-body and can be easily reduced to two single-particle problems.
This is how it is done in classical physics. The classical Hamiltonian of the problem
has the following form:

H D p
2
1

2mp
C p

2
2

2me
C V .jr1 � r2j/ ; (8.1)

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_8

255

256 8 Hydrogen Atom

where p1;r1 and p2; r2 are momentums and positions of the particles with corre-
sponding masses me and mp and V .jr1 � r2j/ is the Coulomb potential energy, which
in the SI units can be written as

V .jr1 � r2j/ D � 1
4�"r"0

Ze2

jr1 � r2j : (8.2)

Here e is the elementary charge, Z is the atomic number of the nucleus introduced
to allow dealing with heavier hydrogen-like atoms such as atoms of alkali metals
(or a charged impurity), and "r is the relative dielectric permittivity accounting for a
possibility that the interacting particles are inside a dielectric medium. To separate
this problem into two single-particle problems, I introduce new coordinates:

R D mpr1 C mer2
mp C me ; (8.3)

r D r1 � r2: (8.4)

I hope you have recognized in R a coordinate of the center of mass of two particles
and in r their relative position vector. Now, I need to find the new momentums
associated with these coordinates. For your sake I will avoid using formalism of
canonical transformations in Hamiltonian mechanics and will begin with defining
the kinetic energy in terms of respective velocities. Reversing Eqs. 8.3 and 8.4, I get

r1 D R C me
mp C me r;

r2 D R � mp
mp C me r;

so that the kinetic energy can be found as

K D 1
2

mp

dR
dt

C me
mp C me

dr
dt

2
C 1
2

me

dR
dt

� mp
mp C me

dr
dt

2
D

1

2

�
mp C me

� dR
dt

2
C mpm

2
e�

mp C me
�2

dr
dt

2
C mem

2
p�

mp C me
�2

dr
dt

2
D

1

2

�
mp C me

� dR
dt

2
C mpme

mp C me

dr
dt

2
:

Introducing two new masses—total mass of the system M D mp C me and reduced
mass � D mpme=

�
mp C me

�
—I can define the momentum of the center of mass:

pR D M
dR
dt
;

8.1 Transition to a One-Body Problem 257

and relative momentum

pr D �
dr
dt
;

so that the Hamiltonian, Eq. 8.1, can be rewritten as

H D p
2
R

2M
C p

2
r

2�
C V.r/:

The corresponding Hamiltonian equations are separated into a pair of equations for
the position and momentum of the center of mass:

dR
dt

D pR
M

dpR
dt

D 0;

and for the relative motion

dr
dt

D pr
M

dpr
dt

D �dV
dr
:

The first pair of these equations describes a uniform motion of a free particle—the
center of mass of the system—while the second pair describes the motion of a single
particle in potential V.r/.

I have little doubts that variables r and pr form a canonically conjugated pair, and
so I can transition to the quantum description by promoting them to operators with
standard commutation relation

�
ri; prj

� D i„ıi;j. However, in order to be 100% sure
and convince all the possible skeptics, I do need to verify this fact by computing
Poisson brackets with these variables. To this end I need to express r and pr in terms
of initial coordinates and momentums. Expression for r is given by Eq. 8.4, so I only
need to figure out pr:

pr D
memp

me C mp

dr1
dt

� dr2
dt

D me

me C mp p1 �
mp

me C mp p2: (8.5)

Let me focus for concreteness on x-components of the momentum and coordinate.
Equation 3.4 for the Poisson bracket, where summation must include the sum over
coordinates of both particles, yields

fx; prxg D @x
@x1

@prx
@p1x

C @x
@x2

@prx
@p2x

D me
me C mp C

mp
me C mp D 1

as expected. All other Poisson brackets also predictably produce necessary results,
so you can start breathing again.

258 8 Hydrogen Atom

8.2 Eigenvalues and Eigenvectors

It is important that the portion of the Hamiltonian describing the motion of the center
of mass, OHR D Op2R=2M, is completely independent from the part responsible for the
relative motion

OHr D Op
2
r

2�
C V.Or/ (8.6)

so that the eigenvectors of the total Hamiltonian OH D OHR C OHr can be written
down as j�Ri j�ri, where the first vector is an eigenvector of OHR with eigenvalue ER,
while the second vector is the eigenvector of OHr with its own eigenvalue Er. The
eigenvalue of the total Hamiltonian is easily verified to be ER C Er (when verifying
this statement, remember that OHR acts only on j�Ri, while OHr only affects j�ri). I
am going to ignore the center of mass motion and will focus on the Hamiltonian
OHr, Eq. 8.6, with the Coulomb potential energy, Eq. 8.2. In what follows I will omit
subindex r in the Hamiltonian.

What I am dealing with here is yet another example of a particle moving in a
central potential, similar to the isotropic harmonic oscillator problem considered in
Sect. 7.2. Just like in the case of a harmonic oscillator, Hamiltonian 8.6 commutes
with angular momentum operators OL2 and OLz; thus its eigenvectors are also
eigenvectors of the angular momentum. Working in the position representation and
using spherical coordinates to represent the position, I can again write down for the
wave function

n;l;m .r; �; '/ D Yml .�; '/Rnl.r/

where Yml .�; '/ are spherical harmonics—coordinate representation of the eigen-
vectors of angular momentum operators. The equation for the remaining radial
function Rnl .r/ is derived in exactly the same way as in Sect. 7.2 and takes the
form similar to Eq. 7.71:

� „
2

2�r2
d

dr

r2
@Rnr ;l
@r

C „

2l.l C 1/
2�r2

Rnr ;l �
1

4�"r"0

Ze2

r
Rnr ;l D El;nr Rnr ;l (8.7)

with obvious replacements of me ! � and quadratic harmonic oscillator potential
for the Coulomb potential. Eigenvalues of energy El;n are found by looking for
normalizable solutions to this equation. My choice of the indexes to label the
eigenvalues reflects the fact that the eigenvalues of the Hamiltonian with any central
potential do not depend on m. Indeed, quantum number m is defined with respect
to a particular choice of the polar axis Z, but since the energy of a system with a
central potential cannot depend upon an arbitrary axis choice, it should not depend
on this quantum number. Here is another example of how symmetry consideration
helps to analyze the problem.

8.2 Eigenvalues and Eigenvectors 259

I will begin by reducing Eq. 8.7 to a dimensionless form as it is customary in
this type of situations. What I need for this is a characteristic length scale, which in
this problem, unlike the harmonic oscillator case, is not that obvious. But there is a
trick which I can use to find it, and I am going to share it with you. Just by looking
at the radial equation, I know that there are three main parameters: mass �, charge
e, and Planck’s constant „ in this problem, and I need to find their combination
with the dimension of length. This is done by first writing down this combination
in the most generic form as �˛ Qeˇ„� , where Qe D e=p4�"0"r is the combination of
charge, vacuum, and relative permittivity, "0 and "r correspondingly appearing in the
Coulomb law in SI units, while ˛; ˇ, and � are unknown powers to be determined.
In the next step, I will present the dimension of each factor in this expression in
terms of the basic quantities: length, time, and mass. For instance, the dimension
of Qe can be found from the Coulomb law as ŒQe� D ŒF�1=2 ŒL� , where ŒF� stands for
the dimension of force and ŒL� stands for the dimension of length. The dimension of
force in basic quantities is ŒF� D ŒM� ŒL� ŒT��2, where ŒM� represents the dimension
of mass and ŒT� represents the dimension of time (think of Newton’s second law).
So, for the effective charge, I have ŒQe� D ŒM�1=2 ŒL�3=2 ŒT��1. The dimension of
Planck’s constant can be determined from the Einstein–de Broglie relation between
energy and frequency as Œ„� D ŒE� ŒT� D ŒM� ŒL�2 ŒT��1, where in the second step I
expressed the dimension of energy ŒE� as ŒE� D ŒF� ŒL�. Combining the results for
the charge and Planck’s constant, I find

�˛ Qeˇ„� D ŒM�˛ ŒM�ˇ=2 ŒL�3ˇ=2 ŒT��ˇ ŒM�� ŒL�2� ŒT��� D
ŒM�˛Cˇ=2C� ŒL�3ˇ=2C2� ŒT��ˇ�� :

If I want this expression to have the dimension of length ŒL�, I need to eliminate the
excessive dimensions such as ŒM� and ŒT�. Remembering that any quantity raised to
the power of zero turns to unity and becomes dimensionless, I can eliminate ŒM� and
ŒT� requiring that their corresponding powers vanish:

˛ C ˇ=2C � D 0
ˇ C � D 0:

Then all what is left to do is to make the power of L equal to unity:

3ˇ=2C 2� D 1:

The result is the system of equations for unknown powers, solving which I find
� D 2, ˇ D �2, and ˛ D �1, i.e., the characteristic length scale can be constructed
using the parameters at our disposal as

aB D 4�"0"r„
2

e2�
: (8.8)

260 8 Hydrogen Atom

The found characteristic length is actually well known from Bohr’s theory of atomic
spectra and is called Bohr radius. It can be used to introduce a dimensionless
coordinate & D r=aB and rewrite Eq. 8.7 as

� 1
&2

d

d&

&2

dRnr ;l
d&

C l.l C 1/

&2
Rnr ;l �

2Z

&
Rnr ;l D

2 .4�"0"r/
2 „2

e4�
El;nr Rnr ;l:

(8.9)

You can verify (do it yourselves) that quantity

QE D e
4�

32�2"20"
2
r „2

(8.10)

has the dimension of energy, so that I can present the right-hand side of this equation
in terms of dimensionless energy parameter

�l;n D El;n= QE:

Finally, introducing auxiliary radial function un;l D &Rn;l (the same as in the
harmonic oscillator problem), I obtain the effective one-dimensional Schrödinger
equation similar to Eq. 7.73:

� d
2unr ;l
d&2

C l.l C 1/
&2

unr ;l �
2Z

&
unr ;l D �l;nr unr ;l: (8.11)

The effective potential in Eq. 8.11 is positively infinite at small & , but as & increases,
it, unlike the harmonic oscillator problem, becomes negative, reaches a minimum
value of �Z2= Œl.l C 1/� at & D l.l C 1/=Z, and remains negative while approaching
zero for & ! 1; see Fig. 8.1. Classical behavior in such a potential is bound for
negative values of energy and unbound for positive energies. In the former case,
we are dealing with a particle moving along a closed elliptical orbit, while in the

Fig. 8.1 Dimensionless
effective potential as a
function of dimensionless
radial coordinate

8.2 Eigenvalues and Eigenvectors 261

latter case, the situation is better described in terms of scattering of a particle by
the potential. In quantum description, as usual, we shall expect states characterized
by the discrete spectrum of eigenvalues for classically bound motion (negative
energies) and states with continuous spectrum for positive energies. The wave
functions representing states of continuum spectrum are well known but rather
complex mathematically and are not used too frequently, so I shall avoid dealing
with them for the sake of keeping everyone sane. The range of negative energies is
much more important for understanding the physical processes in atoms and is more
tractable. In terms of atomic physics, the states with negative energies correspond
to intact atoms, where the electron is bound to its nucleus, and the probability that it
will turn up infinitely far from the nucleus is zero. The states of continuous spectrum
correspond to ionized atoms, where energy of the electron is too large for the nucleus
to be able to “catch” it so that the electron can be found at an arbitrary large distances
from the nucleus.

The process of finding the solution of Eq. 8.11 follows the same steps as solving
a similar equation for the harmonic oscillator: find an asymptotic behavior at small
and large & , factor it out, and present the residual function as a power series. I,
however, can simplify the form of the equation a bit more by replacing variable &
with a new variable � D &p��l;n (remember �l;nr < 0 !). Equation 8.11 now takes
the following form:

d2unr ;l
d�2

� l.l C 1/
�2

unr ;l C
l;nr
�

unr ;l � unr ;l D 0 (8.12)

where I introduced a new parameter

l;nr D
2Zp��l;nr

:

The asymptotic behavior at low � is determined by the contribution from the angular
momentum and is the same as for the harmonic oscillator, unr ;l / �lC1, but the large
� limit is now determined by unr ;l term. The resulting equation

d2unr ;l
d�2

D unr ;l

has two obvious solutions

unr ;l / exp .˙�/

of which I will only keep an exponentially decreasing one in hopes to end up with a
normalizable solution. Thus, I am looking for the solution in the form

unr ;l D �lC1 exp .��/ vn;l .�/; (8.13)

262 8 Hydrogen Atom

where a differential equation for the reduced function vn;l is derived by substituting
Eq. 8.13 into Eq. 8.11. The rest of the procedure is quite similar to the one I outlined
in the harmonic oscillator problem: present vn;l as a power series with respect
to �, derive recursion relation for the coefficients of the expansion, verify that
the asymptotic behavior of resulting power series yields a non-normalizable wave
function, restore normalizability by requiring that the power series terminates after
a final number of terms, and obtain an equation for the energy values consistent
with the normalizability requirement. Leaving details of this analysis to the readers
as an exercise (of course, you can always cheat by looking it up in a number of
other textbooks, but you will gain so much more in terms of your technical prowess
and self-respect by doing it yourselves!), I will present the result. The only way to
ensure normalizability of the resulting wave functions is to require that parameter
l;nr satisfies the following condition:

l;nr D 2. jmax C l C 1/ (8.14)
where jmax is the number of the largest non-zero coefficient in the power series
expansion of function

vn;l .�/ D
X

j

cj�
j

and takes arbitrary integer values starting from 0. However, since jmax appears in
Eq. 8.14 only in combination with l, the actual allowed values of
l;nr depend on
a single parameter, called a principal quantum number n, which takes any integer
values starting from n D 1. Independence of the energy eigenvalues of a hydrogen
Hamiltonian of the angular momentum number l is a peculiarity of the Coulomb
potential and reflects an additional symmetry present in this problem. In classical
mechanics this symmetry manifests itself via the existence of a supplemental
(to energy and angular momentum) conserving quantity called Laplace–Runge–
Lenz vector

A D p � L C � Ze
2

4�"r"0
er

where er is a unit vector in the radial direction. In quantum theory this vector can be
promoted to a Hermitian operator, but this procedure is not trivial because operators
p and L do not commute. However, I am afraid that if I continue talking about the
quantum version of Laplace–Runge–Lenz vector, I might open myself to a lawsuit
for inflicting cruel and unusual punishment on the readers, so I will restrain myself.
Those who are not afraid may look it up, but quantum treatment of Laplace–Runge–
Lenz vector is not very common even in the wild prairies of the Internet.

Anyway, I can drop now the double-index notation in
and dimensionless energy
� and classify the latter with a single index—principal quantum number n. Taking
into account Eq. 8.14 and introducing

n D jmax C l C 1; (8.15)

8.2 Eigenvalues and Eigenvectors 263

I find for the allowed energy values

En D QE�n D �Z
2

n2
QE D � Z

2e4�

32�2"2r"
2
0„2

1

n2
D �Eg

n2
(8.16)

where I introduced a separate notation Eg for the ground state energy. It is also
useful sometimes to have this expression written in terms of the Bohr radius aB
defined by Eq. 8.8:

En D � Z
2e2

8�"r"0aB

1

n2
: (8.17)

For pure hydrogen atom in vacuum Z D 1, "r D 1, and taking into account that
the ratio of the mass of a proton (the nucleus of a hydrogen atom is a single proton)
to the mass of the electron is approximately mp=me � 1:8 � 104, I can replace
the reduced mass � with the electron mass. In this case, the numerical coefficient
in front of 1=n2 contains only universal constants and can be computed once and
for all. It defines the so-called Rydberg unit of energy 1Ry, which in electron volts
is approximately equal to 13:6 eV. This is one of those numbers which is actually
worth remembering, just like a few first digits of number � . The physical meaning of
this number has several interpretations. First of all, it is the ground state energy of the
hydrogen atom, but taking into account that transition from the discrete energy levels
to the continuous spectrum (ionization of atom) amounts to raising of the energy
above zero, you can also interpret this value as the binding energy or ionization
energy of a hydrogen atom—a work required to change the electron’s energy from
the ground state to zero. Also, this number fixes the scale of atomic energies in
general. Transition to atoms heavier than hydrogen, which are characterized by
larger atomic numbers, makes the ground state energy more negative increasing
the binding energy of the atom, which of course totally makes sense (atoms with
larger charge attract electrons stronger).

If you apply Eq. 8.16 to excitons in semiconductors, it will yield a very different
energy scale. It happens for several reasons. First, masses of the interacting positive
and negative charges forming an exciton are comparable in magnitude, so one does
need to compute the reduced mass. Second, these masses are often by an order of
magnitude smaller than the mass of the free electron, which results in significant
decrease of the binding energy. This decrease is further enhanced by a relatively
large dielectric constant of semiconductors "r. All these factors taken together
result in a much larger ground state energy of excitons (remember the energy is
negative!) with much smaller ionization or binding energy, which varies across
different semiconductors and can take values from of the order of 10�3 to 10�2 eV.

Finally, I would like to point out the fact that the discrete energy levels of
hydrogen-like atoms occupy the final spectral region between the ground state and
zero. Despite this fact, the number of these levels is infinite, unlike, for instance, in
the case of one-dimensional square potential well. This means that with increasing
principal quantum number n, the separation between the adjacent levels becomes

264 8 Hydrogen Atom

smaller and smaller, and at some point the discreteness of the energy becomes
unrecognizable, even though the probability to observe the electron infinitely far
from the nucleus is still zero. One can think about this phenomenon as approaching a
classical limit, in which an electron’s motion is finite, it is still bound to the nucleus,
but quantum effects become negligibly small.

For each value of n, there are several combinations of jmax and l satisfying
Eq. 8.15, which means that all energy levels, except of the ground state, are
degenerate with several wave functions belonging to the same energy eigenvalue
that differ from each other by values of l and m. The total degree of degeneracy is
easy to compute taking into account that for each n, there are n � 1 values of l (l
obeys obvious inequality l < n), and for each l, there are 2l C 1 possible values of
m. The total number of wave functions corresponding to the same value of energy
is, therefore, given by

n�1X
lD0
.2l C 1/ D 2.n � 1/ n

2
C n D n2: (8.18)

As expected, this formula yields a single state for n D 1 for which jmax D 0 and
l D m D 0. For the energy level with n D 2, Eq. 8.18 predicts the existence of four
states, which we easily recognize as one state l D m D 0 (in which case jmax D 1)
and three more characterized by l D 1; m D �1; l D 1; m D 0; and l D 1; m D 1.
For all of them, the maximum power of the polynomial function in the solution is
jmax D 0. For some murky and not very important historical reasons, l D 0 states
are called s-states, l D 1 are called p-states, l D 2 are d-states, and, finally, letter f
is reserved for l D 3 states. The origin of these nomenclature comes from German
words for various types of optical spectra associated with each of these states, but
I am not going into this issue any further. Those who are interested are welcome to
google it.

Replacing all dimensionless variables with physical radial coordinate r D
�naB=Z, I find that the radial wave function Rn;l D un;l=r with fixed values of n and
l is a product of .Zr=naB/

l, exponential function exp .�Zr=naB/, and a polynomial
vn;l .Zr=naB/ of the order n � l �1. The polynomials, which emerge in this problem,
are well known in mathematical physics as associated Laguerre polynomials defined
by two indexes as Lpq�p.x/: The definition of these polynomials can be found in many
textbooks as well as online, but for your convenience, I will provide it here as well.
The definition is somewhat cumbersome and involves an additional polynomial
called simply a Laguerre polynomial (no associate here):

Lq.x/ D ex

d

dx

q
.e�xxq/ : (8.19)

To define the associated Laguerre polynomial, one needs to carry out some
additional differentiation of the simply Laguerre polynomial:

8.2 Eigenvalues and Eigenvectors 265

Lpq�p.x/ D .�1/p

d

dx

p
Lq.x/: (8.20)

It is quite obvious that index q in Eq. 8.19 specifies the degree of the respective
polynomial (exponential functions obviously cancel each other after differentiation
is performed). At the same time, index q � p in Eq. 8.20 specifies the ultimate
degree of the associated polynomial (differentiating p times a polynomial of degree
q will reduce it exactly by this amount). So, in terms of these functions, the
polynomial appearing in the hydrogen model can be written as vn;l .Zr=naB/ D
L2lC1n�l�1 .2Zr=naB/. The total normalized radial wave function Rn;l.r/ can be shown
to be

Rn;l; .r/ D
s

2Z

naB

3
.n � l � 1/Š
2n .n C l/Š exp

� Zr

naB

2Zr

naB

l
L2lC1n�l�1

2Zr

naB

:

(8.21)

I surely hope you are impressed by the complexity of this expression and can
appreciate the amount of labor that went into finding the normalization coefficient
here, which you are given as a gift. I also have to warn you that different authors may
use different definitions of the Laguerre polynomials, which affect the appearance
of Eq. 8.21. More specifically, one might include an extra factor 1=.n C l/Š either
in the definition of the polynomial or in the normalization factor. Equation 8.21
is written according to the former convention, while if the latter is accepted, the
term .n C l/Š must be replaced with Œ.n C l/Š�3 : You might find both versions of
the hydrogen wave function on the Internet or in the literature, and my choice was
completely determined by the convention adapted by the popular computational
platform MATHEMATICA ©, which I am using a lot to perform computations
needed for this book. The total hydrogen wave function, which in the abstract
notation can be presented as jn; l;mi, is obtained by multiplying the radial function
and the spherical harmonics Yl;m .�; '/:

n;l;m .r; �; '/ D Rn;l; .r/ Yl;m .�; '/ : (8.22)

Different factors in Eqs. 8.22 and 8.21 are responsible for different physical
effects; however, before giving them any useful interpretation, I have to remind you
that the respective probability distribution density is given by

P .r; �/ D j n;l;m .r; �; '/j2 r2 sin � D
2Z

naB

3
.n � l � 1/Š
2n Œ.n C 1/Š� exp

�2Zr

naB

2Zr

naB

2l
r2�

L2lC1n�l�1

2Zr

naB

�2 �
Pml .cos �/

�2
sin � (8.23)

266 8 Hydrogen Atom

where I replaced spherical harmonics by the product of associated Legendre
functions Pml .cos �/ and exp .im'/ and took into account that the latter disappears
after multiplication by the respective complex-conjugated expression. Additional
factors r2 and sin � are due to spherical volume element, which has the form of
dV D r2 sin �d�d'dr.

The exponential factor describes how fast the wave function decreases at infinity.
The respective characteristic scale

rat D naB
Z

(8.24)

can be interpreted (quite loosely though) as a size of the atom, because the
probability to find the electron at distance r
rat becomes exponentially small.
In the case of the atom in the ground state (n D 1 ), it is easy to show (i.e.,
if you remember that a maximum of a function is given by a zero of its first
derivative) that the distance r D rat corresponds to the maximum of the probability
P .r/ D ´ �

0
P.r; �/d� . In the case of a hydrogen atom (Z D 1/, rat D aB, and if this

atom is in vacuum ("r D 1), the Bohr radius is determined by fundamental constants
only. Replacing the reduced mass with the mass of the electron, you can find that
the size of the hydrogen atom in the ground state is aB � 0:5 � 10�10 m. This
number sets up the atomic spatial scale just as 13:6 eV sets up the typical energy
scale. In the case of excitons in semiconductors, the characteristic scale becomes
much larger for the same reasons why the energy scale becomes smaller: large
dielectric constant and smaller masses yield larger aB; see Eq. 8.8. As a result the
typical size of the exciton can be as large as 10�8 m, which is extremely important
for semiconductor physics, as it allows significant simplification of the quantum
description of excitons.

Radial distribution for higher lying states can have several maximums, so such
a direct interpretation of rat becomes impossible, but it still can be thought of as
a cutoff distance, starting from which the probability for the electron to wander
off dramatically decreases. It is interesting that this parameter increases with n, so
excited atoms not only have more energy, but they are also larger in size. Figure 8.2
presents a number of radial functions for your perusal, which illustrates most of the
properties discussed here.

The factors containing the power of the radial coordinate are responsible for
electrons not falling on the nucleus—the probability that r D 0 is strictly zero.
This probability for r < rat decreases with increasing angular momentum number
l, which can be interpreted as a manifestation of the “centrifugal” force keeping
rotating particles away from the center of rotation. The Laguerre polynomial
factor is essentially responsible for the behavior of the radial wave function at the
intermediate distances between zero and rat: the degree of the respective polynomial
determines how many zeroes the radial wave function has. Finally, the Legendre
function Pml .cos �/ is responsible for directionality of the probability distribution
with respect to Z-axis. States with zero angular momentum are described by a
completely isotropic wave function, which does not depend on direction at all. For

8.3 Virial and Feynman–Hellmann Theorems and Expectation Values of the. . . 267

Fig. 8.2 A few radial functions with different values of principal and orbital numbers n and l.
All graphs on the left figure correspond to l D 0 and increasing n between 1 and 3. The number
of zeroes of the functions is equal to n � l � 1. The graphs in the right figure correspond to
n D 3; l D 1; n D 3; l D 2; and n D 4; l D 1 (which one is which you can figure out yourselves
by counting zeroes). The functions are not normalized for convenience of display

states with non-zero l, an important parameter is l � m; which yields the number of
zeroes of the Legendre function and can also be used to determine the number of the
respective maximums. The properties of the Legendre functions have been already
discussed in Sect. 5.1.4, and the plots illustrating them were presented in Fig. 5.3,
which you might want to consult to refresh your memory.

8.3 Virial and Feynman–Hellmann Theorems and
Expectation Values of the Radial Coordinate
in a Hydrogen Atom

I will finish the chapter by discussing one apparently very special and technical but
at the same time practically very important problem of calculating the expectation
values hrpi of various powers of the radial coordinate rp, where p can be any negative
or positive integer, in the stationary states of a hydrogen atom. Formally, calculation
of these expectation values involves evaluation of the integrals

hrpi D
1̂

0

drrpC2 ŒRnl.r/�2 dr (8.25)

where Rnl.r/ has been defined in Eq. 8.21 and the extra 2 in rpC2 comes from the
term r2 in the probability distribution generated by the hydrogen-like wave function,
Eq. 8.23. Direct calculation of the integral in Eq. 8.25 is a hopeless task given the
complexity of the radial function, but it is possible to circumvent the problem by
relying on the radial equation, Eq. 8.7, itself, rather than on the explicit form of its
solution, Eq. 8.21.

268 8 Hydrogen Atom

But first, let me derive a remarkable relation between the expectation values
of kinetic and potential energies of a quantum particle known as a virial theorem.
Consider the expectation value of the operator Or � Op in an arbitrary quantum state and
compute its time derivative using the Heisenberg picture of the quantum mechanics
(the expectation values do not really depend on which picture is used, but working
with time-dependent Heisenberg operators and time-independent states is more
convenient than using the Ehrenfest theorem, Eq. 4.17, for Schrödinger operators):

d

dt
hOr � Opi D

�
dOr
dt

� Op
�

C
�
Or � d Op

dt

�
:

Applying Heisenberg equations for the position and momentum operators, Eqs. 4.28
and 4.29, to this expression, I obtain

d

dt
hOr � Opi D

� Op
m

� Op
�

�
D
Or � r OV

E
D 2

D OK
E

�
D
Or � r OV

E
: (8.26)

The left-hand side of Eq. 8.26 must vanish if the state used to compute the
expectation value is an eigenvector of the Hamiltonian (a stationary state in the
Schrödinger picture) because the expectation value of any operator in a stationary
state is time-independent. This allows me to conclude that in the stationary states,
the expectation values of kinetic and potential energies satisfy the relation

2
D OK
E

D
D
Or � r OV

E
(8.27)

known as virial theorem. In the case of the Coulomb potential of the hydrogen atom
Hamiltonian, this theorem yields

2
D OK
E

D Ze
2

4�"r"0

�
1

r

�
: (8.28)

Since the expectation value of the Hamiltonian in its own stationary state is simply
equal to the respective eigenvalue, I can write for the hydrogen-like Hamiltonian:

D OH
E

D
D OK
E

� Ze
2

4�"r"0

�
1

r

�
)

En D � Ze
2

8�"r"0

�
1

r

�

where I replaced the expectation value of the Hamiltonian with its eigenvalue for
the n-th stationary state and used Eq. 8.28 to eliminate the expectation value of the
kinetic energy. Finally, using Eq. 8.16 for En; I have

8.3 Virial and Feynman–Hellmann Theorems and Expectation Values of the. . . 269

Z2e4�

32�2"2r"
2
0„2

1

n2
D Ze

2

8�"r"0

�
1

r

�
)

�
1

r

�
D Ze

2�

4�"r"0„2
1

n2
D Z

aBn2
(8.29)

where in the last step I used Eq. 8.8 for Bohr radius aB. The expectation values hrpi
for almost all other values of p can be derived using so-called Kramers’ recursion
relations, which I provide here without proof:

p C 1
n2

hrpi � .2p C 1/ aB
Z

˝
rp�1

˛C pa
2
B

4Z2

h
.2l C 1/2 � p2

i ˝
rp�2

˛ D 0: (8.30)

It is easy to see that I can indeed use Eqs. 8.29 and 8.30 to find hrpi for any positive
p; but Kramers’ relations fail to yield

˝
r�2
˛
, because this term could arise if you set

p D 0, but, unfortunately, the corresponding term vanishes because of the factor
p in it. Therefore, I have to find an independent way of computing

˝
r�2
˛
. Luckily,

there exists a cool theorem, which Richard Feynman derived while working on his
undergraduate thesis, called Feynman–Hellmann theorem.1 The derivation of this
theorem is based on the obvious identity, which is valid for an arbitrary Hamiltonian
and which I have already mentioned when deriving Eq. 8.29. To reiterate, the
identity states that

En D h nj OH j ni

if j ni are the eigenvectors of OH. Now assume that the Hamiltonian OH depends on
some parameter �. It can be, for instance, a mass of a particle, or its charge, or
something else. It is obvious then that the eigenvalues and the eigenvectors also
depend on the same parameter. Differentiating this identity with respect to this
parameter, you get

@En
@�

D
�
@ n

@�

ˇ̌
ˇ̌ OH j ni C h nj OH

ˇ̌
ˇ̌@ n
@�

�
C h nj @

OH
@�

j ni :

The first two terms in this expression can be transformed as

�
@ n

@�

ˇ̌
ˇ̌ OH j ni C h nj OH

ˇ̌
ˇ̌@ n
@�

�
D En

�
@ n

@�
j ni C h n

ˇ̌
ˇ̌@ n
@�

�
D

En
@ h nj ni

@�
D 0

1Hellmann derived this theorem 4 years before Feynman but published it in an obscure Russian
journal, so it remained unknown until Feynman rediscovered it.

270 8 Hydrogen Atom

where I used the fact that all eigenvectors are normalized to unity, so that their norm,
appearing in the last line of the above derivation, is just a constant. Thus, here is the
statement of the Feynman–Hellmann theorem:

@En
@�

D h nj @
OH
@�

j ni : (8.31)

This is a very simple, almost trivial result, and it is quite amazing that it can be used
to solve rather complicated problems, such as finding the expectation value

˝
r�2
˛

in the hydrogen atom problem. So, let’s see how this is achieved. Going back to
Eq. 8.7, you can recognize that this equation can be seen as an eigenvalue equation
for Hamiltonian

OHr D � „
2

2�r2
d

dr

r2
@

@r

C „

2l.l C 1/
2�r2

� 1
4�"r"0

Ze2

r
(8.32)

and that hydrogen energies are eigenvalues of this Hamiltonian. Therefore, I can
apply the Feynman–Hellmann theorem to this Hamiltonian, choosing, for instance,
the orbital quantum number l as a parameter �. Differentiation of Eq. 8.32 with
respect to l yields

@ OHr
@l

D „
2.2l C 1/
2�r2

:

In order to find derivative @[email protected], one needs to recall that the principal quantum
number is related to the orbital number as n D l C nr C 1 so that

@En
@l

D @En
@n

D Z
2e2

4�"r"0aB

1

n3
:

Now, applying the Feynman–Hellmann theorem, I can write

„2.2l C 1/
2�

�
1

r2

�
D Z

2e2

4�"r"0aB

1

n3

where I used Eq. 8.17 for the energy. Rearranging this result and applying Eq. 8.8
for the Bohr radius, I obtain the final expression for

˝
r�2
˛
:

�
1

r2

�
D Z

2e2�

2�„2"r"0aB
1

.2l C 1/ n3 D
2Z2

a2B

1

.2l C 1/ n3 : (8.33)

Now, boys and girls, if what you have just witnessed is not a piece of pure magic
with the Feynman–Hellmann theorem working as a magic wand, I do not know what
else you would call it. And if you are not able to appreciate the awesomeness of this
derivation, you probably shouldn’t be studying quantum mechanics or physics at

8.4 Problems 271

all for that matter. This result is also a key to finding, with the help of Kramers’
relations, Eq. 8.30, of the expectation values hrpi for any p: For instance, to find˝
r�3
˛
, you just need to use Eq. 8.30 with p D �1:

aB
Z

˝
r�2
˛ � a

2
B

4Z2

h
.2l C 1/2 � 1

i ˝
r�3
˛ D 0 )

�
1

r3

�
D 4Z

aB

1

.2l C 1/2 � 1
�
1

r2

�
D

Z

aB

3
2

l .l C 1/ .2l C 1/ n3 : (8.34)

If the sheer wonder at our ability to compute hrpi without using the unseemly
Laguerre polynomials is not a sufficient justification for you to vindicate spending
some time doing these calculations, you will have to wait till Chap. 14, where I
will put this result to actual use in understanding the fine structure of the spectra of
hydrogen-like atoms.

8.4 Problems

Problem 103 Using Eqs. 8.4 and 8.5 together with canonical commutation
relations for single-particle coordinates and momentums, derive the commutator
between relative position vector r and corresponding momentum pr to convince
yourself that these variables, indeed, obey the canonical commutation relations.

Problem 104 Verify that Eq. 8.10 defines a quantity of the dimension of energy.

Problem 105

1. Derive Eq. 8.14 by applying the power series method to Eq. 8.12 and carrying out
the procedure outlined in the text.

2. Find all radial functions with n D 1 and n D 2. Normalize them.
Problem 106 Using the definition of the associate Laguerre functions provided in
the text, find explicit expressions for the radial functions corresponding to the states
considered in the previous problem. Normalize them and make sure that the results
are identical to those obtained previously.

Problem 107 An operator of the dipole moment is defined as Od D eOr where e is
the elementary charge and Or is the position operator of the electron in the hydrogen
atom. A dipole moment of a transition is defined as a matrix element of this operator
between initial and final states of a system: dnlm;n0l0m0 � hnlmj Od jn0l0m0i. Evaluate
this dipole moment for the transitions between ground state of the atom and all
degenerate states characterized by n D 2.
Problem 108 Find the expectation values hri ; h1=ri, and ˝r2˛ for a hydrogen atom
in j2; 1;mi state.

272 8 Hydrogen Atom

Problem 109 Using the results of the previous problem and full 3-D Schrödinger

equation with non-separated variables, find
D
Op2
E
. Find a relation between the

expectation values of the potential and kinetic energies.

Problem 110 A hydrogen atom is prepared in an initial state:

.r; 0/ D 1p
2
. 2;1;1 .r; �; '/C 1;0;0 .r; �; '// :

Find the expectation value of the potential energy as a function of time.

Problem 111 Consider a hydrogen atom in a state described by the following wave
function:

.r/ D R1;0.r/C az �
p
2x

r
R2;1.r/

where

Rn;l.r/ D .r=naB/lC1 exp

� r
naB

L2lC1n�l�1 .2r=naB/ :

1. Rewrite this function in terms of normalized hydrogen wave functions.
2. Find the values of coefficient a that would make the entire function normalized.
3. If you measure OL2 and OLz, what values can you get and with what probabilities?
4. If you measure energy, which values are possible and what are their probabilities?
5. Find the probability that the measurement of the particle’s position will find it in

the direction specified by the polar angle � �44ı < � < 46ı:
6. Find the probability that the measurement of the particle’s position will find the

particle at a distance 0:5aB < r < aB from the nucleus.

Chapter 9
Spin 1/2

9.1 Introduction: Why Spin?

The model of a pure spin 1=2, detached from all other degrees of freedom of a
particle, is one of the simplest in quantum mechanics. Yet, it defies our intuition and
resists developing that pleasant sensation of being able to relate a new concept to
something that we think we already know (or at least are used to thinking about).
We call this feeling “intuitive understanding,” and it does play an important albeit
mysterious role in our ability to use new concepts. The reason for this difficulty, of
course, lies in the fact that spin is a purely quantum phenomenon with no reasonable
way to model it on something that we know from classical physics. While the
only known to me bulletproof remedy for this predicament is practice, I will try
to somehow ease your pain by taking the time to develop the concept of spin and by
providing empirical and theoretical arguments for its inevitability.

Experimentally spin manifests itself most directly via interaction between elec-
trons and magnetic field and can be defined as an inherent property of electrons
responsible for this interaction. This definition is akin to the definition of electric
charge as a property responsible for electron’s interaction with the electric field or
of mass as a characteristic determining electron’s acceleration under an action of a
force. The substantial difference, of course, is that charge and mass are immutable
scalar quantities, our views of which do not change when we transition from classi-
cal to quantum theories of nature. The concept of spin, on the other hand, is purely
quantum and embodies two distinct types of entities. First is a Hermitian vector
operator, characterized by two distinct eigenvalues and corresponding eigenvectors,
which specify the possible experimental outcomes when one attempts to measure
spin. Second are the spinors—particular type of vectors subjected to the action of
the spin operator and representing various spin states; they control the probability
of one or another outcome of the measurement.

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_9

273

274 9 Spin 1/2

To untangle the connections between spin, angular momentum, and magnetic
interactions, let me begin with a simple example of a classical electron moving
along a circular orbit of radius R with period T . Taken literally, this example does
not make much sense, but it does produce surprisingly reasonable results, so it can
be considered as a convenient and meaningful metaphor. So, imagine an observer
placed at some point on the orbit and counting the number of times the electron
passes by during some time t
T . The number of “sightings” of the electron, n,
is related to the duration of the experiment t and the period T as t D nT . The total
amount of charge that passes by the observer is obviously q D ne D et=T , where e
is the elementary charge. The amount of charge passed across per unit time is what
we call the electric current, which can be found as I D q=t D et= .Tt/ D e=T . This
crude trick replaced a circulating electron by a stationary electric current, which,
of course, only makes sense if I spread the entire charge of the electron along its
orbit by some kind of averaging procedure. But as I said, I am treating this model
only as a metaphor. Accepting this metaphor, I can follow up by remembering that
the interaction between a steady loop of current and a uniform magnetic field is
described by the loop’s magnetic dipole moment, � defined as � D IAn, where A
is the area of the loop and n is the unit vector normal to the plane of the loop with
direction determined by the right-hand rule (do you remember the right-hand rule?).
In the case of the orbiting electron, the loop area is A D �R2, so I have

�L D
e

T
�R2n D ev

2�R
�R2n D emevR

2me
n D � e

2me
L (9.1)

where I (a) expressed period T in terms of the circumference 2�R and orbital
velocity v: T D 2�R=v, (b) multiplied the numerator and the denominator of
the resulting expression by electron’s mass me, and (c) recognized that mevRn
is a vector, which is equal in magnitude and opposite in direction to the orbital
momentum of the electron L. To figure out the “opposite” part of the last statement,
recall that the magnetic moment is defined by the direction of the current—motion
of the positive charges—while the charge of our orbiting electron is negative, and,
therefore, it rotates in the direction opposite to the current. Equation 9.1 establishes
the connection between the magnetic dipole moment of the electron and its orbital
angular momentum.

The interaction between a classical magnetic dipole and a uniform magnetic field
B can be described by a potential energy:

UB D ��L� B D
e

2me
L � B: (9.2)

According to this expression, the potential energy has a minimum when the
magnetic dipole is oriented along the magnetic field and a maximum when they
are oriented antiparallel to each other. For both these orientations, the torque on the
dipole � D �L � B is zero, so these are two equilibrium positions, but while the
former is the stable equilibrium, the latter is unstable. Equation 9.2 also establishes

9.1 Introduction: Why Spin? 275

the connection between the potential energy UB and electron’s angular momentum
L, which is quite useful for transitioning to quantum description. Quantization in
this case consists merely in promoting the components of the angular momentum to
the status of the operators. This newly born operator OUB can now be added to the
Hamiltonian OH0 describing the electron in the absence of the magnetic field to yield

OH D OH0 C e
2me

B � OL: (9.3)

OH0, for instance, can describe an electron moving in some central potential V.r/ (the
Coulomb potential would be a good example), and I will assume that its eigenvalues
En;l and eigenvectors jn; l;mi are known. The choice of notation here reflects the
fact that the eigenvectors of the Hamiltonian with central potential must also be the
eigenvectors of angular momentum operators OL2 and OLz and that its eigenvalues do
not depend on magnetic quantum number m.

It is quite easy to verify that if I choose the polar (Z)-axis of the coordinate system
in the direction of the uniform magnetic field B, eigenvectors jn; l;mi of OH0 remain
also eigenvectors of the total Hamiltonian given by Eq. 9.3. The corresponding
eigenvalues are found as

OH0 C eB

2me
Lz

jn; l;mi D En;l jn; l;mi C eB

2me
„m jn; l;mi D

En;l C „ eB

2me
m

jn; l;mi : (9.4)

The combination of fundamental constant e„=2me has a dimension of magnetic
dipole moment and is prominent enough to warrant giving it its own name. Bohr
magneton �B is defined as

�B D e„
2me

(9.5)

so that the expression for the energy eigenvalues can be written down as

EZn;l;m D En;l C m�BB: (9.6)

Term m�BB can be interpreted as the energy of interaction between the uniform
magnetic field and a quantized magnetic moment with values which are multiples
of �B. In this sense, the Bohr magneton can be thought of as a quantum of
magnetic dipole moment. The most remarkable prediction of this simple compu-
tation is the m-dependence of the resulting energy levels, which is responsible for
lifting the original 2l C 1 degeneracy of the energy eigenvectors. Since magnetic
field is the primary reason for this, it seems quite natural to give quantum number m
the name of “magnetic” number.

276 9 Spin 1/2

Experimentally, this degeneracy lifting is observed via the Zeeman effect—
splitting of the absorption or emission spectral lines in the presence of the magnetic
field. I will discuss the relation between the absorption/emission of light and atomic
energy levels in more detail in Part III of the book, but at this point, it is sufficient
to recall old Bohr’s postulates, one of which relates frequencies of the absorbed or
emitted light to atomic energy levels:

!˛;ˇ D E˛ � Eˇ„ ;

where ˛; ˇ are composite indexes replacing groups of n; l;m for the sake of
simplicity of notation. So, if you, say, observe a light emission due to the transition
from the first excited state of hydrogen atom with n D 2 to the ground state, in
the absence of magnetic field, you would see just one emission line formed by
transitions between states j2; 0; 0i, j2; 1;�1i ; j2; 1; 0i, and j2; 1; 1i, all of which
have the same energy, E2 D � QE=4, where QE was defined in Eq. 8.10. When the
magnetic field is turned on, two of these states, j2; 1;�1i and j2; 1; 1i, acquire
magnetic field-related corrections:

E2;�1 D � QE=4 � �BB
E2;1 D � QE=4C �BB;

making their energy different from each other and E2;0: As a result, instead of a
single emission line with frequency ! D .E2 � E1/ =„ D 3 QE=4„, an experimentalist
would observe three lines at frequencies:

!�1 D 3 QE=4„ � �B„ B

!0 D 3 QE=4„
!1 D 3 QE=4„ C �B„ B:

You should not think though that by deriving Eq. 9.4, I completely solved the
Zeeman effect. The actual problem is much more complicated and involves addition
of orbital and spin magnetic moments, as well as multi-electron effects, relativistic
corrections, magnetic moment of nuclei, etc. What I did was just an illustration
designed to make a particular point—the magnetic field lifting of the 2l C 1
degeneracy of atomic levels gives rise to the odd number of closely positioned
spectral lines. While for some atoms the odd number of lines is indeed observed, a
large number of other observations manifest splitting into even number of lines. This
phenomenon, called anomalous Zeeman effect, cannot be explained by interaction
with orbital magnetic moment, because an even number of lines implies half-integer
values of l. To explain this effect, we have to admit that in addition to “normal”
orbital angular momentum, electrons also have another magnetic moment, which

9.1 Introduction: Why Spin? 277

cannot be constructed from the coordinate and momentum operators and has to be,
therefore, an intrinsic property of the electron not related to other regular (spatial–
temporal) observables. The lowest number of observed split lines was equal to two.
Equating 2l C 1 to 2, you find that this splitting corresponds to l D 1=2. If you also
recall that l is the maximum value of the magnetic number m, you might realize that
m in this case can have only two values m D ˙1=2.

A meticulous and mischievous reader might ask, of course, if it is absolutely
necessary to derive a magnetic dipole momentum from an angular momentum. Can
a magnetic momentum exist just by itself with no angular momentum attached to
it? The answer to the first question is yes and to the second one is, obviously, no.
To justify these answers, however, is not so easy, and the path toward realizing
that electrons do possess an intrinsic angular momentum, which can be in one of
two possible states, was a long one. Such physicists as Wolfgang Pauli (Austria–
Switzerland–USA) and Arnold Sommerfeld (Germany) recognized very early that
purely orbital state of electrons proposed in Bohr’s model of atoms could not
explain all experimental data, which consistently indicated that the actual number
of states is double of what Bohr’s model predicted. Pauli was writing about the
“two-valuedness” of electrons in early 1925 as he needed it to explain the structure
of atoms and formulate its famous Pauli exclusion principle. Later in 1925 two
graduate students of Paul Ehrenfest from Leiden, the Netherlands, Goudsmit and
Uhlenbeck, published a paper, in which they proposed that the required additional
states come from intrinsic angular momentum of electrons due to their “spinning”
on their own axis. They postulated that this new angular momentum of electrons, S,
is related to its magnetic moment �s in a way similar to the relation between orbital
momentum and orbital magnetic moment, but in order to fit experimental data, they
had to multiply the Bohr magneton by 2:

�s D �2 �
e

2me
S D �2�B„ S: (9.7)

The idea appeared so ridiculous to many serious physicists (such as Lorentz) that
the students almost withdrew their paper, but luckily for them (and for physics),
it was too late, and the paper was published. Eventually, it was recognized that
while it was indeed wrong to think that a point-like particle such as electron can
actually spin about its axis (estimates of the required spinning speed would put it
well above the speed of light), so this classical mechanistic interpretation had to go,
the idea of the intrinsic angular momentum, which “just is” as one of the attributes
of electrons, survived, committing the names of Goudsmit and Uhlenbeck to the
history of physics. Ironically, this was the highest achievement of their lives: they
both made decent careers in physics, moving to the USA and securing respectable
professorial positions, but they have never did anything as significant as their almost
withdrawn student paper on spin.

There are other purely theoretical arguments for understanding spin as a different
kind of the angular momentum, but this discussion is for a different time and a
different book. At this point let me just mention that if we want to be able to

278 9 Spin 1/2

add orbital angular momentum and spin angular momentum, which is absolutely
necessary to explain a host of effects in atomic spectra, we must require that they
both are described by objects of the same mathematical nature. This means that
if the orbital momentum is described in quantum mechanics by three operator
components OLx, OLy, and OLz of the angular momentum vector with commutation
relations given by Eqs. 3.53–3.55, spin angular momentum must also be described
by three operator components OSx, OSy, and OSz with the same commutation relations.
Our calculations in Sect. 3.3.4 demonstrated that these commutation relations ensure
that one of the operator components (usually it is chosen to be the z-component)
and the operator of the square of the angular momentum OL2(or OS2) can have a
common system of eigenvectors characterized by a set of two eigenvalues, „ml for
the z-component and „2l.l C 1/ for the square operator, where ml can take values
ml D �l;�l C 1; � � � ; l � 1; l and can be either integer or half-integer. The results
of Sect. 5.1.4 indicated that orbital angular momentum can only be characterized by
integer eigenvalues, but, as you can see, half-integer values are needed to deal with
the spin angular momentum. It is amusing to think that nature tends to find use for
everything, which appears in abstract mathematical theories! To distinguish between
spin and orbital moments, I will replace notation l for the maximum eigenvalue of
operator OLz with notation s for the maximum eigenvalue of OSz. The lowest value that
s can take is 1=2; which means that there are only two possible eigenvalues of this
operator, �„=2 and „=2. The eigenvalue of the operator OS2 is „2s .s C 1/ D 3„2=4,
but it is the value of s that we have in mind when we are talking about electron
having spin 1/2. Thus, Pauli’s two-valuedness of the electron comes here in the form
of two eigenvectors and two eigenvalues of the z-component of the spin operator.
The idea that the spin is an intrinsic and immutable property of electrons means
that the 1=2 value of quantum number s (or 3„2=4 eigenvalue of operator OS2) is as
unchangeable as an electron’s mass or charge, but at the same time, the electron can
be in various distinct spin states described by eigenvectors of OSz or their arbitrary
superposition.

9.2 Spin 1/2 Operators and Spinors

While spin 1=2 operators are characterized by the same commutation relations as
the operator of the orbital angular momentum, they act on vectors that live in a
two-dimensional space of spin states or spinors. There are no reasons to panic at
the sound of the unfamiliar word. The term spinor is used to describe a specific
class of abstract vectors, which have all the same properties as any other vectors
belonging to a Hilbert space, only much simpler because the dimensionality of the
spinor space is just 2. One can introduce a ket spinor j�i, its adjoint bra spinor
h�j, and inner product of spinors h�j �0i, which has the same property as any other
inner products h�j �0i D .h�0j �i/�. A basis in this space can be formed by two
eigenvectors of operator OSz, for which physicists use several, different in appearance,
but otherwise equivalent notations. Two of the popular ways to designate these

9.2 Spin 1/2 Operators and Spinors 279

eigenvectors are j1=2i for the state belonging to the eigenvalue „=2 and j�1=2i
for its counterpart accompanying eigenvalue �„=2. Alternatively, states with the
positive eigenvalue are often called spin-up states with corresponding notation j"i,
while states with the negative eigenvalue are called spin-down states and are notated
as j#i. The main difference between spinors and vectors representing other states of
quantum systems is that the spinors do not have the coordinate representation. They
exist separately from the vector spaces formed by the eigenvectors of position or
momentum operators or any other observables related to them. Spinors describe
intrinsic properties of electrons, while vectors from other spaces represent their
extrinsic spatial–temporal states.

This basis of the eigenvectors of operator OSz can be used to construct a particular
representation of spinors and spin operators—as I demonstrated about 100 pages
ago in Sect. 5.2.3. Generic spinors in this basis are represented by 2 � 1 column
vectors:

j�i D

a
b

�
D a

1

0

�
C b

0

1

�
; (9.8)

where j"i D
1

0

�
represents the spin-up or m D 1=2 eigenvector, while j#i D

0

1

�
represents the spin-down or m D �1=2 eigenvector. The representation of the

respective bra vector is given by

h�j D �a� b�� D a� �1 0�C b� �0 1� ; (9.9)

and the norm is

h�j �i D a�a C b�b: (9.10)
Normalized spinors obviously obey the condition

jaj2 C jbj2 D 1: (9.11)
Spin operators OSx, OSy, and OSz defined with respect to a particular Cartesian coordinate
system are represented in the basis of the eigenvectors of OSz by two-by-two matrices
derived in Sect. 5.2.3:

OSx D „
2

0 1

1 0

�
(9.12)

OSy D „
2

0 �i
i 0

�
(9.13)

OSz D „
2

1 0

0 �1
�
: (9.14)

280 9 Spin 1/2

Equations 9.12–9.14 are just a recapitulation of Eqs. 5.110 and 5.111 from
Sect. 5.2.3, which I placed here for your convenience. Spin operators are often
expressed in terms of so-called Pauli matrices O�x; O�y, and O�z defined as

O�x D
0 1

1 0

�
(9.15)

O�y D
0 �i
i 0

�
(9.16)

O�z D
1 0

0 �1
�
: (9.17)

These matrices have a number of important properties such as

O�2x D O�2y D O�2z D OI; (9.18)

which means that they are simultaneously Hermitian and unitary, and

O�x O�y C O�y O�x D 0
O�x O�z C O�z O�x D 0 (9.19)
O�z O�y C O�y O�z D 0;

which is often expressed as an anticommutativity property. Pauli matrices are used
quite often in quantum mechanics, so it makes sense to acquaint yourselves with
their properties. For instance, one can prove that the property expressed by Eq. 9.18
is valid for any matrix of the form �n D O� � n, where n is an arbitrary unit vector
and O� is a vector with components given by Pauli matrices. Using the presentation
of the unit vector in spherical coordinates

nx D sin � cos'
ny D sin � sin' (9.20)

nz D cos �;

where � and ' are polar and azimuthal angles defining the direction of n with respect
to a particular system of Cartesian coordinate axis (see Fig. 9.1), you can derive for
the matrix �n D sin � cos'�x C sin � sin'�y C cos ��z:

�n D

cos � sin �e�i'
sin �ei' � cos �

�
:

9.2 Spin 1/2 Operators and Spinors 281

Fig. 9.1 Unit vector in the
Cartesian coordinate system

Z

X

Y

n
q

j

Squaring it will get you

�2n D

cos � sin �e�i'
sin �ei' � cos �

�
cos � sin �e�i'

sin �ei' � cos �
�

D

cos2 � C sin2 � cos � sin �e�i' � cos � sin �e�i'
cos � sin �ei' � cos � sin �ei' cos2 � C sin2 �

�
D

1 0

0 1

�
:

This property makes the evaluation of various functions with Pauli matrices as
arguments relatively easy. One of the popular examples is the exponential function
exp .i O� � n/, which you will enjoy computing when you get to the problem section
of this chapter.

To help you become more comfortable with spin operators, I will now consider
a few examples.

Example 21 (Measurement of the y-Component of the Spin) Assume that a single
unmovable electron is placed in the state described by the spin-up eigenvector of
operator OSz. Using magnetic field directed along the Y-axis of the coordinate system,
you are probing possible values of the y-component of the spin. What are these
possible values and what are their probabilities?

Solution

As with any observable, possible results of its measurement are given by the
eigenvalues of the respective operator. In this case this operator is OSy and you need to
determine its eigenvalues. The answer is, of course, obvious (C„=2 and �„=2), but
let’s play the game and compute it. Besides, along the way you will determine the
eigenvectors, which you need to answer the probability question. So, the eigenvector
equation is

282 9 Spin 1/2

„
2

0 �i
i 0

�
a
b

�
D „
2
�

a
b

�
;

which produces a set of two linear equations

�ib D �a
ia D �b: (9.21)

The condition for the existence of nontrivial solutions given by zero of the
determinant

����
� i
�i �

����

becomes �2�1 D 0, yielding �1;2 D ˙1. Thus, recalling factor „=2 that I prudently
pulled out, you can conclude that the eigenvalues are, indeed, as predicted ˙„=2.
The first eigenvector is found by substituting � D 1 to Eq. 9.21. This gives a D �ib,
so that the respective eigenvector can be written as

ˇ̌„=2y
˛ D b

�i
1

�
D 1p

2

�i
1

�
(9.22)

where at the last step I normalized it requiring that 2 jbj2 D 1. Repeating this
procedure with � D �1, I find

ˇ̌�„=2y
˛ D 1p

2

i
1

�
: (9.23)

Taking into account that the initial state was j„=2zi D
1

0

�
, I find that the

probabilities of the corresponding eigenvalues are

p˙„=2 D
ˇ̌˝˙„=2y

ˇ̌ „=2zi
ˇ̌2 D 1

2

ˇ̌
ˇ̌�˙i 1�

1

0

�ˇ̌
ˇ̌
2

D 1
2
:

Not a huge surprise, really.

Example 22 (Measurement of the Arbitrary Directed Planar Spin) What if we want
to measure a component of the spin along a direction not necessarily aligned with
one of the coordinate axes? Let me consider an example in which the measured
component of the spin is in the Y–Z plane at an angle � with the Z-axis and find
possible outcomes and their probabilities assuming the same initial state as before.

9.2 Spin 1/2 Operators and Spinors 283

Solution

I can define the specified direction by a unit vector with y-component sin � and z-
component cos � . Introducing unit vectors ey and ez along the respective axis, this
vector can be conveniently presented as n D ey sin � C ez cos � . The component of
the spin in the direction of n is given by a dot product OSn D OS�n D OSy sin �C OSz cos � .
Using the matrix representation of the spin operators in the basis of the eigenvectors
of OSz, Eqs. 9.12–9.14, I find for OSn

OSn D „
2

sin �

0 �i
i 0

�
C „
2

cos �

1 0

0 �1
�

D „
2

cos � �i sin �
i sin � � cos �

�
:

The respective eigenvector equation becomes

cos � �i sin �
i sin � � cos �

�
a
b

�
D �

a
b

�
;

and the equation for the eigenvalues takes the form

����
cos � � � �i sin �

i sin � � cos � � �
���� D � .cos � � �/ .cos � C �/ � sin2 � D �2 � 1 D 0:

I am not going to pretend that I am surprised that the eigenvalues are again ˙„=2 as
what else can they be?

Equations for the eigenvectors can be written as

1. � D 1

a cos � � ib sin � D a
�ib sin � D a .1 � cos �/

�2ib sin �
2

cos
�

2
D 2a sin2 �

2

�ib cos �
2

D a sin �
2
:

There are, of course, multiple choices of the coefficients in this equation, but I
want to make the final form of the eigenvector as symmetric as possible, so I will
choose a D A cos �

2
and b D iA sin �

2
, which obviously satisfy the equation with

an arbitrary A. The latter can be found from the normalization condition jaj2 C
jbj2 D 1, which obviously gives A D 1. Now, I can write the first eigenvector as

j„=2ni D

cos �
2

i sin �
2

�
(9.24)

284 9 Spin 1/2

2. � D �1
a cos � � ib sin � D �a
ib sin � D a .1C cos �/

2ib sin
�

2
cos

�

2
D 2a cos2 �

2

ib sin
�

2
D a cos �

2
:

Using the same trick as previously, I find this eigenvector to be

j�„=2ni D

sin �
2

�i cos �
2

�
: (9.25)

The direction described by � D �=2 corresponds to unit vector n pointing in the
direction of the Y-axis, reducing this example to the previous one. Naturally, you
would expect the eigenvector found here to reduce to the respective eigenvectors
from the previous example. However, by substituting � D �=2 into Eqs. 9.24
and 9.25, you find that the resulting vectors do not coincide with Eqs. 9.22
and 9.23. Did I do something wrong here? Not really, because it is easy to
notice that the difference between the two results is a mere factor of i, and we
know that multiplication of an eigenvector by i or by any other complex number
of the form exp .i'/, where ' is an arbitrary real number, does not change a
quantum state and has no observable consequences. Finally, the probabilities that
the measurements of the spin will produce one of the found eigenvalues are

p„=2 D
ˇ̌
ˇ̌�cos �

2
�i sin �

2

�
1
0

�ˇ̌
ˇ̌
2

D cos2 �
2

p�„=2 D
ˇ̌
ˇ̌�sin �

2
i cos �

2

�
1
0

�ˇ̌
ˇ̌
2

D sin2 �
2
:

I can also use this result to find the expectation value of the operator OSn in the state
1

0

�
. The probabilistic definition of the mean

P
xipi, where xi is the value of the

variable and pi is its probability, yields

D OSn
E

D .„=2/ cos2 �
2

� .„=2/ sin2 �
2

D .„=2/ cos �;

which is exactly the value you should have expected from a classical vector oriented
along the Z-axis when computing its component in the direction of n: The same
result is obtained by computing the expectation value using the operator definition:

9.2 Spin 1/2 Operators and Spinors 285

D OSn
E

D h"zj OSn j"zi D „
2

�
1 0
�
cos � �i sin �

i sin � � cos �
�
1

0

�
D

„
2

�
1 0
�
cos �

i sin �

�
D „
2

cos �:

Example 23 (Measuring of the Z-Component in an Arbitrary Spinor State) You can
also ask a question, what if the spin was prepared in a state presented by one of the
eigenvectors of OSn, say, j„=2ni and we were measuring the z-component of the spin?
What would be the probabilities of obtaining „=2 or �„=2 and the expectation value
of OSz in this situation?
Solution

The corresponding probabilities are given by the following expressions:

ˇ̌
ˇ̌�1 0�

cos �

2

i sin �
2

�ˇ̌
ˇ̌
2

D cos2 �
2

ˇ̌
ˇ̌�0 1�

cos �

2

i sin �
2
0

�ˇ̌
ˇ̌
2

D sin2 �
2

yielding exactly the same results. Obviously the expectation value will also be the
same.

These examples were designed to prepare you to answer an important but rather
confusing question. The concept of spin is supposed to represent a vector quantity
existing in our regular physical three-dimensional space. At the same time, the
quantum objects used to describe spin, operators, and spinors have little relation
to this space. While spin operators do have three components, they are not regular
vectors, and the question about the “direction” of a vector operator does not make
much sense. Spinors, representing spin states, are objects existing in an abstract two-
dimensional space. Thus, the question is how these objects are connected with the
physical space in which all our measurement apparatuses live. One might attempt
to deflect this question by saying that after taking the expectation values of the
spin operators for a given spin state, we will end up with a regular vector, which
will provide us with the information about the spin and its direction. I can counter
this by saying that this information is very limited. Indeed, I can also compute
the uncertainty of each spin component, which will also give me a regular vector.
The problem is that in the most generic situation, the vector obtained from the
expectation values and the vector obtained from the uncertainties do not have to have
the same direction, making it difficult to come up with a reasonable interpretation of
these results. One way to avoid this ambiguity is to focus on eigenvectors, in which
case expectation values provide the complete description of the situation. You only
need to figure out the connection between the spatial direction, spin operators and
corresponding eigenvectors.

286 9 Spin 1/2

One way to answer this question is to do what we just did in the previous
example: introduce a component of the spin operator in the direction of interest, find
its eigenvectors, and analyze their connection to this direction. But I want to add a
bit more intrigue to the issue and will use a different approach. Let me ask you this:
what is the best way to write down a generic spinor? Equation 9.8, which does it by
introducing two complex parameters, a and b, is too general and does not contain
all the information available about even the most generic spin states. Indeed, two
complex numbers contain four independent real parameters, which can be brought
out explicitly by writing a and b in the exponential form: a D jaj exp .i a/ and
b D jbj exp .i b/. I can do better than that and reduce the number of parameters to
just two without making the spinor any less generic.

First, I am going to use the freedom of choice of the overall phase of the spinor.
To this end I will multiply both a and b by exp Œ�i . b C 'a/ =2�, bringing the spinor
in the following form:

j�i D
jaj exp .�i'=2/

jbj exp .i'=2/
�
;

where ' D a � b, and there are only three parameters left to worry about.
Obviously, this is not the only way to eliminate one of the phases, but this one
presents the spinor in a rather symmetric form, and just like all physicists, I have
a sweet spot for symmetry. Besides, frankly speaking, I know where I want to
go and just taking you along for the ride. The normalization imposes additional
condition on these parameters, telling me that I can use it to eliminate another one
of them reducing the total number to just two. After a few seconds of staring at
Eq. 9.11, it can descend upon you that this equation looks similar to the fundamental
trigonometric identity cos2 x C sin2 x D 1 and that you can automatically satisfy
the normalization condition by choosing jaj D cos .�=2/ and jbj D sin .�=2/,
expressing both jaj and jbj in terms of a single parameter �=2. If you are asking why
�=2 and not just � , you will have the answer in a few minutes, just keep reading.
Now, as promised, I have the expression for the generic normalized spinor:

j�1i D

cos .�=2/ exp .�i'=2/
sin .�=2/ exp .i'=2/

�
(9.26)

with only two parameters, � and '. The choice I made for jaj and jbj is not
unique, and I can generate another spinor by assigning jaj D sin .�=2/ and
jbj D � cos .�=2/:

j�2i D

sin .�=2/ exp .�i'=2/
� cos .�=2/ exp .i'=2/

�
: (9.27)

It is easy to verify by computing h�1j �2i that these spinors are orthogonal (of
course, I designed them with this particular goal in mind), and by generating
matrices

9.2 Spin 1/2 Operators and Spinors 287

j�1i h�1j D

cos2 .�=2/ cos .�=2/ sin .�=2/ exp .�i'/
cos .�=2/ sin .�=2/ exp .i'/ sin2 .�=2/

�

and

j�2i h�2j D

sin2 .�=2/ � cos .�=2/ sin .�=2/ exp .�i'/
� cos .�=2/ sin .�=2/ exp .i'/ cos2 .�=2/

�
;

you can also check that

j�1i h�1j C j�2i h�2j D OI;

indicating that these two spinors form a complete set. (When trying to reproduce
these calculations, do not forget complex conjugation when converting kets into
respective bra vectors.)

Thus, with little efforts, I have constructed a complete set of two generic mutually
orthogonal spinors characterized by parameters, which can be interpreted as angles,
and this must mean something. The found representation of spinors establishes a
one-to-one relationship between two-dimensional space of spin states and points
on the surface of a regular three-dimensional sphere of unit radius (see Fig. 9.2).
The points at the north and south poles of the sphere, characterized by � D 0
and � D � , describe the eigenvectors of OSz operators j"i and j#i, respectively
(angle ' is not defined for these points, but it is not a problem because respective
factors exp .�i'=2/ become in these cases simply insignificant phase factors). It
is also easy to notice that the antipodal points lying on the opposite ends of an
arbitrarily oriented diameter of the sphere correspond to two mutually perpendicular
spin states. Indeed, spherical coordinates of the antipodal points are related to each
other as �2 D � � �1; '2 D '1 C � . Substituting these expressions into Eq. 9.26,
you will immediately obtain the spinor presented in Eq. 9.27. While performing this
operation, you can appreciate the wisdom of using half-angles �=2 and '=2 in these
expressions.

In order to further figure out the physical meaning of the mapping between
spinors and directions in regular 3-D space, consider the same operator, OSn D OS � n;
which I discussed in the preceding example, but with a unit vector n defining a
generic direction characterized by the same angles �; ' as in Fig. 9.2. This is the
same vector which I introduced in connection with Pauli matrices, Eq. 9.20, so that
the operator OSn becomes

OSn D „
2

O�n D „
2

cos � sin �e�i'

sin �ei' � cos �
�
:

288 9 Spin 1/2

Fig. 9.2 The Bloch sphere:
each point on the surface
characterized by spherical
coordinates �; ' corresponds
to a particular spin state

ex

ey

ez

1

q

j

c

Now, let me apply this operator to the spinors presented in Eq. 9.26:

„
2

cos � sin �e�i'

sin �ei' � cos �
�

cos .�=2/ exp .�i'=2/
sin .�=2/ exp .i'=2/

�
D

„
2

cos � cos .�=2/ exp .�i'=2/C sin � sin .�=2/ exp .�i'=2/

sin � cos .�=2/ exp .i'=2/ � cos � sin .�=2/ exp .i'=2/
�

D

„
2

cos .�=2/ exp .�i'=2/ �cos � C 2 sin2 .�=2/�
sin .�=2/ exp .i'=2/

�
2 cos2 .�=2/ � cos ��

�
D

„
2

cos .�=2/ exp .�i'=2/ �2 cos2 .�=2/ � 1C 2 sin2 .�=2/�
sin .�=2/ exp .i'=2/

�
2 cos2 .�=2/ � 2 cos2 .�=2/C 1�

�
D

„
2

cos .�=2/ exp .�i'=2/
sin .�=2/ exp .i'=2/

�
:

Isn’t that nice? A generic spinor with arbitrarily introduced parameters � and '
turned out to be an eigenvector of an operator representing the component of the
spin in the direction defined by these parameters. It probably would not come as
a particularly great surprise now that the second eigenvector I conjured up is also
an eigenvector of the same operator but corresponding to the second eigenvalue,
namely, �„=2. (Check it out as an exercise. And by the way, did you notice that
in the course of this computation, I used a couple of trigonometric identities such
as cos x D 2 cos2 x=2 � 1 and sin x D 2 sin x=2 cos x=2?) This exercise allows us
to give more substance to an already established connection between spinors and
directions in physical space: each spinor parametrized as in Eq. 9.26 or 9.27 is an
eigenvector of a component of the spin in the direction specified by parameters �
and ' interpreted as spherical coordinates of the corresponding unit vector lying
on the surface of the Bloch sphere. The measurement of the spin in this direction
will yield definite results corresponding to the respective eigenvalue, so it can be
interpreted as the direction of the spin for this particular spin state. It also makes

9.3 Dynamic of Spin in a Uniform Magnetic Field 289

sense that antipodal points on the Bloch sphere represent eigenvectors belonging
to opposite eigenvalues of OSn. Finally, by now, I hope you have the answer to the
question why I used half-angles in the definition of the spinors.

9.3 Dynamic of Spin in a Uniform Magnetic Field

A bound (for instance, by attraction to a nucleus) electron in a uniform magnetic
field, the system used earlier to introduce the Zeeman effect, is also the simplest
somewhat realistic physical model allowing one to study quantum dynamics of a
pure spin. Assuming that the interaction between the spin and the magnetic field
does not affect the orbital state of the electron, one can ignore energy associated
with the latter and omit the atomic part of the Hamiltonian (remember, energy only
matters when it changes, and if it does not, we can always make it equal to zero).
The Hamiltonian of this system is obtained by dropping OH0 term from Eq. 9.3 and
replacing orbital angular momentum OL with 2 OS, where factor 2 takes into account
the empirically established modification of the connection between spin angular and
magnetic momenta, Eq. 9.7. The resulting Hamiltonian takes the form

OH D 2�B„
OS � B: (9.28)

Note that magnetic field B is not an operator because it describes a classical
magnetic field created by a source whose physics is outside of our consideration.
Since the field is uniform, it makes sense to use its direction as one of the axes of the
coordinate system, which I have to specify in order to be able to carry out subsequent
calculations. It is customary to choose axis Z as the one which is codirected with
the magnetic field, in which case Hamiltonian 9.28 significantly simplifies

OH D 2�BB„
OSz: (9.29)

In this section I will discuss the dynamics of spin described by this Hamiltonian
using both Schrödinger and Heisenberg pictures of quantum mechanics.

9.3.1 Schrödinger Picture

In the Schrödinger picture, we always begin by establishing the eigenvalues
and eigenvectors of the Hamiltonian. It is obvious that the eigenvectors of the
Hamiltonian given by Eq. 9.29 coincide with those of operator OSz, which I will
denote here as j"i with eigenvalue „=2 (spin-up) and j#i with eigenvalue �„=2

290 9 Spin 1/2

(spin-down). The respective eigenvalues of the Hamiltonian are quite obvious and
are

E" D �BB
E# D ��BB: (9.30)

A solution of the time-dependent Schrödinger equation for an arbitrary time-
dependent spinor j�.t/i

�i„@ j�.t/i
@t

D 2�BB„
OSz j�.t/i

can be presented as a linear combination of the stationary states of the Hamiltonian

j�.t/i D a exp

i
�BB

„ t

j"i C b exp

�i�BB„ t

j#i (9.31)

where coefficients a and b are determined by the initial state of the spin

j�.0/i D a j"i C b j#i : (9.32)

Equation 9.31 essentially solves the problem of the dynamics of a single spin in a
uniform magnetic field. It, however, does little to develop our intuition about the
physical phenomena, which this solution describes. In a typical experimental situ-
ation, one is rarely dealing with a single spin. Most frequently, an experimentalist
would measure a signal from an ensemble of many spins, and if we can neglect any
kind of interaction between them, as well as assume that all spins are in the same
initial state,1 the experimental results can be described by finding the expectation
values of the spin operators. So, let me compute these expectation values for the
state described by Eq. 9.31.

To this end I will use the representation of a generic spinor in the form of Eq. 9.26
and rewrite coefficients a and b as

a D cos .�=2/ exp .�i'=2/
b D sin .�=2/ exp .i'=2/ : (9.33)

Substituting these expressions for a and b into Eq. 9.31 and using regular represen-
tation of basis spinors j"i and j#i, you can find

1The assumption about the same initial state is the most difficult to realize experimentally and can
be justified only at zero temperature.

9.3 Dynamic of Spin in a Uniform Magnetic Field 291

j�.t/i D
2
4cos .�=2/ exp .�i'=2/ exp

�
i�BB„ t

�

sin .�=2/ exp .i'=2/ exp
�
�i�BB„ t

�
3
5 : (9.34)

It is easiest to compute the expectation value of OSz:
D OSz
E

D „
2

�
jaj2 � jbj2

�
D „
2

cos �: (9.35)

I derived this expression taking advantage of the fact that j"i and j#i in Eq. 9.31
are eigenvectors of OSz, and, therefore, coefficients in front of them (their absolute
values squared, of course) determine the probabilities of the respective eigenvalues.
To find the expectation values of two other components, I will have to do a little bit
more work computing

D OSx;y
E

D h�.t/j OSx;y j�.t/i :

I begin with the x-component and first compute the right half of this expression
OSx j�.t/i:

OSx j�.t/i D „
2

0 1

1 0

�2
4cos .�=2/ exp .�i'=2/ exp

�
i�BB„ t

�

sin .�=2/ exp .i'=2/ exp
�
�i�BB„ t

�
3
5 D

„
2

2
4sin .�=2/ exp .i'=2/ exp

�
�i�BB„ t

�

cos .�=2/ exp .�i'=2/ exp
�

i�BB„ t
�
3
5 :

By the way, have you noticed how operator OSx flips the components of the spinor?
Anyway, to complete this computation, I find the inner product of this ket with the
bra version of spinor in Eq. 9.31:

D OSx
E

D „
2

cos .�=2/ sin .�=2/ exp .i'/ exp

�i2�BB„ t

C cos .�=2/ sin .�=2/ exp .�i'/ exp

i
2�BB

„ t
�

D

„
2

sin � cos

2�BB

„ t � '
:

Similar calculations with the y-component operator yield

D OSy
E

D „
2

sin � sin

2�BB

„ t � '
:

292 9 Spin 1/2

Let’s collect all these results together to get the better picture:

D OSz
E

D „
2

cos �

D OSx
E

D „
2

sin � cos

2�BB

„ t � '

D OSy
E

D „
2

sin � sin

2�BB

„ t � '
: (9.36)

Here is what we have: a vector of length „=2 remains at all times at angle � with
respect to the magnetic field, but its projection on the X–Y plane of the coordinate
system (which is perpendicular to the magnetic field) rotates with frequency
!L D 2�BB=„ D eB=me, where I substituted Eq. 9.5 for the Bohr magneton.
A remarkable fact about this result is the disappearance of Planck’s constant
from the final expression for frequency, which signals that this phenomenon must
exist in classical physics as well, and it, indeed, does. Equation 9.36 describes
a very well-known effect—Larmor precession—which is observed every time
when a magnetic moment (of any nature) interacts with a uniform magnetic field.
However, the frequency of the precession might be different for different magnetic
moments because of its dependence on the so-called gyromagnetic ratio defined as a
coefficient of proportionality between the magnetic dipole moment and the angular
momentum. For the orbital angular momentum, this ratio is �e= .2me/ as given by
Eq. 9.1, while for the spin, it is two times larger, resulting in twice as big precession
frequency.

9.3.2 Heisenberg Picture

To describe spin precession in the Heisenberg picture, I have to solve Heisenberg
equations 4.24 for spin operators. To simplify notations I will omit subindex H,
which I used to distinguish Schrödinger from Heisenberg operators. However, it
is important to note that the angular momentum commutation relations, Eqs. 3.53–
3.55, remain the same for both pictures, provided that we take Heisenberg operators
at the same time. If you do not see how to verify this statement, imagine

sandwiching both sides of the commutation relation between operators exp
�

i OHt=„
�

and exp
�
�i OHt=„

�
and also inserting the products of these operators (which is equal

to unity, by the way) between the products of any two operators in the commutator.
Thus, using the necessary commutation relation, I obtain the following equations:

d OSz
dt

D � i„!L
h OSz; OSz

i
D 0 (9.37)

9.3 Dynamic of Spin in a Uniform Magnetic Field 293

d OSx
dt

D � i„!L
h OSx; OSz

i
D �!L OSy (9.38)

d OSy
dt

D � i„!L
h OSy; OSz

i
D !L OSx; (9.39)

where I introduced the Larmor frequency defined in the previous section to the
Hamiltonian. Differentiating Eqs. 9.38 and 9.39 with respect to time, I can separate
them into two independent differential equations of the second order:

d2 OSx
dt2

D �!2L OSx
d2 OSy
dt2

D �!2L OSy
with obvious solutions

OSx.t/ D OAx sin!Lt C OBx cos!Lt
OSy.t/ D OAy sin!Lt C OBy cos!Lt:

Unknown constant operators OAx;y and OBx;y are determined by the initial conditions
for the spin operators and their derivatives:

OBx D OSx.0/I OAx D 1
!L

d OSx
dt

D � OSy.0/

OBy D OSy.0/I OAy D 1
!L

d OSy
dt

D OSx.0/

where OSx;y .0/ coincide with the Schrödinger spin operators. Thus, I have
OSx.t/ D � OSy.0/ sin!Lt C OSx.0/ cos!Lt (9.40)
OSy.t/ D OSx.0/ sin!Lt C OSy.0/ cos!Lt: (9.41)

All that is left to do is to compute the expectation values of the Schrödinger spin
operators in the initial state given by Eqs. 9.32 and 9.33. However, I do not have to
repeat these calculations as we can just read them off Eq. 9.36 at t D 0: This yields

D OSx.t/
E

D „
2

sin � cos' cos!Lt C „
2

sin � sin' sin!Lt D „
2

sin � cos .!Lt � '/
D OSy.t/

E
D „
2

sin � cos' sin!Lt � „
2

sin � sin' cos!Lt D „
2

sin � sin .!Lt � '/

in complete agreement with the results obtained from the Schrödinger picture.

294 9 Spin 1/2

9.4 Spin of a Two-Electron System

9.4.1 Space of Two-Particle States

I will complete the discussion of the spin by considering a system of two electrons.
The goal of this exercise is to figure out if it makes sense to talk about a total spin of
this system understood as a some kind of the sum of two individual spins OS1C OS2. In
classical physics that would have been a trivial question—of course, we can define
the total angular momentum of several particles—just add them up remembering
that they are vectors. You can even derive a total angular momentum conservation
law valid in the absence of external torques, just like you can derive a total linear
momentum conservation law if the system of particles is not exposed to external
forces. In quantum mechanics, when individual spins are presented by operators
acting in different spaces containing spin states of each particle, the answer to this
question is more complex. It is still affirmative: yes, it is possible to define the total
spin of a system of two (or more particles) by introducing a new operator, which
can be formally defined as

OS.tp/ D OS.1/ C OS.2/; (9.42)

where the upper index is an abbreviation of “two-particle.” However, so far Eq. 9.42
is a purely formal expression, in which even the meaning of the sign “C” is not
clear. What I need to do now is to figure out the properties of OS.tp/ and their relation
to the properties of OS.1/ and OS.2/, which is not a trivial task.

Operators are defined by their action on vectors, and, since vectors live in a
certain vector space, the first step in defining an operator is to understand the space
where vectors, on which the operator acts, live. Operators OS1 and OS2 operate on
vectors that live in different and unrelated spaces: one acts on spin states of one
particle and the other on the states of a completely different particle. I can, however,
combine these spaces to form a new extended space, which would include spin states
of both particles. To define such a space, all what I need is to define a basis in it,
and then any other vector can be presented as a linear combination of the vectors of
basis. The space containing spin states of each individual particle is defined by two
basis vectors for each particle. These states are eigenvectors of operators OS.1/z and OS.2/z
(obviously defined in the same coordinate system) and can be depicted symbolically
in a few equivalent ways discussed in previous sections. Here I will use spin-up
and spin-down notations indicated by the vertical arrows j"1;2i or j#1;2i, where
subindexes 1 and 2 simply indicate a particle whose states these kets represent.
In a system of two particles, there exist four different combinations of their spin
states: both spins up, both spins down, the first spins up, the second spins down,
and vice versa. You can create a notation for these states either by putting two state
signifiers inside a single ket, like that j"1; #2i, or by sticking together two kets like
this: j"1i j#2i. The difference between the two notations is superfluous, and either

9.4 Spin of a Two-Electron System 295

one can be used freely, while the second notation with two separate kets is slightly
more convenient when one needs to write down matrix elements corresponding to
operators acting on different particles. Thus, I will present the four basis vectors in
a new four-dimensional space containing states of a two-spin system as

j1i � j"1i j"2i ; j2i � j"1i j#2i ; j3i � j#1i j"2i j4i � j#1i j#2i : (9.43)

Conversion from kets to bras occurs following all the standard rules of Hermitian
conjugation applied to the states of both particles.

A larger space formed from two smaller spaces in the described manner is
called in mathematics a tensor product of spaces. It has all the standard algebraic
properties of a linear vector space discussed in Sect. 2.2, and I only need to add the
distributive properties involving vectors belonging to different components of the
tensor product:

.je1i C je2i/ jv1i � je1i jv1i C je2i jv1i
je1i .jv1i C jv2i/ � je1i jv1i C je1i jv2i : (9.44)

The inner product in the tensor space is defined as

.he1j hv1j/ .je2i jv2i/ D he1j e2i hv1j v2i ; (9.45)

and it is quite obvious that this definition preserves the main property of the inner
product, namely, that hˇ j˛i D h˛ jˇi�. In the case of the two-spin system, you can,
for instance, find using the notation from Eq. 9.43:

h1j 1i D h"1j "1i h"2j "2i D 1

where it is presumed that vectors j"1;2i are normalized. You can also find that the
inner products involving different vectors from the basis vanish such as

h1j 2i D h"1j "1i h"2j #2i D 0
h1j 3i D h"1j #1i h"2j "2i D 0:

In reality you have already encountered the tensor product of spaces earlier in
this book, even though I never used the name. One example was the construction
of states of a three-dimensional harmonic oscillator from the states of the one-
dimensional oscillators.

To illustrate calculations of the inner product between vectors belonging to such
a tensor product space, consider the following example.

Example 24 (Working with Vectors from a Tensor Product Space) Compute the
norms of the following vectors as well as the inner product hˇ j˛i:

296 9 Spin 1/2

j˛i D .3i j"1i C 4 j#1i/ .2 j"2i � i j#2i/
jˇi D .2 j"1i � i j#1i/ .2 j"2i � 3 j#2i/ :

Solution

Since all the vectors in adjacent parenthetic expressions are kets and belong to
different spaces, it is clear that I am dealing here with the tensor product of two
spaces. Distribution properties, expressed by Eq. 9.44, allow me to convert these
expressions into

j˛i D 6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i
jˇi D 4 j"1i j"2i C 3i j#1i j#2i � 6 j"1i j#2i � 2i j#1i j"2i :

Note that the order in which vectors belonging to different spaces are stacked
together is completely irrelevant. Using the normalized and orthogonal basis
introduced in Eq. 9.43, I can rewrite this expression as

j˛i D 6i j1i � 4i j4i C 3 j2i C 8 j3i
jˇi D 4 j1i C 3i j4i � 6 j2i � 2i j3i :

Now I can compute the norms and the inner product following the standard
procedure, which yields

k˛k D p36C 16C 9C 64 D p125
kˇk D p16C 9C 36C 4 D p65

hˇ j˛i D 4 � 6i � 4i .�3i/ � 3 � 6C 8 � .2i/ D �30C 40i:

Finally, I need to introduce the rule describing how spin operators act on the
vectors in the tensor product space. The rule is actually very simple: the operators
affect only the states of their own particles. To illustrate this rule, consider the
following example.

Example 25 (Operator Action in Tensor Spaces) For the state j˛i from the previous
example, compute

(a)

� OS.1/z C OS.2/z
�

j˛i

(b)

� OS.1/C C OS.2/C
�

j˛i

9.4 Spin of a Two-Electron System 297

Solution

(a)

� OS.1/z C OS.2/z
�
.6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i/ D

6i OS.1/z j"1i j"2i C 6i j"1i OS.2/z j"2i � 4i OS.1/z j#1i j#2i � 4i j#1i OS.2/z j#2i C
3 OS.1/z j"1i j#2i C 3 j"1i OS.2/z j#2i C 8 OS.1/z j#1i j"2i C 8 j#1i OS.2/z j"2i D

3i„ j"1i j"2i C 3i„ j"1i j"2i C 2i„ j#1i j#2i C 2i„ j#1i j#2i C
3„
2

j"1i j#2i � 3„
2

j"1i j#2i � 4„ j#1i j"2i C 4„ j#1i j"2i D
6i„ j"1i j"2i C 4i„ j#1i j#2i

(b)

� OS.1/C C OS.2/C
�
.6i j"1i j"2i � 4i j#1i j#2i C 3 j"1i j#2i C 8 j#1i j"2i/ D

6i OS.1/C j"1i j"2i C 6i j"1i OS.2/C j"2i � 4i OS.1/C j#1i j#2i � 4i j#1i OS.2/C j#2i C
3 OS.1/C j"1i j#2i C 3 j"1i OS.2/C j#2i C 8 OS.1/C j#1i j"2i C 8 j#1i OS.2/C j"2i D

�4i„ j"1i j#2i � 4i„ j#1i j"2i C 3„ j"1i j"2i C 8„ j"1i j"2i D
11„ j"1i j"2i � 4i„ .j"1i j#2i C j#1i j"2i/

9.4.2 Operator of the Total Spin

The concept of the tensor product gives an exact mathematical meaning to Eq.
9.42 and the “plus” sign in it as illustrated by the previous example. Indeed, if
each operator appearing on the right-hand side of this equation is defined to act
in a common space of two-particle states, then the plus sign generates the regular
operator sum as defined in earlier chapters of this book.

Now I can tackle the main question: what are the eigenvalues and eigenvectors
of the components of the total spin operator defined by Eq. 9.42, and of its square� OStp

�2 D
� OS.1/ C OS.2/

�2
? When discussing any systems of operators, the first

question you must be concerned with is the commutation relations between these
operators. The first commutator that needs to be dealt with is between operators
OS.1/ and OS.2/, and it is quite obvious that any two components of these operators
commute, i.e.,

h OS.1/i ; OS.2/j
i

D 0

298 9 Spin 1/2

for all i; j taking values x; y, and z. Indeed, since OS.1/i only affects the states of
the particle 1, and OS.2/i acts only on the states of particle 2, an order in which
these operators are applied is irrelevant. Now it is quite easy to establish that all
commutation relations for the components of operator OStp and its square are exactly
the same as for any operator of angular momentum. This justifies the claim that there
exists a system of vectors, jS;Mi, which are common eigenvectors of one of the
components of OStp, usually chosen to be OStpz , and of the operator

� OStp
�2

characterized

by two numbers M and S such that

OStpz jS;Mi D „M jS;Mi
� OStp

�2 jS;Mi D „2S .S C 1/ jS;Mi :

It also can be claimed that jMj � S and that these numbers take integer of half-
integer values. What is missing at this point is the information about the actual
values that S can take and its relation to eigenvalues of the spin operators of the
individual particles. Also, one would like to know about the connection between

eigenvectors of OStpz and
� OStp

�2
and their single-particle counterparts. To answer

these questions, I am going to generate matrix representations of operators OStpz and� OStp
�2

, using the basis vectors defined in Eq. 9.43. I will start with operator OStpz . The
application of this operator to the basis vectors yields (see examples above)

OStpz j"1i j"2i D „ j"1i j"2i (9.46)
OStpz j"1i j#2i D 0 (9.47)
OStpz j#1i j"2i D 0 (9.48)

OStpz j#1i j#2i D �„ j#1i j#2i : (9.49)
These results indicate that the basis vectors defined in Eq. 9.43 are also eigenvectors
of the operator OStpz with eigenvalues ˙„ and a double-degenerate eigenvalue 0. Thus,
the matrix of this operator in this basis is a diagonal 4 � 4 matrix:

OStpz D „

2
664

1 0 0 0

0 0 0 0

0 0 0 0

0 0 0 �1

3
775

where I have positioned the matrix elements in accord with the numeration of
eigenvectors introduced in Eq. 9.43. For instance, the right-hand side of Eq. 9.46,
where operator OStpz acts on the first of the basis vectors, represents the first column of
the matrix, which contains a single non-zero element, the right-hand side of Eq. 9.47
yields the second column, where all elements are zeroes, and so on and so forth.

9.4 Spin of a Two-Electron System 299

Operator
� OStp

�2
requires more work. First, let me rewrite it in terms of the

particle’s operators OS.1/ and OS.2/:
� OStp

�2 D
� OS.1/ C OS.2/

�2
D
� OS.1/

�2
C
� OS.2/

�2
C 2 OS.1/ � OS.2/ D

� OS.1/
�2

C
� OS.2/

�2
C 2

� OS.1/x OS.2/x C OS.1/y OS.2/y C OS.1/z OS.2/z
�

D
� OS.1/

�2
C
� OS.2/

�2
C 2 OS.1/z OS.2/z C

2

" OS.1/C C OS.1/�
2

OS.2/C C OS.2/�
2

C
OS.1/C � OS.1/�

2i

OS.2/C � OS.2/�
2i

#
D

� OS.1/
�2

C
� OS.2/

�2
C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

where I replaced the x- and y-components of the spin operator by ladder operators
defined in Eqs. 3.59 and 3.60 adapted for spin operators. The last expression is

perfectly suited for generating the matrix of
� OStp

�2
. Applying this operator to each

of the basis vectors, I can again simply read out the columns of this matrix:

� OStp
�2 j1i D

� OS.1/
�2

C
� OS.2/

�2
C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

�
j"1i j"2i D (9.50)

3

4
„2 j"1i j"2i C 3

4
„2 j"1i j"2i C 1

2
„2 j"1i j"2i D 2„2 j"1i j"2i � 2„2 j1i :

The ladder operators do not contribute to the final result because the raising operator
applied to the spin-up vector yields zero. All other terms in this expression follow
from the standard properties of the spin operators. Continue

� OStp
�2 j2i D

� OS.1/
�2

C
� OS.2/

�2
C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

�
j"1i j#2i D

3

4
„2 j"1i j#2i C 3

4
„2 j"1i j#2i � 1

2
„2 j"1i j#2i C „2 j#1i j"2i D (9.51)

„2 j"1i j#2i C „2 j#1i j"2i � „2 j2i C „2 j3i

where the ladder operators in term OS.1/� OS.2/C are responsible for the non-zero
contribution and where the spin of each particle becomes upside down. And again

300 9 Spin 1/2

� OStp
�2 j3i D

� OS.1/
�2

C
� OS.2/

�2
C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

�
j#1i j"2i D

3

4
„2 j#1i j"2i C 3

4
„2 j#1i j"2i � 1

2
„2 j#1i j"2i C „2 j"1i j#2i D (9.52)

„2 j"1i j#2i C „2 j#1i j"2i � „2 j3i C „2 j2i

where the inversion of the spins in the last term is due to operators OS.1/C OS.2/� . Finally
� OStp

�2 j4i D
� OS.1/

�2
C
� OS.2/

�2
C 2 OS.1/z OS.2/z C OS.1/C OS.2/� C OS.1/� OS.2/C

�
j#1i j#2i D (9.53)

3

4
„2 j#1i j#2i C 3

4
„2 j#1i j#2i C 1

2
„2 j#1i j#2i D 2„2 j#1i j#2i � 2„2 j4i :

Reading out columns 1 through 4 from Eqs. 9.50 to 9.53 correspondingly, I generate
the desired matrix:

� OStp
�2

ij
D „2

2
664

2 0 0 0

0 1 1 0

0 1 1 0

0 0 0 2

3
775 :

What is left now is to find its eigenvalues and eigenvectors, i.e., to solve the
eigenvalue problem:

„2
2
664

2 0 0 0

0 1 1 0

0 1 1 0

0 0 0 2

3
775

2
664

a1
a2
a3
a4

3
775 D �„2

2
664

a1
a2
a3
a4

3
775 :

Two eigenvalues can be found just by looking at Eqs. 9.50 and 9.53, which indicate
that vectors j1i and j4i are eigenvectors of this matrix with eigenvalues 2„2. This
circumstance is reflected in the structure of the matrix, where the first and fourth
rows as well as the first and fourth columns contain single non-zero elements. Such
matrices are known as block-diagonal, and what makes them special is that each
block can be considered independently of the others and treated accordingly. For
instance, the equations for a1 and a4 will not contain any other coefficients, while
equations for elements a2 and a3 will only contain these two elements. Since I
already know that solutions with a1 D 1, a2;3;4 D 0 and a4 D 1, a1;2;3 D 0 are

9.4 Spin of a Two-Electron System 301

eigenvectors corresponding to � D 2„2, I only need to deal with the remaining two
coefficients a2 and a3 satisfying equations

a2 C a3 D �a2
a2 C a3 D �a3:

It immediately follows from this system that either a2 D a3 or � D 0. In the former
case, I have

� D 2;

while the latter one gives me

a2 D �a3:

Thus, I end up once again with eigenvalue 2„2, but now it belongs to the eigenvector
1p
2
.j2i C j3i/ D 1p

2
.j"1i j#2i C j#1i j"2i/

where I set a2 D a3 D 1=
p
2 to make this vector normalized. I also got a new

eigenvalue equal to zero with eigenvector

1p
2
.j2i � j3i/ D 1p

2
.j"1i j#2i � j#1i j"2i/ :

Recalling that eigenvalues of the
� OStp

�2
must have the form „2S.S C 1/, I can

immediately deduce that eigenvalue 2„2 corresponds to S D 1, while eigenvalue
zero, obviously, corresponds to S D 0.

It is time to put all these results together. Here is what I have: a triple degenerate
eigenvalue characterized by spin S D 1 and three eigenvectors

j1; 1i D j"1i j"2i

j1; 0i D 1p
2
.j"1i j#2i C j#1i j"2i/ (9.54)

j1;�1i D j#1i j#2i

and a single non-degenerate eigenvalue corresponding to S D 0 with eigenvector

j0; 0i D 1p
2
.j"1i j#2i � j#1i j"2i/ (9.55)

302 9 Spin 1/2

attached to it. Notations used for these eigenvectors follow the traditional scheme
jS;Mi and reflect the facts that all three eigenvectors in Eq. 9.54 are simultaneously
eigenvectors of operator OStpz with corresponding quantum numbers M D 1, M D 0,
and M D �1, while a single eigenvector in Eq. 9.55 is also an eigenvector of OStpz
corresponding to M D 0. You might want to pay attention to the fact that both
superposition eigenvectors j2; 0i and j0; 0i are linear combinations of the eigen-
vectors of OStpz established in Eqs. 9.47 and 9.48 belonging to a double-degenerate
eigenvalue 0 of OStpz , which reflects the general notion that linear combinations of
degenerate eigenvectors are also eigenvectors belonging to the same eigenvalue. The
particular combinations appearing in Eqs. 9.54 and 9.55 ensure that these vectors are

simultaneously eigenvectors of the operator
� OStp

�2
. The results presented in these

equations also reflect the general property of the angular momentum operators: the
value of quantum number S determines the maximum and minimum allowed values
of the second quantum number M and, respectively, the total number 2S C 1 of
eigenvectors belonging to the given eigenvalue of

� OStp
�2

. Indeed, for S D 1, we
have three vectors with M ranging from �1 to 1, while for S D 0, there exists a
single vector with M D 0. This situation is often described by saying that the system
of two-spin 1=2 particles can be in two states characterized by the total spin equal
to one or zero. The former is called a triplet state reflecting the existence of three
distinct states with the same S and different magnetic numbers M, and the latter is
called a singlet for obvious enough reasons. People also often say that in the triplet
state, the spins of the particles are parallel to each other, while in the singlet state,
they are antiparallel, but this is highly misleading. Even leaving aside the obvious
quantum mechanical fact that the direction of spin in quantum mechanics is not
defined because only one component of the vector can have a definite value in a
given state, parallel or antiparallel can refer only to the sign of the z-component of
the spin determined by the value of M. As we have just seen, this number can be
equal to zero, reflecting the “antiparallel” orientation of the particle’s spins, when
the particles are either in the S D 1 or S D 0 state. Therefore, more accurate
verbal description of the situation (if you really need one) may sound like this: in
the triplet spin states, the particle’s spins can be either parallel or antiparallel, while
in the singlet state, they can only be antiparallel.

To complete this discussion, let me direct your attention to another interesting
difference between triplet and singlet states. The former are symmetric with respect
to the exchange of the particles, while the latter are antisymmetric. What it means
is that if you replace particle indexes 1 and 2 in Eqs. 9.54 and 9.55 (exchange
the particles one and two), the states described by the former equation do not
change, while the singlet state described by the latter equation changes its sign.
The operation of the particle’s exchange reflects the classical idea that you can
somehow distinguish between the particles marking them as one and two and then
swap them by placing particle one in the state of particle two and vice versa. In
quantum mechanics two electrons are not really distinguishable, and, therefore, the
swapping operation shouldn’t change the properties of the system. This topic will be
discussed in much more detail in Chap. 11 devoted to quantum mechanics of many

9.5 Operator of Total Angular Momentum 303

identical particles. Here I just want to mention, giving you a brief preview of what
is coming, that the symmetry and antisymmetry of the spin states of the two-particle
system are a reflection of quantum indistinguishability of electrons.

9.5 Operator of Total Angular Momentum

9.5.1 Combining Orbital and Spin Degrees of Freedom

When discussing the model of spin 1=2 or addition of two such spins, I intentionally
ignored the fact that the spin is “attached” to a particle, which can be involved
in all kinds of crazy things such as being a part of an atom or rushing through a
piece of metal delivering an electron current. At the same time, such phenomena as
resonant tunneling or hydrogen atom in the previous chapters were treated with utter
ignorance of the fact that in addition to “regular” observables, such as position or
momentum, electrons also carry around their spin, which is as unalienable as their
mass or charge. Now the time has come to design a formalism allowing to treat spin
and orbital properties2 of the electrons (and other particles with spin) together.

First of all, one needs to recognize that the spinors and orbital vectors are
completely different animals and inhabit different habitats. For instance, while you
can represent eigenvectors of momentum and angular momentum in the same, say,
position representation or express them in terms of each other, it is impossible to
construct a position representation for the eigenvectors of the spin operators or
present momentum eigenvectors as a linear combination of spinors. Accordingly,
operators acting on orbital vectors do not affect spinors, and spin operators are
indifferent to vectors representing orbital states. One of the trivial consequences
of this is, of course, that orbital and spin operators always commute. Giving these
statements a bit of a thought, you can notice a certain similarity with the just
discussed two-spin problem, where we also had to deal with vectors belonging to
two unrelated spaces and being acted upon only by their “native” operators. That
situation was handled by combining spinors representing spin states of different
particles into a common space formed as a tensor product of the spaces of each
individual spin. Similarly, spin and orbital spaces of a single particle can also be
combined into a tensor product space by stacking together all combinations of
the basis vectors from both spaces. Assuming that the orbital space is described

by some discrete basis
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
based on a set of mutually consistent

observables, a typical basis vector in a compound tensor product space can be made
to look something like this:

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
jmsii ; (9.56)

2By orbital properties I understand all those properties of the particle that can be described using
quantum states related to position or momentum operators or a combination thereof. In what
follows I will call these states and vectors representing them orbital states or orbital vectors.

304 9 Spin 1/2

where jmsii is a basis spinor. Since there are only two of those, the dimension of
the combined space is two times the dimensionality of the orbital space. Indeed,

attaching the spin state to each orbital basis vector
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
, I am

generating two new basis vectors:

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j1=2i

and
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j�1=2i ;

or, if you prefer,

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j"i

and
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j#i :

Sometimes the indicator of a spin state is put inside a single ket or bra vector together
with the signifiers of all other observables:

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
jmsii �

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p ;msi

E
; (9.57)

but this notation hides the critical difference between the spin and orbital observ-
ables and makes some calculations less intuitive, so I would prefer using the notation
of Eq. 9.56 most of the time. Nevertheless, sometimes it might be appropriate to
use the simplified notation of Eq. 9.57, and if you notice me doing it, do not
start throwing stones—this is just a notation, chosen based on convenience and a
moment’s expedience.

An arbitrary vector j�i residing in the tensor product space can be presented as

j�i D
X

km;���p
akm���pI"

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j"i C

X
km;���p

akm���pI#
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
j#i : (9.58)

Expansion coefficients akm���pI" now define the probability
ˇ̌
akm���pI"

ˇ̌2
that the

measurement of the mutually consistent observables including a component of
the spin will yield values k:m � � � p for regular observables and „=2 for the spin’s
component. The set of coefficients akm���pI# defines the probability

ˇ̌
akm���pI#

ˇ̌2
that

9.5 Operator of Total Angular Momentum 305

the observation will produce the same values of all the orbital observables and value
�„=2 for the spin. The sum of probabilities

pkm���p D
ˇ̌
akm���pI"

ˇ̌2 C ˇ̌akm���pI#
ˇ̌2

yields the probability to observe the given values of the observables provided that
the spin is not measured, while the sums

p" D
X

km;���p

ˇ̌
akm���pI"

ˇ̌2

or

p# D
X

km;���p

ˇ̌
akm���pI#

ˇ̌2

generate probabilities of getting values of the spin component „=2 or �„=2,
respectively, regardless of the values of other observables. Finally, the normalization
condition for the expansion coefficients must now include the summation over all
available variables:

X
km;���p

hˇ̌
akm���pI"

ˇ̌2 C ˇ̌akm���pI#
ˇ̌2i D 1: (9.59)

Equations 9.56–9.58 are written under the assumption that the basis in the
orbital space is discrete. However, they can be easily adapted to representations
in a continuous basis by replacing all the sums with integrals and probabilities
with corresponding probability densities. For instance, in the basis of the position
eigenvectors jri, Eqs. 9.56 and 9.58 become jri jmsi and

j�i D
ˆ

d3r ".r/ jri j"i C
ˆ

d3r #.r/ jri j#i : (9.60)

j ms.r/j2 now gives the position probability density for the corresponding spin state
jmsi,

ˇ̌
".r/

ˇ̌2 C ˇ̌ #.r/
ˇ̌2

yields the same, but when the spin state is not important,
and

´
d3r j ms.r/j2 generates the probability of finding the particle in the spin state

jmsi. The normalization Eq. 9.59 now becomes
ˆ

d3r
hˇ̌
".r/

ˇ̌2 C ˇ̌ #.r/
ˇ̌2i D 1: (9.61)

One can generate particular representations for the generic vectors j�i by
choosing specific bases for the orbital and spinor components of the states. One
of the most popular choices is to use the position representation for the orbital
vectors and eigenvectors of operator OSz for the spinor component. The respective

306 9 Spin 1/2

representation is generated by premultiplying
ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
by hrj, which

yields

q
.1/
k ;q

.2/
m ;���q.Nmax/p .r/ D hr

ˇ̌
ˇq.1/k ; q.2/m ; � � � q.Nmax/p

E
;

and by replacing jmsii with a corresponding two-component column
1

0

�
for the

spin-up (or C1=2) state and
0

1

�
for the spin-down (or �1=2) state. Since the

coordinate representation for the orbital states is almost always used in conjunction
with the representation of spinors in the basis of the eigenvectors of OSz operator, I
will call this form the coordinate–spinor representation. Then the combined spin–
orbital state takes the form

q
.1/
k ;q

.2/
m ;���q.Nmax/p .r/

1

0

�

or

q
.1/
k ;q

.2/
m ;���q.Nmax/p .r/

0

1

�
;

depending on the chosen spin state. The generic state vector represented by Eq. 9.58
in this representation becomes (I will keep the same notation for the abstract vector
and its coordinate–spinor representation to avoid introducing new symbols, when it
is not really necessary and should not cause any confusion)

j�i D
X

km;���p
akm���pI" q.1/k ;q.2/m ;���q.Nmax/p .r/

1

0

�
C

X
km;���p

akm���pI# q.1/k ;q.2/m ;���q.Nmax/p .r/
0

1

�
D

‰".r; t/
1

0

�
C‰#.r; t/

0

1

�
D
‰".r; t/
‰#.r; t/

�
; (9.62)

where

‰".r; t/ D
X

km;���p
akm���pI".t/ q.1/k ;q.2/m ;���q.Nmax/p .r/

‰#.r; t/ D
X

km;���p
akm���pI#.t/ q.1/k ;q.2/m ;���q.Nmax/p .r/ (9.63)

9.5 Operator of Total Angular Momentum 307

are the orbital wave functions corresponding to spin-up and spin-down states
correspondingly. These functions appear in Eq. 9.63 as linear combinations of
the initial basis vectors transformed in their position representations. Obviously,
‰".r; t/ and ‰#.r; t/ in these expressions are the same functions, which appear
in Eq. 9.60 presenting expansion of an abstract vector j�i in the basis of position
eigenvectors.

Any combination of orbital and spin operators act on vectors defined by Eq. 9.58
or 9.62 following a simple rule: orbital operators act on orbital component of the
vector, and spin operators affect only its spin component. To illustrate this point,
consider the following example.

Example 26 (Using Operators of Orbital and Spin Angular Momentum.) Consider
the following vector representing a state of an electron in a hydrogen atom:

j˛i D 2
3

j2; 1;�1i j"i C 1
3

j1; 0; 0i j#i � 1
3

j2; 0; 0i j"i C 1p
3

j2; 1; 1i j#i ;

where the orbital portion of the state follows the standard notation jn; l;mi. Compute
the following expressions:

1. h˛j OH j˛i, where OH is the Hamiltonian of a hydrogen atom, Eq. 8.6.
2.
� OLC OS� C OL� OSC

�
j˛i.

3.
� OLz C OSz

�
j˛i.

4. Write down vector j˛i in the coordinate–spinor representation.
Solution

1. I begin by computing

OH j˛i D �2
3

E1
4

j2; 1;�1i j"i � 1
3

E1 j1; 0; 0i j#i C 1
3

E1
4

j2; 0; 0i j"i

� 1p
3

E1
4

j2; 1; 1i j#i ;

where �E1 is the hydrogen ground state energy. Now I find

h˛j OH j˛i D �E1
9

� E1
9

� E1
36

� E1
12

D �E1
3
;

where I took into account that all terms in the expression above remain mutually
orthogonal, so that all cross-product terms in the inner product vanish. The spin
components of the state are not affected by the Hamiltonian because it does not
contain any spin operators.

308 9 Spin 1/2

2.

� OLC OS� C OL� OSC
�

j˛i D 2
3

p
2„2 j2; 1; 0i j#i C 1p

3

p
2„2 j2; 1; 0i j"i D

r
2

3
„2 j2; 1; 0i

2p
3

j#i C j"i
;

where I applied orbital and spin ladder operators separately to corre-
sponding orbital and spin portions of the vectors using correspondingly
Eqs. 3.75, 3.76, 5.104, and 5.106. In particular I found that

OLC OS� j2; 1; 0i j#i D OLC j2; 1; 0i OS� j#i D 0

as well as that

OL� OSC j2; 1; 0i j"i D OL� j2; 1; 0i OSC j"i D 0:

3.

� OLz C OSz
�

j˛i D �„2
3

j2; 1;�1i j"i C 2
3

„
2

j2; 1;�1i j"i �
1

3

„
2

j1; 0; 0i j#i � 1
3

„
2

j2; 0; 0i j"i C 1p
3

„ j2; 1; 1i j#i � 1p
3

„
2

j2; 1; 1i j#i D

„
2

�2
3

j2; 1;�1i j"i � 1
3

j1; 0; 0i j#i � 1
3

j2; 0; 0i j"i C 1p
3

j2; 1; 1i j#i
�

4. A coordinate–spinor representation of vector j˛i looks like this:
"
2
3
R21.r/Y�11 .�; '/ � 13p4�R20.r/
1

3
p
4�

R10.r/C 1p
3
R21.r/Y11 .�; '/

#
:

If ‰".r; t/ and ‰#.r; t/ can be written down as

‰".r; t/ D a1.t/ .r; t/I ‰#.r; t/ D a2.t/ .r; t/; (9.64)

Eq. 9.62 becomes

j�i D .r; t/

a1.t/
a2.t/

�
: (9.65)

resulting in the separation of spin and orbital components of the state. The spin and
orbital properties of the particle in such a state are completely independent of each

9.5 Operator of Total Angular Momentum 309

other, and changing one of them wouldn’t affect the other. In a more generic case,
when ‰".r/ and ‰#.r/ are two different functions, the orbital state of the particle
depends on its spin state and vice versa. This interdependence is called “spin–orbit
coupling” and is responsible for many important phenomena. Some of them are old,
known for a century, while others have been discovered only recently. For instance,
spin–orbit interaction is responsible for the fine structure of atomic spectra (an old
phenomenon known from the earlier days of quantum mechanics), but it also gave
birth to the entire new “hot” research area in contemporary semiconductor physics
known as spintronics. Researchers working in this field seek to control the flow of
electrons using their spin as a steering wheel and also to control the orientation of
an electron’s spin by affecting its electric current. I will talk more about spin–orbit
coupling and its effect on atomic spectra in Chap. 14, but for the spintronics effects,
you will have to consult a more specialized book.

While the abstract form of the Schrödinger equation

i„@ j�i
@t

D OH j�i

stays the same even when the spin and orbital degrees of freedom are combined,
its position representation, which is frequently used for practical calculations,
needs to be modified. Indeed, in the representation described by Eq. 9.62, a state
of a particle is described by two wave functions corresponding to two different
spin states. Respectively, a single Schrödinger equation becomes a system of two
equations, whose form depends on the interactions included in the Hamiltonian.
To find the explicit form of these equations, you will need to convert operator OH
into the combined position–spinor representation. This can be done independently
for the orbital and spin portions of the Hamiltonian with the result, which can be
presented in the form

OH ! OHms;m0s .r/ � hmsj OH .r/
ˇ̌
m0s
˛

where ms;m0s take values 1 or 2 corresponding, respectively, to ms D 1=2 and ms D
�1=2. Thus, the Hamiltonian in the presence of the spin becomes a 2 � 2 matrix,
and its action on the state presented in the form of Eq. 9.62 involves (in addition to
what it normally does to orbital vectors) the multiplication of a matrix and a spinor.
In the most trivial case, when the Hamiltonian does not contain any spin operators
and does not act, therefore, on spin states, this matrix becomes

OHms;m0s .r/ � hmsj OH .r/
ˇ̌
m0s
˛ D OH .r/ hms

ˇ̌
m0s
˛ D OH .r/ ıms;m0s

so that the Schrödinger equations for both wave function components ‰".r/ and
‰#.r/ are identical. In this case the total state of the system is described by the
vector of the form given by Eq. 9.65, in which the coefficients a1 and a2 of the spinor
component can be chosen arbitrarily. Physically this means that in the absence of the

310 9 Spin 1/2

spin-related terms in the Hamiltonian, the spin state of the particle does not change
with time and is determined by the initial conditions.

Now let me consider a less trivial case, when the Hamiltonian includes a stand-
alone spin operator, something like what we dealt with in Sect. 9.3:

OH D OHorb C 2�BB„
OSz: (9.66)

Here OHorb is a spin-independent portion of the Hamiltonian, and the second term,
as you know, describes the interaction of the spin with uniform magnetic field B
directed along the Z-axis. In the matrix form, this Hamiltonian becomes

OHms;m0s D OHorbıms;m0s C �BB . O�z/ms;m0s (9.67)

where I used the representation of the spin operators in terms of the corresponding
Pauli matrices introduced in Eqs. 9.15–9.17. The explicit matrix form of the
stationary Schrödinger equation becomes

OHorb 0
0 OHorb

�
‰".r/
‰#.r/

�
C �BB

1 0

0 �1
�
‰".r/
‰#.r/

�
D E

‰".r/
‰#.r/

�

and translates into two independent equations:

OHorb‰".r/C �BB‰".r/ D E‰".r/ (9.68)
OHorb‰#.r/ � �BB‰#.r/ D E‰#.r/: (9.69)

This independence signifies the absence of any spin–orbit coupling in this system:
the functions‰".r/ and‰#.r/ can be chosen in the form of Eq. 9.64 where .r/ is a
solution of the orbital equation OHorb .r/ D Eorb .r/. With this, Eqs. 9.68 and 9.69
can be converted into equations

a1 .E � Eorb � �BB/ D 0
a2 .E � Eorb C �BB/ D 0;

yielding two eigenvalues E.1/ D Eorb C �BB and E.2/ D Eorb � �BB, with two
respective eigenvectors a.1/1 D 1; a.1/2 D 0 and a.2/1 D 0; a.2/2 D 1. Choosing the
zero level of energy at Eorb and disregarding the orbital part of the resulting spinors

ˇ̌
.1/

˛ D .r/
1

0

�

ˇ̌
.2/

˛ D .r/
0

1

�

9.5 Operator of Total Angular Momentum 311

which does not affect any of the phenomena associated with the action of the
magnetic field on electron’s spin, you end up with eigenvalues

E.1;2/ D ˙�BB

and eigenvectors

ˇ̌
.1/

˛ D
1

0

�
I ˇ̌
.2/˛ D

0

1

�

identical to those found for a single spin in the magnetic field in Sect. 9.3.
This example demonstrates that the “pure” spin approach, which ignores orbital
components of the total state of a particle, is justified as long as the presence of the
spin does not change its orbital state, i.e., in the absence of the spin–orbit interaction.

9.5.2 Total Angular Momentum: Eigenvalues and Eigenvectors

In Example 26 in the preceding section, you learned that working in the tensor
product of spin and orbital spaces, you can operate with expressions combining
orbital and spin operators such as OLz C OSz. This is a z-component of a vector operator

OJ D OL C OS (9.70)

called the operator of a total angular momentum, which plays an important role in
the general structure of quantum mechanics as well as in a variety of its applications.
For instance, this operator is crucial for understanding the energy levels of hydrogen
atom in the presence of spin–orbit coupling and magnetic field; I will introduce you
to these topics in Chap. 14. Here my objective is to elucidate the general properties
of this operator, which appears to be a logical conclusion to the discussion started
in the previous section.

I begin by stating that components of vector OJ obey the same commutation
relations as those of its constituent vectors OL and OS. This statement is easy to verify,
taking into account that orbital and spin operators commute. For instance, you can
check that

OJx OJy � OJy OJx D OLx OLy � OLy OLx C OSx OSy � OSy OSx D i„OLz C i„ OSz D i„ OJz (9.71)

where I canceled terms like OLx OSy � OSy OLx D 0. Once the commutation relations for
the components of OJ are established, one can immediately claim that all components
of OJ commute with operator OJ2, which can be written down as

OJ2 D OL2 C OS2 C 2 OL � OS: (9.72)

312 9 Spin 1/2

Indeed, the proof of the similar statement for orbital angular momentum carried
out in Sect. 3.3.2 was based exclusively on the inter-component commutation
relations and is, therefore, automatically expanded to all operators with the same
commutation relations. If you go back to Sect. 3.3.4, you will recall that the
derivation of the eigenvalues of the orbital angular momentum operators carried
out there also relied exclusively on the commutation relations. Therefore, you can
immediately claim, without fear of retribution or embarrassment, that operators OJ2
and OJz possess a common system of eigenvectors, characterized by two numbers j
and mJ , satisfying inequality �j � mJ � j, taking either integers or half-integer
values, and which generate eigenvalues of these operators according to

OJ2 j j;mJi D „2j. j C 1/ j j;mJi (9.73)
OJz j j;mJi D „mJ j j;mJi : (9.74)

However, it would be wrong for you to think that Eqs. 9.72 and 9.74 are the
end of the story. While these equations do give you some information about the
eigenvalues and eigenvectors of OJ2 and OJz, this information is quite limited and does
not allow you, for instance, to generate representations of these vectors in any basis
except of their own or to help you evaluate the results of the application of various
combinations of orbital and spin angular momentum operators to these states. To
be able to do all this, you need to answer more rather tough questions such as (a)
what is a relation between numbers j, mJ on the one hand and numbers l, s, m, and
ms on the other, and (b) how are vectors j j;mJi connected with vectors jl;mli and
jmsi? Finding answers to these questions requires substantial additional efforts, so
that Eqs. 9.73 and 9.74 are not the end but just the beginning of the journey.

And as a first step, I would note an additional property of the operators OJ2 and OJz,
which they possess by the virtue of being the sum of orbital and spin operators:

they both commute with operators OL2 and OS2. Proof of this statement is quite
straightforward and is based on Eq. 9.72 as well as on the fact that both OL2 and
OS2 commute with all their components (well, OS2 for spin 1=2 is proportional to a
unity matrix and, therefore, commutes with everything). This means that operators
OJ2, OJz, OL2, and OS2 have a common set of eigenvectors so that numbers j and mJ do not
provide a full description of these vectors. To have these vectors fully characterized,
one needs to throw number l into the mix replacing j j;mJi with j j; l;mJi and adding
equation

OL2 j j; l;mJi D „2l.l C 1/ j j; l;mJi (9.75)

to Eqs. 9.73 and 9.74. Strictly speaking, I would need to include here a spin number
s as well, but since I am going to limit this discussion to only spin 1=2 particles,
this number never changes so that its inclusion would just superfluously increase
the clumsiness of the notations.

9.5 Operator of Total Angular Momentum 313

A relation between vectors j j; l;mJi and individual eigenvectors of the orbital
and spin operators can be established by using the latter as a basis in the combined
spin–orbital space defined in Sect. 9.5.1 as a tensor product of the orbital and spinor
spaces. Specializing a generic Eq. 9.58 to the particular case, when the basis in the
orbital space is presented by vectors jl;mi, I can write for an arbitrary member j�i
of the tensor product space:

j�i D
X

l0;m;ms

Cl
0

m;ms

ˇ̌
l0;m

˛ jmsi : (9.76)

However, when applying this expansion to the particular case of vectors j j; l;mJi,
I need to take into account that these vectors are eigenvectors of OL2, i.e., that they
must obey Eq. 9.75:

OL2
X

l0;m;ms

Cl
0

m;ms

ˇ̌
l0;m

˛ jmsi D
X

l0;m;ms

Cl
0

m;ms
OL2 ˇ̌l0;m˛ jmsi D

„2
X

l0;m;ms

Cl
0

m;ms l
0 �l0 C 1� ˇ̌l0;m˛ jmsi D „2l.l C 1/ jl;mi jmsi :

Because of the orthogonality of the basis vectors jl;mi jmsi, the only way to satisfy
the equality in the last line is to make sure that l0 D l is the only term in the sum. This
is achieved by setting Cl

0

m;ms D Clm;msıl;l0 and thereby vanquishing the summation
over l0. In a less formal way, you can argue that for the vector defined by Eq. 9.76 to
be an eigenvector of OL2, it cannot be a combination of vectors with different values
of l. Thus, I can conclude that a representation of j j; l;mJi in the basis of jl;mi jmsi
must have the following form:

j j; l;mJi D
X
m;ms

Cl;jm;ms;mJ jl;mi jmsi (9.77)

where I also added upper index j and lower index mJ to the expansion coefficients
to make it explicit that the expansion is for eigenvectors of operators OJ2 and OJz
characterized by quantum numbers j and mJ .

The task now is to find coefficients Cl;jm;msmJ , which are a particular case of so-

called Clebsch–Gordan coefficients.3 To this end I will first apply operator OJz to the
left-hand side of Eq. 9.77 and operator OLz C OSz to its right-hand side. Using Eq. 9.74
on the left-hand side and similar properties of orbital and spin angular momentum
operators on the right-hand side, I obtain

3Clebsch–Gordan coefficients allow to present eigenvectors of an operator OJ1 C OJ2 in terms of
eigenvectors of generic angular momentum operators OJ1 and OJ2.

314 9 Spin 1/2

mJ j j; l;mJi D
X
m;ms

Cl;jm;ms;mJ .m C ms/ jl;mi jmsi )

mJ
X
m;ms

Cl;jm;ms;mJ jl;mi jmsi D
X
m;ms

Cl;jm;ms;mJ .m C ms/ jl;mi jmsi )
X
m;ms

Cl;jm;ms;mJ .mJ � m � ms/ jl;mi jmsi D 0:

For the equation in the last line to be true, one of two things should happen: either
mJ D m C ms or Cl;jm;ms;mJ D 0. This means that the Clebsch–Gordan coefficients
vanish unless m D mJ � ms so that they can be presented as

Cl;jm;ms;mJ D Cl;jms;mJım;mJ�ms:
Substituting this result into Eq. 9.77, I can eliminate the summation over m and
obtain a simplified form of this expansion:

j j; l;mJi D
X
ms

Cl;jms;mJ jl;mi jmsi D

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i (9.78)

where the last line explicitly accounts for the fact that the spin number ms only
takes two values 1=2 and �1=2. Equation 9.77 contains all the information about
Clebsch–Gordan coefficients that I could extract from operator OJz (which is not that
much), but hopefully I can learn more from operator OJ2.

The idea is the same: apply OJ2 to the left-hand side of Eq. 9.78, its reincarnation
in the form OL2C OS2C2 OL� OS to this equation’s right-hand side, and find conditions that
the two sides of the equation agree. The first step is just a recapitulation of Eq. 9.73:

OJ2 j j; l;mJi D „2j. j C 1/ j j; l;mJi D

„2j. j C 1/

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

(9.79)

but the second one results in rather long expressions, which couldn’t even fit to a

single page. Therefore, I will deal with different terms in OL2C OS2C2 OL � OS separately.
First I will do OL2 C OS2, which is the easiest to handle:

9.5 Operator of Total Angular Momentum 315

� OL2 C OS2
�

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

D

„2l.l C 1/

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

C

3

4
„2

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

C

„2

l.l C 1/C 3
4

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�
:

(9.80)

To evaluate the remaining OL � OS term, I first give it a makeover using ladder operators
OL˙ and OS˙:

OL � OS D OLx OSx C OLy OSy C OLz OSz D
OLz OSz C 1

2

� OLC C OL�
� 1
2

� OSC C OS�
�

C
1

2i

� OLC � OL�
� 1
2i

� OSC � OS�
�

D

OLz OSz C 1
2

� OL� OSC C OLC OS�
�

(9.81)

where I used Eqs. 3.59 and 3.60 for orbital and Eqs. 5.109 and 5.108 for spin opera-
tors. Using the fact that

ˇ̌
l;mJ � 12

˛ j"i and ˇ̌l;mJ C 12
˛ j#i are eigenvectors of OLz and

OSz with eigenvalues „ .mJ � 1=2/, „=2 and „ .mJ C 1=2/, -„=2 correspondingly, I
get for the first term in the last line of Eq. 9.81:

OLz OSz

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

D

„2
2

mJ � 1

2

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i �

„2
2

mJ C 1

2

Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i : (9.82)

To compute the contribution from OL� OSC and OLC OS�, you need to recall that OSC j"i D
0, OS� j#i D 0, OSC j#i D „ j"i, OS� j"i D „ j#i (These formulas originally
appeared in Sect. 5.2.3, Eqs. 5.104 and 5.106, but I am reposting them here for your
convenience.) You will also need to go back to Eqs. 3.75 and 3.76 to figure out
the part related to operators OL˙. Refreshing this way your memory of the ladder
operators, you can get

316 9 Spin 1/2

OL� OSC

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

D

„2
s

l.l C 1/ �

mJ C 1
2

mJ � 1

2

Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i (9.83)

and

OLC OS�

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

D

„2
s

l.l C 1/ �

mJ C 1
2

mJ � 1

2

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i : (9.84)

Finally, you just need to bring together all Eqs. 9.80–9.84 and apply some simple
algebra (just group together the like terms) to cross the goal line:

� OL2 C OS2 C 2 OL � OS
�

Cl;j1=2s;mJ

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C Cl;j�1=2s;mJ

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

D

„2

Cl;j1=2s;mJ

l.l C 1/C mJ C 1

4

C

s
l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j�1=2s;mJ

# ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C

„2

Cl;j�1=2s;mJ

l.l C 1/ � mJ C 1

4

C

s
l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j1=2s;mJ

# ˇ̌
ˇ̌l;mJ C 1

2

�
j#i :

Comparing this against Eq. 9.79 and equating coefficients in front of each of the
vectors, you will end up with the following system of equations for coefficients
Cl;j1=2s;mJ and C

l;j
�1=2s;mJ :

l.l C 1/ � j . j C 1/C mJ C 1

4

Cl;j1=2s;mJ C

s
l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j�1=2s;mJ D 0 (9.85)

9.5 Operator of Total Angular Momentum 317

s
l.l C 1/ �

mJ C 1

2

mJ � 1

2

Cl;j1=2s;mJ C

l.l C 1/ � j . j C 1/C 1

4
� mJ

Cl;j�1=2s;mJ D 0: (9.86)

And once again you are looking for non-zero solutions of a homogeneous system of
linear equations, and once again you need to find zeroes of the determinant formed
by its coefficients:

������
l.l C 1/ � j . j C 1/C mJ C 14 I

q
l.l C 1/ � �mJ C 12

� �
mJ � 12

�
q

l.l C 1/ � �mJ C 12
� �

mJ � 12
�I l.l C 1/ � j . j C 1/C 1

4
� mJ

������
D 0:

Evaluation of the determinate yields

l.l C 1/ � j . j C 1/C 1

4
C mJ

l.l C 1/ � j . j C 1/C 1

4
� mJ

�

l.l C 1/C

mJ C 1
2

mJ � 1

2

D

l.l C 1/ � j . j C 1/C 1

4

2
� l.l C 1/ � 1

4
D

"
l C 1

2

2
� j . j C 1/

#2
�

l C 1
2

2

where I used easily verified identity

l.l C 1/C 1
4

�

l C 1
2

2
: (9.87)

Now it is quite easy to find that equation

"
l C 1

2

2
� j . j C 1/

#2
�

l C 1
2

2
D 0

is satisfied for

j. j C 1/ D

l C 1
2

l C 3

2

or

j. j C 1/ D �l C 1
2

� �
l � 1

2

�
:

318 9 Spin 1/2

The only physically meaningful solutions of these equations are

j1 D l C 1
2

(9.88)

and

j2 D l � 1
2
: (9.89)

(Two other solutions �l � 3=2 and �l � 1=2 are negative and must be ignored.) The
obtained result means that for any value of the orbital quantum number l; operator OJ2
has two possible eigenvalues „2j1 . j1 C 1/ and „2j2 . j2 C 1/ with j1 and j2 defined
above. For each value of j, there are 2j C 1 values of mJ , mJ D �j;�j C 1 � � � j �
1; j so that the total number of states j j; l;mJi (for a given l) is 2 .l C 1=2/ C 1 C
2 ..l � 1=2/C 1/ D 2 .2l C 1/, which is exactly the same as the number of states
jl;mi jmsi. One important conclusion from this arithmetic is that orthogonal and
linearly independent states j j; l;mJi and other orthogonal and independent states
jl;mi jmsi represent two alternative bases in the same vector space: vectors of the
former basis are defined by the states in which the measurement of the total angular
momentum and its component would yield determinate results, and vectors of the
latter basis correspond to the states in which orbital and spin momenta separately
would have definite values.

Now I can go back to Eqs. 9.85 and 9.86 and find the Clebsch–Gordan coeffi-
cients that establish a connection between vectors j j; l;mJi and vectors jl;mi jmsi,
signaling the close end of this journey. Substituting the found values for j1 and j2 to
Eqs.9.85 and 9.86, I find the two sets of the coefficients:

Cl;j1�1=2s;mJ D
l C 1

2
� mJq

l.l C 1/ � m2J C 14
Cl;j11=2s;mJ D

s
l C 1

2
� mJ

l C 1
2

C mJ
Cl;j11=2s;mJ (9.90)

Cl;j21=2s;mJ D �
l C 1

2
� mJq

l.l C 1/ � m2J C 14
Cl;j2�1=2s;mJ D �

s
l C 1

2
� mJ

l C 1
2

C mJ
Cl;j2�1=2s;mJ (9.91)

where I again used Eq. 9.87. As usual, Eqs. 9.85 and 9.86 yield only the ratio of
the coefficients, and in order to find the coefficients themselves, the normalization
requirement, complemented by the convention that the Clebsch–Gordan coefficients
remain real, needs to be invoked. Substituting Eqs. 9.90 and 9.91 into the normal-
ization condition

ˇ̌
ˇCl;j�1=2s;mJ

ˇ̌
ˇ
2 C

ˇ̌
ˇCl;j1=2s;mJ

ˇ̌
ˇ
2 D 1;

9.5 Operator of Total Angular Momentum 319

I find after some trivial algebra

Cl;j11=2s;mJ D
s

l C 1
2

C mJ
2l C 1 I C

l;j1
�1=2s;mJ D

s
l C 1

2
� mJ

2l C 1

Cl;j21=2s;mJ D
s

l C 1
2

� mJ
2l C 1 I C

l;j2
�1=2s;mJ D �

s
l C 1

2
C mJ

2l C 1 : (9.92)

Now you just plug Eq. 9.92 into Eq. 9.78 to derive the final expressions for the two
eigenvectors of operator OJ2 characterized by quantum numbers j1 and j2 in terms of
linear combination of the orbital and spin angular momentum eigenvectors:

jl C 1=2; l;mJi D 1p
2l C 1

p
l C mJ C 1=2

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i C

p
l � mJ C 1=2

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�

(9.93)

jl � 1=2; l;mJi D 1p
2l C 1

p
l � mJ C 1=2

ˇ̌
ˇ̌l;mJ � 1

2

�
j"i �

p
l C mJ C 1=2

ˇ̌
ˇ̌l;mJ C 1

2

�
j#i
�
: (9.94)

It is quite easy to verify that vectors jl C 1=2; l;mJi and jl � 1=2; l;mJi are
normalized and orthogonal, as they shall be. One can interpret this result by saying
that if an electron is prepared in a state with determinate values of total angular
momentum „2j. j C 1/, one of its components, „mJ , and total orbital momentum
„2l.l C 1/, the values of the corresponding components of its orbital momentum
„m and spin „ms remain uncertain. An attempt to measure them will produce the
combination m D mJ � 1=2, ms D 1=2 with probabilities

pmJ�1=2;1=2 D
(

lCmJC1=2
2lC1 ; j D l C 1=2

l�mJC1=2
2lC1 j D l � 1=2

(9.95)

or combination m D mJ C 1=2, ms D �1=2 with probabilities

pmJC1=2;�1=2 D
(

l�mJC1=2
2lC1 ; j D l C 1=2

lCmJC1=2
2lC1 j D l � 1=2

: (9.96)

To help you feel better about these results, let me illustrate the application of
Eqs. 9.95 and 9.96 by a few examples.

320 9 Spin 1/2

Example 27 (Measuring Spin and Orbital Angular Momentums in the State with the
Definite Value of the Total Angular Momentum) Assume that an electron is in a
state with a given orbital momentum l, total angular momentum j D l � 1=2, and
its z-component mJ D l � 3=2 and that you have a magic instrument allowing you
to measure the z-components of electron’s orbital momentum and its spin. What are
the possible outcomes of such a measurement and their probabilities?

Solution

Value mJ D l � 3=2 can be obtained in two different ways—when ms D 1=2 and
m D l � 2 or ms D �1=2 and m D l � 1. The probability of the first outcome is
(second line in Eq. 9.95)

pl�2;1=2 D l � .l � 3=2/C 1=2
2l C 1 D

2

2l C 1 ;

and the probability of the second outcome (second line in Eq. 9.96) is

pl�1;�1=2 D l C .l � 3=2/C 1=2
2l C 1 D

2l � 1
2l C 1 :

Obviously the sum of the two probabilities is equal to one, and for large values of l,
the second outcome is significantly more probable.

Example 28 (More on Measurement of Spin and Orbital Momentums) Let me
modify the previous example by assuming that the value of the total angular
momentum is not known, but it is known that the electron can be in either state
of the total angular momentum with equal probability. How will the answer to the
previous example change in this case?

Solution

Now you have to take into account that both possible outcomes discussed in the
previous example can come either from the state with j D l C 1=2 or the state with
j D l � 1=2: Respectively, the total probability of the outcomes becomes

pl�2;1=2 D 1
2

l � .l � 3=2/C 1=2
2l C 1 C

1

2

l C .l � 3=2/C 1=2
2l C 1 D

l C 1=2
2l C 1 D

1

2

pl�1;�1=2 D 1
2

l C .l � 3=2/C 1=2
2l C 1 C

1

2

l � .l � 3=2/C 1=2
2l C 1 D

1

2
:

Even though, generally speaking, either mJ or m and ms cannot be known with
certainty in the same state, there exist two states in which all three of these quantum
numbers have definite values. These are the states with the largest mJ D l C 1=2
and smallest mJ D �l � 1=2 values of mJ , for which one of the Clebsch–Gordan
coefficients vanishes, while the other one turns to unity, reducing Eq. 9.93 to

jl C 1=2; l; l C 1=2i D jl; li j"i I jl C 1=2; l;�l � 1=2i D jl;�li j#i :

9.6 Problems 321

You can easily understand this fact by noting that mJ D l C 1=2 or mJ D �l � 1=2
can be obtained only by a single combination of m and ms: mJ D lC1=2 corresponds
to the choice m D l and ms D 1=2, while mJ D �l � 1=2 can only be generated by
m D �l, ms D �1=2.

Equations 9.88 and 9.89 together with Eqs. 9.93 and 9.94 provide answers to
all the questions posed in the beginning of this subsection: you now know the
relation between total, orbital, and spin angular momentum quantum numbers as
well as between corresponding eigenvectors. In particular, Eqs. 9.93 and 9.94 allow
generating any representation for j j; l;mJi, using corresponding representations
for vectors jl;mi and jmsi, as well as define the action of any combination of
orbital and spin operators on these vectors. To illustrate this point, I will write
down the coordinate–spinor representation of jl C 1=2; l;mJi using Eq. 9.93 and the
corresponding representations for jl;mi and jmsi:

jl C 1=2; l;mJi D 1p
2l C 1

"p
l C mJ C 1=2YmJ�1=2l .�; '/p
l � mJ C 1=2YmJC1=2l .�; '/

#
:

I can now use this to compute, e.g., OLC OS� jl C 1=2; l;mJi. Taking the matrix
representation for OS� from Eq. 5.107 and recalling that any orbital operator in the
spinor representation is multiplied by a unity matrix, I can rewrite this expression as

OLC OS� jl C 1=2; l;mJi D „p
2l C 1

" OLC 0
0 OLC

#"
0 0

1 0

#"p
l C mJ C 1=2YmJ�1=2l .�; '/p
l � mJ C 1=2YmJC1=2l .�; '/

#
D

„pl C mJ C 1=2p
2l C 1

" OLC 0
0 OLC

#"
0

YmJ�1=2l .�; '/

#
D „

p
l C mJ C 1=2p
2l C 1

"
0

OLCYmJ�1=2l .�; '/

#
D

„pl C mJ C 1=2p
2l C 1

p
l.l C 1/ � .mJ � 1=2/ .mJ C 1=2/

"
0

YmJC1=2l .�; '/

#
D

„ .l C mJ C 1=2/
s

l � mJ C 1=2
2l C 1

"
0

YmJC1=2l .�; '/

#
:

9.6 Problems

Section 9.2

Problem 112 Write down a spinor corresponding to the point on the Bloch sphere
with coordinates � D �=4, ' D 3�=2.

322 9 Spin 1/2

Problem 113 The impossibility of half-integer values of the angular momentum
for orbital angular momentum operators expressed in terms of coordinate and
momentum operators can be demonstrated by considering the following example.
Imagine that there exists a state of the orbital angular momentum with l D 1=2.
Then in the coordinate representation, these states would be represented by two
functions f1=2.�; '/ and f�1=2.�; '/ corresponding to the values of the magnetic
quantum number m D 1=2 and m D �1=2, respectively. These functions must
obey the following set of equations:

OLCf1=2.�; '/ D 0I OL�f�1=2.�; '/ D 0
OLCf�1=2.�; '/ D f1=2.�; '/I OL�fC1=2.�; '/ D f�1=2.�; '/:

Using the coordinate representation of the ladder operators, show that these
equations are mutually inconsistent.

Problem 114 An electron is in spin state described by (non-normalized) spinor:

j�i D
2i � 3
4

�
:

1. Normalize this spinor.
2. If you measure the z-component of the spin, what are the probabilities of various

outcomes?
3. What is the expectation value of the z-component of the spin in this state?
4. Answer the same questions for x- and y-components.

Problem 115

1. Consider a spin in state

1

0

�
:

You measure the component of the spin in the direction of the unit vector
n characterized by angles �; ' of the spherical coordinate system. What is a
probability of obtaining value �„=2 as an outcome of this measurement?

2. Imagine that you conduct two measurements in a quick succession: first you
carry out the measurement described in the previous part of the problem, and
right after that, you measure the y-component of the spin. Find the probability
of getting „=2 as an outcome of the last measurement. (Hint: Do not forget to
consider all possible paths that could lead to this outcome.)

Problem 116 Consider a particle with spin 1=2 in a state in which a component of
the spin in a specified direction is equal to „=2. Choose a coordinate system with
the Z-axis along this direction and some arbitrary positions for X- and Y-axes in
the perpendicular plane. Now imagine that you measure a component of the spin in

9.6 Problems 323

a direction making angle 30ı with the Z-axis and lying in the XZ plane. Find the
probabilities of the various outcomes of this measurement.

Section 9.3

Problem 117 Derive the expression for the expectation value of the y-component
of the spin in the state specified by Eq. 9.34.

Problem 118 Consider a spin in the initial state characterized by angles � D �=6
and ' D �=3 of the Bloch sphere. At time t D 0, the magnetic field B directed
along the polar axes of the spherical coordinate system is turned on and remains on
for t D �= .2!L/ seconds. After the field is off, an experimentalist measures the
z-component of the spin. What is the probability that the measurement yields „=2?
�„=2? Answer the same questions if it is the x-component of the spin that is being
measured.

Problem 119 In the last problem to Chap. 5, you found matrices OSx, OSy, and OSz
for a particle with spin 3=2. Assume that an interaction of this particle with its
surrounding is described by Hamiltonian:

OH D "0„2
� OS2x � OS2y

�
� "1„2

OS2z :

1. Find the stationary states of this Hamiltonian.
2. Assuming that the initial state of the particle is given by a generic spinor of the

form

j�0i D

2
664

1

0

0

0

3
775 ;

find the spin state of the particle at time t.
3. Calculate the time-dependent expectation values of all three components of the

spin operator.

Problem 120 Consider a spin 1=2 particle in a time-dependent magnetic field,
which rotates with angular velocity � in the X–Y plane:

B D iB0 cos˝t C jB0 sin�t;

where i and j are unit vectors in the directions of X and Y coordinate axes,
respectively. Derive the Heisenberg equations for the spin operators and solve them.
Note, since the Hamiltonian of this system is time-dependent, you cannot claim the
same form for the Hamiltonian in Schrödinger and Heisenberg pictures based upon

324 9 Spin 1/2

the notion that the time-evolution operator OU commutes with the Hamiltonian (it
does not because it does not have the form of exp

�
�i OHt=„

�
, which is only valid for

time-independent Hamiltonians). Nevertheless, since the time-dependent factor in
the Hamiltonian does not contain operators, you can still show that the Heisenberg
form of the Hamiltonian, which in the Schrödinger picture has the form

OH D 2�B„
OS � B;

has exactly the same form in the Heisenberg picture if the Schrödinger spin operator
is replaced with its time-dependent Heisenberg operator.

1. Convince yourself that this is, indeed, the case.
2. Derive the Heisenberg equations for all three components of the spin operators.
3. Solve these equations and find the time dependence of the spin operators. (Hint:

You might want to introduce new time-dependent operators defined as

OP D OSx cos�t C OSy sin�t
OQ D OSy cos�t � OSx sin�t

and derive equations for them.)

Section 9.4

Problem 121 Normalize the following vector belonging to the tensor product of
two spaces:

j i D 2i
ˇ̌
ˇe.1/1

E �ˇ̌
ˇe.2/1

E
� 3i

ˇ̌
ˇe.2/2

E�
C
�
2
ˇ̌
ˇe.1/1

E
� 3

ˇ̌
ˇe.1/2

E� ˇ̌
ˇe.2/2

E
;

assuming that vectors
ˇ̌
ˇe.1/1;2

E
and

ˇ̌
ˇe.2/1;2

E
are normalized and mutually orthogonal.

Problem 122 Compute commutators
h OS.tp/i ; OS.tp/j

i
for all i ¤ j and

OS.tp/i ;

� OS.tp/
�2�

,

where i; j take values x; y; and z.

Problem 123 Assuming that vectors
ˇ̌
ˇe.1/1;2

E
and

ˇ̌
ˇe.2/1;2

E
in Problem 121 correspond

to spin-up and spin-down states of two particles as defined by operators OS.1;2/z
correspondingly, compute

h j OS.1/ � OS.2/ j i ;

where vector j i is also defined in Problem 121.

9.6 Problems 325

Problem 124 Derive Eqs. 9.46 through 9.49.

Problem 125 Consider a system of two interacting spins described by Hamilto-
nian:

OH D 2�B„
OS.1/B C 2�B„

OS.2/B C J OS.1/ � OS.2/:

Find the eigenvalues and eigenvectors of this Hamiltonian. Do it in two different
ways: first, use eigenvectors of individual OS.1;2/z operators as a basis, and second, use
eigenvectors of the operators of the total spin. Find the ground state of the system
for different relations between the magnetic field and parameter J. Consider cases
J > 0 and J < 0.

For Sect. 9.5.1

Problem 126 Using the approach presented in Sect. 9.4, consider addition of the
operators of the orbital angular momentum and spin, limiting your consideration to
the orbital states with l D 1.
1. Construct the matrix of the operator OJ2, where OJ D OL C OS, in the basis

of eigenvectors of operators OL2, OLz, and OSz, taking into account only those
eigenvectors which belong to the orbital quantum number l D 1. (Hint: Your
basis will consist of 6 vectors, so that you are looking for a 6 � 6 matrix.)

2. Diagonalize the matrix and confirm that eigenvectors of OJ2 are characterized by
quantum numbers j D 1=2 and j D 3=2.

3. Find the eigenvectors of OJ2 in this basis.
Problem 127

1. Write down an expression for a spinor describing the equal superposition of
states, in which an electron in the ground state of an infinite one-dimensional
potential is also in a spin-up state, while an electron in the first excited state of
this potential is also in the spin-down state. The potential confines the electron’s
motion in x direction, while spin-up and spin-down states correspond to the
z-component of the spin.

2. Imagine that you have measured a component of the spin in the x direction and
obtained value „=2. Find the probability distribution of the electron’s coordinate
right after this measurement.

Problem 128 A one-dimensional harmonic oscillator is placed in a state

j˛i D 1p
2
Œj0i j"i C j1i j#i� ;

326 9 Spin 1/2

where spin-up and spin-down states are defined with respect to the z-component of
the spin operator and kets j0i and j1i correspond to the ground state and the first
excited state of a harmonic oscillator. At time t D 0 an experimentalist turns on a
uniform magnetic field in the z direction. Find the state of the system at a later time
t, and compute the expectation values of oscillator’s coordinate and momentum.
(Hint: You can use Eqs. 9.68 and 9.69 with the orbital part of the Hamiltonian taken
to be that of a harmonic oscillator.)

For Sect. 9.5.2

Problem 129 Compute the expectation value of all components of the operator

OJ D OL C OS

as well as of operator OJ2 in state

j�i D 1p
14

Yl�2l .�; '/

ˇ̌
ˇ̌1
2

�
� 2Yll .�; '/

ˇ̌
ˇ̌�1
2

�
C 3iY2l .�; '/

ˇ̌
ˇ̌1
2

��
:

Problem 130 Derive Eq. 9.92.

Problem 131 Consider an electron in a state with l D 2, j D 3=2, and mJ D 0.
If one measures the z-components of the electron orbital momentum and spin, what
are the possible values and their probabilities?

Problem 132 Let me reverse the previous problem: assume that the electron is in
the state with l D 2, m D 1, and ms D �1=2. What are the possible values of j and
their probabilities?

Problem 133 Consider an electron in the following state (in the coordinate repre-
sentation):

j˛i D 2p
10

Y11 .�; '/

ˇ̌
ˇ̌1
2

�
C 1p

10
Y02 .�; '/

ˇ̌
ˇ̌�1
2

�
C 1p

10
Y�11 .�; '/

ˇ̌
ˇ̌1
2

�

C 2p
10

Y12 .�; '/

ˇ̌
ˇ̌�1
2

�
:

1. If one measures OJ2 and OJz, what values can one expect to observe and what are
their probabilities?

2. Present this vector as a linear combination of appropriate vectors j j; l;mJi.

9.6 Problems 327

Section 9.5.2

Problem 134 Compute commutators
h OJy; OJz

i
and

h OJx; OJz
i
, and demonstrate that

they have a standard for the angular momentum operators form.

Problem 135 Write down the position–spinor representation of vector jl � 1=2;
l;mJi, and compute OL� OSC jl � 1=2; l;mJi using this representation.

Chapter 10
Two-Level System in a Periodic External
Field

I have already mentioned somewhere in the beginning of this book that while vectors
representing states of realistic physical systems generally belong to an infinite-
dimensional vector space, we can always (well, almost, always) justify limiting
our consideration to a subspace of states with a reasonably small dimension. The
smallest nontrivial subspace containing states that can be assumed to be isolated
from the rest of the space is two-dimensional. One relatively clean example of such
a subspace is formed by two-dimensional spinors in the situations when one can
neglect interactions between spins of different particles as well as by the spin–orbital
interaction. An approximately isolated two-dimensional subspace can also be found
in systems described by Hamiltonians with discrete spectrum, if this spectrum is
strongly non-equidistant, i.e., the energy intervals between adjacent energy levels
4i D EiC1 � Ei are different for different pairs of levels. Two-level models are
very popular in various areas of physics because, on one hand, they are remarkably
simple, while on the other hand, they capture essential properties of many real
physical systems ranging from atoms to semiconductors.

The most popular (and useful) version of this model involves an interaction of a
two-level system with a periodic time-dependent external “potential.” This can be
an electric dipole potential describing interaction of an atomic electron with electric
field or magnetic “potential” describing interaction of an electron spin with time-
dependent magnetic field. Since I am not going to go into concrete details of a
physical system, which this model is supposed to represent, I will introduce it by
assuming that its Hamiltonian is a sum of a time-independent “unperturbed” part
OH0 and the time-dependent “perturbation” OV.t/. I will also assume that OH0 has only
two linearly independent and orthogonal eigenvectors, which I will designate as j1i
and j2i, and two corresponding eigenvalues E.0/1 and E.0/2 , which may be degenerate.

It is easy to see now that OH0 can be written as

OH0 D E.0/1 j1i h1j C E.0/2 j2i h2j : (10.1)

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_10

329

330 10 Two-Level System in a Periodic External Field

Indeed, taking into account the orthogonality and normalization of j1i and j2i, you
can find

OH0 j1i D E.0/1 j1i h1j 1i C E.0/2 j2i h2j 1i D E.0/1 j1i ;

and

OH0 j2i D E.0/1 j1i h1j 2i C E.0/2 j2i h2j 2i D E.0/2 j2i ;

confirming that the Hamiltonian given by Eq. 10.1 does, indeed, have the properties
prescribed to it. It is obvious that in the basis of these eigenvectors, OH0 is presented
by a diagonal matrix with eigenvalues along the main diagonal. In the most general
form, the interaction term can be written down as

OV D V11 j1i h1j C V22 j2i h2j C V12 j1i h2j C V21 j2i h1j :

The diagonal elements in this expression, Vii.t/ D hij OV jii, often vanish, thanks to
the symmetry of the system. Indeed, if the initial Hamiltonian is symmetric with
respect to inversion, its eigenvectors have definite parity—they are either odd or
even. If, in addition, the interaction Hamiltonian is odd (which is quite common—
for instance, the electric–dipole interaction is proportional to Or � E , where E is
the electric field, and position operator changes sign upon inversion), the diagonal
elements of the interaction term must vanish (details of the arguments can be found
in Sect. 7.1). Also, the requirement that the operator must be Hermitian demands
that V21 D V�12.

10.1 Two-Level System with a Time-Independent
Interaction: Avoided Level Crossing

I begin by considering the properties of the two-level model with a time-independent
interaction term, so that the complete Hamiltonian of the system becomes

OH D E.0/1 j1i h1j C E.0/2 j2i h2j C V12 j1i h2j C V�12 j2i h1j (10.2)

where Vij are in general complex constant parameters. Since this is a time-
independent Hamiltonian, it makes sense to explore its eigenvectors and eigenvalues
using vectors j1i and j2i as a basis. The Hamiltonian in this representation becomes
a 2 � 2 matrix so that the eigenvector equation can be written in the matrix form

"
E.0/1 V12
V�12 E

.0/
2

#
a1
a2

�
D E

a1
a2

�
; (10.3)

10.1 Two-Level System with a Time-Independent Interaction: Avoided Level. . . 331

and the corresponding equation for the eigenvalues becomes

�����
E.0/1 � E V12

V�12 E
.0/
2 � E

����� D 0:

Evaluation of the determinant turns it into a simple quadratic equation:

E2 � E
�

E.0/1 C E.0/2
�

C E.0/1 E.0/2 � jV12j2 D 0

with two solutions (I provided a lot of detailed derivations in this book, but I am not
going to show how to solve quadratic equations!)

E1 D 1
2

�
E.0/1 C E.0/2

�
C 1
2

r�
E.0/1 � E.0/2

�2 C 4 jV12j2 (10.4)

E2 D 1
2

�
E.0/1 C E.0/2

�
� 1
2

r�
E.0/1 � E.0/2

�2 C 4 jV12j2: (10.5)

Substituting the first of these solutions into

�
E.0/1 � E

�
a1 C V12a2 D 0

(the first of the equations encoded in the matrix form in Eq. 10.3), I find the ratio of
the coefficients representing the first eigenvector of the Hamiltonian:

a.1/1
a.1/2

D �2 V12
E.0/1 � E.0/2 �

r�
E.0/1 � E.0/2

�2 C 4 jV12j2
: (10.6)

Repeating this calculation with the second eigenvalue, I find the ratio of the
coefficients for the second eigenvector:

a.2/1
a.2/2

D �2 V12
E.0/1 � E.0/2 C

r�
E.0/1 � E.0/2

�2 C 4 jV12j2
: (10.7)

The normalization coefficients for these eigenvectors are too cumbersome and are
not too informative, so I will leave the eigenvectors non-normalized. Both of them
can be written as a superposition of vectors j1i and j2i with coefficients a.1;2/1;2 defined
by Eqs. 10.6 and 10.7:

jE1;2i D a.1;2/1 j1i C a.1;2/2 j2i (10.8)

332 10 Two-Level System in a Periodic External Field

where I used eigenvalues to label the corresponding eigenvectors.
The ratios of the coefficients in this superposition determine relative contribu-

tions of each of the original states into jE1;2i. These ratios depend on the relation
between the inter-level spectral distance

ˇ̌
ˇE.0/1 � E.0/2

ˇ̌
ˇ and the interaction matrix

element jV12j. If the former is much larger than the latter, I can expand the
denominators of Eqs. 10.6 and 10.7 as

r�
E.0/1 � E.0/2

�2 C 4 jV12j2 � E.0/1 � E.0/2 C
2 jV12j2

E.0/1 � E.0/2

where it is assumed for concreteness that E.0/1 > E
.0/
2 . Then Eqs. 10.6 and 10.7 yield

a.1/1
a.1/2

�
V12

�
E.0/1 � E.0/2

�

jV12j2
1

a.2/1
a.2/2

� � V12
E.0/1 � E.0/2

� 1:

Thus, the contributions of the state presented by vector j2i into the eigenvector
jE1i and of state j1i into the eigenvector jE2i are very small. Not surprisingly, the
energy E1 in this limit is close to E

.0/
1 , and E2 is close to E

.0/
2 (check it out, please).

These results justify the assumption lying in the foundation of the two-level model:
contributions from energetically remote states can, indeed, be neglected. It also
provides a quantitative condition for validity of this approximation:

ˇ̌
E.0/n � E.0/m

ˇ̌
jVnmj, where n;m are the labels for energy levels and the corresponding states.

It is easy to verify that if I reversed inequality E.0/1 > E
.0/
2 and assumed instead

that E.0/1 < E
.0/
2 , the role of vectors j1i and j2i would have interchanged: the main

contribution to state jE1i would have come from initial vector j2i, and state jE2i
would have been mostly determined by j1i. This flipping between the initial vectors
is due to trivial but often overlooked property of the square root,

p
x2=jxj, which

is x when x is positive and �x when it is negative. In one of the exercises, you are
asked to verify this flipping phenomenon.

In the opposite limit
ˇ̌
ˇE.0/1 � E.0/2

ˇ̌
ˇ � jV12j, the radical in Eqs. 10.6 and 10.7 can

be approximated as

r�
E.0/1 � E.0/2

�2 C 4 jV12j2 � 2 jV12j (10.9)

which is valid with accuracy up to terms of the order of
�

E.0/1 � E.0/2
�2
= jV12j2 � 1.

The ratios of the coefficients in this case become

10.1 Two-Level System with a Time-Independent Interaction: Avoided Level. . . 333

a.1/1
a.1/2

D �2 V12
E.0/1 � E.0/2 � 2 jV12j

� eiıV
1C E

.0/
1 � E.0/2
2 jV12j

!

a.2/1
a.2/2

D �2 V12
E.0/1 � E.0/2 C 2 jV12j

� �eiıV
1 � E

.0/
1 � E.0/2
2 jV12j

!

where I introduced the phase of the matrix element V12 D jV12j exp .iıV/ and used
approximation for .1C x/�1 � 1 � x. Note that the correction to the main terms
(˙ exp ŒiıV �) in both expressions is linear in

�
E.0/1 � E.0/2

�
= jV12j, which justifies

approximation for the radical used in Eq. 10.9 (neglected quadratic terms are smaller
than the linear ones kept in the expressions for the coefficients). The contributions
of the initial eigenvectors in this limit are almost equal to each other in magnitude
while differing in their phase by � (do I need to remind you that �1 D exp .i�/?).
Approximate expressions for the energy eigenvalues, Eqs. 10.6 and 10.7 in this limit,

become (again neglecting quadratic terms in
�

E.0/1 � E.0/2
�
= jV12j)

E1 D 1
2

�
E.0/1 C E.0/2

�
C jV12j (10.10)

E2 D 1
2

�
E.0/1 C E.0/2

�
� jV12j : (10.11)

What is significant about this result is that even when the difference between initial
energy levels is very small compared to the matrix element of the interaction, the
difference between the actual eigenvalues is jV12j and is not small at all.

Experimentalists love the two-level models because they are simple (all what you
need to know is how to solve quadratic equations), and they are tempted to use it
as often as they can in disparate fields of physics. Theoreticians, of course, hate this
model with as much fervor because if all of the physics could have been explained
by a two-level model, all theoreticians would have lost their jobs. Luckily, this is
not the case.

The physics described by this model becomes particularly interesting (and
important) if the initial Hamiltonian OH0 depends on some parameters, which can
be controlled experimentally in such a way that the sign of the difference E.0/1 � E.0/2
can be continuously altered. In this case, at certain value of this parameter, the two
initial energy levels become degenerate, and if one plots dependence of E.0/1 and

E.0/2 as functions of this parameter, the corresponding curves would cross at some
point. This is an example of an accidental degeneracy, which is not related to any
symmetry and occurs only at particular values of a system’s parameters. Still, it
happens in a number of physical systems and is of great interest because it affects
how the system reacts to various stimuli. If, however, one plots the dependence of
the actual eigenvalues as functions of the same parameter, the curves would not

334 10 Two-Level System in a Periodic External Field

Fig. 10.1 An example of
avoided crossing

External parameter
E

ne
rg

y

cross each other as is obvious from Eqs. 10.10 and 10.11. The curves representing
this dependence will now look like the ones shown in Fig. 10.1. You can see that
the curves do not cross each other anymore giving this phenomenon the name of
avoided level crossing.

This is a remarkable phenomenon, which is not easily appreciated. Let me try to
help you to understand what is so special about these two curves not crossing each
other. Let’s begin far on the left from the point of the degeneracy, where E.0/1 > E

.0/
2 .

We ascertained that in this case the lower curve describes the energy of a state,
which is mostly j2i, while the state whose energy belongs to the upper curve is
mostly j1i. At the point of avoided crossing, the eigenvectors describing the state of
the system consist of both j1i and j2i in equal proportions. Now let’s keep moving
along the lower curve, which means that we are turning the dial and experimentally
gradually changing our control parameter. After we will have passed the point of
avoided crossing, the relation between initial energy levels has changed: now we
have E.0/1 < E

.0/
2 . Now, the main contribution to the superposition represented by

the points on the lower curve comes from the state j1i,1 and if I move the system
far enough from the avoided crossing point, I will have a state mostly consisting
of the state j1i. Now think about it: we started with the state of the system being
predominantly j2i, and by continuously changing our parameter, we transformed
this state in the one which is now predominantly state j1i. This is better than any
Hogwarts style transformation wizardry simply because it is not a magic and not an
illusion—just honest to earth quantum mechanics!

1Recall a comment I made at the end of the discussion of the limit
ˇ̌
ˇE.0/1 � E.0/2

ˇ̌
ˇ � jV12j.

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 335

10.2 Two-Level System in a Harmonic Electric Field:
Rabi Oscillations

Now let me switch gears and allow the perturbation operator OV to become a function
of time. More specifically, I will assume that perturbation matrix elements V12 and
V21 have the following form:

V21.t/ D V12.t/ D E cos�t;

where E is real. This form of the perturbation describes, for instance, a dipole
interaction between a two-level system and a harmonic electric field and appears
in many realistic situations. The Hamiltonian of the system in this case reads

OH D E.0/1 j1i h1j C E.0/2 j2i h2j C E cos�t .j1i h2j C j2i h1j/ : (10.12)

This is the first time you are dealing with an explicitly time-dependent Hamiltonian
in the Schrödinger picture, and this requires certain adjustments in the way of
thinking about the problem. First of all, you have to accept the fact that you cannot
present solutions in the form of exp .�iEt=„/ j i, with j i being an eigenvector of
the Hamiltonian. Equation OH j i D E j i with time-dependent Hamiltonian and
time-independent j i does not make sense anymore. In other words, the stationary
states do not exist in the case of time-dependent Hamiltonians, and we need,
therefore, a new way of solving the time-dependent Schrödinger equation. No one
can forbid you, however, to use eigenvectors of any time-independent Hamiltonian
as a basis, because basis is a basis regardless of the properties of the Hamiltonian.
The choice of the basis is determined solely by the reason of convenience, and it is
especially convenient in this case to use eigenvectors of OH0 presented by vectors j1i
and j2i. Thus, let me present the unknown time-dependent state vector j .t/i as a
linear combination of these vectors:

j .t/i D a1.t/ exp

� iE
.0/
1 t

„

!
j1i C a2.t/ exp

� iE

.0/
2 t

„

!
j2i (10.13)

with some unknown coefficients a1;2. This expression reminds very much Eq. 4.15
for a general solution with a time-independent Hamiltonian but with two significant
differences—first, the basis used in Eq. 10.13 is not formed by eigenvectors of the
total Hamiltonian OH;which does not have eigenvectors (at least not in a regular sense
of the word), and second, the expansion coefficients now are unknown functions of
time, while their counterparts in Eq. 4.15 were constants. You might wonder at this
point if I am allowed to separate the exponential factors characteristic of the time
dependence of the stationary states. A simple answer is: “Why not?” As long as I
allow for the yet undetermined time dependence of the residual coefficients, I can
factor out any time-dependent function I want. It will affect the equations, which
these coefficients obey, but not the final result. The most meticulous of you might

336 10 Two-Level System in a Periodic External Field

also ask that even if it is allowed to pull out these factors, why bother doing it? This
is a more valid question, which deserves a more detailed answer. Let me begin by
saying that I did not have to do it: the earth would not stop in its tracks if I did not,
and we would still solve the problem. However, by doing so, I reflect a somewhat
deeper understanding of two distinct sources of the time dependence of the vector
states. One is a trivial dependence given by these exponential factors, which would
have existed even if the Hamiltonian did not depend on time. These exponential
factors have nothing to do with the time dependence of the Hamiltonian. Factoring
them out right away, I ensure that the remaining time dependence of the coefficients
reflects only genuine nontrivial dynamics. As an extra bonus, I hope that by doing
so, I will arrive at equations that are easier to analyze.

Substitution of Eq. 10.13 to the left-hand side of the Schrödinger equation
i„d j i =dt yields

i„d j i
dt

D E.0/1 a1.t/ exp

� iE
.0/
1 t

„

!
j1i C i„da1.t/

dt
exp

� iE

.0/
1 t

„

!
j1i C

E.0/2 a2.t/ exp

iE.0/2 t

„

!
j2i C i„da2.t/

dt
exp

� iE

.0/
2 t

„

!
j2i : (10.14)

The right-hand side of this equation, OH j i , with OH defined by Eq 10.12 and j i by
Eq. 10.13 becomes

OH j i D E.0/1 a1.t/ exp

� iE
.0/
1 t

„

!
j1i C E.0/2 a2.t/ exp

� iE

.0/
2 t

„

!
j2i C

Ea2.t/ cos�t exp

� iE
.0/
2 t

„

!
j1i C Ea1.t/ cos�t exp

� iE

.0/
1 t

„

!
j2i (10.15)

where I took into account the orthogonality of the basis states. Equating coefficients
in front of vectors j1i and j2i on the left- and right-hand sides of the Schrödinger
equation (Eqs. 10.14 and 10.15 correspondingly) results in differential equations for
the time-dependent coefficients a1;2.t/:

i„da1.t/
dt

D Ea2.t/ cos�t exp
0
@ i
h
E.0/1 � E.0/2

i
t

„

1
A (10.16)

i„da2.t/
dt

D Ea1.t/ cos�t exp
0
@�

i
h
E.0/1 � E.0/2

i
t

„

1
A : (10.17)

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 337

Factors exp
�
˙i
h
E.0/1 � E.0/2

i
t=„
�

on the right-hand side in these equations

appeared as a result of eliminating the corresponding exponential factors

exp
�
�iE.0/1;2t=„

�
from their left-hand sides. Note that energy eigenvalues appear

in these equations only in the form of their difference, which is just another
manifestation of the already mentioned fact that the absolute values of the energy
levels are irrelevant. To simplify the notations, let me introduce a so-called transition
frequency:

!12 D E
.0/
1 � E.0/2

„ (10.18)

where I again for concreteness assumed that E.0/1 � E.0/2 > 0. Introducing this
notation and replacing cos�t by the sum of the respective exponential functions,
I can rewrite Eqs. 10.16 and 10.17 in the following form:

i„da1.t/
dt

D 1
2
Ea2.t/ .exp Œi .!12 ��/ t�C exp Œi .!12 C�/ t�/ (10.19)

i„da2.t/
dt

D 1
2
Ea1.t/ .exp Œ�i .!12 ��/ t�C exp Œ�i .!12 C�/ t�/ (10.20)

Equations 10.19 and 10.20 cannot be solved analytically. However, the most
interesting phenomena described by these equations occur when !12 � � �
!12 C �, in which case I can introduce an effective approximation capturing the
most important properties of the model (obviously, something will be left out, and
there might be situations when this something becomes important, but I am going to
pretend that such situations do not concern me at all). In order to formulate this
approximation, it is convenient to introduce a parameter 4 D !12 � � called
frequency detuning. In the case of the small detuning, the two exponential terms in
Eqs. 10.19 and 10.20 change with time on significantly different time scales. Terms
containing !12 �� oscillate with a much larger period (much slower) as compared
to the terms containing !12 C�, which exhibit comparatively fast oscillations.

In order to understand why fast oscillations are not effective in influencing the
behavior of the system, imagine a regular pendulum acted upon by a force, which
changes its direction faster than the pendulum manages to react to it (it is called
inertia, in case you forgot, and it takes some time for any quantity to change by
any appreciable amount). What will happen to the pendulum in this case? Right
before it has any chance to move in the initial direction of the force, the force will
have already changed and push the pendulum in the opposite direction. This is a
very frustrating situation, so the pendulum will just stay where it is. This effect
in a scientific jargon is called self-averaging—the force changes so much faster
than the reaction time of the pendulum that it effectively averages itself out to zero.
Taking advantage of this self-averaging effect, I will drop the fast-changing terms
in Eqs. 10.19 and 10.20, turning them into

338 10 Two-Level System in a Periodic External Field

i„da1.t/
dt

D 1
2
Ea2.t/ exp .i4t/ (10.21)

i„da2.t/
dt

D 1
2
Ea1.t/ exp .�i4t/ : (10.22)

Differentiating the first of these equations with respect to time, I get

i„da
2
1.t/

dt2
D 1
2
E da2.t/

dt
exp .i4t/C 1

2
i4Ea2.t/ exp .i4t/ :

Now, taking da2=dt from Eq. 10.22 while expressing a2.t/ in terms of da1=dt using
Eq. 10.21, I am getting rid of coefficient a2 and derive an equation containing
only a1:

da21.t/

dt2
� i4da1.t/

dt
C 1
4„2 E

2a1.t/ D 0: (10.23)

Did you notice how the time-dependent exponents in Eq. 10.23 magically disap-
peared turning it into a regular linear differential equation of the second order
with constant coefficients? You might notice that this is the same equation which
describes (among other things) a motion of a damped harmonic oscillator with
damping represented by a term with the first time derivative. This might appear a bit
troublesome, because the motion of a damped harmonic oscillator is characterized
by exponential decay of the respective quantities with time, and this is not the
behavior which we would like our quantum state to have. However, before going
into a panic mode, look at the equation a bit more carefully, and then you might
notice that “the damping” coefficient (whatever appears in front of da1=dt) is purely
imaginary, so no real damping takes place, and you can breathe easier.

Damping or no damping, I know that equations of the type of Eq. 10.23 are solved
by an exponential function, which I choose in the form of exp .i!t/. Substitution of
this function into Eq. 10.23 yields an equation for the yet unknown parameter !:

!2 � 4! � 1
4
�2R D 0;

where I introduced a new quantity of the dimension of frequency

�R D E„ ; (10.24)

which plays an important role in the phenomena we are about to uncover. The
quadratic equation for ! has two solutions:

!˙ D 1
2

4 ˙ 1
2

q
42 C�2R (10.25)

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 339

(both of which are, by the way, real) so that the general solution to Eq. 10.23 takes
the form

a1 D A exp .i!Ct/C B exp .i!�t/ : (10.26)

Expression for the second coefficient, a2, is found using Eq. 10.21:

a2 D 2i„E exp .�i4t/
da1.t/

dt
D

� 2
�R

exp .�i4t/ ŒA!C exp .i!Ct/C B!� exp .i!�t/� :

Combining exponential functions in this equation, you might notice the emergence
of two frequencies, !C � 4 and !� � 4, which can be evaluated into

!C � 4 D �1
2

4 C 1
2

q
42 C�2R D �!�

!� � 4 D �1
2

4 � 1
2

q
42 C�2R D �!C

allowing you to write an expression for a2 as

a2 D � 2
�R

ŒA!C exp .�i!�t/C B!� exp .�i!Ct/� : (10.27)

Amplitudes A and B in Eqs. 10.26 and 10.27 are yet undetermined; to find them
I have to specify initial conditions for Eqs.10.21 and 10.22, the issue which I
have not even mentioned yet. At the same time, you are perfectly aware that any
problem involving a time evolution is not complete without initial conditions, which
in quantum mechanics mean a state of the system at some instant of time defined
as t D 0.

It is usually assumed in this type of problems that one can “turn on” the time-
dependent interaction at some instant determined by the will of the experimentalist,
and in many cases it does make sense. For instance, the time-dependent term in
Hamiltonian 10.12 can represent a laser beam, which you can, indeed, turn on and
off at will. In this case one can prepare the system to be in a specific state before the
laser is turned on and study how this state will evolve due to the interaction with the
laser radiation. It is simplest to prepare the system in the lowest energy stationary
state, and so this is what I will choose as the initial condition:

j .0/i D j2i :

Taking into account Eq. 10.13, I can translate it into the following initial conditions
for the dynamic variables a1 and a2:

340 10 Two-Level System in a Periodic External Field

a1.0/ D 0 (10.28)
a2.0/ D 1: (10.29)

Substituting t D 0 into Eqs. 10.26 and 10.27 and using Eqs. 10.28 and 10.29, I derive
the following equations for amplitudes A and B:

A C B D 0I

� 2
�R

ŒA!C C B!�� D 1;

which are easily solved to yield

A D �B D � �R
2 .!C � !�/ :

It is easy to see using Eq. 10.25 that

!C � !� D
q

42 C�2R;

so that the amplitudes take on the value

A D �B D � �R
2

q
42 C�2R

:

Having found A and B, I can write down the final solutions for the time-dependent
coefficients a1;2.t/:

a1 D �R
2

q
42 C�2R

Œexp .i!�t/ � exp .i!Ct/� (10.30)

a2 D 1q
42 C�2R

Œ!� exp .�i!Ct/ � !C exp .�i!�t/� : (10.31)

These equations formally solve the problem I set out for you to solve: you
now know the time-dependent state of the two-level system described by Hamil-
tonian 10.12 at any instant of time. But I wouldn’t blame you if you still have this
annoying gnawing feeling of not being quite satisfied, probably because you are not
quite sure what to do with this solution and what kind of useful physical information
you can dig out from it. Indeed, the standard interpretation of coefficients in expres-
sions similar to Eq. 10.13 as probability amplitudes, whose squared absolute values
yield the probability of obtaining a corresponding value of an observable whose
eigenvectors are used as a basis, wouldn’t work here. The problem is that we are

10.2 Two-Level System in a Harmonic Electric Field: Rabi Oscillations 341

using the basis provided by eigenvectors of the Hamiltonian of a system, which
does not exist anymore, so that this traditional interpretation does not make much
sense.

One way to make sense out of Eqs. 10.26 and 10.27 is to recognize that in a
typical experiment, the time-dependent interaction does not last forever—it starts at
some instant, which you can designate as t D 0, and it usually ends at some time
t D tf (for instance, when a graduate student running the experiment gets tired,
turns the laser off, and goes on a date). So, after the time-dependent part of the
Hamiltonian vanishes, you are back to the standard situation, but the system is now
in a superposition state defined by the values of the coefficients a1;2 at the time,
when the laser got switched off. Now, you can quickly take the measurement of the
energy and interpret the results in terms of probabilities of getting one of two values:

E.0/1 or E
.0/
2 . The probability p

�
E.0/1

�
that the measurement would yield E.0/1 is given

as usual by ja1j2, which according to Eq. 10.30 is

p
�

E.0/1

�
D �

2
R

4
�42 C�2R

� �2 � exp �i .!C � !�/ tf
� � exp ��i .!C � !�/ tf

�� D

�2R

2
�42 C�2R

� �1 � cos .!C � !�/ tf
� D �

2
R�42 C�2R

� sin2 !C � !�
2

tf D

�2R�42 C�2R
� sin2

q
42 C�2R
2

tf : (10.32)

The probability that this measurement would yield value E.0/2 could have been
computed in exactly the same manner, and I will give you a chance to do it,
as an exercise, but here I will be smart and take advantage of the fact that

p
�

E.0/1

�
C p

�
E.0/2

�
D 1, so that without much ado, I can present you with

p
�

E.0/2

�
D 1 � �

2
R�42 C�2R

� sin2
q

42 C�2R
2

tf D

42�42 C�2R
� sin2

q
42 C�2R
2

tf C cos2
q

42 C�2R
2

tf : (10.33)

Equations 10.32 and 10.33 create a clear physical picture of what is happening with
our system. The first thing to note is the periodic oscillations of the probabilities with

time with frequency �GR D
q

42 C�2R called generalized Rabi frequency (note
that the factor 1=2 in the arguments of the cos and sin functions in these equations
is the result of transition from cos x to the functions of x=2 and is, therefore,
not included into the definition of the frequency). There exist special times tfn D

342 10 Two-Level System in a Periodic External Field

Fig. 10.2 Oscillations of

p
�

E.0/1

�
for three values of

the detuning: 4 D
0; 4=�R D 0:5, and
4=�R D 1:5

�n=�GR, where n is an integer, when the probability that the system will be found
in the higher energy state is zero, and there are times tfn D �n=�GRC�=2when this
probability acquires its maximum value �2R=

�42 C�2R
�
. For probability p

�
E.0/2

�
,

the situation is reversed—the probability reaches its value of unity at certain times
tfn D �n=�GR, but its minimum value occurring at tfn D �n=�GRC�=2 is not zero,
but is equal to 42= �42 C�2R

�
. Figure 10.2 depicts these oscillations of probability

known as Rabi oscillations. The period of these oscillations as well as maximum
and minimum values of the corresponding probabilities depend on the detuning
parameter 4 controlled by the experimentalists. For large detuning 4
�R, the
frequency of oscillations is determined mostly by 4, but their swing (a difference
between largest and smallest values) diminishes. For instance, the largest value of

p
�

E.0/1

�
becomes of the order of �2R=42 � 1, while the smallest value of p

�
E.0/2

�

is in this case very close to unity: 1 � �2R=42. For both probabilities there are not
much oscillations to speak of. A more interesting situation arises in the case of
small detuning, with the special case of zero detuning being of most interest. The
frequency of Rabi oscillations in this case becomes smallest and is equal to �R,
which is called Rabi frequency, and the probabilities swing between exact zero and
exact unity becoming the most pronounced.

If you are interested how one can observe Rabi oscillations, here is an example
of how it can be done. Imagine that you subject an ensemble of two-level systems
to a strong time-periodic electric field with small detuning and turn it off at different
times. The timing of the switching-off will determine the probability of a two-level
system to be in the higher energy state. The fraction of the systems in the ensemble
in this state is proportional to the corresponding probability. The systems will
eventually undergo transition to the lower energy level and emit light. The intensity
of the emitted light will be proportional to the number of systems in the upper state
and will change periodically with the switching-off time. In real experiments there is
no actual need to turn the electric field on and off all the time because spontaneous
transitions of the system from upper to lower energy states happen even with the
electric field on, and when this transition happens, the system is kicked off its normal
dynamic so hard that it forgets everything about what was happening to it before

10.3 Problems 343

that, so that the whole process starts anew. These kicks serve effectively as switches
for the electric field. Oscillations in this case can be observed as functions of Rabi
frequency controlled by the strength of the applied electric field. It is important that
Rabi oscillations can be observed only if their period is shorter than the time interval
between the “kicks.” To fulfill this condition, the applied electric field must be strong
enough to yield oscillations with a sufficiently high frequency.

10.3 Problems

Problems for Sect. 10.1

Problem 136 Find the approximate expression for energy levels of a two-level
system with a time-independent perturbation in the limit jV12j � jE1 � E2j for
two cases: E1 > E2 and E1 < E2.

Problem 137 Assume that the perturbation part of the Hamiltonian is given by
OV D OzE , where E is the electric field and Oz is the respective coordinate operator.
Assume also that the wave functions of the states included in the Hamiltonian are
described (in the coordinate representation) by wave functions defined on the one-
dimensional interval �1 < z < 1:

hzj E1i D 1paB exp .� jzj =aB/

hzj E2i D
q

2

a3B
z exp .� jzj =aB/ ;

where aB is the Bohr radius for an electron in the hydrogen atom, and that the electric
field is given in terms of the binding energy of the hydrogen atom Wb and electron’s
charge e as E D Wb=eaB. The unperturbed energy levels are given as

E1 D Wb.1C u/
E2 D Wb.1 � u/;

where u is a dimensionless parameter that can be changed between values �2 and
2.

1. Find the perturbation matrix Vij.
2. Find the eigenvalues of the full Hamiltonian, and plot them as a function of the u.
3. Also find the eigenvectors of the full Hamiltonian, and plot the ratio of the relative

weights of the initial vectors jE1;2i to the both found eigenvectors as functions
of u.

4. Also, consider phases of the ratio of the coefficients c1=c2 for both eigenvectors,
and plot their dependence on parameter u.

344 10 Two-Level System in a Periodic External Field

5. In all plots pay special attention to the region around u D 0, and describe how
the behavior of the eigenvalues of the perturbed Hamiltonian differs from the
corresponding behavior of the unperturbed energies.

6. Describe the behavior of the absolute values and the phases of c1=c2 in the
vicinity of u D 0.

Problems for Sect. 10.2

Problem 138 Find the probability that the measurement of the energy will yield
value E2 directly from the coefficient a2 in Eq. 10.33, and verify that the expression
for this probability given in the text is correct.

Problem 139 Find the time dependence of the probabilities p.E1;2/ assuming that
at time t D 0 the system was in the state j1i.

Chapter 11
Non-interacting Many-Particle Systems

11.1 Identical Particles in the Quantum World: Bosons
and Fermions

Quantum mechanical properties of a single particle are an important starting point
for studying quantum mechanics, but in real experimental and practical situations,
you will rarely deal with just a single particle. Most frequently you encounter
systems consisting of many (from two to infinity) interacting particles. The main dif-
ficulty in dealing with many-particle systems comes from a significantly increased
dimensionality of space, where all possible states of such systems reside. In Sect. 9.4
you saw that the states of the system of two spins belong to a four-dimensional
spinor space. It is not too difficult to see that the states of a system consisting of N
spins would need a 2N-dimensional space to fit them all. Indeed, adding each new
spin 1=2 particle with two new spin states, you double the number of basis vectors
in the respective tensor product, and even the system of as few as ten particles
inhabits a space requiring 1024 basis vectors. More generally, imagine that you
have a particle which can be in one of M mutually exclusive states, represented
obviously by M mutually orthogonal vectors (I will call them single-particle states),
which can be used as a basis in this single-particle M-dimensional space. You can
generate a tensor product of single-particle spaces by stacking together M basis
vectors from each single-particle space. Naively you might think that the dimension
of the resulting space will be MN , but it is not always so. The reality is more
interesting, and to get the dimensionality of many-particle states correctly, you need
to dig deeper into the concept of identity of quantum particles.

In the classical world, we know that all electrons are the same—the same charge
and the same mass—but if necessary we can still distinguish between them saying
that this is an electron with such and such initial coordinates and initial velocity, and
therefore, it follows this particular trajectory. A second electron, which is exactly the
same as the first one, but starting out with different initial conditions, follows its own
trajectory. And even if these two electrons interact, scatter off each other, we can still

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_11

345

346 11 Non-interacting Many-Particle Systems

Fig. 11.1 Two
distinguishable classical
electrons interact with each
other and follow their own
distinguishable trajectory. We
can easily say which electron
follows which trajectory

Fig. 11.2 Propagating clouds
of probabilities representing
the particles. In the
interaction region, the clouds
overlap, and the individuality
of the particles is lost
(Warning: it is dangerous to
take this cartoon too
seriously!)

say which electron is which by following their trajectories (see Fig. 11.1). Thus, we
say that classical electrons, even though they are identical, are still distinguishable.

The situation changes when you are talking about quantum particles. In essen-
tially the same setup—two particles approach each other from opposite directions,
interact, and move each in its new direction—the situation becomes completely
different. Now instead of two well-localized particles with perfectly defined tra-
jectories, you are dealing with moving amorphous clouds of probabilities, and when
they approach each other and overlap, all you can measure is the probability to find
one or two particles within a certain region of space, but you have no means to
tell which of the observed particles is which (Fig. 11.2). In quantum mechanics the
individuality of particles is completely lost—they are not just identical, but they are
indistinguishable.

Now the questions arise: how to formally describe this indistinguishability, and
what are the observable consequences of this property? To begin, let me formally
assign numbers 1 and 2 to the two particles and assume that particle 1 is in the
state described by vector

ˇ̌
˛.1/

˛
, where ˛ indicates a particular quantum state and 1

assigns this state to the first particle, and the second particle is in the state
ˇ̌
ˇ.2/

˛
.

The space of the two-particle states can be generated by the tensor product of the
single-particle states with a two-vector basis:

11.1 Identical Particles in the Quantum World: Bosons and Fermions 347

ˇ̌
ˇ .tp/1

E
D ˇ̌˛.1/˛ ˇ̌ˇ.2/˛ (11.1)

ˇ̌
ˇ .tp/2

E
D ˇ̌˛.2/˛ ˇ̌ˇ.1/˛ ; (11.2)

where the second vector is obtained from the first by replacing particle 1 with
particle 2: This operation can be formally described by a special “exchange”
operator OP.1; 2/ whose job is to interchange indexes of the particles assigned to
each state:

ˇ̌
˛.2/

˛ ˇ̌
ˇ.1/

˛ D OP.1; 2/ ˇ̌˛.1/˛ ˇ̌ˇ.2/˛ :

When applied twice, this operator obviously leaves the initial vector intact (two
exchanges 1 ! 2, 2 ! 1 are equivalent to no exchange at all), meaning that
OP2.1; 2/ D OI . An immediate consequence of this identity is that eigenvalues of

this operator are either equal to 1 or �1.
Using the exchange operator, the concept of indistinguishability can be for-

mulated in a precise and formal way. Consider an arbitrary state of two particles
represented by vector j .1; 2/i. If particles 1 and 2 are truly indistinguishable, then
vector j .2; 1/i D OP.1; 2/ j .1; 2/i and initial vector j .1; 2/i must represent the
same state, which means that they can differ from each other only by a constant
factor. The formal representation of the last statement looks like this:

OP.1; 2/ j .1; 2/i D j .2; 1/i D � j .1; 2/i ; (11.3)

which makes it clear that if j .1; 2/i represents a state of indistinguishable particles,
it must be an eigenvector of OP.1; 2/. The remarkable thing about this conclusion
is that there are only two types of such eigenvectors—those corresponding to
eigenvalue 1 and those that belong to eigenvalue �1, i.e., any vector describing a
state of indistinguishable particles must belong to one of two classes: symmetric
(even) with respect to the exchange of the particles when

OP.1; 2/ j .1; 2/i D j .1; 2/i (11.4)

or antisymmetric (odd) if

OP.1; 2/ j .1; 2/i D � j .1; 2/i : (11.5)

Moreover, Hamiltonians of indistinguishable particles obviously do not change
when the particles are exchanged (otherwise they wouldn’t be indistinguishable),
which means that the exchange operator and the Hamiltonian commute:

h OH; OP.1; 2/
i

D 0 (11.6)

348 11 Non-interacting Many-Particle Systems

(if it is not clear where it comes from, check the discussion around the parity
operator in Sect. 5.1, where similar issues were raised). In the context of the
exchange operator, Eq. 11.6 signifies two things. First is that the Hamiltonian and the
exchange operator are compatible and share a common system of eigenvectors. In
other words, eigenvectors of a Hamiltonian of two indistinguishable particles must
be either symmetric or antisymmetric. Second, if you treat the exchange operator
as a representative of a specific observable that takes only two values depending
on the symmetry of the state, Eq. 11.6 indicates that the expectation value of this
observable does not change with time (see Sect. 4.1.3 and the discussion around
Eq. 4.17 there). Accordingly, Eq. 11.6 ensures that if a two-particle system starts out
in a symmetric or antisymmetric state, it will remain in this state forever.

While it is useful to know that all states of indistinguishable particles must belong
to one of the two symmetry classes and that a system put in a symmetry class at some
instant of time will stay in this class forever, we still do not know how to relate the
symmetry of the states of the particular system of particles to their other properties:
does it depend on the particle’s charges, masses, and potential they are moving on,
or can it be somehow created deliberately through a clever measurement process?
I personally find the answer to all these questions, which I am about to reveal to
you, quite amazing: the symmetry of any state of indistinguishable particles cannot
be “chosen” or changed; the particles are born with predestined fate to be only
in states with one or another symmetry predetermined by their spin. It turns out
that particles with half-integer spin can exist only in antisymmetric states, while
particles with integer spins can be only in symmetric states. This statement is called
a spin-statistics theorem and is, in my view, one of the most amazing fundamental
results, which follows purely mathematically from the requirement that quantum
mechanics agrees with the relativity theory. Just stop to think about it: quantum
mechanics deals with phenomena occurring at very small spatial, temporal, mass,
and energy scales, while relativity theory explains the behavior of nature at very
large velocities. Apparently, quantum mechanics and relativity overlap when the
interaction between light and matter is involved and, at high energies, when particle-
antiparticle phenomena become important. However, the theoretical requirements of
the self-consistency of the theory, one of which is the spin-statistics theorem, are felt
well outside of these overlap areas and penetrate all of quantum mechanics from
atomic energy structure to electric, magnetic, and optical properties of solids and
fluids. Two better known phenomena made possible by the spin-statistics connection
are superfluidity and superconductivity. The proof of this theorem relies heavily on
quantum field theory, and you will sleep better at night by just accepting it as one of
the axioms of quantum mechanics.

The spin-statistics theorem was proved by Wolfgang Pauli in 1939 but published
only in 1940. The footnote to Pauli’s paper in Physical Review states that the paper
is a part of the report prepared for the Solvay Congress 1939, which did not take
place because of the war in Europe. By the time of that publication, Pauli had
moved from Zurich to Princeton, because Switzerland rejected his request for Swiss
citizenship on the ground of him becoming a German citizen after Hitler annexed
his native Austria. Anyway, to finish with this theorem, I only need to mention

11.1 Identical Particles in the Quantum World: Bosons and Fermions 349

that the particles with half-integer spins are called fermions (in honor of Enrico
Fermi, an Italian physicist, who had to leave Italy after Mussolini came to power;
he moved to the USA, where he created the world’s first nuclear reactor and played
a crucial role in the Manhattan Project), while particles with whole spins are called
bosons after Indian physicist Satyendra Nath Bose, who worked on the system of
indistinguishable photons as early as in 1924. It is interesting that after an initial
attempt to publish his paper on this topic failed, Bose sent it to Einstein asking
Einstein’s opinion and assistance with publication. Einstein translated the paper into
German and published in the leading German physics journal of the time Zeitschrift
für Physik (under Bose’s name, of course).

Before getting back to the business of doing practical quantum mechanics with
many-particle systems, a few additional words about fermions and bosons might be
useful. Among elementary particles constituting regular matter, fermions are most
abundant: electrons, protons, and neutrons—the main building blocks of atoms,
molecules, and the rest of the material world are all spin 1=2 fermions. The only
elementary boson you would encounter in regular setting would be a photon—
a quantum of the electromagnetic field—and not a regular material particle. This
can be taken as a general rule—as long as we are talking about elementary
particles, the matter is represented by fermions, while the interaction fields, and
other objects, which would classically be presented as waves in quantum mechanics,
become bosons. Other examples of bosons you can find are quantized elastic waves
(phonons) or quantized magnetic waves (magnons).

However, the concept of fermions and bosons can be extended to composite
particles as long as the processes they are taking part in do not change their
internal structure. The most famous examples of such composite particles are
electron Cooper pairs (Cooperons, named after American physicist Leon Cooper
who discovered them in 1956), responsible for superconductivity phenomenon, and
He4 nuclei, which in addition to two mandatory protons contain two neutrons,
making the total number of particles in the nucleus equal to 4. In both these
examples, we are dealing with composite bosons. Indeed, a pair of electrons, as
you already know, can be in the state with total spin either 0 or 1, i.e., the spin
of the pair is in either case integer. In the case of He4 nucleus, there are four spin
1=2 particles, and by diving them into two pairs, you can also see that the total
spin of this system can again only be an integer. Another interesting example of
composite bosons is an exciton in semiconductors, which consists of an electron
and a hole,1 both with spin 1=2: The extent to which the inner structure in all
these examples can be neglected and the particles can be treated as bosons depends
on the amount of energy required to disintegrate them into their constituent parts.
For Cooperons this energy is quite small—of the order of 10�3 eV—which explains

1Energy levels in a semiconductor are organized in bands separated by large gaps. The band,
all energy levels of which are filled with electrons, is called a valence band, and the closest to
its empty band is a conduction band. When an electron gets excited from the valence band, the
conduction band acquires an electron, and a valence band losses an electron, which leaves in its
stead a positively charged hole. Here you have an electron-hole pair behaving as real positively and
negatively charged spin 1=2 particles.

350 11 Non-interacting Many-Particle Systems

why they can only survive at very low (below 10 K) temperatures; exciton binding
energies vary over a rather large range between several millielectronvolts and several
hundred millielectronvolts, depending upon the material, and they, therefore, survive
at temperatures between 10 K and the room temperature (300 K). He4 nucleus, of
course, is the most stable of all the composite particles discussed: it takes the whole
28:3 mega-electronvolts to take it apart.

After this short and hopefully entertaining detour, it is time to get back to business
of figuring out how to implement the requirements of the spin-statistics theorem
in practical calculations. A generic vector representing a two-particle state and
expressed as a linear combination of basis vectors 11.1 and 11.2

j .1; 2/i D a1
ˇ̌
˛.1/

˛ ˇ̌
ˇ.2/

˛C a2
ˇ̌
˛.2/

˛ ˇ̌
ˇ.1/

˛

with arbitrary coefficients a1;2 does not obey the required symmetry condition.
However, after a few minutes of contemplation and silent staring at this expression,
you will probably see that you can satisfy the symmetry requirements of Eq. 11.4
by choosing a1 D a2, while Eq. 11.5 can be made happy with the choice a1 D �a2.
(If you are not that big on contemplation, just switch the particles in the expression
for j .1; 2/i, and write down the conditions of Eq. 11.4 or 11.5 explicitly.) If in
addition to symmetry you want your two-particle states to be also normalized, you
can choose for fermions

ˇ̌
f .1; 2/

˛ D 1p
2

�ˇ̌
˛.1/

˛ ˇ̌
ˇ.2/

˛ � ˇ̌˛.2/˛ ˇ̌ˇ.1/˛� (11.7)

and for bosons

ˇ̌
ˇ .1/b .1; 2/

E
D 1p

2

�ˇ̌
˛.1/

˛ ˇ̌
ˇ.2/

˛C ˇ̌˛.2/˛ ˇ̌ˇ.1/˛� : (11.8)

While Eq. 11.7 exhausts all possible two-particle states for fermions, in the case of
bosons, two more states, in which different particles occupy the same single-particle
state, can be constructed:

ˇ̌
ˇ .2/b .1; 2/

E
D ˇ̌˛.1/˛ ˇ̌˛.2/˛ (11.9)

ˇ̌
ˇ .3/b .1; 2/

E
D ˇ̌ˇ.1/˛ ˇ̌ˇ.2/˛ : (11.10)

An attempt to arrange a similar state for fermions fails because you cannot have
an antisymmetric expression with two identical states—they simply cancel each
other giving you a zero. In other words, it is impossible to have a two-particle
state of fermions, in which each fermion is in the same single-particle state. This
is essentially an expression of famous Pauli’s exclusion principle, which Pauli
formulated in 1925 trying to explain why atoms with even number of electrons are
more chemically stable than atoms with odd electron numbers. He realized that this

11.2 Constructing a Basis in a Many-Fermion Space 351

can be explained requiring that there can only be one electron per single-electron
state. If one takes into account only orbital quantum numbers such as principal
number n, orbital number l < n, and magnetic number jmj � l (see Chap. 8), the
total number of available states is equal to n2, which does not have to be even. So,
Pauli postulated the existence of yet another quantum quantity, which can only take
two different values, making the total amount of quantum numbers characterizing a
state of an electron in atom equal to 4 and the total number of available states 2n2.
The initial formulation of this principle was concerned only with electrons and was
stated approximately like this: no two electrons in a many-electron atom can have
the same values of four quantum numbers. Despite the success of this principle in
explaining the periodic table, Pauli remained unsatisfied for two principal reasons:
(a) he had no idea which physical quantity the fourth quantum number represents,
and (b) he was not able to derive his principle from more fundamental postulates
of quantum mechanics. The first of his concerns was resolved with the emergence
of the idea of spin (see Sect. 9.1), but it took him 14 long years to finally prove the
spin-statistics theorem, of which his exclusion principle is a simple corollary.

Before continuing I would like to clear up one terminological problem. When
dealing with many-particle systems, the word “state” might have different meanings
when used in different contexts. On one hand, I will talk about states characterizing
the actual many-particle system; Eqs. 11.7 through 11.10 give examples of such
states for the two-particle system. On the other hand, I use single-particle states,

such as
ˇ̌
˛.1/

˛
or
ˇ̌
ˇ.2/

˛
, to construct the many-particle states

ˇ̌
f .1; 2/

˛
or
ˇ̌
ˇ .i/b .1; 2/

E
.

So, in order to avoid misunderstandings and misconceptions, let’s agree that the term
“state” from now on will always refer to an actual state of a many-particle system,
while single-particle states from this point forward will be called single-particle
orbitals. Understood literally orbitals are usually used to describe single-electron
states of atomic electrons, but I will take the liberty to expand this term to any
single-electron state. Getting this out of the way, I now want to direct your attention
to the following fact. In the system of two fermions with only two available orbitals,
we ended up with just a single two-particle state. At the same time, in the case of
the same number of bosons and the same number of orbitals, there are three linearly
independent orthogonal two-particle states, and if we were to forget about symmetry
requirements (as we would if dealing with distinguishable particles), we would have
ended up with a four-dimensional space of two-particle states just like in the two-
spin problem from Sect. 9.4. You can see now that the dimensionality of the space
containing many-particle states severely depends on the symmetry requirements,
and the naive prediction for this dimension to be MN turned to be only correct for
distinguishable particles.

352 11 Non-interacting Many-Particle Systems

11.2 Constructing a Basis in a Many-Fermion Space

While identical bosons are responsible for some fascinating phenomena such as
superfluidity and superconductivity, the systems of many fermions are much more
ubiquitous in the practical applications of quantum theory, and, therefore, I will
mostly focus on them from now on. As always, the first thing to understand is the
structure of the space in which vectors representing the states of interest live. This
includes finding its dimension and constructing a basis. The problem of finding
the dimension of a many-particle space is an exercise in combinatorics—the science
of counting the number of different combinations of various objects. In the case
of fermions, the problem is formulated quite simply: given N objects (particles)
and M boxes (orbitals), you need to compute in how many different ways you can
fill the boxes assuming that each box can hold only one particle, and an order in
which the particles are distributed among the boxes is not important. Once you
find one distribution of the particles among the boxes, it becomes a seed for one
many-particle state. The state itself is found by permuting the particles among the
boxes, adding a negative sign for each permutation and summing up the results.
To understand the situation, better begin with a simplest case: M D N. When the
number of particles is equal to the number of orbitals, you do not have much of a
choice: you just have to put one particle in each box, and then do the permutations—
you end up with a single antisymmetric state. As an example, consider three

particles that can be in one of three available orbitals
ˇ̌
ˇ˛.s/i

E
, where the lower index

enumerates the orbitals and the upper index refers to the particles. Assume that
you put the first particle in the first box, the second particle in the second, and
the third one in the third, generating the following combination of the orbitals:ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E
. Now, let me switch particles 1 and 2, generating combination

�
ˇ̌
ˇ˛.2/1

E ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.3/3

E
. If I switch the particles again, say, particles 1 and 3, I will get

the new combination
ˇ̌
ˇ˛.2/1

E ˇ̌
ˇ˛.3/2

E ˇ̌
ˇ˛.1/3

E
. Note that the negative sign has disappeared

because each new permutation brings about a change of sign. Making all 6 (3Š)
permutations, you will end up with a single three-particle state:

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E
�
ˇ̌
ˇ˛.2/1

E ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.3/3

E
C
ˇ̌
ˇ˛.3/1

E ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.2/3

E
�

ˇ̌
ˇ˛.3/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.1/3

E
C
ˇ̌
ˇ˛.2/1

E ˇ̌
ˇ˛.3/2

E ˇ̌
ˇ˛.1/3

E
�
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.3/2

E ˇ̌
ˇ˛.2/3

E
: (11.11)

In agreement with the permutation rules described above, all terms in Eq. 11.11 with
negative signs in front of them can be obtained from the first term by exchanging
just one pair of particles, while the terms with the positive sign are obtained by
permutation of two particles. It makes sense, of course, because, as I said before,
an exchange of any two fermions is complemented by a change of sign, in which
case an exchange of two pairs of fermions is equivalent to changing the sign twice:
C ! � ! C, which is, of course, the same argument as I made when deriving this

11.2 Constructing a Basis in a Many-Fermion Space 353

expression. Finally, if you are wondering how to choose the first, seeding, term, the
answer is simple: it does not matter and you can start with any of them. The only
difference, which you might notice, is that all negative terms could become positive
and vice versa, which amounts to a simple overall negative sign in front of the whole
expression, and this makes no physical difference whatsoever.

Now, since, for every selection of the number of boxes equal to the number of the
particles, you end up with just a single many-particle state, the total number of states
is simply equal to the number of ways you can select N boxes out of M. This is a
classical combinatorial problem with a well-known solution given by the number of
combinations for N objects chosen out of M. Thus, the number of distinct linearly
independent and orthogonal N-fermion states based on M available single-fermion
orbitals (the dimensionality D .N;M/ of the corresponding space) is

D.N;M/ D

M
N

D MŠ

NŠ .M � N/Š : (11.12)

You can verify this general results with a few simple examples. Let’s say that now
you want to build a space of three-fermion states using five available single-fermion
orbitals. According to Eq. 11.12 this space possesses 5Š=.3Š2Š/ D 10 basis vectors.
Using the same notation

ˇ̌
ˇ˛.s/i

E
as before, but now allowing index i to run from 1 to 5,

you can generate the following ten seed vectors, in which each particle is assigned
to a different orbital:

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E
;
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/4

E
;
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/5

E

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/3

E ˇ̌
ˇ˛.3/4

E
;
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/3

E ˇ̌
ˇ˛.3/5

E
;
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/4

E ˇ̌
ˇ˛.3/5

E

ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.2/3

E ˇ̌
ˇ˛.3/4

E ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.2/3

E ˇ̌
ˇ˛.3/5

E
;
ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.2/4

E ˇ̌
ˇ˛.3/5

E
;

ˇ̌
ˇ˛.1/3

E ˇ̌
ˇ˛.2/4

E ˇ̌
ˇ˛.3/5

E
:

Each of these seeds yields a single antisymmetric state in an exactly the same way
as in the previous example.

A bit of gazing at Eq. 11.11 might reveal you an ultimate truth about the structure
of this expression: a sum of products of various distinct combinations of nine

elements
ˇ̌
ˇ˛. j/i

E
where each index takes three different values is grouped in three with

alternating positive and negative signs. Some digging in your associative memory
will bring to the surface that this is nothing but a determinant of a matrix whose rows
are the three participating orbitals with different particles assigned to each row:

j˛1; ˛2; ˛3i D

���������

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/1

E ˇ̌
ˇ˛.3/1

E
ˇ̌
ˇ˛.1/2

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/2

E
ˇ̌
ˇ˛.1/3

E ˇ̌
ˇ˛.2/3

E ˇ̌
ˇ˛.3/3

E

���������
(11.13)

354 11 Non-interacting Many-Particle Systems

where on the left of this equation I introduced a notation j˛1; ˛2; ˛3i which
contains all the information you need to know about the state presented on the
right, namely, that this three-fermion state is formed by distributing three particles
among orbitals j˛1i, j˛2i, and j˛3i. The right-hand side of this expression gives
you a good mnemonic rule about how to combine these three orbitals into an
antisymmetric three-fermion state. Arranging the orbitals into determinants makes
the antisymmetry of the corresponding state obvious: the exchange of particles
becomes mathematically equivalent to the interchange of the columns of the
determinant, and this operation is well known to reverse its sign.

The idea to arrange orbitals into determinants in order to construct automatically
antisymmetric many-fermion states was first used independently by Heisenberg and
Dirac in their 1926 papers and expressed in a more formal way by John C. Slater, an
American physicist, in 1929, and for this reason these determinants bear his name.
That was a time when American physicists had to travel for postdoctoral positions
to Europe, and not the other way around, so after getting his Ph.D. from Harvard,
Slater moved to Cambridge and then to Copenhagen before coming back to the USA
and joining the Physics Department at Harvard as a faculty member.

A word of caution: the fermion states in the form of the Slater determinant are
not necessarily the eigenvectors of a many-particle Hamiltonian, which, in general,
can be presented in the form

OH.N/ D
NX

iD1
OHi C 1

2

NX
iD1

NX
j

OVi;j (11.14)

where the first term is the sum of the single-particle Hamiltonians for each particle,
which includes operators of the particle’s kinetic energy and might include a term
describing the interaction of each particle with some external object, e.g., electric
field, while the second term describes the interaction between the particles, most
frequently the Coulomb repulsion between negatively charged electrons. The factor
1=2 in front of the second term takes into account that the double summation over i
and j counts the interaction between each pair of particles twice: once as OVi;j and the
second time as OVj;i. The principal difference between these two terms is that while
each OHi acts only on the orbitals of “its own” particle, the interaction term acts on
the orbitals of two particles. As a result, any simple tensor product of single-particle
orbitals is an eigenvector of the first term of the many-particle Hamiltonian, but not
of the entire Hamiltonian. Consider, for instance, the three-particle state from the
previous example. Picking up just one term from Eq. 11.11, I can write

� OH1 C OH2 C OH3
� ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E
D

ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E OH1
ˇ̌
ˇ˛.1/1

E
C
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.3/3

E OH2
ˇ̌
ˇ˛.2/2

E
C

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E OH3
ˇ̌
ˇ˛.3/3

E
D (11.15)

.E1 C E2 C E3/
ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E
:

11.2 Constructing a Basis in a Many-Fermion Space 355

Since all other terms in Eq. 11.11 feature the same three orbitals, it is obvious
that all of them are eigenvectors of this Hamiltonian with the same eigenvalue, so
that the entire antisymmetric three-particle state given by the Slater determinant,
Eq. 11.13, is also its eigenvector. It is also clear that for any Slater determinant state,
the eigenvalue of the non-interacting Hamiltonian is always a sum of the single-
particle energies of the orbitals used to construct the determinant. If, however, one
adds the interaction term to the picture, the situation changes as none of the single-
particle orbitals can be eigenvectors of OVi;j, which acts on states of two particles, so
that the Slater determinants are no longer stationary states of many-fermion system.
This does not mean, of course, that they are useless—they form a convenient basis
in the space of many-particle states, which ensures that all states represented in this
basis are antisymmetric. This brings me back to Eq. 11.12, defining the dimension
of this space and highlighting the main difficulty of dealing with interacting many-
particle systems—the space containing the corresponding states is just too large.

Consider, for instance, an atom of carbon, with its six electrons. You can start
building the basis for the six-electron space starting with lowest energy orbitals
and continuing until you have enough basis vectors. The two lowest energy orbitals
correspond to principal quantum number n D 1, orbital and magnetic numbers
equal to zero, and two spin numbers ˙1=2: j1; 0; 0; 1=2i and j1; 0; 0;�1=2i. This is
definitely not enough for six electrons, so you need to go to orbitals with n D 2, of
which there are 8: j2; 0; 0; 1=2i, j2; 0; 0;�1=2i ; j2; 1;�1; 1=2i, j2; 1;�1;�1=2i,
j2; 1; 0; 1=2i, j2; 1; 0;�1=2i, j2; 1; 1; 1=2i, and j2; 1; 1;�1=2i, where the notation
follows the regular scheme jn; l;m;msi (I combined the spin number ms with orbital
quantum numbers for the sake of simplifying the notation). If I limit the space to
just these ten orbitals (and it is not the fact that orbitals with n D 3 should not be
included), the total number of basis vectors in this space will be 10Š=.6Š4Š/ D 210.
It means that using the Slater determinants as a basis in this space, I will end up
with the Hamiltonian of the system represented by a 210 � 210 matrix. Allowing
the electrons to occupy additional n D 3 orbitals, all 18 of them, will bring the
dimensionality of the six-electron space to 376,740. I hope these examples give you
a clear picture of how difficult problems with many interacting particles can be and
explain why people were busy inventing a great variety of different approximate
ways of dealing with them. Very often, the idea behind these methods is to replace
the Hamiltonian in Eq. 11.14 by an effective Hamiltonian without an interaction
term. The effects of the interaction in such approaches are always hidden in “new”
single-particle Hamiltonians retaining some information about the interaction with
other particles. A more detailed exposition of this issue is way beyond the scope of
this book and can be found in many texts on atomic physics and quantum chemistry.

Before continuing to the next section, let me consider a few examples involving
non-interacting indistinguishable particles so that you could get a better feel for the
quantum mechanical indistinguishability.

Example 29 (Non-interacting Particles in a Potential Well.) Consider a system
of three non-interacting particles in an infinite one-dimensional potential well.
Assuming that the particles are (a) distinguishable spinless atoms of equal mass

356 11 Non-interacting Many-Particle Systems

ma, (b) electrons, and (c) indistinguishable spinless bosons, find three lowest energy
eigenvalues of this system, and write down the corresponding wave functions
(spinors when necessary).

Solution

(a) In the case of three distinguishable atoms, no symmetry requirements can
be imposed on the three-particle wave function, so the ground state energy
corresponds to a state in which all three atoms are in the same single-particle
ground state orbital:

.3/
1 .z1; z2; z3/ D

s
2

L

3
sin

�z1
L

sin
�z2
L

sin
�z3
L

(11.16)

with corresponding energy

E1;1;1 D 3„
2�2

2L2z ma
: (11.17)

The second energy level would correspond to moving one of the atoms to the
second single-particle orbital, so that I have for the three degenerate three-
particle states

.3/
2;1 .z1; z2; z3/ D

s
2

L

3
sin

2�z1
L

sin
�z2
L

sin
�z3
L

.3/
2;2 .z1; z2; z3/ D

s
2

L

3
sin

�z1
L

sin
2�z2

L
sin

�z3
L

(11.18)

.3/
2;3 .z1; z2; z3/ D

s
2

L

3
sin

�z1
L

sin
�z2
L

sin
2�z3

L

with the corresponding energy

E2;1;1 D 6„
2�2

2L2z ma
: (11.19)

Finally, the next lowest energy will correspond to two particles moved to
the second single-particle level with the wave functions and triple-degenerate
energy level given by

.3/
3;1 .z1; z2; z3/ D

s
2

L

3
sin

2�z1
L

sin
2�z2

L
sin

�z3
L

11.2 Constructing a Basis in a Many-Fermion Space 357

.3/
3;2 .z1; z2; z3/ D

s
2

L

3
sin

�z1
L

sin
2�z2

L
sin

2�z3
L

(11.20)

.3/
3;3 .z1; z2; z3/ D

s
2

L

3
sin

2�z1
L

sin
�z2
L

sin
2�z3

L

E2;2;1 D 9„
2�2

2L2z ma
: (11.21)

(b) Electrons are indistinguishable fermions, so their many-particle states must
be antisymmetric. The single-particle orbitals are spinors, formed as a tensor
product of the eigenvectors of the infinite potential well and of the spin operator
OSz. For convenience, I will begin by writing down the single-particle orbitals
in the symbolic form jn;msi, where n corresponds to an energy level in
the infinite well and ms is a spin magnetic number. To construct the vector
representing the ground state of the three-electron system, I need to include
three different orbitals with the lowest single-particle energies. Obviously these
are j1;"i ; j1;#i ; j2;msi. The choice of the spin state in the third orbital is
arbitrary, so that there are two different ground states with the same energy. The
respective Slater determinant becomes

j1; 1; 2i D
������
j1;"i1 j1;"i2 j1;"i3
j1;#i1 j1;#i2 j1;#i3
j2;"i1 j2;"i2 j2;"i3

������

where the lower subindex enumerates electrons, and I chose for concreteness
the spin-up state for the spin portion of the third orbital. Notation j1; 1; 2i for
the three-electron state was chosen in the form, which reflects the eigenvectors
of the infinite potential, “occupied”2 by electrons in this state. Expanding the
determinant and pulling out the spin number into a separate ket, I have

j1; 1; 2i D j1i1 j"i1 j1i2 j#i2 j2i3 j"i3 C j1i1 j#i1 j1i2 j"i2 j2i3 j"i3 C
j1i1 j"i1 j2i2 j"i2 j1i3 j#i3 � j2i1 j"i1 j1i2 j#i2 j1i3 j"i3 � j1i1 j#i1 j1i2 j"i2

j2i3 j"i3 � j1i1 j"i1 j2i2 j"i2 j1i3 j#i3 :

2“Occupied” in this context means that a given orbital participates in the formation of a given
many-particle state.

358 11 Non-interacting Many-Particle Systems

Bringing back the position representation of the eigenvectors of the well, the
last result can be written down as

j1; 1; 2i.1/ D
s

2

L

3
�

�
sin �z1L
0

�
0

sin �z2L

�
sin 2�z3L
0

�
C

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
C

0

sin �z1L

�
sin 2�z2L
0

�
0

sin �z3L

�
�

sin 2�z1L
0

�
0

sin �z2L

�
sin �z3L
0

�
�

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
�

sin �z1L
0

�
sin 2�z2L
0

�
0

sin �z3L

��
: (11.22)

To get a bit more comfortable with this expression, let’s apply operator

OH D OH.1/ C OH.2/ C OH.3/;

where OH.i/ is a single-electron infinite potential well Hamiltonian, which in the
spinor representation is proportional to a unit matrix:

OH j1; 1; 2i D
s

2

L

3
�

�
OH.1/ sin �z1L
0

�
0

sin �z2L

�
sin 2�z3L
0

�
C

0
OH.1/ sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
C

0

OH.1/ sin �z1L

�
sin 2�z2L
0

�
0

sin �z3L

�
�
OH.1/ sin 2�z1L

0

�
0

sin �z2L

�
sin �z3L
0

�
�

0

OH.1/ sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
�
OH.1/ sin �z1L

0

�
sin 2�z2L
0

�
0

sin �z3L

�
C

sin �z1L
0

�
0

OH.2/ sin �z2L

�
sin 2�z3L
0

�
C

0

sin �z1L

�
OH.2/ sin �z2L
0

�
sin 2�z3L
0

�
C

0

sin �z1L

�
OH.2/ sin 2�z2L
0

�
0

sin �z3L

�
�

sin 2�z1L
0

�
0

OH.2/ sin �z2L

�
sin �z3L
0

�
�

0

sin �z1L

�
OH.2/ sin �z2L
0

�
sin 2�z3L
0

�
�

sin �z1L
0

�
OH.2/ sin 2�z2L
0

�
0

sin �z3L

�
C

sin �z1L
0

�
0

sin �z2L

�
OH.3/ sin 2�z3L
0

�
C

0

sin �z1L

�
sin �z2L
0

�
OH.3/ sin 2�z3L
0

�
C

11.2 Constructing a Basis in a Many-Fermion Space 359

0

sin �z1L

�
sin 2�z2L
0

�
0

OH.3/ sin �z3L

�
�

sin 2�z1L
0

�
0

sin �z2L

�
OH.3/ sin �z3L
0

�
�

0

sin �z1L

�
sin �z2L
0

�
OH.3/ sin 2�z3L
0

�
�

sin �z1L
0

�
sin 2�z2L
0

�
0

OH.3/ sin �z3L

��
:

I understand that this expression looks awfully intimidating (or just awful), but I
still want you to gather your wits and go through it line by line, and let the force
be with you. The first thing that you shall notice is that every single-particle
Hamiltonian affects only those orbitals that contain its own particle. Now
remembering that each of the orbitals is an eigenvector of the corresponding
Hamiltonian, you can rewrite the above expression as

OH j1; 1; 2i D
s

2

L

3
�

�
E1

sin �z1L
0

�
0

sin �z2L

�
sin 2�z3L
0

�
C E1

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
C

E1

0

sin �z1L

�
sin 2�z2L
0

�
0

sin �z3L

�
� E2

sin 2�z1L
0

�
0

sin �z2L

�
sin �z3L
0

�
�

E1

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
� E1

sin �z1L
0

�
sin 2�z2L
0

�
0

sin �z3L

�
C

E1

sin �z1L
0

�
0

sin �z2L

�
sin 2�z3L
0

�
C E1

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
C

E2

0

sin �z1L

�
sin 2�z2L
0

�
0

sin �z3L

�
� E1

sin 2�z1L
0

�
0

sin �z2L

�
sin �z3L
0

�
�

E1

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
� E2

sin �z1L
0

�
sin 2�z2L
0

�
0

sin �z3L

�
C

E2

sin �z1L
0

�
0

sin �z2L

�
sin 2�z3L
0

�
C E2

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
C

E1

0

sin �z1L

�
sin 2�z2L
0

�
0

sin �z3L

�
� E1

sin 2�z1L
0

�
0

sin �z2L

�
sin �z3L
0

�
�

E2

0

sin �z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
� E1

sin �z1L
0

�
sin 2�z2L
0

�
0

sin �z3L

��
;

where E1;2 are eigenvalues of energy corresponding to eigenvectors j1i ; j2i
of the infinite potential well. Combining the like terms (terms with the same
combination of single-particle orbitals), you will find

360 11 Non-interacting Many-Particle Systems

OH j1; 1; 2i D .2E1 C E2/ j1; 1; 2i :

The second eigenvector belonging to this eigenvalue can be generated by
changing the spin state paired with the orbital state j2i from spin-up to spin-
down, which yields

j1; 1; 2i.2/ D
s

2

L

3
�

�
sin �z1L
0

�
0

sin �z2L

�
0

sin 2�z3L

�
C

0

sin �z1L

�
sin �z2L
0

�
0

sin 2�z3L

�
C

0

sin �z1L

�
0

sin 2�z2L

�
0

sin �z3L

�
�

0

sin 2�z1L

�
0

sin �z2L

�
sin �z3L
0

�
�

0

sin �z1L

�
sin �z2L
0

�
0

sin 2�z3L

�
�

sin �z1L
0

�
0

sin 2�z2L

�
0

sin �z3L

��
:

To get the next energy level and the corresponding eigenvector, I just need
to move one of the particles to the orbital j2i jmsi, which means that the Slater
determinant is now formed by orbitals j1;msi ; j2;#i ; j2;"i with the arbitrary
value of the spin state in the single-particle ground state. Using for concreteness
the spin-up value in j1;msi, I can write

j1; 2; 2i.1/ D
s

2

L

3
�

�
sin �z1L
0

�
sin 2�z2L
0

�
0

sin 2�z3L

�
C

sin 2�z1L
0

�
0

sin 2�z2L

�
sin �z3L
0

�
C

0

sin 2�z1L

�
sin �z2L
0

�
sin 2�z3L
0

�
�

0

sin 2�z1L

�
sin 2�z2L
0

�
sin �z3L
0

�
�

sin 2�z1L
0

�
sin �z2L
0

�
0

sin 2�z3L

�
�

sin �z1L
0

�
0

sin 2�z2L

�
sin 2�z3L
0

��
:

The energy corresponding to this state is E2;2;1 D E1 C 2E2 and coincides with
Eq. 11.20 for energy of the second excited in the system of the distinguishable
particles. Finally, to generate the next lowest energy level, one has to keep two
orbitals corresponding to the second excited level of the well with different
values of the spin number, and then the only choice for the third orbital would be
to use one of two j3i jmsi orbitals, which result in two degenerate eigenvectors,
one of which is shown below:

11.2 Constructing a Basis in a Many-Fermion Space 361

j2; 2; 3i.2/ D
s

2

L

3
�

�
sin 3�z1L
0

�
sin 2�z2L
0

�
0

sin 2�z3L

�
C

sin 2�z1L
0

�
0

sin 2�z2L

�
sin 3�z3L
0

�
C

0

sin 2�z1L

�
sin 3�z2L
0

�
sin 2�z3L
0

�
�

0

sin 2�z1L

�
sin 2�z2L
0

�
sin 3�z3L
0

�
�

sin 2�z1L
0

�
sin 3�z2L
0

�
0

sin 2�z3L

�
�

sin 3�z1L
0

�
0

sin 2�z2L

�
sin 2�z3L
0

��
:

(I derived this expression by simply replacing sin �ziL everywhere with sin
3�zi

L .)
The respective energy value is given by

E2;2;3 D E3 C 2E2 D 17„
2�2

2L2z me
:

(c) Now, let me deal with the system of three identical spinless bosons. The
symmetry requirement for the three-boson system allows using all identical
orbitals (the resulting state is automatically symmetric); thus, the ground state
can be built of a single orbital j1i and turns out to be the same as in the case of
distinguishable particles and with the same energy value (Eq. 11.17). A differ-
ence from distinguishable particles arises when transitioning to excited states.
Now, to satisfy the symmetry requirements, I have to turn three degenerate states
of Eqs. 11.18 and 11.20 with energies given by Eqs. 11.19 and 11.21 into single
non-degenerate states:

.3/
2;1;1 .z1; z2; z3/ D

s
2

L

3
sin

2�z1
L

sin
�z2
L

sin
�z3
L

C

sin
�z1
L

sin
2�z2

L
sin

�z3
L

C sin �z1
L

sin
�z2
L

sin
2�z3

L

�

.3/
2;2;1 .z1; z2; z3/ D

s
2

L

3
sin

2�z1
L

sin
2�z2

L
sin

�z3
L

C

sin
�z1
L

sin
2�z2

L
sin

2�z3
L

C sin 2�z1
L

sin
�z2
L

sin
2�z3

L

�
:

362 11 Non-interacting Many-Particle Systems

11.3 Pauli Principle and Periodic Table of Elements:
Electronic Structure of Atoms

While we are not equipped to deal with systems of large numbers of interacting
particles, you can still appreciate how Pauli’s idea of exclusion principle helped
understand the periodicity in the properties of the atoms. In order to follow the
arguments, you need to keep in mind two important points. First, when discussing
the chemical properties of atoms, people are interested foremost in the many-particle
ground state, i.e., a state of many electrons, which would have the lowest possible
energy. Second, since the Pauli principle forbids states in which two electrons
occupy the same orbital, you have to build many-particle states using at least as
many orbitals as many particles are in your system, starting with ground state
orbitals and adding new orbitals in a way, which would minimize an unavoidable
increase of the sum of single-particle energies of all involved electrons. This
last point implicitly assumes that the lowest energy of non-interacting particles
would remain the lowest energy even if the interaction is taken into account. This
assumption is not always true, but the discussion of this issue is beyond the scope
of this book. Anyway, having these two points in mind, let’s consider what happens
with states of electrons as we are moving along the periodic table. Helium occupies
the second place in the first row and is known as an inert gas, meaning that it is very
stable and is not eager to participate in chemical reactions or form chemical bonds.
It has two electrons, and therefore you need only two orbitals, which can have the
same value of the principal number n D 1 to construct a two-electron state:

j1; 0; 0; 1=2i1 j1; 0; 0;�1=2i2 � j1; 0; 0; 1=2i2 j1; 0; 0;�1=2i1 :

These two orbitals exhaust all available states with the same principal number. In
chemical language, we can say the electrons in helium atom belong to a complete
or closed shell. Going to the next atom, lithium Li, you will notice that it has
very different chemical properties—lithium is an active alkali metal, which readily
participates in a variety of chemical reactions and forms a number of different
compounds gladly offering one of its electrons for chemical bonding. Three lithium
electrons need more than two orbitals to form a three-electron state, so you must
start dealing with orbitals characterized by principal number n D 2. There are
eight of them, but only one is really required to form the lowest energy three-
electron state, and as a result seven of those orbitals remain, using physicist’s
jargon, “unoccupied.” Once you go along the second row of the periodic table,
the number of electrons increases to four in the case of beryllium, five for boron,
six for carbon, seven for nitrogen, eight for oxygen, nine for fluorine, and finally
ten for neon. With an increasing number of electrons, you must add additional
orbitals to be able to create corresponding many-electron states, so that the number
of “unoccupied” orbitals decreases. As the number of available, unused orbitals is
getting smaller, the chemical activity of the corresponding substances diminishes,
until you reach another inert gas neon. To construct a many-electron state for neon

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 363

Table 11.1 Elements of the
second row of the periodic
table and electronic
configurations of their ground
states in terms of
single-electron orbitals and
the term symbols

Element Configuration Term symbol

Li3 1s22s1 2S1=2
Be4 1s22s2 1S0
B5 1s22s22p1 2P1=2
C6 1s22s22p2 3P0
N7 1s22s22p3 4S3=2
O8 1s22s22p4 3P2
F9 1s22s22p5 2P3=2
Ne10 1s22s22p6 1S0

with ten electrons, you have to use all ten available orbitals with n D 1 and n D 2:
Consequently, the electron structure of neon is again characterized as a closed shell
configuration. A popular way to visualize this process of filling up the available
orbitals consists in assigning numbers 1; 2; � � � to the principal quantum number n,
and letters s; p; f ; and d to orbitals with orbital angular momentum number l equal
to 0; 1; 2, and 3, respectively. The configuration of helium in this notation, primarily
used in atomic physics and quantum chemistry, would be 1s2, where the first number
stays for the principal number, and the upper index indicates the number of electrons
available for assignment to orbitals with l D 0. The electronic structure of elements
in the second row of the periodic table discussed above is shown in Table 11.1.

You can see from this table that l D 0 orbitals are added first to the list
of available single-electron states, and only after that additional six orbitals with
l D 1 and different values of m and ms are thrown in. The supposition here is
that single-electron states with l D 0 would contribute less energy than the l D 1
states3. Therefore, these two orbitals must be incorporated into the basis first. The
assumption that the orbitals with larger n and larger l would contribute more energy,
and, therefore, the corresponding orbitals must be added only after the orbitals
with lower values of these numbers are filled, is not always correct, and for some
elements orbitals with lower l and higher n contribute less energy than orbitals with
higher l and lower n. This happens, for instance, with orbital 4s, which contributes
less energy than the orbital 3d, but there are no simple hand-waving arguments that
could explain or predict this behavior. Anyway, going now to the third row of the
periodic table, you again start with the new set of orbitals characterized by n D 3,
plenty of which are available for 11 electrons in the first element, another alkali
metal, sodium. I think you get the gist of how it is working, but on the other hand,
you shall be aware that this line of arguments is still a gross oversimplification, and
periodic table of elements is not that periodic in some instances, and there are lots
of elements that do not fit this simple model of closed shells.

Single-electron orbitals jn; l;m;msi based on eigenvectors of operators of orbital
and spin angular momenta are not the only way to characterize the ground states

3I have to remind you that while the hydrogen energy levels are degenerate with respect to l, for
other atoms this is not true because of the interaction with other electrons.

364 11 Non-interacting Many-Particle Systems

of atoms. An alternative approach is based on using eigenvectors of total orbital

angular momentum OL.tot/ D Pi OL
.i/

(sum of the orbital momenta of all electrons),

total spin of all electrons OS.tot/ D Pi OS
.i/

, and grand total momentum OJ D OL.tot/ C
OS.tot/.

Properties of the sum of two arbitrary angular momentum operators, OJ.1/ and OJ.2/,
can be figured out by generalizing the results for the sum of two spins or the spin 1=2
and the angular momentum presented in Chap. 9. The eigenvectors of the operator� OJ.1/ C OJ.2/

�2
are characterized by quantum number j, which can take values

j j1 � j2j � j � j1 C j2; (11.23)

where j1 and j2 refer to eigenvalues of
� OJ.1/

�2
and

� OJ.2/
�2

, respectively. For each j,

eigenvalues of OJ.1/z C OJ.2/z are characterized by magnetic numbers Mj obeying usual
inequality

ˇ̌
Mj
ˇ̌ � j and related to individual magnetic numbers mj1 and mj2 of OJ.1/z

and OJ.2/z correspondingly as

Mj D mj1 C mj2 : (11.24)

While Eq. 11.24 can be easily derived, proving Eq. 11.23 is a bit more than you can
chew at this stage, but you may at least verify that it agrees with the cases considered
in Chap. 9: for two 1=2 spins, Eq. 11.23 gives two values for j: j D 1; 0 in agreement
with Eqs. 9.54 and 9.55, and for the sum of the orbital momentum and the 1=2 spin,
Eq. 11.23 yields j D l ˙ 1=2 again in agreement with Sect. 9.5.2.

The transition from the description of many-fermion states in terms of single-
particle orbitals to the basis formed by eigenvectors of total orbital momentum,
total spin, and grand total angular momentum raises an important issue of separate
symmetry properties of many-particle orbital and spin states. Consider again for
simplicity two fermions that can individually be in orbital states j 1i and j 2i and
spin states j"i and j#i. In the description, where spin and orbital states are lumped
together in a one single-particle orbital (this is what I did writing equations such as
Eq. 11.11 or 11.13), I would have introduced four single-electron orbitals j˛ii:

j˛1i � j 1;"i I j˛2i � j 2;"i I j˛3i � j 1;#i I j˛4i � j 2;#i

and used them as a basis in a 4Š=.2Š2Š/ D six-dimensional two-fermion space. If,
however, I preferred to use eigenvectors of the total spin of the two particles as
a basis in the spin sector of the total spin–orbital two-particle space, separating
thereby the orbital and spin states, I would have to make sure that both the former
and the latter components separately possess a definite parity. Four eigenvectors of
the total spin of two spin 1=2 particles, indeed, contain a symmetric triplet j1;MSi
of states with total S.tot/ D 1 (see Eq. 9.54) and one antisymmetric singlet state
(Eq. 9.55) with total S.tot/ D 0. Thus, if I take these states as the spin components

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 365

of the total basis of two-particle fermion states, then the symmetry of the spin
component will dictate the symmetry of the orbital portion. Indeed, to make the
entire two-fermion state antisymmetric, the orbital components paired with any of
the symmetric two-spin state j1;MSi must itself be antisymmetric. Two available
orbital states can only yield a single antisymmetric combination resulting in three
basis vectors characterized by the value of total spin S.tot/ D 1:

1p
2

hˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
�
ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
j1;�1i

1p
2

hˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
�
ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
j1; 0i (11.25)

1p
2

hˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
�
ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
j1; 1i ;

where 1=
p
2 factor ensures the normalization of the vector representing the orbital

portion of the state. The remaining total spin eigenvector corresponding to S D 0 is
an antisymmetric singlet j0; 0i. Consequently, the corresponding orbital part of the
two-particle state must be symmetric resulting in three additional possible states:

ˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/1

E
j0; 0i

ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/2

E
j0; 0i (11.26)

1p
2

hˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
C
ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
j0; 0i :

You may notice that the first two of these states are formed by identical orbitals.
This is not forbidden by the Pauli principle because the spin state of two electrons
in this case is antisymmetric. This situation is often described by saying that the
two electrons in the same orbital state have “opposite” spins, which is not exactly
accurate. Indeed, “opposite” can refer only to the possible values of the z-component
of spin, but those can have opposite values in the singlet state as well as in the triplet
state with Ms D 0. Thus, it is more accurate to describe this situation as a total spin
zero or a singlet state. Combining three spin-antisymmetric states, Eq. 11.26, with
three antisymmetric-orbital states, Eq. 11.25, you find that the total number of basis
vectors in this representation is the same (six) as in the single-particle orbital basis,
confirming that this is just an alternative basis in the same vector space.

A more realistic example of a basis based on the separation of many-fermion spin
and orbital states would include two particles and at least three single-particle orbital
states corresponding to l D 1, m D �1; 0; 1. The total orbital angular momentum of
two electrons in this case can take three values: L D 0; 1; 2 with the total number of
corresponding states being 1C 3C 5 D 9 with various values of magnetic number
M. To figure out the symmetry of these states, you would need to present them as a
linear combination of single-particle states using the Clebsch–Gordan coefficients,

366 11 Non-interacting Many-Particle Systems

similar to what I did in Sect. 9.5.2:

jL; l1; l2;Mi D
X

m1;m2

CL;l1;l2M;m1;m2 jl1m1i jl2;m2i ım2;M�m1 ; (11.27)

where Kronecker’s delta makes sure that Eq. 11.24 is respected. The particle’s
exchange symmetry of the states presented by jL; l1; l2;Mi is determined by
the transformation rule of the Clebsch–Gordan coefficients with respect to the
transposition of indexes l1;m1 and l2;m2, which you will have to accept without
proof:

CL;l1;l2M;m1;m2 D .�1/L�l1�l2 CL;l2;l1M;m2;m1 : (11.28)

Indeed, applying the exchange operator OP.1; 2/ to Eq. 11.27, you will see that its
action on the right-hand side of the equation consists in the interchange of indexes
l1 and l2 in the Clebsch–Gordan coefficients:

OP.1; 2/ jL; l1; l2;Mi D
X

m1;m2

CL;l2;l1M;m2;m1 jl1m1i jl2;m2i ım1;M�m2 D

.�1/L�l1�l2
X

m1;m2

CL;l1;l2M;m1;m2 jl1m1i jl2;m2i ım2;M�m1

D .�1/L�l1�l2 jL; l1; l2;Mi :

In the second line of this expression, I used the transposition property of CL;l1;l2M;m1;m2 ,
Eq. 11.28. With this it becomes quite evident that state jL; l1; l2;Mi is symmetric
with respect to the exchange of particles if L � l1 � l2 is even and is antisymmetric
if L � l1 � l2 is odd. In the example when l1 D l2 D 1, which I am trying to figure
out now, this rule yields that the states with L D 2 and L D 0 are symmetric and
the state with L D 1 is antisymmetric. Correspondingly, the latter must be paired
with a triplet spin state, while the former two must go together with the zero spin
state. Since the total number of single-electron orbitals in this case is 6, the expected
number of two-particle antisymmetric basis vectors is 6Š=.4Š2Š/ D 15, and if you
insist I can list all of them below (I will use a simplified notation jL;Mi omitting l1
and l2):

j2;Mi j0; 0i
j1;Mi j1;Msi (11.29)
j0; 0i j0; 0i :

The first line in this expression contains five vectors with �2 � M � 2, the second
line represents 3 � 3 D 9 vectors with both M and Ms taking three values each, and
finally, the last line supplies the last 15th vector to the basis.

11.3 Pauli Principle and Periodic Table of Elements: Electronic Structure. . . 367

Finally, to complete the picture, I can rewrite these vectors in terms of the grand
total momentum OJ. The first five vectors from the expression above obviously
correspond to j D 2, so one can easily replace this line with vectors j2;MJi,
where the first number now corresponds to the value of j. The nine vectors from the
second line correspond to three values of j: j D 2; 1; 0. While this situation appears
terribly similar to the case of l1 D 1 and l2 D 1 states considered previously, the
significant difference is that vectors j1;Mi j1;Msi are no longer associated with just
one or another particle, so that Eq. 11.28 has no relation to the symmetry properties
of the resulting states j j; 1; 1;MJi with respect to the exchange of the particles.
All these states are as asymmetric under operator OP.1; 2/ as states j1;Mi j1;Msi.
The last line in Eq. 11.29 obviously corresponds to a single state with zero grand
total angular momentum, which simply coincides with j0; 0i j0; 0i. In summary, the
antisymmetric basis in terms of eigenvectors of operators OJ2,

� OL.tot/
�2

,
� OS.tot/

�2
, and

OJz is formed by vectors j j;L; S;MJi:

j2; 2; 0;MJi ; j2; 1; 1;MJi ; j1; 1; 1;MJi ; j0; 1; 1; 0i ; j0; 0; 0; 0i : (11.30)

It is easy to check that this basis also consists of 5C5C3C1C1 D 15 vectors. They
can be expressed as linear combinations of eigenvectors of the total orbital and total
spin momenta (Eq. 11.29) with the help of Eq. 11.27 and the same Clebsch–Gordan
coefficients, which can always be found on the Internet. Just to illustrate this point,
let me do it for the grand total eigenvector j1; 1; 1; 0i using one of the tables of the
Clebsch–Gordan coefficients that Google dug out for me in the depth of the World
Wide Web:

j2; 1; 1; 0i D
r
1

6
j1; 1i j1;�1i C

r
1

6
j1;�1i j1; 1i C

r
2

3
j1; 0i j1; 0i :

Values of the total orbital, spin, and grand total momentum are often used to
designate the electronic structure of atoms, instead of single-electron orbitals, in
the form of the so-called term symbol:

2SC1LJ : (11.31)

Here the center symbol designates the value of the total orbital momentum using the
same correspondence between numerical values and letters: S;P;D;F for 0; 1; 2; 3
correspondingly similar to the single-electron orbital case but with capital rather
than lowercase letters. The right subscript shows the value of the grand total
momentum, and the left superscript shows the multiplicity of the respective energy
configuration with respect to the total spin magnetic number Ms. For instance,
using this notation, the states j1; 1; 1;MJi can be described as 3P1, while states
j2; 2; 0;MJi become 1D2.

The example of two electrons and three available single-particle orbital states
is more realistic than the one with only two such states, but it is still a far cry

368 11 Non-interacting Many-Particle Systems

from what people have to deal when analyzing real atoms. The system of only
two electrons corresponds to helium atom, and one only needs one orbital state
with l1;2 D 0 to construct an antisymmetric two-electron ground state. In terms
of eigenvectors of total angular momentum and total spin, this state corresponds to
L D 0, S D 0: j0; 0i j0; 0i, where the orbital component is symmetric (both electrons
are in the same orbital state), and the spin component is antisymmetric (spins are
in an antisymmetric singlet state). The term symbol for this state is obviously 1S0.
Going from helium to lithium, you already have to deal with three electrons, with the
corresponding structure in terms of single-electron orbitals shown in the first line of
Table 11.1. To figure out the values of the total orbital, spin, and grand total momenta
for this element, you can start with the one established for helium atom and add an
additional electron to it assuming that it does not disturb the existing configuration
of the two electrons in the closed shell. Since we know that this electron goes to
the orbital with l D 0, the total orbital momentum remains zero, and the total spin
becomes 1=2 (you add a single spin to a state with S D 0, so what else can you
get?), so the grand total moment becomes J D 0 C 1=2 D 1=2, so that the term
symbol for Li becomes the same as for hydrogen 2S1=2 emphasizing the periodic
property of the electronic properties of the elements. For the same reason, the term
symbol for the next element, beryllium, is exactly the same as the one we derived
for helium (see Table 11.1). To figure out the term symbol for boron, ignore the two
electrons in the first closed shell, which do not contribute anything to the total orbital
or spin momenta, and focus on the three electrons in the second shell. For these three
electrons, you have available two orbitals with the same orbital state, l1 D l2 D 0,
and opposite spin states and an extra orbital with l3 D 1 and s3 D 1=2. The total
orbital and spin momenta in this case can only be equal to L D 1 and S D 1=2,
while the grand total momentum can be either J1 D 1=2 or J2 D 3=2. Thus, boron
can be in one of two configurations 2P1=2 or 2P3=2, but so far we have no means
of figuring out which of these two configurations have a lower energy. To answer
this question, we can ask help from German physicist Friedrich Hermann Hund,
who formulated a set of empiric rules determining which term symbol describes the
electron configurations in atoms with lowest energy. These rules can be formulated
as follows:

1. For a given configuration, a term with the largest total spin has the lowest energy.
2. Among the terms with the same multiplicity, a term with the largest total orbital

momentum has the lowest energy.
3. For the terms with the same total spin and total orbital momentum, the value of

the grand total momentum corresponding to the lowest energy is determined by
the filling of the outermost shell. If the outermost shell is half-filled or less than
half-filled, then the term with the lowest value of the grand total momentum has
the lowest energy, but if the outermost shell is more than half-filled, the term with
the largest value of the grand total momentum has the lowest energy.

In the case of boron, you have to go straight to the third of Hund’s rules because
the first two do not disambiguate between the corresponding terms. Checking
Table 11.1, you can see that the outermost shell for boron is the one characterized by

11.4 Exchange Energy and Other Exchange Effects 369

principal number n D 2, and the total number of single-particle orbitals on this shell
is 8. Since boron has three electrons on this shell, the shell is less than half-filled,
so that the third Hund’s rule tells you that the ground state configuration of boron
is 2P1=2.

The case of carbon is even more interesting. Ignoring again two electrons with
L D 0 and S D 0, I focus on two p-electrons with l1 D l2 D 1: Speaking of total
orbital momentum and total spin, you can identify the following possible values for
L and S: L D 0; 1; 2 and S D 0; 1. However, one needs to remember that the overall
state, including its spin and orbital components, must be antisymmetric, so that not
all combinations of L and S are possible. For instance, you already know that L D 2
orbitals are all symmetric; therefore, they can only coexist with spin singlet S D 0.
The corresponding grand total momentum is J D 2, so that the respective term is
1D2. The state with total orbital momentum L D 1 is antisymmetric and, therefore,
demands a symmetric triplet spin state S D 1. This combination of orbital and
spin momenta can generate grand total momentum J D 2; 1; 0, so that we have the
following terms: 3P2, 3P1,3P0. Finally, symmetric L D 0 state must be coupled with
the spin singlet giving rise to term 1S0. In summary, I identified five possible terms
consistent with the antisymmetry requirement: 1D2, 3P2, 3P1,3P0, and 1S0. Using the
first two Hund’s rules, you can limit the choice of the ground state configuration to
the P states, and since the number of electrons in C atom on the outer shell is only 4,
it is half-filled, and the third Hund’s rule yields that the ground state configuration
for carbon is 3P0. Figuring out term symbols for elements where there are more than
two electrons in the incomplete subshell (orbitals with the same value of the single-
particle orbital momentum), such as nitrogen (three electrons on the p-subshell), is
more complex, so I give you the term symbols for the rest of the elements in the
second row of the periodic table in Table 11.1 without proof for you to contemplate.

11.4 Exchange Energy and Other Exchange Effects

11.4.1 Exchange Interaction

Some of the examples discussed in the previous section have already demonstrated
a weird interconnectedness between spin and orbital components of many-particle
states, which has nothing to do with any kind of real spin–orbital interaction.
Recall, for instance, Eqs. 11.25 and 11.26 for two-fermion states: the triplet spin
state in Eq. 11.25 requires asymmetric orbital state, while the singlet spin state in
Eq. 11.26 asks for the orbital states to be symmetric. In the absence of interaction
between electrons, all three S D 1 states are degenerate and belong to the energy
eigenvalue E1 C E2, where E1;2 are eigenvalues of the single-particle Hamiltonian
corresponding to the “occupied” orbital states. At the same time, S D 0 states
correspond to three different energies 2E1, 2E2, and E1 C E2, depending upon the
orbital components used in their construction. The E1CE2 energy level is, therefore,
fourfold degenerate, with the corresponding eigenvectors formed by symmetric and

370 11 Non-interacting Many-Particle Systems

antisymmetric combinations of the same two orbital functions j 1i and j 2i. It
is important to emphasize that three of these degenerate states correspond to the
total spin of the system S D 1, and the fourth one possesses total spin S D 0.
An interaction between electrons, however, might lift the degeneracy, making the
energy of a two-electron system dependent on its spin state even in the absence of
any actual spin-dependent interactions. This is yet another fascinating evidence of
the weirdness of the quantum world.

I will demonstrate this phenomenon using a simple spin-independent Coulomb
interaction potential:

OV.1; 2/ D e
2

4�"0 jOr1 � Or2j
which describes the repulsion between two electrons in the atom of helium and is
added to an attractive potential responsible for the interaction between the electrons
and the nucleus. While a mathematically rigorous solution of a quantum three-
body problem is too complicated for us to handle, what I can do is to compute the
expectation value of the potential OV.1; 2/ using eigenvectors of the non-interacting
electrons. As you will find out later in Chap. 13, such an expectation value gives
you an approximation for the interaction-induced correction to the eigenvalues of
the Hamiltonian.

Let me begin with the two-fermion state described by the vector presented
in Eq. 11.25, which is characterized by an antisymmetric orbital component. The
interaction potential does not contain any spin-related operators, allowing me to
ignore the spin component of this state (it will simply yield h1;Msj 1;Msi D 1) and
write the expectation value as follows:

D OV.1; 2/
E

D
1

2

hD
.2/
2

ˇ̌
ˇ
D
.1/
1

ˇ̌
ˇ �

D
.2/
1

ˇ̌
ˇ
D
.1/
2

ˇ̌
ˇ
i OV.1; 2/

hˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
�
ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
D

1

2

hD
.2/
2

ˇ̌
ˇ
D
.1/
1

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
C
D
.2/
1

ˇ̌
ˇ
D
.1/
2

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
�

(11.32)

1

2

hD
.2/
1

ˇ̌
ˇ
D
.1/
2

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E
C
D
.2/
2

ˇ̌
ˇ
D
.1/
1

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/2

E ˇ̌
ˇ .2/1

Ei
:

(11.33)

If you carefully compare the terms in the third and fourth lines of the expression
above, you will notice a striking difference between them. In both terms in the
third line (Eq. 11.32), the ket and bra vectors, describing any of the two particles,

represent the same state (
ˇ̌
ˇ .1/1

E
and

D
.1/
1

ˇ̌
ˇ,
ˇ̌
ˇ .2/2

E
and

D
.2/
2

ˇ̌
ˇ), while the ket and bra

vectors of the same particle in the fourth line (Eq. 11.33) correspond to different

states (
ˇ̌
ˇ .1/1

E
and

D
.1/
2

ˇ̌
ˇ,
ˇ̌
ˇ .2/2

E
and

D
.2/
1

ˇ̌
ˇ). In other words, the terms in the line

11.4 Exchange Energy and Other Exchange Effects 371

labeled as Eq. 11.32 look like regular single-particle expectation values, while the
terms in the next line look like non-diagonal matrix elements computed between
different states for each particle. You can also notice that the two terms in Eq. 11.32
can be transformed into the other by exchange operator OP .1; 2/. Since the particles
are identical, no matrix elements must change as a result of the transposition, which
means that these terms are equal to each other. If you, however, apply the exchange
operator to the terms in Eq. 11.33, you will generate expressions, where ket and
bra vectors are reversed, meaning that these terms are complex conjugates of each
other. Finally, you can easily see that the expression in Eq. 11.32 would have exactly
the same form even if the particles in question were distinguishable, while Eq. 11.33
results from the antisymmetrization requirements imposed on the two-electron state.

Taking all this into account, the interaction expectation value can be presented as

D OV.1; 2/
E

D VC C Vexc; (11.34)

where VC is defined as

VC D
D
.2/
2

ˇ̌
ˇ
D
.1/
1

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

E

and Vexc as

Vexc D �Re
hD
.2/
1

ˇ̌
ˇ
D
.1/
2

ˇ̌
ˇ OV.1; 2/

ˇ̌
ˇ .1/1

E ˇ̌
ˇ .2/2

Ei
:

Using the position representation for the orbital states, the expression for VC can be
written down in the explicit form

VC D e
2

4�"0

ˆ
d3r1

ˆ
d3r2

j 1.r1/j2 j 2.r2/j2
jr1 � r2j ; (11.35)

which makes all statements made about VC rather obvious. If you agree to identify
e j .r/j2 with the charge density, you can interpret Eq. 11.35 as a classical energy
of the Coulomb interaction between two continuously distributed charges with
densities e j 1.r/j2 and e j 2.r/j2.

The expression for Vexc in the position representation takes the form

Vexc D � e
2

4�"0
Re

ˆ
d3r1

ˆ
d3r2

�1 .r2/ �2 .r1/ 1.r1/ 2.r2/
jr1 � r2j

�
; (11.36)

which does not have any classical interpretation. This contribution to the energy is
called exchange energy, and its origin can be directly traced to the antisymmetriza-
tion requirement. The expectation value computed with the symmetric orbital state
would have the same form as in Eq. 11.34, with one but important difference—a dif-
ferent sign in front of the exchange energy term. Thus, previously degenerate states
are now split by the interaction of the amount equal to 2Vexc on the basis of their

372 11 Non-interacting Many-Particle Systems

spin states. Just think about it—in the absence of any special spin–orbit interaction
term in the Hamiltonian, the energies of the two-electron states composed of the
same single-particle orbitals depend on their spin state! This is a purely quantum
effect, one of the manifestations of the oddity of quantum mechanics, which has
profound experimental and technological implications. However, first of all, I want
you to get some feeling about the actual magnitude of this effect; for this reason, I
am going to compute the Coulomb and exchange energies for a simple example of
a two-electron state of the helium atom.

For concreteness (and to simplify calculations), I will presume that the orbitals
participating in the construction of the two-electron state are j1; 0; 0i and j2; 0; 0i,
where I used the notation for the states from Chap. 8. In the position representation,
the corresponding wave functions are 1.r1/ D R10.r1/=

p
4� and 2.r2/ D

R20.r2/=
p
4� , where R10 and R20 are hydrogen radial wave functions, and the factor

1=
p
4� is what is left of the spherical harmonics with zero orbital momentum.

When integrating Eq. 11.35 with respect to r1, I can choose the Z-axis of the
spherical coordinate system in the direction of r2, in which case the denominator
in this equation can be written down as

jr1 � r2j D
q

r21 C r22 � 2r1r2 cos �1:

The integral over r1 now becomes

I.r2/ D 32
4�a3B

1̂

0

dr1

�̂

0

d�1

2�ˆ

0

d'1r
2
1 sin �1

e�4r1=aBq
r21 C r22 � 2r1r2 cos �1

D

64

a3B

1̂

0

dr1r
2
1e

�4r1=aB
1ˆ

�1
dx

1q
r21 C r22 � 2r1r2x

;

where I substituted

R10 D 2 .2=aB/3=2 exp .�2r=aB/

(remember that Z D 2 for He). Integral over x yields
1ˆ

�1
dx

1q
r21 C r22 � 2r1r2x

D 1
2r1r2

2r1r2ˆ

�2r1r2

dzq
r21 C r22 C z

D

1

r1r2

q
r21 C r22 C 2r1r2 �

q
r21 C r22 � 2r1r2

�
D r1 C r2 � jr1 � r2j

r1r2
: (11.37)

11.4 Exchange Energy and Other Exchange Effects 373

Evaluating this expression separately for r1 > r2 and r1 < r2, I find for I.r2/

I.r2/ D 128
a3Br2

r2ˆ

0

dr1r
2
1e

�4r1=aB C 128
a3B

1̂

r2

dr1r1e
�4r1=aB D

4

r2

1 �

1C 2r2

aB

e�4r2=aB

�
: (11.38)

Now, using

R20 D 2
1

aB

3=2
1 � r

aB

exp

� r

aB

:

I get for VC

VC D e
2

4�"0

8

4�a3B

1̂

0

dr2

�̂

0

d�2

2�ˆ

0

d'2 sin �2r
2
2I.r2/

1 � r2

aB

2
exp

�2r2

aB

D

e2

4�"0

32

a3B

1̂

0

dr2r2

1 �

1C 2r2

aB

e�4r2=aB

�
1 � r2

aB

2
exp

�2r2

aB

D

272

81

e2

4�"0aB
D 3:35Ry Š 46:34 eV

where I used Eq. 8.17 with Z set to unity and notation Ry D 13:8 eV for hydrogen’s
ground state (in vacuum).

Now, I will compute the exchange energy correction. Keeping the same notation
I.r2/ for the first integral with respect to r1, I can present it, using expressions for
the radial functions provided above, as

I2.r2/ D 8
p
2

4�a3B

1̂

0

dr1

�̂

0

d�1

2�ˆ

0

d'1r
2
1 sin �1

�
1 � r1aB

�
exp

�
� 3r1aB

�
q

r21 C r22 � 2r1r2 cos �1
D

4
p
2

a3B

1̂

0

dr1r
2
1 exp

�3r1

aB

1 � r1

aB

1ˆ

�1
dx

1q
r21 C r22 � 2r1r2x

:

Equation 11.37 for the angular integral and Mathematica © for the remaining radial
integrals yield

374 11 Non-interacting Many-Particle Systems

I.r2/ D 8
p
2

a3Br2

r2ˆ

0

dr1r
2
1 exp

�3r1

aB

1 � r1

aB

C

8
p
2

a3B

1̂

r2

dr1r1 exp

�3r1

aB

1 � r1

aB

D 8

p
2

27aB

1C 3r2

aB

e�3r2=aB :

Plugging it into Eq. 11.36 and dropping the real value sign because all the functions
in the integral are real, I have

Vexc D � e
2

4�"0

8
p
2

27aB
4

2

aB

3=2
1

aB

3=2
�

1̂

0

dr2r
2
2 exp

�2r2

aB

1 � r2

aB

exp

� r2

aB

1C 3r2

aB

exp

�3r2

aB

D

128

27

e2

4�"0a4B

1̂

0

dr2r
2
2 exp

�6r2

aB

1 � r2

aB

1C 3r2

aB

� �1:21 eV:

Thus, this calculation showed that a state with S D 1 has a smaller energy than
a state with S D 0 by 2 � 1:21 D 2:42 eV. However, the sign of the integral
in the exchange term depends on the fine details of wave functions representing
single-electron orbitals and is not predestined. If single-particle orbitals of electrons
were different, describing, for instance, electrons with non-zero orbital momentum
in outer shells of heavier elements, or electrons in metals, the situation might have
been reversed, and the singlet spin state could have a lower energy than a triplet.

This difference between energies of symmetric and antisymmetric spin states
gives rise to something known as an exchange interaction between spins and plays
an extremely important role in magnetic properties of materials. In particular,
this “interaction,” which is simply a result of the fermion nature of electrons, is
responsible for the formation of ordered spin arrangements responsible for such
phenomena as ferromagnetism or antiferromagnetism.

Ferromagnets—materials with permanent magnetization—have been known
since the earliest days of human civilization, but the origin of their magnetic
properties remained a mystery for a very long time. André-Marie Ampère (a French
physicist who lived between 1775 and 1836 and made seminal contributions to
electromagnetism) proposed that magnetization is the result of the alignment of
all dipole magnetic moments formed by circular electron currents of each atom
in the same direction. This alignment, he believed, was due to the magnetostatic
interaction between the dipoles, which made the energy of the system lowest when
all dipoles point in the same direction. Unfortunately, calculations showed that the
magnetostatic interaction is so weak that thermal fluctuations would destroy the
ferromagnetic order even at temperatures as small as a few Kelvins. The energy of

11.4 Exchange Energy and Other Exchange Effects 375

the spin exchange interaction is much bigger (if you think that 2 eV is a small energy,
you will be delighted to know that it corresponds to a temperature of more than
20,000 K). The temperature at which iron loses its ferromagnetic properties due to
thermal agitation is about 1043 K, which corresponds to the exchange energy of only
80 meV, which is the real culprit behind the ordering of spin magnetic moments. In
the classical picture, ordering would mean that all magnetic moments are aligned
in the same direction, but describing this phenomenon quantum mechanically, we
would say that N spins S are aligned if they are in the symmetric state with total
spin equal to NS. For such a state to correspond to the ground state of the system
of N spins, the exchange energy must favor (be lower for) symmetric states over the
antisymmetric states. An antisymmetric state classically could be described as an
array of magnetic moments, each pair of which is aligned in the opposite directions,
while quantum mechanically we would say that each pair of spins in the array is in
the spin zero state. Materials with such an arrangement of magnetic moments are
known as antiferromagnetics, and for the antiferromagnetic state to be a ground
state, the exchange energy must change its sign compared to the ferromagnetic
case. The complete theory of magnetic order in solids is rather complicated, so
you should not think that this brief glimpse into this area gives you any kind of
even remotely complete picture, but beyond all these complexities, there is a main
underlying physical mechanism—the exchange energy.

11.4.2 Exchange Correlations

The symmetry requirement on the many-particle states of indistinguishable particles
affects not only their interaction energy but also their spatial positions. To illustrate
this point, I will compute the expectation value of the distance between two electrons
defined as

D
.r1 � r2/2

E
D ˝r21

˛C ˝r22
˛ � 2 hr1r2i : (11.39)

This time around, I will assume that the two electrons belong to two different
hydrogen-like atoms separated by a distance R small enough for the wave functions
describing states of each electron to have significant spatial overlap. I will also
assume that the electrons are in the same atomic orbital jn; l;mi, but since each
of these orbitals belongs to two different atoms, they represent different states,
even if they are described by the same set of quantum numbers. To distinguish
between these orbitals, I will add another parameter to the set of quantum numbers
characterizing the position of the nucleus: jn; l;m;Ri. If the atoms are separated
by a significant distance, you can quite clearly ascribe an electron to an atom it
belongs. However, when the distance between the nuclei becomes comparable with
the characteristic size of the electron’s wave function, this identification is no longer
possible, and you have to introduce two orbitals for each electron, jn; l;m;R1ii

376 11 Non-interacting Many-Particle Systems

and jn; l;m;R2ii, where lower index i outside of the ket symbol takes values 1
or 2 signifying one or another electron. This two-electron system can be again in
a singlet or triplet spin state demanding symmetric or antisymmetric two-electron
orbital state:

j˙i D 1p
2
Œjn; l;m;R1i1 jn; l;m;R2i2 ˙ jn; l;m;R1i2 jn; l;m;R2i1� : (11.40)

The first two terms in Eq. 11.39 are determined by single-particle orbitals:

h˙j r21 j˙i D
1

2

�
1 hn; l;m;R1j r21 jn; l;m;R1i1 C 1 hn; l;m;R2j r21 jn; l;m;R2i1 ˙

1 hn; l;m;R1j r21 jn; l;m;R2i1 � 2 hn; l;m;R1j n; l;m;R2i2 ˙
1 hn; l;m;R2j r21 jn; l;m;R1i1 � 2 hn; l;m;R2j n; l;m;R1i2

�
:

When writing this expression, I took into account that the orbitals belonging to the
same atom are normalized, 2 hn; l;m;R2j n; l;m;R2i2 D 1, but orbitals belonging
to different atoms are not necessarily orthogonal: 2 hn; l;m;R1j n; l;m;R2i2 ¤ 0 .
Similar expression for h˙j r22 j˙i is

h˙j r22 j˙i D
1

2

�
2 hn; l;m;R1j r22 jn; l;m;R1i2 C 2 hn; l;m;R2j r22 jn; l;m;R2i2 ˙

2 hn; l;m;R1j r22 jn; l;m;R2i2 � 1 hn; l;m;R1j n; l;m;R2i1 ˙
2 hn; l;m;R2j r22 jn; l;m;R1i2 � 1 hn; l;m;R2j n; l;m;R1i1

�
:

Since both atoms are assumed to be identical, the following must be true:

1 hn; l;m;R1j r21 jn; l;m;R1i1 D 2 hn; l;m;R2j r22 jn; l;m;R2i2 � a2

1 hn; l;m;R2j r21 jn; l;m;R2i1 D 2 hn; l;m;R1j r22 jn; l;m;R1i2 � b2

1 hn; l;m;R1j r21 jn; l;m;R2i1 D 2 hn; l;m;R2j r22 jn; l;m;R1i2 � u
2 hn; l;m;R1j n; l;m;R2i2 D 1 hn; l;m;R2j n; l;m;R1i1 � v:

All these relations can be obtained by noticing that the system remains unchanged
if you replace R1 ! R2 and simultaneously change electron indexes 1 and 2. Taking
into account these relations and corresponding simplified notations, I can write for
h˙j r21;2 j˙i:

h˙j r21 j˙i D h˙j r22 j˙i D
1

2

�
a2 C b2 ˙ �uv C u � v��� :

11.4 Exchange Energy and Other Exchange Effects 377

The next step is to evaluate hr1r2i:

h˙j r1r2 j˙i D
1

2
Œ1 hn; l;m;R1j r1 jn; l;m;R1i1 � 2 hn; l;m;R2j r2 jn; l;m;R2i2 C (11.41)

1 hn; l;m;R2j r1 jn; l;m;R2i1 � 2 hn; l;m;R1j r2 jn; l;m;R1i2 ˙
1 hn; l;m;R1j r1 jn; l;m;R2i1 � 2 hn; l;m;R1j r2 jn; l;m;R2i 2˙

1 hn; l;m;R2j r1 jn; l;m;R1i1 � 2 hn; l;m;R2j r2 jn; l;m;R1i2� : (11.42)

The evaluation of these expressions requires a more explicit determination of the
point with respect to which electron position vectors are defined. Assuming for
concreteness that the origin of the coordinate system is at the nucleus of atom 1, I can
immediately note that the symmetry with respect to inversion kills first two terms
in Eq. 11.42 since 1 hn; l;m;R1j r1 jn; l;m;R1i1 D 2 hn; l;m;R1j r2 jn; l;m;R1i2 D 0.
The remaining two terms survive and can be written as

h˙j r1r2 j˙i D ˙ jdj2

where I introduced vector d defined as follows:

1 hn; l;m;R1j r1 jn; l;m;R2i1 D 2 hn; l;m;R2j r2 jn; l;m;R1i2 � d:

Finally, combining all the obtained results together, I can write

D
.r1 � r2/2

E
D a2 C b2 ˙

�
uv C u � v� � 2 jdj2

�
: (11.43)

While the actual computation of matrix elements appearing in Eq. 11.43 is rather
difficult and will not be attempted here, you can still learn something from this
exercise. Its main lesson is that the spin state of the electrons affects how close the
electrons of the two atoms can be. Assuming for concreteness that the expression
in the parentheses in Eq. 11.43 is negative, which is favored by the term jdj2 (the
actual sign depends on the single-electron orbitals), one can conclude that the
antisymmetric spin state promoting symmetric orbital state (C sign in ˙) results
in electrons being closer together, than in the case of the symmetric spin state. This
is an interesting quantum mechanical effect: electrons appear to be “pushed” closer
toward each other or further away from each other depending on their spin state even
though there is no actual physical force doing the “pushing.” This phenomenon plays
an important role in chemical bonding between atoms, because electrons, when
“pushed” toward each other, pull their nuclei along making the formation of a stable
bi-atomic molecule more likely.

378 11 Non-interacting Many-Particle Systems

11.5 Fermi Energy

The behavior of systems consisting of many identical particles (and by many here I
mean really huge, something like Avogadro’s number) is studied by a special field of
physics called quantum statistics. Even a sketchy review of this field would take us
well outside the scope of this book, but there is one problem involving a very large
number of fermions, which we can handle. The issue in question is the structure
of the ground state and its energy for the system on N
1 non-interacting free
electrons (an ideal electron gas) confined within a box of volume V . Each electron is
a free particle characterized by a momentum p, corresponding single-particle energy
Ep D p2=2me, and a single particle wave function (in the position representation)
p .r/ D Ap exp .ip � r=„/, where Ap is a normalization parameter, which was chosen
in Sect. 5.1.1 to be 1=

p
2�„ to generate a delta-function normalized wave function.

Here it is more convenient to choose an alternative normalization, which would
explicitly include volume V occupied by the electrons. To achieve this, I will impose
so-called periodic boundary conditions:

p .r C L/ D p .r/ ; (11.44)

where L is a vector with components Lx; Ly; Lz such that LxLyLz D V . This
boundary condition is the most popular choice in solid-state physics, and if you
are wondering about its physical meaning and any kind of relation to reality, it does
not really have any. The logic of using it is based upon two ideas. First, it is more
convenient than, say, particle-in-the-box boundary conditions p .L/ D 0, implying
that the electrons are confined in an infinite potential well, because it keeps the wave
functions in the complex exponential form rather than forcing them to become much
less convenient real-valued sin functions. Second, it is believed that as long as we
are not interested in specific surface-related phenomena, the behavior of the wave
functions at the boundary of a solid shall not have any impact on its bulk properties.
I have used a similar idea when computing the total energy of electromagnetic field
in Sect. 7.3.1.

This boundary condition imposes restrictions on the allowed values of the
electron’s momentum:

exp .ip � .r C L/ =„/ D exp .ip � r=„/ )

exp .ip � L=„/ D 1 ) pi � Li„ D 2�ni; (11.45)

where i D x; y; z and ni D ˙1;˙2;˙3 � � � . In addition to making the spectrum
of the momentum operator discrete, the periodic boundary condition also allows an
alternative normalization of the wave function:

11.5 Fermi Energy 379

ˇ̌
Ap
ˇ̌2

Lx=2ˆ

�Lx=2

Ly=2ˆ

�Ly=2

Lz=2ˆ

�Lz=2
e�i

p�r
„ ei

p�r
„ dxdydz D 1

which yields Ap D 1=
p

V . The system of normalized and orthogonal single-electron
wave function takes the form

n1;n2;n3 .r/ D
1p
V

exp

i

2�

Lx
n1x C 2�

Ly
n2y C 2�

Lz
n3z

�
;

while the single-electron energies form a discrete spectrum defined by

�n1n2n3 D
.2�„/2
2me

n21
L2x

C n
2
2

L2y
C n

2
3

L2z

!
: (11.46)

This wave function generates two single-electron orbitals characterized by two
different values of the spin magnetic number ms, which is perfectly suitable for
generating many-particle states of the N-electron system. The ground state of this
system is given by the Slater determinant formed by N single-electron orbitals with
the lowest possible single-particle energies ranging from �1;1;1 to some maximum
value �F corresponding to the last orbital making it into the determinant. Thus,
all single-particle orbitals of electrons are divided into two groups: those that are
included (occupied) into the Slater determinant for the ground state and those that
are not (empty or vacant). The occupied and empty orbitals are separated by energy
�F known as the Fermi energy. The Fermi energy is an important characteristic of an
electron gas, which obviously depends on the number of electrons N and determines
much of its ground state properties. Thus, let’s spend some time trying to figure it
out.

In principle, finding �F is quite straightforward: one needs to find the total
number of orbitals M.�/ with energies less than �. Then the Fermi energy is found
from equation

M .�F/ D N: (11.47)
However, counting the orbitals and finding M.�/ are not quite trivial because energy
values defined by Eq. 11.46 are highly degenerate, and what is even worse is that
there is no known analytical formula for the degree of the degeneracy as a function
of energy. The problem, however, can be solved in the limit when N ! 1 and V !
1 so that the concentration of electrons N=V remains constant. In this limit the
discrete granular structure of the energy spectrum becomes negligible (the spectrum
in this case is called quasi-continuous), and the function M.�/ can be determined.

You might think that I am nuts because I first introduce finite V to make the
spectrum discrete and then go to the limit V ! 1 to make it continuous again. The
thing is that if I had begun with the infinite volume and continuous spectrum, the
only information I would have had about the number of states is that it is infinite

380 11 Non-interacting Many-Particle Systems

(the number of states of continuous spectrum is infinite for any finite interval of
energies), which does not help me at all. What I am getting out of this roundabout
approach is the knowledge about how the number of states turns infinite when the
volume goes to infinity, and as you will see, this is exactly what we need to find the
Fermi energy.

In order to find M.�/, it is convenient to visualize the states that need to be
counted. This can be done by presenting each single-electron orbital graphically as
points with coordinates n1; n2; n3 in a three-dimensional space defined by a regular
Cartesian coordinate system with axes X, Y , and Z. Each point here represents two
orbitals with different values of the spin magnetic number. Surrounding each point
by little cubes with sides equal to unity, I can cover the entire three-dimensional
space containing the electron’s orbitals. Since each cube has a unit volume, the total
volume covered by the cubes is equal to the number of points within the covered
region. Since each point represents two orbitals with opposite spins, the number of
all orbitals in this region is twice the number of points.

For simplicity let me assume that all Lz D Ly D Lz � L, which allows me to
rewrite Eq. 11.46 in the form

n2.�/ D n21 C n22 C n23 (11.48)
where I introduced

n2.�/ D 2me�n1n2n3L
2

.2�„/2 : (11.49)

Equation 11.48 defines a sphere in the space of electron orbitals with radius n /
L
p
�. If you allow non-integer values for numbers n1;2;3, you could say that each

point on the surface of the sphere corresponds to states with the same energy � (such
surface is called isoenergetic). All points in the interior of the surface correspond to
states with energies less than �, while all points in the exterior represent states with
energies larger than �. Now, the number of orbitals encompassed by the surface is, as
I just explained, simply equal to the volume of the corresponding region multiplied
by two to account for two values of spin. Thus, I can write for the number of states
with energies less than �4:

M.�/ D 24
3
�

L

2�„
3
.2me�/

3=2 D V .2me�/
3=2

3�2„3 : (11.50)

4If instead of periodic boundary conditions you would use the particle-in-the-box boundary
conditions requiring that the wave function vanishes at the boundary of the region Lx � Ly � Lz,
you would have ended up with pi D �ni=Li, where ni now can only take positive values because
wave functions sin .�n1x=Lx/ sin

�
�n2y=Ly

�
sin .�n3z=Lz/ with positive and negative values of ni

represent the same function, while function exp i
�
2�
Lx

n1x C 2�Ly n2y C 2�Lz n3z
�

with positive and

negative indexes represents two linearly independent states. As a result, Eq. 11.50 when used in
this case would have an extra factor 1=8 reflecting the fact that only 1/8 of a sphere correspond to
points with all positive coordinates.

11.5 Fermi Energy 381

Fig. 11.3 Two-dimensional
version of a state counting
procedure described in the
texts: squares replace cubes, a
circle represents a sphere, but
the points are still the states
specified by two instead of
three integer numbers. The
2-D version is easier to
process visually, but
illustrates all the important
points

The problem with this calculation is, of course, that the points on the surface do
not necessarily correspond to integer values of n1; n2, and n3 so that this surface
cuts through the little cubes in its immediate vicinity (see Fig. 11.3 representing
a two-dimensional version of the described construction). As a result, some states
with energies in the thin layer surrounding the spherical surface cannot be counted
accurately. The number of such states is obviously proportional to the area of the
enclosing sphere, which is / L2, while the number of states counted correctly is /
L3; so that the relative error of the outlined procedure behaves as 1=L and approaches
zero as L goes to infinity. Thus, Eq. 11.50 can be considered to be asymptotically
correct in the limit L ! 1. Now you can see the value of this procedure with the
initial discretization followed by passing to the quasi-continuous limit. It allowed
me to establish the exact dependence of M as a function of volume V expressed
by Eq. 11.50. Now I can easily find the Fermi energy by substituting Eq. 11.50 into
Eq. 11.47:

V
.2me�F/

3=2

3�2„3 D N )

�F D „
2

2me

3�2N

V

2=3
: (11.51)

The most important feature of Eq. 11.51 is that the number of electrons N and the
volume they occupy V , the two quantities which are supposed to go to infinity,
appear in this equation only in the form of the ratio N=V which we shall obviously
keep constant when passing to the limit V ! 1, N ! 1. The ratio N=V
specifies the number of electrons per unit volume and is also known as the electron
concentration. This is one of the most important characteristics of the electron gas.

It is important to understand that the Fermi energy is the single-electron energy of
the last “occupied” single-particle orbital and is not the energy of the many-electron
ground state. To find the latter I need to add energies of all occupied single-electron
orbitals:

382 11 Non-interacting Many-Particle Systems

E D 2
NmaxX

n1;n2;n3

�n1;n2;n3 ; (11.52)

where Nmax is the collection of indexes n1; n2; n3 corresponding to the last occupied
state and the factor of 2 accounts for the spin variable. I will compute this sum again
in the limit V ! 1, N ! 1 (which, by the way, is called thermodynamic limit),
and while doing so I will show you a trick of converting discrete sums in integrals.
This operation is, again, possible because in the thermodynamic limit the discrete
spectrum becomes quasi-continuous, and the arguments I am going to employ here
are essentially the same as the one used to compute the Fermi energy but with a
slightly different flavor.

So I begin. When an orbital index ni changes by one, the change of the respective
component of the momentum pi can be presented as

4pi D 2�„
Li

4ni;

where 4ni D 1. With this relation in mind, I can rewrite Eq. 11.52 as

E D 2
NmaxX

n1;n2;n3

�n1;n2;n34n14n24n3 D 2
L3

.2�„/3
NmaxX

px;py;pz

�px;py;pz4px4py4pz

where I again set Lz D Ly D Lz � L (remember that 4ni D 1, so by including
factors 4n14n24n3 into the original sum, I did not really change anything). Now,
when L ! 1, 4ni remains equal to unity, but 4pi ! 0, so that the corresponding
sum in the preceding expression turns into an integral:

E D 2 V
.2�„/3

pFxˆ

�pFx

pFyˆ

�pFy

pFzˆ

�pFz

dpxdpydpz�
�

px; py; pz
�

(11.53)

where I changed the notation for the energy to emphasize that momentum is now
not a discrete index, but a continuous variable. Since the single-particle energy
�
�
px; py; pz

�
depends only upon p2, it makes sense to compute the integral in

Eq. 11.53 using the representation of the momentum vector in spherical coordinates.
Replacing the Cartesian volume element dpxdpydpz with its spherical counterpart
p2 sin �d�d'dp, where � and ' are polar and azimuthal angles characterizing the
direction of vector p, I can rewrite Eq. 11.53 as

E D 2 V
.2�„/3

pFˆ

0

�̂

0

2�ˆ

0

� . p/ p2 sin �d�d'dp;

11.5 Fermi Energy 383

where pF is the magnitude of the momentum corresponding to the Fermi energy �F.
I proceed replacing the integration variable p with another variable � according to
relation p D p2me�:

E D 2V .2me/
3=2

2 .2�„/3 4�
�Fˆ

0

�
p
�d�: (11.54)

Before computing this integral, let me point out that it can be rewritten in the
following form:

E D
�Fˆ

0

�g .�/ d� (11.55)

where I introduced quantity

g .�/ D V .2me/
3=2 p�

2�2„3 (11.56)

called density of states. This quantity, which means the number of states per unit
energy interval, is an important characteristic of any many-particle system. Actually
it is so important to give me an incentive to deviate from the original goal of
computing the integral in Eq. 11.54 and spend more time talking about it.

To convince you that g .�/ d� can, indeed, be interpreted as a number of states
with energies within the interval Œ�; � C d��, I will simply compute this quantity
directly using the same state counting technique, which I used to find the Fermi
energy. However, this time around I am interested in a number of states within a
spherical shell with inner radius n.�/ and outer radius n.� C d�/:

n.� C d�/ D n.�/C d� .dn=d�/ D L
2�„

p
2me� C L

4�„

r
2me
�

d�

where I used Eq. 11.49 for n.�/. The volume occupied by this shell is

4V D 4�n2dn D 4�n2 dn
d�

d� D

4�
L22me�

4�2„2
L

4�„

r
2me
�

d� D V .2me/
3=2 p�

4�2„3 d�:

Using again the fact that the volume allocated to a single point in Fig. 11.3 is equal
to one and that there are two single-electron orbitals per point, the total number of
states within this spherical layer is

384 11 Non-interacting Many-Particle Systems

V .2me/
3=2 p�

2�2„3 d�

which according to Eq. 11.56 is exactly g .�/ d�.
Now I can go back to Eq. 11.55 and complete the computation of the integral,

which is quite straightforward and yields

E D V .2me/
3=2

2�2„3
�Fˆ

0

�
p
�d� D V .2me/

3=2

5�2„3 �
5=2
F D

3

5
N

V .2me/

3=2

3�2N .„2/3=2
!
�
5=2
F :

In the last line, I rearranged the expression for the energy to make it clear (with the
help of Eq. 11.51) that the expression in the parentheses is ��3=2F and that the ground
state energy of the non-interacting free electron gas can be written down as

E D 3
5

N�F: (11.57)

This expression can also be rewritten in another no less illuminating form. Substi-
tuting Eq. 11.51 into Eq. 11.57, I can present energy E of the gas as a function of
volume:

E D 3„
2
�
3�2

�2=3
10me

N5=3

V2=3
:

The fact that this energy depends on the volume draws out an important point: if
you try to expand (or contract) the volume occupied by the gas, its energy changes,
which means that someone has to do some work to affect this change. Recalling a
simple formula from introductory thermodynamics class dW D PdV , where W is
work and P is pressure exerted by the gas on the walls of containing vessel, and
taking into account that for the fixed number of particles, energy E depends only on
volume V (no temperature), you can relate dW to �dE and determine the pressure
exerted by the non-interacting electrons on the walls of the container as

P D � dE
dV

D „
2
�
3�2

�2=3
5me

N5=3

V5=3
:

Thus, even in the ground state (which, by the way, from the thermodynamic point
of view corresponds to zero temperature), an electron gas exerts a pressure on the
surrounding medium which depends on the concentration of the electrons. The
coolest thing about this result is that unlike the case of a classical ideal gas, this
pressure has nothing to do with the thermal motion of the electrons because they

11.6 Problems 385

are in the ground state, which is equivalent to their temperature being equal to zero.
This pressure is a purely quantum effect solely due to the indistinguishability of the
electrons and their fermion nature.

11.6 Problems

Problems for Sect. 11.1

Problem 140 Consider the following configuration of single-particle orbitals for a
system of four identical fermions:

ˇ̌
ˇ˛.1/1

E ˇ̌
ˇ˛.2/2

E ˇ̌
ˇ˛.3/3

E ˇ̌
ˇ˛.4/4

E
:

Applying exchange operator OP .i; j/ to all pairs of particles in this configuration,
generate all possible transpositions of the particles and determine the correct signs
in front of them. Write down the correct antisymmetric four-fermion state involving
these single-particle orbitals.

Problem 141 Consider the system of two bosons that can be in one of four single-
particle orbitals. List all possible two-boson states adhering to the symmetrization
requirements.

Problem 142 Consider two non-interacting electrons in a one-dimensional har-
monic oscillator potential characterized by classical frequency !.

1. Consider single-electron orbitals j˛n;msi D jni jmsi where jni is an eigenvector
of the harmonic oscillator and jmsi is a spinor describing one of two possible
eigenvectors of operator OSz. Using orbitals j˛n;msi, write down the Slater deter-
minant for the two-electron ground state, and find the corresponding ground state
energy.

2. Do the same for the first excited state(s) of this system.
3. Write the two-electron states found in Parts I and II in the position-spinor

representation.
4. Now use the eigenvectors of the total spin of the two particles to construct the

two-particle ground and first excited states. Find the relations between the two-
particle states found here with those found in Parts I and II.

5. Compute the expectation value
D
.z1 � z2/2

E
where z1;2 are coordinates of the two

electrons in the states determined above.

Problem 143 Repeat Problem 142 for two non-interacting bosons.

386 11 Non-interacting Many-Particle Systems

Problems for Sect. 11.3

Problem 144 Consider an atom of nitrogen, which has three electrons in l D 1
states.

1. Using single-particle orbitals with l D 1 and different values of orbital and
spin magnetic numbers, construct all possible Slater determinants representing
possible three-electron states.

2. Applying operators OS.1/z C OS.2/z C OS.3/z to all found three-particle states, figure out
the possible values of the total spin in these states.

Problems for Sect. 11.4

Problem 145 Consider two identical non-interacting particles both in the ground
states of their respective harmonic oscillator potentials. Particle 1 is in the potential
V1 D 12m!2x21, while particle 2 is in the potential V2 D 12m!2 .x2 � d/2.
1. Assuming that particles are spin 1=2 fermions in a singlet spin state, write down

the orbital portion of the two-particle state and compute the expectation value of
the two-particle Hamiltonian

OH D Op
2
1

2me
C Op

2
2

2me
C 1
2

m!2x21 C
1

2
m!2 .x2 � d/2

in this state.
2. Repeat the calculations assuming that the particles are in the state with total spin

S D 1.
3. The energy you found in Parts I and II depends upon the distance d between

the equilibrium points of the potential. Classically such a dependence would
mean that there is a force associated with this energy and describing repulsive or
attractive interaction between the two particles. In the case under consideration,
there is no real interaction, and what you have found is a purely quantum effect
due to symmetry requirements on the two-particle states. Still, you can describe
the result in terms of the effective “force” of interaction between the particles.
Find this force for both singlet and triplet spin states, and specify its character
(attractive or repulsive).

Problem 146 Consider two electrons confined in a one-dimensional infinite
potential well of width d and interacting with each other via potential Vint D
�E0 .z1 � z2/2 where E0 is a real positive constant and z1;2 are coordinates of the
electrons.

1. Construct the ground state two-electron wave function assuming that electrons
are (a) in a singlet spin state and (b) in a triplet spin state.

11.6 Problems 387

2. Compute the expectation value of the interaction potential in each of these states.
3. With interaction term included, which spin configuration would have smaller

energy?

Problem for Sect. 11.5

Problem 147 Consider an ideal gas of N electrons confined in a three-dimensional
harmonic oscillator potential:

OV D 1
2

me!
2
�
x2 C y2 C z2� � 1

2
me!

2r2:

Find the Fermi energy of this system and the total energy of the many-electron
ground state. Hint: The degeneracy degrees of the single-particle energy levels
in this case can be easily found analytically, so no transition to quasi-continuous
spectrum and from summation to integration is necessary.

Part III
Quantum Phenomena and Methods

In this part of the book, I will introduce you into the wonderful world of actual
experimentally observable quantum mechanical phenomena. Theoretical descrip-
tion of each of these phenomena will require developing special technical methods,
which I will present as we go along. So, let the journey begin.

Chapter 12
Resonant Tunneling

12.1 Transfer-Matrix Approach in One-Dimensional
Quantum Mechanics

12.1.1 Transfer Matrix: General Formulation

In Sects. 6.2 and 6.3 of Chap. 6, I introduced one-dimensional quantum mechanical
models, in which potential energy of a particle was described by a simplest
piecewise constant function (or its extreme case—a delta-function), defining a
single potential well or barrier. A natural extension of this model is a potential
energy profile corresponding to several wells and/or barriers (or several delta-
functions). In principle, one can approach the multi-barrier problem in the same
way as a single well/barrier situation: divide the entire range of the coordinate into
regions of constant potential energy, and use the continuity conditions for the wave
function and its derivative to “stitch” the solutions from different regions. However,
it is easier said than done. Each new discontinuity point adds two new unknown
coefficients and correspondingly two equations. If in the case of a single barrier you
had to deal with the system of four equations, a dual-barrier problem would require
solving the system of eight equations, and soon even writing those equations down
becomes a serious burden, and I do not even want to think about having to solve
them.

Luckily, there is a better way of dealing with the ever-increasing number of
the boundary conditions in problems with multiple jumps of the potential energy.
In this section I will show you a convenient method of arranging the unknown
amplitudes of the wave functions and relating them to each other across the point of
the discontinuity.

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_12

391

392 12 Resonant Tunneling

Let’s move forward by going back to the simplest problem of a step potential
with a single discontinuity:

V.z/ D
(

V0 z < 0

V1 z > 0;
(12.1)

where I assigned the coordinate z D 0 (the origin of the coordinate axes) to the
point where the potential makes its jump. If I were to ask a good diligent student
of quantum mechanics to write down a wave function of a particle with energy
E exceeding both V0 and V1, I would have most likely been presented with the
following expression:

.z/ D
(

A0 exp .ik0z/C B0 exp .�ik0z/ z < 0
A1 exp .ik1z/ z > 0;

(12.2)

where

k0 D
p
2me .E � V0/

„

k1 D
p
2me .E � V1/

„ :

This wave function would have been perfectly fine if all what I were after was
just the single step-potential problem. In this section, however, I have further-
reaching goals, so I need to generalize this expression allowing for a possibility
to have a wave function component corresponding to the particles propagating in
the negative z direction for z > 0 as well as for z < 0. If you wonder where these
backward propagating particles could come from, just imagine that there might be
another discontinuity in the potential somewhere down the line, at a positive value
of z, which would create a flux of reflected particles propagating in the negative z
direction at z > 0. To take this possibility into account, I will replace Eq. 12.2 with
a wave function of a more general form

.z/ D
(

A0 exp .ik0z/C B0 exp .�ik0z/ z < 0
A1 exp .ik1z/C B1 exp .�ik1z/ z > 0:

(12.3)

The continuity of the wave function and its derivative at z D 0 then yields

A0 C B0 D A1 C B1 (12.4)
k0 .A0 � B0/ D k1 .A1 � B1/ : (12.5)

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 393

Quite similarly to what I already did in Sect. 6.2.1, I can rewrite these equations as

A1 D 1
2

1C k0

k1

A0 C 1

2

1 � k0

k1

B0 (12.6)

B1 D 1
2

1 � k0

k1

A0 C 1

2

1C k0

k1

B0: (12.7)

However, for the next step, I prepared for you something new. After spending some
time staring at these two equations, you might divine that they can be presented in a
matrix form if amplitudes A1;0 and B1;0 are arranged into a two-dimensional column
vector, while the coefficients in front of A0 and B0 are arranged into a 2 � 2 matrix:

A1
B1

�
D
2
4
1
2

�
1C k0k1

�
1
2

�
1 � k0k1

�

1
2

�
1 � k0k1

�
1
2

�
1C k0k1

�
3
5

A0
B0

�
: (12.8)

Go ahead, perform matrix multiplication in Eq. 12.8, and convince yourself that the
result is, indeed, the system of Eqs. 12.6 and 12.7. If we agree to always use an
amplitude of the forward propagating component of the wave function (whatever
appears in front of exp .ikiz/) as the first element in the two-dimensional column
and the amplitude of the backward propagating component (the one appearing in
front of exp .�ikiz/) as the second element, I can introduce notation v0;1 for the
respective columns, D.1;0/ for the matrix

D.1;0/ D
"

k1Ck0
2k1

k1�k0
2k1

k1�k0
2k1

k1Ck0
2k1

#
; (12.9)

and rewrite Eq. 12.8 as a compact matrix equation:

v1 D D.1;0/v0: (12.10)

Upper indexes in the notation for this matrix are supposed to be read from right to
left and symbolize a transition across a boundary between potentials V0 and V1.

I will not be surprised if at this point you feel a bit disappointed and thinking:
“so what, dude? This is just a fancy way of presenting what we already know.”
But be patient: patience is a virtue and is usually rewarded. The real utility of
the matrix notation becomes apparent only when you have to deal with potentials
featuring multiple discontinuities. So, let’s get to it and assume that at some point
with coordinate z D z1, the potential experiences another jump changing abruptly
from V1 to V2. If asked to write the expression for the wave function in the regions
between z D 0 and z D z1 and for z > z1, you would have probably written

.z/ D
(

A1 exp .ik1z/C B1 exp .�ik1z/ 0 < z < z1
A2 exp .ik2z/C B2 exp .�ik2z/ z > z1

(12.11)

394 12 Resonant Tunneling

which is, of course, a perfectly reasonable and correct expression. However, if you
tried to write down the continuity equations at z D z1 using this wave function and
present them in a matrix form, you would have ended up with a matrix containing
exponential factors like exp .˙ik1;2z1/ and which would not look at all like simple
matrix D.1;0/ from Eq. 12.9. I can try to make the situation more attractive by
rewriting the expression for the wave function in a form, in which arguments of
the exponential functions vanish at z D z1:

.z/ D
(

A.L/1 exp Œik1 .z � z1/�C B.L/1 exp Œ�ik1 .z � z1/� 0 < z < z1
A.R/1 exp Œik2 .z � z1/�C B.R/1 exp Œik2 .z � z1/� z > z1:

(12.12)

This amounts to redefining amplitudes appearing in front of the respective exponents
as you will see for yourselves when doing Problem 2 in the exercise section for
this chapter. Please note the change in the notations: instead of distinguishing the
amplitudes by their lower indexes (1 or 2), I introduced upper indexes L and R,
indicating that these coefficients describe the wave function immediately to the left
or to the right of the discontinuity point, correspondingly. At the same time, the
lower indexes in all coefficients are now set to be 1, implying that we are dealing
with the discontinuity at point z D z1. In terms of these new coefficients, the
stitching conditions take a form

A.R/1 D
1

2

1C k1

k2

A.L/1 C

1

2

1 � k1

k2

B.L/1 (12.13)

B.R/1 D
1

2

1 � k1

k2

A.L/1 C

1

2

1C k1

k2

B.L/1 ; (12.14)

which, with obvious substitutions k0 ! k1 and k1 ! k2, become identical to
Eqs. 12.6 and 12.7. These equations can again be written in the matrix form

v
.R/
1 D D.2;1/v.L/1 ; (12.15)

where v.R/1 is formed by coefficients A
.R/
1 and B

.R/
1 , v

.L/
1 —by coefficients A

.L/
1 and

B.L/1 , while matrix D
.2;1/ is defined as

D.2;1/ D
"

k2Ck1
2k2

k2�k1
2k2

k2�k1
2k2

k2Ck1
2k2

#
: (12.16)

You might have noticed by now the common features of the matrices D.1;0/ and
D.2;1/: (a) they both describe the transition across a boundary between two values
of the potential (V0 to V1 for the former and V1 to V2 for the latter), (b) they both
connect the pairs of coefficients characterizing the wave functions immediately on
the left of the potential jump with those specifying the wave function immediately

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 395

on the right of the jump, and, finally, (c) they have a similar structure, recognizing
which enables you to write down a matrix connecting the wave function amplitudes
across a generic discontinuity as

D.iC1;i/ D
" kiC1Cki

2kiC1

kiC1�ki
2kiC1

kiC1�ki
2kiC1

kiC1Cki
2kiC1

#
; (12.17)

where

ki D
p
2me .E � Vi/

„
is determined by the potential to the left of the discontinuity and

kiC1 D
p
2me .E � ViC1/

„
corresponds to the value of the potential to the right of it. It is also not too difficult
to rewrite Eqs. 12.3 and 12.12 in a situation, when a jump of potential occurs at an
arbitrary point z D zi:

.z/ D
(

A.L/i exp Œiki .z � zi/�C B.L/i exp Œ�iki .z � zi/� zi�1 < z < zi
A.R/i exp ŒikiC1 .z � zi/�C B.R/i exp ŒikiC1 .z � zi/� z > zi:

(12.18)

Correspondingly, Eq. 12.15 becomes

v
.R/
i D D.iC1;i/v.L/i ; (12.19)

where v.R/i contains A
.R/
i and B

.R/
i , while v

.L/
i contains A

.L/
i and B

.L/
i .

I hope that by now I have managed to convince you that using the suggested
matrix notations does have its benefits, but I also suspect that some of you might
become somewhat skeptical about the generality of this approach. You might
be thinking that all these formulas that I so confidently presented here can only
be valid for energies exceeding all potentials Vi and that this fact strongly limits the
utility of the method. If this did occur to you, accept my commendation for paying
attention, but reality is not as bad as it appears. Let’s see what happens if one of
Vi turns out to be larger than E. Obviously, in this case the respective ki becomes
imaginary and can be written as

ki D
p
2me .E � Vi/

„ D i
p
2me .Vi � E/

„ � i
i; (12.20)

396 12 Resonant Tunneling

where I introduced a new real-valued parameter

i D
p
2me .Vi � E/

„ : (12.21)

The corresponding wave function at z < zi becomes

.z/ D A.L/i exp Œ�
i .z � zi/�C B.L/i exp Œ
i .z � zi/� :

The continuity condition of the wave function at z D zi remains the same as Eq. 12.4:

A.L/i C B.L/i D A.R/i C B.R/i ;

while the continuity of the derivative of the wave function yields this instead of
Eq. 12.5

�
i
�

A.L/i � B.L/i
�

D ikiC1
�

A.R/i � B.R/i
�
;

where I assumed for the sake of argument that E > ViC1. Combining these two
equations, I get, instead of Eqs. 12.6 and 12.7,

A.R/i D
1

2
A.L/i

1 �
i

ikiC1

C 1
2

B.L/i

1C
i

ikiC1

B.R/i D
1

2
A.L/i

1C
i

ikiC1

C 1
2

B.L/i

1 �
i

ikiC1

;

which can be written again in the form of Eq. 12.19 with a new matrix

QD.iC1;i/ D
" ikiC1�
i

2ikiC1

ikiC1C
i
2ikiC1

ikiC1C
i
2ikiC1

ikiC1�
i
2ikiC1

#
D
" kiC1Ci
i

2kiC1

kiC1�i
i
2kiC1

kiC1�
i
2kiC1

kiC1Ci
i
2kiC1

#
:

Comparing QD.iC1;i/ with D.iC1;i/ in Eq. 12.17, you can immediately see that the latter
can be obtained from the former with the simple substitution defined in Eq. 12.20.
Therefore, you do not really have to worry about the relation between energy and
the respective value of the potential: Eq. 12.17 works in all cases, and if ki turns out
to be imaginary, you just need to replace it with i
i as prescribed by Eq. 12.20 (or

just let the computer to do it for you). Consequently, matrix QD.iC1;i/ turns out to be
perfectly unnecessary and will not be used any more, but there is one circumstance
which you must pay close attention to. Special significance of Eq. 12.20 is that
when ki turns imaginary, it forces it to have a positive imaginary part (square root
obviously allows for either positive or negative signs). As a result, the exponential
factor at the amplitude designated as Ai acquires a negative positive argument, while
the exponential factor multiplied by amplitude Bi gets a real positive argument. You

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 397

need to take this into account when designating corresponding amplitudes as A or B
and placing them in the first or the second row of your column vector. (Obviously,
it is not the actual symbols used to designate the amplitudes that are important, but
their places in the column vector.)

I hope that your head is not spinning yet, but as a prophylactic measure, let me
summarize what we have achieved so far. We are considering a particle moving in
a piecewise constant potential, which has interruptions of continuity at a number
of points with coordinates z D zi (the first discontinuity occurs at z0 D 0).
When crossing zi, the potential jumps from Vi to ViC1. In the vicinity of each
discontinuity point, the wave function is presented by Eq. 12.18, organized in such
a way that coefficients with upper index L determine amplitudes of the right- and
left-propagating components of the wave function on the left of the discontinuity
and coefficients with upper index R determine the same amplitudes on the right of
zi. The connection between these pairs of coefficients is described by the matrix
equation as presented by Eq. 12.19.

To help you get a better feeling of why this matrix representation is useful, let
me put together the matrix equations for a few successive discontinuity points:

v
.R/
2 D D.3;2/v.L/2 I v.R/1 D D.2;1/v.L/1 ; v.R/0 D D.2;1/v.L/0 (12.22)

The structure of these equations indicates that it might be possible to relate column
vector v.R/2 to v

.L/
0 by consecutive matrix multiplication if we had matrices relating

v
.L/
2 to v

.R/
1 , v

.L/
1 to v

.R/
0 , and, in general, v

.L/
i to v

.R/
i�1. To find these matrices, I have

to take you back to Eq. 12.18, where you shall notice that the pairs of coefficients
A.R/i�1;B

.R/
i�1 and A

.L/
i ;B

.L/
i describe the wave function defined on the same interval

zi < z < ziC1. Accordingly, the following must be true:

A.L/i exp Œiki .z � zi/�C B.L/i exp Œ�iki .z � zi/� D
A.R/i�1 exp Œiki .z � zi�1/�C B.R/i�1 exp Œ�iki .z � zi�1/�

which is satisfied if

A.L/i exp Œiki .z � zi/� D A.R/i�1 exp Œiki .z � zi�1/�

and

B.L/i exp Œ�iki .z � zi/� D B.R/i�1 exp Œ�iki .z � zi�1/�

Canceling the common factor exp .ikiz/, you find

A.L/i D exp Œiki .zi � zi�1/�A.R/i�1 (12.23)
B.L/i D exp Œ�iki .zi � zi�1/�B.R/i�1 (12.24)

398 12 Resonant Tunneling

which can be presented in the matrix form as

"
A.L/i
B.L/i

#
D

exp Œiki .zi � zi�1/� 0
0 exp Œ�iki .zi � zi�1/�

�"
A.R/i�1
B.R/i�1

#
: (12.25)

Introducing the diagonal matrix

M.i/ D

exp Œiki .zi � zi�1/� 0
0 exp Œ�iki .zi � zi�1/�

�
(12.26)

I can give Eq. 12.25 the form

v
.L/
i D M.i/v.R/i�1; (12.27)

which you can recognize as the missing relation between v.L/i and v
.R/
i�1. Note that

the upper index in M.i/ signifies that it corresponds to the region of coordinates
zi�1 < z < zi, where the potential is equal to Vi. It is important to note that Eq. 12.26
can be used even if ki turns out to be imaginary. All you will need to do in this case
is to replace ki with i
i according to Eqs. 12.20 and 12.21. Now, complimenting
Eq. 12.22 with the missing links, you get

v
.R/
2 D D.3;2/v.L/2 I v.L/2 D M.2/v.R/1 I v.R/1 D D.2;1/v.L/1 ; (12.28)

v
.L/
1 D M.1/v.R/0 I v.R/0 D D.2;1/v.L/0 ;

which, after combining all successive matrix relations, yields

v
.R/
2 D D.3;2/M.2/D.2;1/M.1/D.1;0/v.L/0 : (12.29)

This result illuminates the power of the method, which is presented here: the ampli-
tudes of the wave function after the particles have encountered three discontinuity
points of the potential are expressed in terms of the amplitudes specifying the wave
function in the region before the first discontinuity via a simple matrix relation,
v
.R/
2 D T.3/v.L/0 , where matrix T.3/, called the transfer matrix, is the product of five

matrices of two different kinds:

T.3/ D D.3;2/M.2/D.2;1/M.1/D.1;0/:

Matrices D.iC1;i/ can be called interface matrices as they describe transformation of
the wave function amplitudes due to crossing of an interface between two distinct
values of the potential, and you can use the name “free propagation matrices”
for M.i/ because they describe the evolution of the wave function due to free
propagation of the particle between two discontinuities. Equation 12.29 has a
simple physical interpretation if you read it from right to left: a particle begins

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 399

Fig. 12.1 A potential profile
corresponding to Eq. 12.29

with a wave function characterized by column vector v0. It encounters the first
discontinuity at z D z1, and upon crossing it the wave function coefficients undergo
transformation prescribed by matrix D.1;0/. After that the wave function evolves
as it were for a free particle in potential V1—this evolution is described by the
propagation matrix M.1/. The crossing of the boundary between V1 and V2 regions is
represented by the interface matrix D.2;1/ and so on and so forth. One of the possible
potential profiles that could have been described by Eq. 12.29 is shown in Fig. 12.1.

Equation 12.29 is trivially generalized to the case of an arbitrary number, N, of
the discontinuities, located at points zi; i D 0; 1; 2 � � � N � 1 with z0 D 0:

v
.R/
N�1 D T.N/v.L/0 (12.30)

with a corresponding transfer matrix defined as

T.N/ D D.N;N�1/M.N�1/ � � � D.2;1/M.1/D.1;0/: (12.31)

Once the transfer matrix is known, you can use it to obtain all the information about
wave functions (and corresponding energy eigenvalues when appropriate) of the
particle in the corresponding potential both in the continuous and discrete segments
of the energy spectrum. The next section in this chapter discusses how this can be
done.

12.1.2 Application of Transfer-Matrix Formalism to Generic
Scattering and Bound State Problems

12.1.2.1 Generic Scattering Problem via the Transfer Matrix

Having defined a generic transfer matrix T.N/, I can now solve a typical scattering
problem similar to the one discussed in Sect. 6.2.1. Setting it up amounts to
specifying the wave function of the particle at z < 0 (before the particle encounters

400 12 Resonant Tunneling

the first break of the continuity) and at z > zN�1 (after the particle passes through
the last discontinuity point). The scattering wave function introduced in Sect. 6.2.1

.z/ D
(

exp .ik0z/C r exp .�ik0z/ ; z < 0
t exp .ikNz/ z > zN�1

(12.32)

is in the transfer-matrix formalism described by column vectors v0 and vN :

v
.L/
0 D

1

r

�
I v.R/N�1 D

t
0

�
(12.33)

Presenting the generic T-matrix by its (presumably known) elements

T.N/ D

t11 t12
t21 t22

�

I can rewrite Eq. 12.30 in the expanded form as

t
0

�
D

t11 t12
t21 t22

�
1

r

�
:

This translates into the system of linear equations:

t D t11 C rt12
0 D t21 C rt22:

From the second of these equations, I immediately have

r D � t21
t22
; (12.34)

and substituting this result into the first one, I find

t D t11 � t12t21
t22

D
det

�
T.N/

�

t22
: (12.35)

Here det
�

T.N/
�

� t11t22 � t12t21 is the determinant of the T-matrix T.N/, which,
believe it or not, can actually be quite easily computed for the most general transfer
matrix defined in Eq. 12.31.

To do so you must, first, recall that the determinant of the product of the matrices
is equal to the product of the determinants of the individual factors:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 401

det
�

T.N/
�

D det
�

D.N;N�1/
�

det
�

M.N�1/
�

� � � �

det
�

D.2;1/
�

det
�

M.1/
�

det
�

D.1;0/
�
: (12.36)

It is easy to see that det
�

M.i/
�

D 1 for any i, so all these factors can be omitted
from Eq. 12.36 yielding

det
�

T.N/
�

D det
�

D.N;N�1/
�

det
�

D.N�1;N�2/
�

� � � �

det
�

D.2;1/
�

det
�

D.1;0/
�
: (12.37)

Now all I need is to compute the determinant of the generic matrix D.iC1;i/. Using
Eq. 12.17, I find

det
�

D.iC1;i/
�

D

kiC1 C ki
2kiC1

2
�

kiC1 � ki
2kiC1

2
D ki

kiC1
;

which leads to the following expression for det
�

T.N/
�

:

det
�

T.N/
�

D kN�1
kN

kN�2
kN�1

� � � k1
k2

k0
k1

D k0
kN

(12.38)

Isn’t it amazing how all ki in the intermediate regions got canceled, so that the
determinant depends only upon the wave numbers (real or imaginary) in the first
and the last region of the constant potential. Using this result in Eq. 12.35, I can find
a simplified expression for the transmission amplitude

t D k0
kN

1

t22
(12.39)

which becomes even simpler if the potential for z < 0 and for z > zN�1 is the
same. In this case the determinant of the transfer matrix becomes equal to unity
and t D 1=t22. Having found r and t, I can restore the wave function in the entire
range of the coordinate by consequently applying interface and propagation matrices
constituting the total transfer matrix T.N/.

With help of Eq. 6.53 from Sect. 6.2.1, I can also find the corresponding reflection
and transmission probabilities:

R D jrj2 D
ˇ̌
ˇ̌ t21
t22

ˇ̌
ˇ̌
2

T D k
2
N

k20
jtj2 D

ˇ̌
ˇ̌ 1
t22

ˇ̌
ˇ̌
2

402 12 Resonant Tunneling

Fig. 12.2 An example of a
potential with discrete
spectrum

where I used Eq. 12.39 for t. Since reflection and transmission probabilities must
obey the condition R C T D 1, it imposes the following general condition on the
elements of the transfer matrix:

jt22j2 � jt12j2 D 1:

12.1.2.2 Finding Bound States with the Transfer Matrix

Now let me show how transfer-matrix method can be used to find energies of the
bound states, if they are allowed by the potential. Consider, for instance, a potential
shown in Fig. 12.2. After eyeballing this figure for a few moments and recalling
that discrete energy levels correspond to classically bound motion, you shall be
able to conclude that states with energies in the interval V3 < E < V4 must
belong to the discrete spectrum. An important general point to make here is that
the discrete spectrum in such a Hamiltonian exists at energies, which are smaller
than the limiting values of the potential at z ! ˙1 and larger than the potential’s
smallest value, provided that these conditions are not self-contradictory. For such
values of energy, the solutions of the Schrödinger equation for z < 0 and z > zN�1
(classically forbidden regions) take the form of real exponential functions, so that
instead of Eq. 12.32, I have

.z/ D
(

B0 exp .
0z/ z < 0

AN exp .�
Nz/ z > zN�1;
(12.40)

where

0 D
p
2m.V0 � E/

„

N D
p
2m.VN � E/

„

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 403

Before continuing I have to reiterate a point that I already made earlier in this
section. Equation 12.40 was obtained by making transition from parameters k0 and
kN , which become imaginary for the chosen values of energy, to real parameters
0 and
N with the help of Eq. 12.20. This procedure turns exponential functions
exp .˙ikz/ into exp .
z/. Accordingly, in order to preserve the structure of my
transfer matrices, I have to designate amplitude coefficients in front of exp .
iz/ as
Bi and coefficients in front of exp .�
iz/ as Ai. Finally, I feel obliged to remind
you that I discarded exponentially growing terms in Eq. 12.40 in order to preserve
normalizability of the wave function. Thus, now, initial vectors v0 and vN , instead
of Eq. 12.33, take the form

v
.L/
0 D

0

B0

�
I v.R/N�1 D

AN
0

�

The resulting transfer-matrix equation in this case becomes

AN
0

�
D

t11 t12
t21 t22

�
0

B0

�

which yields

AN D t12B0
0 D t22B0

The last of these equations produces an equation for the allowed energy values, since
it can only be fulfilled for nonvanishing B0 and A0 if

t22.E/ D 0 (12.41)

The first of these equations express AN in terms of the remaining undetermined
coefficient B0, which can be fixed by the normalization requirement.

12.1.3 Application of the Transfer Matrix to a Symmetrical
Potential Well

To illustrate the transfer-matrix method, I will now apply it to a problem, which
we have already solved in Sect. 6.2.1—the states of a particle in a symmetric
rectangular potential well. To facilitate application of the transfer-matrix approach,
I will describe this potential by function

404 12 Resonant Tunneling

V.z/ D

8̂
<̂
ˆ̂:

Vb z < 0

Vw 0 < z < d

Vb z > d;

(12.42)

which differs from the one used in Sect. 6.2.1 by the choice of the origin of the
coordinate axis for z. This potential has two discontinuity points: it changes from
Vb to Vw at z0 D 0 and then, again, from Vw to Vb at z1 D d, where it is assumed
that Vb > Vw. Correspondingly, I need to introduce two interface matrices: D.1;0/ as
defined in Eq. 12.9 with k0 D

p
2me .E � Vb/ and k1 D

p
2me .E � Vw/ and D.2;1/

defined in Eq. 12.16 with k2 D k0.

12.1.3.1 Scattering States

Scattering states (continuous spectrum) of this potential correspond to energies
E > Vb, in which case parameters k0 and k1 are regular real-valued wave numbers.
Inserting the free propagation matrix M.1/ from Eq. 12.26 between D.2;1/ and D.1;0/

according to Eq. 12.31 and taking into account that z0 D 0 and z1 D d; I obtain the
total T-matrix

T.2/ D D.2;1/M.1/D.1;0/ D
"

k0Ck1
2k0

k0�k1
2k0

k0�k1
2k0

k0Ck1
2k0

#
exp .ik1d/ 0

0 exp .�ik1d/
�" k1Ck0

2k1
k1�k0
2k1

k1�k0
2k1

k1Ck0
2k1

#
D

"
k0Ck1
2k0

exp .ik1d/
k0�k1
2k0

exp .�ik1d/
k0�k1
2k0

exp .ik1d/
k0Ck1
2k0

exp .�ik1d/

#"
k1Ck0
2k1

k1�k0
2k1

k1�k0
2k1

k1Ck0
2k1

#
D

2
4

.k0Ck1/2 exp.ik1d/�.k0�k1/2 exp.�ik1d/
4k0k1

.k21�k20/Œexp.ik1d/�exp.�ik1d/�
4k0k1

� .k21�k20/Œexp.ik1d/�exp.�ik1d/�
4k0k1

.k0Ck1/2 exp.�ik1d/�.k0�k1/2 exp.ik1d/
4k0k1

3
5 D

1

2k0k1
�

i
�
k20 C k21

�
sin k1d C 2k0k1 cos k1d i

�
k21 � k20

�
sin k1d

�i �k21 � k20
�

sin k1d �i
�
k20 C k21

�
sin k1d C 2k0k1 cos k1d

�
:

(12.43)

Substitution of the corresponding elements of the T-matrix from the last expression
into Eqs. 12.34 and 12.39 yields the amplitude reflection and transmission coeffi-
cients:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 405

r D i
�
k21 � k20

�
sin k1d

�i �k20 C k21
�

sin k1d C 2k0k1 cos k1d
D (12.44)

�
k20 � k21

�
sin k1d�

k20 C k21
�

sin k1d C 2ik0k1 cos k1d
:

t D 2k0k1�i �k20 C k21
�

sin k1d C 2k0k1 cos k1d
D (12.45)

2ik0k1�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

:

where at the last steps, the numerators and denominators of the expressions for r and
t were multiplied by i. The resulting expressions coincide with Eqs. 6.58 and 12.35
of Sect. 6.2, which, of course, is not surprising. Having found the reflection and
transmission amplitudes, I can easily restore the entire wave function. Indeed,
substitution of Eqs. 12.45 and 12.44 into Eq. 12.32 yields the wave function for
z < 0 and z > d. Next, using Eq. 12.10 with v0 in the form

v0 D
1

r

�

I find coefficients A.R/0 and B
.R/
0 :

"
A.R/0
B.R/0

#
D
"

k1Ck0
2k1

k1�k0
2k1

k1�k0
2k1

k1Ck0
2k1

#
1

r

�
)

A.R/0 D
k1 C k0 C r .k1 � k0/

2k1
(12.46)

B.R/0 D
k1 � k0 C r .k1 C k0/

2k1
; (12.47)

which generate the wave function in the region 0 < z < d:

.z/ D k1 .1C r/C k0 .1 � r/
2k1

eik1z C k1 .1C r/ � k0 .1 � r/
2k1

e�ik1z:

I will leave it as an exercise to demonstrate that coefficients A.R/0 and B
.R/
0 in

Eqs. 12.46 and 12.47 coincide with coefficients A2 and B2 in Eqs. 6.60 and 6.61
in Sect. 6.2. Rewriting the expression for the wave function as

.z/ D k1 .1C r/C k0 .1 � r/
2k1

eik1deik1.z�d/C

k1 .1C r/ � k0 .1 � r/
2k1

e�ik1de�ik1.z�d/;

406 12 Resonant Tunneling

where I simply multiplied each term by exp .ik1d/ exp .�ik1d/ � 1, you can identify
coefficients A.L/1 and B

.L/
1 as

A.L/1 D
k1 C k0 C r .k1 � k0/

2k1
eik1d

B.L/1 D
k1 � k0 C r .k1 C k0/

2k1
e�ik1d:

The same expressions for A.L/1 and B
.L/
1 can obviously be found by multiplying

diagonal matrix M.1/ by v.R/0 formed by coefficients A
.R/
0 and B

.R/
0 . Finally, in order

to convince the skeptics that the outlined procedure is self-consistent, you can try to
apply the interface matrix D.1;2/ to A.L/1 and B

.L/
1 :

"
A.R/2
B.R/2

#
D
"

k0Ck1
2k0

k0�k1
2k0

k0�k1
2k0

k0Ck1
2k0

#"
k1Ck0Cr.k1�k0/

2k1
eik1d

k1�k0Cr.k1Ck0/
2k1

e�ik1d

#
(12.48)

yielding for A.R/2

A.R/2 D
.k0 C k1/2
4k0k1

eik1d C r k
2
1 � k20
4k0k1

eik1d�

.k0 � k1/2
4k0k1

e�ik1d � r k
2
1 � k20
4k0k1

e�ik1d D

eik1d

k0.1 � r/
4k1

C k1.1C r/
4k0

C 1
2

�

e�ik1d

k0.1 � r/
4k1

C k1.1C r/
4k0

� 1
2

:

To continue I have to use the reflection coefficient r given by Eq. 12.44. Evaluating
parts of the expression for A.R/2 separately, I find

k0.1 � r/
4k1

D k0
4k1

"
1 �

�
k20 � k21

�
sin k1d�

k20 C k21
�

sin k1d C 2ik0k1 cos k1d

#
D

k0
2

k1 sin k1d C ik0 cos k1d�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

k1.1C r/
4k0

D k1
4k0

"
1C

�
k20 � k21

�
sin k1d�

k20 C k21
�

sin k1d C 2ik0k1 cos k1d

#
D

k1
2

k0 sin k1d C ik1 cos k1d�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

:

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 407

Lastly,

k0.1 � r/
4k1

C k1.1C r/
4k0

C 1
2

D

1

2

"
2k0k1 sin k1d C i

�
k20 C k21

�
cos k1d�

k20 C k21
�

sin k1d C 2ik0k1 cos k1d
C 1

#
D

i

2

.k0 C k1/2 e�ik1d�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

;

where at the last step, I replaced sin k1d C i cos k1d with i exp .�ik1d/. Similarly,
k0.1 � r/
4k1

C k1.1C r/
4k0

� 1
2

D

1

2

"
2k0k1 sin k1d C i

�
k20 C k21

�
cos k1d�

k20 C k21
�

sin k1d C 2ik0k1 cos k1d
� 1

#
D

i

2

.k0 � k1/2 eik1d�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

:

Combining all these results, I finally get A.R/2 :

A.R/2 D
i

2

.k0 C k1/2 � .k0 � k1/2�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

D

2ik0k1�
k20 C k21

�
sin k1d C 2ik0k1 cos k1d

: (12.49)

Catching my breath after this marathon calculations (OK—half marathon), I am
eager to compare Eq. 12.49 with Eq. 12.45 for the transmission amplitude. With a
sigh of relief, I find that they, indeed, coincide. I will leave it as an exercise to
demonstrate that B.R/2 vanishes as it should.

12.1.3.2 Bound States

Now I will illustrate application of the transfer-matrix approach to bound states of
the square potential well described by the same by Eq. 12.42. Discrete spectrum of
this potential is expected to exist in the interval of energies defined as Vw < E < Vb.
The transfer matrix given in Eq. 12.43 can be adapted to this case by replacing wave
number k0 with i
0, where
0 in this context is defined as

0 D
p
2m.Vb � E/

„

408 12 Resonant Tunneling

This procedure yields

T D 1
2
0k1

�
��
20 C k21

�
sin k1d C 2
0k1 cos k1d

�
k21 C
20

�
sin k1d

� �k21 C
20
�

sin k1d �
��
20 C k21

�
sin k1d C 2
0k1 cos k1d

�

and Eq. 12.41 for the bound state energies takes the following form:

2
0k1 cos k1d D
��
20 C k21

�
sin k1d

or

tan .k1d/ D 2
0k1�
20 C k21
(12.50)

At the first glance, this result does not agree with the one I derived in Sect. 6.2.1,
where states were segregated according to their parity with different equations for
the energy levels of the even and odd states. Equation 12.50, on the other hand,
is a single equation, and the parity of the states has never been even mentioned.
If, however, you pause to think about it, you will see that the differences between
results obtained here and in Sect. 6.2.1 are purely superficial.

First of all, you need to notice that the coordinates used here and in Sect. 6.2.1
have different origins. Placing the origin of the coordinate at the center of the well
made the inversion symmetry of the potential with respect to its center reflected in its
coordinate dependence. Consequently, we were able to classify states by their parity.
This immediate benefit of the symmetry is lost once the origin of the coordinate is
displaced from the center of the well. This, of course, did not change the underlying
symmetry of the potential (it has nothing to do with such artificial things as our
choice of the coordinate system), but it masked it. The wave functions written in
the coordinate system centered at the edge of the potential well do not have a definite
parity with respect to point z D 0, and it is not surprising that my derivation of the
eigenvalue equation naturally yielded a single equation for all energy eigenvalues.
However, it is not too difficult to demonstrate that our single Equation 12.50 is in
reality equivalent to two equations of Sect. 6.2.1, but it does take some extra efforts.

First, you shall notice that trigonometric functions in Eqs. 6.39 and 6.42 are
expressed in terms of kd=2, while Eq. 12.50 contains tan .k1d/. Thus, it makes sense
to try to express tan .k1d/ in terms of k1d=2 using a well-known identity

tan .k1d/ D 2 tan .k1d=2/
1 � tan2 .k1d=2/ ;

12.1 Transfer-Matrix Approach in One-Dimensional Quantum Mechanics 409

which yields

tan .k1d=2/

1 � tan2 .k1d=2/ D
0k1

�
20 C k21
:

To simplify algebra, it is useful to temporarily introduce notations x D tan .k1d=2/,
� D ��
20 C k21

�
=
0k1, and rewrite the preceding equation as a quadratic equation

for x:

x2 C x� � 1 D 0

This equation has two solutions:

x1;2 D �1
2
� ˙ 1

2

p
�2 C 4

Computing �2 C 4 you will easily find that

�2 C 4 D k
4
1 C
40 � 2k21
20

k21
2
0

C 4 D
�
20 C k21

�2
k21

2
0

which yield the following for x1 and x2:

x1 D ��
2
0 C k21
2
0k1

�
2
0 C k21
2k1
0

D � k1
0

x2 D ��
2
0 C k21
2
0k1

C
2
0 C k21
2k1
0

D
0
k1

Recalling what x stands for, you can see that one equation 12.50 is now replaced by
two equations:

tan .k1d=2/ D � k1
0

(12.51)

tan .k1d=2/ D
0
k1
: (12.52)

which are exactly the eigenvalue equations for odd and even wave functions derived
in Sect. 6.2.1. Isn’t it beautiful, really?

Having figured out the situation with the eigenvalues, I can take care of the
eigenvectors. The ratio of the wave function amplitudes A2=B0 is given by

A2
B0

D t12 D
�
k21 C
20

�
sin k1d

2
0k1
; (12.53)

410 12 Resonant Tunneling

while amplitudes of the wave functions in the region 0 < z < d are found from

A1
B1

�
D D.1;0/

0

B0

�
:

Matrix D.1;0/ is adapted to the case under consideration by the same substitution
k0 ! i
0 as before:

D.1;0/ D
"

k1Ci
0
2k1

k1�i
0
2k1

k1�i
0
2k1

k1Ci
0
2k1

#
:

Using this matrix, you easily find

A1 D k1 � i
0
2k1

B0

B1 D k1 C i
0
2k1

B0;

which yields the following expression for the wave function inside the well:

.z/ D B0

k1 � i
0
2k1

exp .ik1z/C k1 C i
0
2k1

exp .�ik1z/

D

B0

cos k1z C
0

k1
sin k1z

:

You can replace the ratio
0=k1 in this expression with tan .k1d=2/ or with
� cot .k1d=2/ according to Eqs. 12.51 and 12.52 and obtain the following expres-
sions for the wave function representing two different types of states:

.z/ D
(

B0
cos.k1d=2/

cos Œk1 .z � d=2/� ;
0=k1 D tan .k1d=2/
� B0sin.k1d=2/ sin Œk1 .z � d=2/� ;
0=k1 D � cot .k1d=2/

It is quite obvious now that the found wave functions are even and odd with respect
to variable Qz D z � d=2, which is merely a coordinate defined in the coordinate
system with the origin at the center of the well, just like in Sect. 6.2.1. One can also
show that Eq. 12.53 is reduced to A2 D ˙B0 for two different types of the wave
function, again in agreement with the results of Sect. 6.2.1. This proof I will leave
to you as an exercise.

12.2 Resonant Tunneling 411

12.2 Resonant Tunneling

In this section I will apply the transfer-matrix method to describe an interesting and
practically important phenomenon of resonant tunneling. This phenomenon arises
when one considers quantum states of a particle in a potential, which consists of
two (or more) potential barriers separated by a potential well. An example of such
a potential is shown in Fig. 12.3. I am interested here in the states corresponding to
under-barrier values of energies E: 0 < E < V . It was established in Sect. 6.2.1
that in the case of a single barrier whose width d satisfies inequality d
1,
where
D p2me .V � E//=„ , such states are characterized by an exponentially
small transmission probability T / exp .�
d/, which is responsible for the effect
of quantum tunneling—a particle incident on the barrier has a non-zero probability
to “tunnel” through it and continue its free propagation on the other side of the
barrier. You might wonder if adding a second barrier will result in any new and
interesting effects. A common sense based on “classical” probability theory suggests
that in the presence of the second barrier, the total transmission probability will
simply be a product of transmission coefficients for each of the barriers T /
T1T2 / exp .�
1d1 �
2d2/, further reducing the probability that the particle tunnels
through the barriers. However, as it often happens, the reality is more complex
(and sometimes more intriguing) than our initial intuited insight. So, let’s see if
our intuition leads us astray in this case.

To simplify algebra, I will assume that both barriers have the same width d and
height V and that they are separated by a region of zero potential of length w. This
potential profile is characterized by four discontinuity points with coordinates

x0 D 0I x1 D dI x2 D d C wI x3 D 2d C w: (12.54)

Accordingly, the propagation of a particle through this potential is described by
four interface matrices, D.1;0/, D.2;1/, D.3;2/, and D.4;3/, and three free propagating
matrices M.1/, M.2/, and M.3/. Matrices D.1;0/ and D.2;1/ are obviously identical
to matrices D.3;2/ and D.4;3/, correspondingly, and can be obtained from those
appearing in the first line of Eq. 12.43 by replacing k0 ! k D

p
2meE=„ and

k1 ! i
D i
p
2me .V � E//=„:

D.1;0/ D D.3;2/ D
i
Ck
2i

i
�k
2i

i
�k
2i

i
Ck
2i

�
I (12.55)

Fig. 12.3 Double-barrier
potential

412 12 Resonant Tunneling

D.2;1/ D D.4;3/ D
kCi

2k
k�i
2k

k�i
2k

kCi
2k

�
: (12.56)

For matrices M.1/; M.2/, and M.3/, I can write, using general definition, Eq. 12.26
and expressions for the corresponding coordinates given in Eq. 12.54:

M.1/ D M.3/ D

exp .�
d/ 0
0 exp .
d/

�
(12.57)

M.2/ D

exp .ikw/ 0
0 exp .�ikw/

�
: (12.58)

The total transfer matrix T then becomes

T.4/ D D.4;3/M.3/D.3;2/M.2/D.2;1/M.1/D.1;0/ D
D.2;1/M.1/D.1;0/M.2/D.2;1/M.1/D.1;0/ � T.2/M.2/T.2/; (12.59)

where T.2/ is the transfer matrix describing the single barrier. I do not have to
calculate this matrix from scratch. Instead, I can again replace k0 with k and k1
with i
in Eq. 12.43:

T.2/ D 1
2kk1

�
"

i
�
k2 C k21

�
sin k1d C 2kk1 cos k1d i

�
k21 � k2

�
sin k1d

�i �k21 � k
�

sin k1d �i
�
k2 C k21

�
sin k1d C 2kk1 cos k1d

#
!

1

2ik

"
i
�
k2 �
2� sin .i
d/C 2ik
cos .i
d/ �i �
2 C k2� sin .i
d/

i
�
2 C k2� sin .i
d/ �i �k2 �
2� sin .i
d/C 2ik
cos .i
d/

#
D

1

2ik

"
� �k2 �
2� sinh .
d/C 2ik
cosh .
d/ �
2 C k2� sinh .
d/

� �
2 C k2� sinh .
d/ �k2 �
2� sinh .
d/C 2ik
cosh .
d/

#

(12.60)

At the last step of this derivation, I used identities connecting trigonometric and
hyperbolic functions: sin .iz/ D i sinh z and cos .iz/ D cosh z. The elements of this
matrix determine amplitude reflection and transmission coefficients for a single-
barrier potential, r1 and t1 correspondingly, as established by Eqs. 12.34 and 12.39:

t1 D 2ik
.k2 �
2/ sinh .
d/C 2ik
cosh .
d/ (12.61)

r1 D �
�
2 C k2� sinh .
d/

.k2 �
2/ sinh .
d/C 2ik
cosh .
d/ (12.62)

12.2 Resonant Tunneling 413

Equations 12.61 and 12.62, obviously, can be derived from Eqs. 12.44 and 12.45 for
the single-well problem with the same replacements of k0 and k1 used to obtain the
T-matrix itself.

In order to simplify further computations and also to provide an easier way to
relate the properties of the double-barrier structure to those of its single-barrier
components, I am going use Eqs. 12.34 and 12.39 to rewrite the transfer matrix
in terms of the amplitude reflection and transmission coefficients, r1 and t1:

T.2/22 D
1

t1
I T.2/21 D �

r1
t1
:

Using the explicit form of the matrix T.2/, Eq. 12.60, you can determine that T.2/11 D�
T.2/22

��
and T.2/12 D

�
T.2/21

��
, so that the entire T.2/ can be written down as

T.2/ D
1=t�1 �r�1 =t�1

�r1=t1 1=t1
�
:

Multiplying this by M.2/ from Eq. 12.58, I get

T.2/M.2/ D

exp .ikw/ =t�1 � exp .�ikw/ r�1 =t�1
� exp .ikw/ r1=t1 exp .�ikw/ =t1

�
1=t�1 r�1 =t�1

�r1=t1 1=t1
�
;

and, finally, multiplying this matrix by T.2/ (from the left), I find the total double-
barrier T-matrix T.4/:

T.4/ D
2
4

exp.ikw/

.t�1 /
2 C exp.�ikw/jr1j

2

jt1j2
exp.ikw/r�1

.t�1 /
2 � exp.�ikw/r

�

1

jt1j2

� exp.ikw/r1jt1j2 �
exp.�ikw/r1

t21
� exp.ikw/jr1j2jt1j2 C

exp.�ikw/
t21

3
5

D 1jt1j2

2
4

t1 exp.ikw/
t�1

C jr1j2 exp .�ikw/ t1 exp.ikw/r
�

1

t�1
� exp .�ikw/ r�1

�r1 exp .ikw/ � t
�

1 exp.�ikw/r1
t1

� jr1j2 exp .ikw/C t
�

1 exp.�ikw/
t1

3
5 :

This expression can be simplified by introducing

t1 D jtj exp .i't/
r1 D jrj exp .i'r/ ;

which yields

T.4/ D 1jt1j2
�

ei.kwC2't/ C jr1j2 e�ikw r�1
�
ei.kwC2't/ � e�ikw�

�r1
�
e�i.kwC2't/ C eikw� � jr1j2 eikw C e�i.kwC2't/

�
D

414 12 Resonant Tunneling

1

jt1j2
�

2
4 e

i't
h
ei.kwC't/ C jr1j2 e�i.kwC't/

i
2ir�1 ei't sin .kw C 't/

�2r1e�i't cos .kw C 't/ e�i't
h
� jr1j2 ei.kwC't/ C e�i.kwC't/

i
3
5 :

(12.63)

At the last step I factored out exp .i't/ to make residual expressions more
symmetrical with respect to the phases of the remaining exponential functions
and used Euler’s identities cos x D .exp .ix/C exp .�ix//=2/ and sin x D
.exp .ix/ � exp .�ix//=2i/. Now you can simply read out the expressions for the
total amplitude reflection and transmission coefficients:

tdb D jt1j
2 exp .i't/

� jr1j2 exp .ikw C i't/C exp .�ikw � i't/
(12.64)

rdb D r
�
1 exp .2i't/ Œexp .ikw C i't/ � exp .�ikw � i't/�

� jr1j2 exp .ikw C i't/C exp .�ikw � i't/
; (12.65)

where subindex db stands for the double barrier.
I will begin the analysis of the obtained expression with the transmission

probability Tdb D jtdbj2:

Tdb D jt1j
4

ˇ̌
ˇ
�
1 � jr1j2

�
cos .kw C 't/ � i

�
1C jr1j2

�
sin .kw C 't/

ˇ̌
ˇ
2

At this point it is useful to recall that transmission and reflection probabilities obey
the probability conservation condition jt1j2 C jr1j2 D 1, which allows to rewrite the
expression for Tdb in the simplified form

Tdb D jt1j
4

jt1j4 cos2 .kw C 't/C
�
1C jr1j2

�2
sin2 .kw C 't/

: (12.66)

The corresponding expression for the reflection probability becomes

Rdb D 4 jr1j
2 sin2 .kw C 't/

jt1j4 cos2 .kw C 't/C
�
1C jr1j2

�2
sin2 .kw C 't/

(12.67)

12.2 Resonant Tunneling 415

Before going any further, it is always useful to check that the results obtained obey
the probability conservation condition Rdb C Tdb D 1. To prove that this is indeed
true, you just need to demonstrate that

jt1j4 C 4 jr1j2 sin2 .kw C 't/ D jt1j4 cos2 .kw C 't/C
�
1C jr1j2

�2
sin2 .kw C 't/ :

You might probably find an easier way to prove this identity, but this is how I did it:

jt1j4 C 4 jr1j2 sin2 .kw C 't/ D
jt1j4

�
cos2 .kw C 't/C sin2 .kw C 't/

�C 4 jr1j2 sin2 .kw C 't/ D
jt1j4 cos2 .kw C 't/C

�
4 jr1j2 C jt1j4

�
sin2 .kw C 't/ D

jt1j4 cos2 .kw C 't/C
4 jr1j2 C

�
1 � jr1j2

�2
sin2 .kw C 't/ D

jt1j4 cos2 .kw C 't/C
�
1C jr1j2

�2
sin2 .kw C 't/ : (12.68)

Having verified that my calculations are not obviously wrong, I can proceed with
their analysis. If you remember that the naive expectation, which I described in the
beginning of this section, was that adding a second barrier would result in a total
transmission being just a product of the transmission probabilities through each
barrier, which in our case of identical barriers would mean Tdb D jt1j4. Looking
at Eq. 12.66, you can indeed notice the factor jt1j4 in its numerator, but you will
also see that this factor is accompanied by a denominator, which is responsible
for breaking our naive expectations. What this denominator does, it selects special
energies, namely, the ones obeying the condition

& .E/ D k.E/w C 't.E/ D �n; n D 1; 2; 3 � � � ; (12.69)

which turns sin .kw C 't/ in Eqs. 12.66 and 12.67 to zero. For energies satisfying
Eq. 12.69, the reflection coefficient vanishes and the transmission coefficient turns
to unity. So much for the second barrier suppressing the transmission probability!
In reality, the presence of the second barrier somehow magically helps the quantum
particle to penetrate both barriers without any reflection, albeit only at special
energies. This phenomenon is called resonant tunneling, and it is a wonderful
manifestation of importance of quantum superposition of states or, as one could
say, of the wave nature of quantum particles. Energy values at which the resonant
tunneling takes place are called tunneling resonances.

To analyze this effect in more details, it is useful to rearrange terms in the
denominator of Eq. 12.66. Using identity in Eq. 12.68, I can rewrite the expression
for the transmission probability T2 as

416 12 Resonant Tunneling

Tdb D jt1j
4

jt1j4 C 4 jr1j2 sin2 .kw C 't/
D

1

1C 4jr1j2jt1j4 sin
2 .kw C 't/

: (12.70)

This expression makes it even more obvious that every time when the energy of the
particle obeys the resonance condition, Eq. 12.69, the transmission turns to unity,
but it also reveals the role of the parameter:

� D jt1j
2

jr1j : (12.71)

Indeed, let me find the values of the energy for which the transmission drops to the
half of its maximum value, i.e., becomes equal to 1=2. Quite obviously, this happens
whenever

4

�2
sin2 & D 1 ” jsin & j D �=2: (12.72)

In the case of the thick individual barriers, when the effect of the resonant
transmission is most drastic, the single-barrier transmission jt1j is small, while the
reflection jr1j is almost unity. In this case Eq. 12.71 can be approximated as follows:

� D jt1j
2

q
1 � jt1j2

� jt1j
2

1 � jt1j2 =2
� jt1j2

�
1C jt1j2 =2

�
� jt1j2 ; (12.73)

where I neglected terms smaller than jt1j2. This approximation shows that � is as
small as jt1j2 meaning that according to Eq. 12.72, the value of the phase & .E/ at the
energy values corresponding to Tdb D 1=2 only weakly deviates from the resonant
value En with & .En/ D �n. Accordingly, & .E/ can be presented as & .E/ D �n C
ı&n where ı&n � 1, allowing to simplify Eq. 12.72 as

jsin .ı&n/j � jı&nj D �=2: (12.74)

Thus, parameter �=2 determines the magnitude of the deviation of the phase & .E/
from its resonant value required to bring down the transmission coefficient by half.
The smaller the � , the smaller is such deviation, which means, in other words, that
smaller � results in steeper decrease of transmission when particle’s energy shifts
away from the resonance. Deviation of the phase can be translated into the respective
deviation of energy by presenting

& .E/ � & .En/C d& .E/
dE

ıE

12.2 Resonant Tunneling 417

Fig. 12.4 Double-barrier
transmission of an electron as
a function of energy for the
structure with barrier height
1 eV, the distance between the
barriers w D 1:2 nm, and
three barrier widths: blue line
corresponds to d D 0:8 nm,
red to d D 0:4 nm, and black
to d D 0:2 nm. Energy is
given in dimensionless units
of 2meEw2=„2

The deviation of the phase equal to �=2 corresponds to the deviation of energy
equal to

�E

2
D

d& .E/

dE

��1
�

2
(12.75)

If one plots transmission as a function of particle energy, the resonant values will
appear as peaks of the transmission, while parameter �E will determine the width
of these peaks. More accurately �E=2 is called the half-width at half-maximum
(HWHM). The origin of “half-maximum” in this term is obvious, and half-width
refers to the fact that Eq. 12.74 has two solutions ˙�=2, and the total width of
the resonance at half-maximum is .En C �E=2/ � .En � �E=2/ D �E. Widening
of the barriers results in decreasing � , which can be qualitatively described as
narrowing of the resonances. You can observe this phenomenon in Fig. 12.4,
presenting transmission as a function of energy for several barrier widths. You can
also see that the resonances broaden with increasing energy. This is the result of
the energy dependence of the elements of the single-barrier transfer matrix and,
correspondingly, of the parameters � and the derivative of the phase d&=dE. The
explicit expression for this derivative can be found from Eq. 12.61 for the amplitude
transmission coefficient, but the result is rather cumbersome and can be left out.

This figure reveals that parameter � also determines how small the transmission
becomes between the maximums and, therefore, how prominent the resonances are.
In order to see where this effect comes from, it is useful to rewrite Eq. 12.70 for
transmission as

Tdb D .�=2/
2

.�=2/2 C sin2 .kw C 't/
(12.76)

One can see now that the minimum of transmission, which occurs whenever
sin .kw C 't/ reaches its largest value of unity, is

418 12 Resonant Tunneling

T.min/db D
�2

�2 C 1 � �
2

where I assumed at the last step that � � 1, i.e., it increases with increasing � . You
may also notice that the position of the resonances is different for different barrier
thicknesses. This result seems to be contrary to Eq. 12.69, which shows explicitly
only the dependence of the resonant energies on the distance between the barriers,
w. The observed effect of the dependence of the resonances on d emphasizes the
role of the phase factor 't, which does depend on the thickness of the barriers, but
is often overlooked.

In the vicinity of the resonance &n D �n, one can expand the sin .&/ as

sin .& � &n C �n/ D .�1/n sin .& � &n/ � .�1/n .& � &n/ � .�1/n .d&=dE/ .E � En/ :

With this approximation Eq. 12.76 for transmission can be presented in the vicinity
of the resonance as

Tdb D .�E=2/
2

.�E=2/
2 C .E � En/2

(12.77)

Resonance behavior of this type occurs frequently in various areas of physics and is
called a Breit–Wigner resonance, while Eq. 12.77 bears the name of a Breit–Wigner
formula.1

The treatment of the resonant tunneling, which I have developed, is remarkably
independent on the details of the shapes of the barriers constituting the double-
barrier structure. As long as the boundaries of the barriers are clearly defined so
that I can write down a single-barrier transfer matrix T.2/ and the distance between
the barriers, w, I can use the results of this section. Do not get me wrong—the
parameters of T.2/, of course, depend on the details of the barrier’s shape, but what
I want to say is that T.2/ can be computed independently of the double-barrier
problem once and for all, numerically if needed, and then used in the analysis of
the resonant tunneling.

So, I hope you are convinced by now that the resonant tunneling is a remarkable
phenomenon, which can be relatively simply described in terms of the reflection and

1Gregory Breit was an American physicist, known for his work in high energy physics and
involvement at the earlier stages of the Manhattan project. Eugene Wigner was a Hungarian-
American theoretical physicist, winner of the half of 1963 Nobel Prize “for his contributions to the
theory of the atomic nucleus and the elementary particles, particularly through the discovery and
application of fundamental symmetry principles.” In 1939 he participated in a faithful Einstein-
Szilard meeting resulting in a letter to President Roosevelt prompting him to initiate work
on development of atomic bombs. You might find this comment of his particularly intriguing,
“It was not possible to formulate the laws of quantum mechanics in a fully consistent way
without reference to consciousness,” which he made in one of his essays published in collection
“Symmetries and Reflections – Scientific Essays (1995).”

12.2 Resonant Tunneling 419

transmission coefficients of a single barrier. Still, you might feel certain dissatisfac-
tion because all these calculations do not really explain how passing through two
thick barriers instead of one can improve the probability of transmission, leave alone
make it equal to one. They also do not clarify the role of the quantum superposition,
which, I claimed, played a crucial role in this phenomenon. There are several distinct
ways to develop a more qualitative, intuitive understanding of this situation. First is
naturally based on thinking about quantum mechanical properties of the particle in
terms of waves, their superposition and interference. To see how these ideas play out,
consider an expression for the amplitude reflection coefficient rdb, which determines
the relative contribution of the backward propagating wave in the wave function
representing the state of the particle in the region z < 0:

.z/ D exp .ikz/C rdb exp .�ikz/ :

A careful look at Eq. 12.65 reveals that this expression describes a superposition
of two waves, both propagating backward, but with different phases. The origin of
these contributions is the multiple reflections of the waves representing the particle’s
state between the boundaries of both barriers (this is why the second barrier is
crucial for this effect to occur). The only terms contributing to the phase difference
between them are exp .ikw C i't/ and exp .�ikw � i't C i�/, where the extra i�
in the argument of the exponent takes care of the negative sign appearing in front
of this expression in Eq. 12.65. The phase difference between these contributions
to the reflected (backward propagating) component of the wave function is 4 D
2kw C 2't C � , and if we want to suppress reflection by destructive interference,
we must require that 4 D � C 2�n, which results in exactly the condition for the
transmission resonance kw C 't D �n.

It is also instructive to take a look at the spatial dependence of the probability
density j .z/j2 for resonant and off-resonant values of energy. The analytical
expression for this quantity is quite cumbersome, especially off the resonance, so
I will spare you from having to suffer through its derivation, presenting instead
only the corresponding graphs obtained for the same values of the parameters as
in Fig. 12.4 for off- and on-resonance values of the particle’s energy. The first two
graphs in Fig. 12.5 correspond to the value of energy smaller and larger than the
energy of the first tunneling resonance. In both cases you can observe oscillations of
the probability in the region z < 0 due to interference between incident and reflected
waves. You should also notice that the relative probability to find the particle
in front of the barrier at the maximums of the interference pattern significantly
exceeds the probability to find the particle between the barriers or behind them (the
right boundary of the second barrier can be clearly identified from the graphs by
the absence of any interference pattern in the transmitted wave) for energies both
below and above the resonance. The situation, however, changes completely at the
resonance (the last graph in the figure). The most remarkable feature of this graph
is a pronounced increase of the likelihood that the particle is located between the
barriers. If we are dealing with a beam of many electrons incident on the structure,
this effect will result in an accumulation of electrons between the barriers making

420 12 Resonant Tunneling

Fig. 12.5 Spatial dependence of the probability density j .z/j2 for energies below, above, and
equal to the energy of the first tunneling resonance. Parameters of the double-barrier structure are
the same as in Fig. 12.4 with the barrier width d D 0:4 nm

this region strongly negatively charged. Electric field associated with this strong
charge will repel incoming electrons making it more difficult for additional electrons
to penetrate the barriers. This effect, called Coulomb blockade, can be noticed as
increase in the number of reflected electrons as we increase the density of electrons
in the beam. For very small distance between the barriers, the effect of Coulomb
blockade can be so strong that even a single electron is capable of preventing
other electrons from entering the structure. Thanks to this phenomenon, physicists
and engineers gain ability to count individual electrons and develop single-electron
devices.

It is important to notice that the resonance probability distribution featured in
Fig. 12.5 corresponds to the smallest of the resonance energies, which satisfies
Eq. 12.69 with n D 1. Now I want you to take a look at the probability distributions
corresponding to resonance energies satisfying Eq. 12.69 with n D 2 and n D 3
presented in Fig. 12.6.

Ignore for the second that the functions depicted in Figs. 12.5 and 12.6 do not
vanish at infinity, and compare them to those shown in Fig. 6.6 , which present the

12.2 Resonant Tunneling 421

Fig. 12.6 Spatial dependence of the probability density j .z/j2 at the resonance energies of the
second and third order (n D 2; 3)

wave functions corresponding to the first three bound energy levels in a rectangular
potential. Taking into consideration the obvious difference stemming from the
fact that the graphs in Fig. 6.6 are those of the real-valued wave functions, while
Figs. 12.5 and 12.6 depict j .z/j2, you cannot help noticing the eerie resemblance
between the two sets of graphs. You might also notice that the resonance condition,
Eq. 12.69, resembles an equation for the energy eigenvalues of the bound states.
Actually, in the limit d ! 1, this equation must exactly reproduce Eq. 12.50
with obvious replacements d ! w and Vw ! 0, and it would be interesting to
demonstrate that. Will you dare to try? Do not get deceived by the term kw in
Eq. 12.69, which might make you think about the bound states of an infinite potential
well. The finite nature of the potential barriers arising in the limit d ! 1 is hidden
in the phase term 't; which shall play the main role when recasting Eqs. 12.69 into
the form of Eq. 12.50.

Anyway, this similarity between resonance wave functions and those of the
bound states offers an alternative interpretation of the phenomenon of the resonant
tunneling. Imagine that you start with a potential, in which the barriers are infinitely
thick, so that you can place a particle in one of the stationary states of the respective
potential well. Then, using a magic wand, you reduce the thickness of the barriers
to a finite value. What will happen to the particle in this situation? Using what we
learned about the tunneling effect, you can intuit that the particle will “tunnel out”
of the potential well and escape to the infinity. In formal mathematical language,
this can be rephrased by saying that the boundary condition for the corresponding
Schrödinger equation at z ! ˙1 must take a form of a wave propagating to the
right (exp .ikz/) for z ! 1 and a wave propagating to the left (exp .�ikz/) for
z ! �1. These boundary conditions differ from the ones we used when deriving
the transmission and reflection coefficients by the absence of the wave exp .ikz/
incident on the potential from negative infinity. Correspondingly, the wave function
at z < 0 and z > 2d C w is now presented by the column vectors

422 12 Resonant Tunneling

v0 D
0

r

�
I v4 D

t
0

�

similar to the bound state problem. Also, similar to the bound state problem, you
will have to conclude that the transfer-matrix equation T.4/v0 D v4 in this case has
non-zero solutions only if the element in the second row and second column of T.4/

vanishes. Equation 12.63 then yields

� jr1j2 exp .ikw C i't/C exp .�ikw � i't/ D 0

This equation can be transformed into a more convenient for further discussion
form:

exp .2ikw C 2i't/ � exp .2i&/ D 1jr1j2
(12.78)

where I brought back the same notation for the phase & D kw C ' used when
discussing tunneling resonances. Trying to solve this equation, for instance, by
graphing its left-hand and right-hand sides, you will immediately realize that this
equation does not have real-valued solutions (it’s left-hand side is complex-valued,
while the right-hand side is always real). More accurately, I shall say that this
equation might have real solutions only if jr1j is equal to unity for all frequencies,
which is equivalent to the requirement that the thickness of the barriers d becomes
infinite. If this is the case, Eq. 12.78 can be satisfied if 2& D 2�n, which is, of
course, just Eq. 12.69, and I hope that by now you have already demonstrated that
this equation is equivalent to Eq. 12.50 in the limit of the infinitely thick barriers.
We, however, are interested in the situation when the barriers are thick but finite,
so that jr1j2 is less than one, but not by much. Using the probability conservation
equation, jt1j2 C jr1j2 D 1, I can rewrite Eq. 12.78 as

exp .2i&/ D 1
1 � jt1j2

(12.79)

and using condition jt1j2 � 1, approximate it as

exp .2i&/ � 1C jt1j2 (12.80)

where I used well-known approximation

.1C x/˛ � 1C ˛x

which is just the first two terms in the power series expansion of function .1C x/˛
with ˛ D �1. Expecting that the solution to this equation deviates only slightly
from &n D �n, I will present & as & D �n C �, where � � 1. The exponential
left-hand side of Eq. 12.80 in this case becomes

12.2 Resonant Tunneling 423

exp .2i�n C 2i�/ D e2i� � 1C 2i�

and substituting it into Eq. 12.80, I find that � is a purely imaginary quantity equal to

� D �1
2

i jt1j2

Thus, the wave number satisfying Eq. 12.80 acquires an imaginary part defined by
equation

kw C 't D �n � 1
2

i jt1j2 � �n � 1
2

i� (12.81)

where I used Eq. 12.73 to replace jt1j2 with � .
Gazing for some time at Eq. 12.81, you will realize that something not quite

kosher is happening here. When starting the calculations, we postulated that the
wave function at infinity is described by propagating waves with real wave numbers.
Well, it turns out that it is not possible to keep this assumption and satisfy all other
boundary conditions. If you now substitute Eq. 12.81 into the exp .˙ikz/, it will turn
into exp Œ˙i .�n � 't/ z=w� exp .˙�z=2/, which explodes exponentially for both
z < 0 and z > 2d C w. Quite obviously this is not an acceptable wave function
as it cannot be normalized neither in the regular nor in ı-function sense. So, does it
mean that all our efforts for the last hour, hour and a half (I guess this is how long it
would take you to get through this segment of the book, but, believe me, it took me
much, much longer to write it), were in vain? Well, not quite, of course, why would
I bother you with this if it were. What I want to do now to save my face is to take
the phase & .E/ in Eq. 12.81 and expand it as a function of energy around the point
En, where En is a resonant frequency obeying equation & .En/ D �n. It will give me

& .E/ � �n C .d&=dE/ .E � En/ ;

which I will substitute to Eq. 12.81

�n C .d&=dE/ .E � En/ D �n � 1
2

i� )

E D En � 1
2

i�

d&

dE

��1
D En � 1

2
i�E (12.82)

where I used Eq. 12.75 to introduce energy HWHM parameter �E. Quite clearly,
Eq. 12.81 cannot be satisfied with real values of energy, so that the found solutions
cannot be eigenvalues of a Hermitian operator, (which must be real), and of course,
they are not. The problem, which we have been trying to solve, lost its Hermitian
nature once it was allowed for the wave function not to vanish at infinity. So, if the
found solutions are not “true” energy eigenvalues, what are they? Can we prescribe
them at least some physical meaning? Well, just by looking at Eq. 12.82, you can

424 12 Resonant Tunneling

notice that its real part coincides with the energy of the tunneling resonances, while
its imaginary part is equal to the HWHM parameter of those resonances. Plugging
this equation into the time-dependent portion of the wave function (which has been
ignored so far), exp .�iEt=„/, will get you

.t/ / e�iEnt=„e� 12 �Et=„ (12.83)

The respective probability distribution, which normally wouldn’t depend on time, is
now exponentially decreasing:

P D j j2 / exp .��Et=„/ (12.84)

with a characteristic time scale �E D „=�E. And Eq. 12.84, actually, admits quite
a natural physical interpretation. To see this, you need to recall the very initial
assumption that I made starting discussing this approach to the resonant tunneling.
The question I posed at that time was: What would happen to a particle placed in a
stationary state of a potential wall with infinitely thick potential barriers if the width
of the barriers would become large but finite? Physical intuition told us that a particle
in this case would be able to tunnel out of the well through the barriers, which means
that the probability to locate the particle inside the well would diminish with time.
We can understand Eq. 12.84 as a formal description of this decay of probability
due to tunneling. Thus, even though the wave functions, which I calculated, do not
appear to have much physical or mathematical meaning, the complex eigenvalues
given by Eq. 12.82 contain an important physically relevant information: its real part
yields the energy of the tunneling resonances, while its imaginary part describes
both the resonance width, �E, and the time of the decay of the probability due to
tunneling � .

These complex eigenvalues are called quasi-energies, while respective states are
known as “quasi-stationary states,” “quasi-modes,” or “resonance states.” The term
quasi-stationary implies that a particle placed in such a state would not stay there
forever and would tunnel out during some time; the time �E can be understood as an
average lifetime of such states. It is remarkable that the product �E�E is simply equal
to Planck’s constant „, making relationship between the width of the resonance and
the lifetime of the respective quasi-stationary state similar to the uncertainty relation
between, say, coordinate and momentum operators.

More accurate treatment of such time-dependent tunneling requires solving the
time-dependent Schrödinger equation with an appropriate initial condition, but even
such, less than rigorous, but intuitively appealing approach, can be used to infer
important information about behavior of the particle in potentials similar to the
double-barrier potential considered here. This approach, together with the concept
of quasi-stationary states, was first introduced by George Gamow, an influential
Russian-American physicist, born in Odessa (Russian Empire, presently Ukraine),
educated in Soviet Union, and defected to the West in 1933 as Stalin’s purges began
to intensify. (One of his closest university friends, Matvei Bronstein, was executed

12.3 Problems 425

Attractive nuclear
potential well

Repulsive electric
potential

Energy of
alpha-particle

Alpha
radioactive
nucleus

Fig. 12.7 A schematic of a potential barrier experienced by an alpha-particle inside of a nucleus

by Soviet authorities in 1938 on trumped-up treason charges.) Gamow introduced
these states (sometimes called Gamow states) while developing the theory of alpha-
particle radioactivity. His idea was that the alpha-particles contained inside of the
nucleus of a radioactive atom experience a potential in the form of a well followed
by a thick, but finite, barrier (see Fig. 12.7).

Radioactive decay in Gamow’s theory was understood as a slow tunneling of ˛-
particles out of the nucleus. Gamow’s approach can actually be modified to give
it more mathematical rigor and to turn the wave functions representing the quasi-
states into physically and mathematically meaningful objects. However, the modern
variations of the concept of quasi-states is a topic lying far outside of the scope of
this book, so let me just finish this chapter now.

12.3 Problems

Problem 148 Verify that the matrix equation, Eq. 12.8, is, indeed, equivalent to the
system of equations, Eqs. 12.6 and 12.7.

Problem 149

1. Write down boundary conditions for the wave function and its derivative
presented in Eq. 12.11.

426 12 Resonant Tunneling

Fig. 12.8 Potential described
in Problem 7

2. Find relations between amplitudes A.L;R/1 , B
.L;R/
1 and A1;2, B1;2, which would

convert the boundary conditions found in Part I of the problem into Eqs. 12.13
and 12.14.

Problem 150 Demonstrate that coefficients A.R/0 and B
.R/
0 in Eqs. 12.46 and 12.47

coincide with coefficients A2 and B2 in Eqs. 6.60 and 6.61 in Sect. 6.2.

Problem 151 Show that coefficient B.R/2 in Eq. 12.48 vanishes.

Problem 152 Prove that Eq. 12.53 is, indeed, reduced to A2 D ˙B0 for energies
satisfying dispersion equations 12.51 and 12.52.

Problem 153 Use the transfer-matrix method to find the equation for the bound
state in the asymmetric potential well:

V.z/ D

8̂
<̂
ˆ̂:

V1 z < 0

Vw 0 < z < d

V2 z > d

where V2 > V1.

Problem 154 Consider an electron moving in a potential described as (see
Fig. 12.8)

V.x/ D

8̂
ˆ̂̂<
ˆ̂̂̂
:

1 x < 0
0 0 < x < w

Vb w < x < w C d
0 x > w C d

You are interested in the properties of the electron in this potential for energies
0 < E < Vb. It is clear that for the particle incident on this potential from the left,
which is the only direction it can be incident from, the probability of reflection is

12.3 Problems 427

always equal to one, simply because the wave function at z < 0 vanishes and there
can be no transmitted particles. So, it appears that the effects of tunneling resonance
discussed in Sect. 12.2 have no place in this potential. At the same time, in the
limit d ! 1, this potential allows for at least one bound state localized mostly
within the region of the potential well. When the barrier width d becomes finite, this
bound stationary state begins leaking outside due to the tunneling effect, just like we
discussed in the section on the resonant tunneling. Accordingly, we must expect this
potential to possess quasi-stationary states, but it is not clear how they are related to
the reflective properties of the potential. I suggest that you try to figure it out.

1. First, find the amplitude reflection coefficient assuming that to the right of the
potential, there are both incident and reflected waves, so that the column vector
representing the wave function for z > w C d would look like

vR D

r
1

�

(As always, the first element is occupied by the amplitude in front of the wave
propagating to the right, but in the case under consideration, this wave represents
reflected particles.) The wave function for 0 < z < w must turn zero at z D 0,
which is achieved by the function of the form

.z/ D Aeikz � Ae�ikz D Aeikweik.z�w/ � Ae�ikwe�ik.z�w/

so that the wave function to the very left of the potential discontinuity at z D w
is presented by the column

v1 D

Aeikw

�Ae�ikw
�

The wave function for z > w can be found using standard combination of the
interface and free propagation matrices (you will need two of the former and
one of the latter). Complete the transfer-matrix calculations and find parameter r.
Look for any traces of possible resonant behavior paying special attention to the
phase of r.

2. Now repeat these calculations assuming that there are no incident particles so
that the wave function for z > d C w is represented by a column

vR D

r
0

�

Derive equation for the complex quasi-energies of the respective quasi-stationary
states. Assuming that the imaginary part of the quasi-energies is small, separate
this equation into an equation for the real and imaginary parts, just like we did in
Sect. 12.2. Compare the results with those of the preceding calculations.

Chapter 13
Perturbation Theory for Stationary
States: Stark Effect and Polarizability of
Atoms

Only few models in quantum mechanics allow for an exact analytical solution.
Most of the problems, which are relevant to the real-world situations and are
important for understanding the fundamental nature of things or for applications,
can only be solved using one or another type of approximation. In this chapter I
will introduce a method designed for finding approximate solutions for eigenvalues
and corresponding eigenvectors of a time-independent Hamiltonian with a discrete
spectrum. This method works for Hamiltonians that can be written as a sum of
two parts: the main or unperturbed Hamiltonian OH0, whose eigenvalues, E.0/s , and
eigenvectors, jsi, are presumed to be known, and a perturbation � OV . Parameter �
that I pulled out of OV has a formal meaning of the strength of the perturbation, but
this can be understood literally only in the sense that the perturbation vanishes when
� D 0. The actual parameter determining the strength of the perturbation emerges
only post factum, after the problem is solved. I will mainly use � as a technical
bookkeeping device (you will know what it means when you see it) and set it equal
to unity at the end. The index s appearing in the notation for the eigenvalues and
the eigenvectors can be a composite index, consisting of several subindexes. For
instance, if OH0 is the Hamiltonian of a hydrogen-like atom, then s contains principal,
orbital, and magnetic numbers n; l;m. It is also presumed that the perturbation OV
can be considered small in some yet undefined sense so that the eigenvalues and
eigenvectors of the total Hamiltonian

OH D OH0 C � OV (13.1)

do not deviate too much from the E.0/s and jsi correspondingly. Consequently,
one might hope that they can be found approximately using the eigenvalues and
eigenvectors of OH0 as a starting point.

The development of the method is quite different for non-degenerate unperturbed
eigenvalues and the degenerate ones. You can see where the difference is coming
from by pondering over the following. Whatever the approximate expression I will

© Springer International Publishing AG, part of Springer Nature 2018
L.I. Deych, Advanced Undergraduate Quantum Mechanics,
https://doi.org/10.1007/978-3-319-71550-6_13

429

430 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

derive for the eigenvalues and eigenvectors, they must reduce to E.0/s and
ˇ̌
s.0/
˛

as I

set � D 0. If E.0/s is a non-degenerate eigenvalue so that jsi is the only eigenvector
belonging to it, this process does not raise any issues. If, however, E.0/s is degenerate,

meaning that there are several unperturbed orthonormal eigenvectors
ˇ̌
ˇs.0i
E

belonging

to it, together with infinitely many linear combinations thereof, an outcome of the
transition � ! 0 in this case becomes a mystery. You will learn eventually that as
improbable as it might sound, it is the perturbation operator OV that “decides” the
outcome of this transition even as its own “strength” goes to zero. You can consider
these somewhat vague remarks as a teaser designed to spur your curiosity. All this
will become (hopefully) much clearer once we get down to it. At this point, my
goal is simply to justify the importance of separate consideration of degenerate and
non-degenerate unperturbed eigenvalues.

13.1 Non-degenerate Perturbation Theory

The non-degenerate case is more straightforward, so this is what I am going to
begin with. The idea is to present the unknown eigenvalues Es and eigenvectors jsi
as power series of the form

Es D E.0/s C �E.1/s C �2E.2/s C �3E.3/s C � � � (13.2)
jsi D ˇ̌s.0/˛C � ˇ̌s.1/˛C �2 ˇ̌s.2/˛C �3 ˇ̌s.3/˛C � � � (13.3)

and plug them into the stationary Schrödinger equation:

� OH0 C � OV
�

jsi D Es jsi : (13.4)

This procedure yields

OH0
ˇ̌
s.0/
˛C � OH0

ˇ̌
s.1/
˛C �2 OH0

ˇ̌
s.2/
˛C � � � C

� OV ˇ̌s.0/˛C �2 OV ˇ̌s.1/˛C �3 OV ˇ̌s.2/˛C � � � D
E.0/s

ˇ̌
s.0/
˛C �E.0/s

ˇ̌
s.1/
˛C �E.1/s

ˇ̌
s.0/
˛C �2E.0/s

ˇ̌
s.2/
˛C (13.5)

�2E.2/s
ˇ̌
s.0/
˛C �2E.1/s

ˇ̌
s.1/
˛C � � � :

For this expression to be true for an arbitrary value of the perturbation parameter �,
it is necessary that the terms with the same power of � on the left-hand side and on
the right-hand side of this equation were individually equal to each other:

13.1 Non-degenerate Perturbation Theory 431

�0 W OH0
ˇ̌
s.0/
˛ D E.0/s

ˇ̌
s.0/
˛
; (13.6)

�1 W OH0
ˇ̌
s.1/
˛C OV ˇ̌s.0/˛ D E.0/s

ˇ̌
s.1/
˛C E.1/s

ˇ̌
s.0/
˛
; (13.7)

�2 W OH0
ˇ̌
s.2/
˛C OV ˇ̌s.1/˛ D E.0/s

ˇ̌
s.2/
˛C E.2/s

ˇ̌
s.0/
˛C E.1/s

ˇ̌
s.1/
˛
: (13.8)

Now you can see what I meant in saying that � will only be used for bookkeeping
purposes: I use it to identify different approximation orders, and once it is done, it
can be set to unity.

Equation 13.6 is just the eigenvalue equation for the unperturbed Hamiltonian,
which, as I presumed, is fulfilled by E0s and

ˇ̌
s.0/
˛
. Corrections to the eigenvalue and

the eigenvector proportional to � (I will call them the first-order corrections) should
supposedly be found from Eq. 13.7. You might wonder if it is possible to find both
these unknown quantities from a single equation. Well, let’s see.

First, I multiply Eq. 13.7 by
˝
s.0/
ˇ̌

from the left:

˝
s.0/
ˇ̌ OH0

ˇ̌
s.1/
˛C ˝s.0/ ˇ̌ OV ˇ̌s.0/˛ D E.0/s

˝
s.0/

ˇ̌
s.1/
˛C E.1/s

˝
s.0/

ˇ̌
s.0/
˛
:

Taking into account that
˝
s.0/

ˇ̌
s.0/
˛ D 1 (normalization) and that ˝s.0/ ˇ̌ OH0 D

E.0/s
˝
s.0/
ˇ̌

(Hermitian property of the Hamiltonian), I transform this expression into

���
���E.0/s

˝
s.0/

ˇ̌
s.1/
˛C ˝s.0/ ˇ̌ OV ˇ̌s.0/˛ D����

��
E.0/s

˝
s.0/

ˇ̌
s.1/
˛C E.1/s ;

which gives me the first-order correction to the energy eigenvalue

E.1/s D
˝
s.0/
ˇ̌ OV ˇ̌s.0/˛ : (13.9)

So far so good—I got the correction to the energy, but how about the correction to
the eigenvector,

ˇ̌
s.1/
˛
, which got canceled out? But fret you not—the cancelation of

the
ˇ̌
s.1/
˛

is not a bug, but a feature, which allowed me to isolate and determine the
energy correction. In order to obtain

ˇ̌
s.1/
˛
, I need to do something else, something

which would eliminate the E.1/s term from Eq. 13.7. One way to achieve this is
to premultiply the equation by an eigenvector of the unperturbed Hamiltonian,
different from

˝
s.0/
ˇ̌
, say,

˝
q.0/

ˇ̌
, where q D 1; 2; 3; s�1; sC1; � � � . Indeed, in this case

the term with E.1/s vanishes because of the orthogonality condition:
˝
q.0/

ˇ̌
s.0/
˛ D 0,

for q ¤ s. The remaining expression in this case becomes
˝
q.0/

ˇ̌ OH0
ˇ̌
s.1/
˛C ˝q.0/ OV ˇ̌s.0/˛ D E.0/s

˝
q.0/

ˇ̌
s.1/
˛ )

E.0/q
˝
q.0/

ˇ̌
s.1/
˛C Vqs D E.0/s

˝
q.0/

ˇ̌
s.1/
˛
;

where I again used
˝
q.0/

ˇ̌ OH0 D E.0/q
˝
q.0/

ˇ̌
and introduced the matrix element Vqs D˝

q.0/
ˇ̌ OV ˇ̌s.0/˛ (note that the order of indexes in Vqs follows their order in

˝
q.0/ OV ˇ̌s.0/˛

from left to right). The result is an equation for
˝
q.0/

ˇ̌
s.1/
˛
, which yields

432 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

˝
q.0/

ˇ̌
s.1/
˛ D Vqs

E.0/s � E.0/q
: (13.10)

This quantity is a component of the unknown vector
ˇ̌
s.1/
˛

in the direction of the
vector

ˇ̌
q.0/

˛
and can be used to find the entire vector

ˇ̌
s.1/
˛

as follows. Since
ˇ̌
q.0/

˛
are eigenvectors of a Hermitian operator and, therefore, form a basis, I can expandˇ̌
s.1/
˛

in this basis as

ˇ̌
s.1/
˛ D

X
q

ˇ̌
q.0/

˛ ˝
q.0/

ˇ̌
s.1/
˛ D

X
q¤s

ˇ̌
q.0/

˛ ˝
q.0/

ˇ̌
s.1/
˛C ˝s.0/ ˇ̌ s.1/˛ ˇ̌s.0/˛ ;

where in the sum in the second line I separated out the term with q D s. With˝
q.0/

ˇ̌
s.1/
˛

found, I am just one step away from finding the entire vector
ˇ̌
s.1/
˛
: all I

need is the value of
˝
s.0/
ˇ̌

s.1/
˛
, which so far remains unknown. The help comes from

a familiar place—the normalization condition. Consider the found eigenvector with
accuracy up to the first order in �:

jsi D ˇ̌s.0/˛C � ˝s.0/ ˇ̌ s.1/˛ ˇ̌s.0/˛C �
X
q¤s

Vqs

E.0/s � E.0/q
ˇ̌
q.0/

˛
;

and compute its norm hsj si:

hsj si D ˝s.0/ ˇ̌ s.0/˛C � ˝s.0/ ˇ̌ s.1/˛ ˝s.0/ ˇ̌ s.0/˛C

�
X
q¤s

Vqs

E.0/s � E.0/q
˝
s.0/

ˇ̌
q.0/

˛C �
X
q¤s

V�qs
E.0/s � E.0/q

hq0/j s0i C O.�2/:

I cannot include in this expression the terms of the second order in �2 or higher
because other terms of the same order were omitted from the initial expression
for jsi. The second line in this expression is actually equal to zero because of the
orthogonality condition

˝
s.0/

ˇ̌
q.0/

˛ D hq0/j s0i D 0, so all that is left is

hsj si D 1C � ˝s.0/ ˇ̌ s.1/˛ ;

where I took into account that
˝
s.0/
ˇ̌

s.0/
˛ D 1. Thus, if I want the norm of jsi to be

equal to unity, I have to set
˝
s.0/
ˇ̌

s.1/
˛ D 0. This is the last piece I needed to find the

first-order correction to the eigenvector, which I can now write down as

jsi D ˇ̌s.0/˛C �
X
q¤s

Vqs

E.0/s � E.0/q
ˇ̌
q.0/

˛
: (13.11)

13.1 Non-degenerate Perturbation Theory 433

So, I am done with the first-order corrections to the eigenvalues and eigenvectors,
and now I can turn to finding the corrections proportional to �2, which are often
called second-order corrections. Going back to Eq. 13.8 and performing the same
magic trick of premultiplying it by

˝
s.0/
ˇ̌
, I find

˝
s.0/
ˇ̌ OH0

ˇ̌
s.2/
˛C ˝s.0/ ˇ̌ OV ˇ̌s.1/˛ D

E.0/s
˝
s.0/

ˇ̌
s.2/
˛C E.2/s

˝
s.0/

ˇ̌
s.0/
˛C E.1/s

˝
s.0/

ˇ̌
s.1/
˛
:

The first term on the left evaluates to E.0/s
˝
s.0/

ˇ̌
s.2/
˛

and cancels the first term on
the right, while the remaining expression, remembering that

˝
s.0/

ˇ̌
s.0/
˛ D 1 and˝

s.0/
ˇ̌
s.1/
˛ D 0, becomes

E.2/s D
˝
s.0/
ˇ̌ OV ˇ̌s.1/˛ D

X
q¤s

VsqVqs

E.0/s � E.0/q
D
X
q¤s

ˇ̌
Vsq
ˇ̌2

E.0/s � E.0/q
; (13.12)

where I again introduced a matrix element Vsq D
˝
s.0/
ˇ̌ OV ˇ̌q.0/˛ and took into account

that Vqs D V�sq.
Premultiplying Eq. 13.8 by

˝
q.0/

ˇ̌
, I obtain

˝
q.0/

ˇ̌ OH0
ˇ̌
s.2/
˛C ˝q.0/ ˇ̌ OV ˇ̌s.1/˛ D

E.0/s
˝
q.0/

ˇ̌
s.2/
˛C E.2/s

˝
q.0/

ˇ̌
s.0/
˛C E.1/s

˝
q.0/

ˇ̌
s.1/
˛
:

Using again the Hermitian property of the Hamiltonian to compute the first term in
the first line and orthogonality of the zero-order eigenvectors to eliminate the middle
term in the second line, I turn this expression into

E.0/q
˝
q.0/

ˇ̌
s.2/
˛C ˝q.0/ ˇ̌ OV ˇ̌s.1/˛ D E.0/s

˝
q.0/

ˇ̌
s.2/
˛C E.1/s

˝
q.0/

ˇ̌
s.1/
˛
:

Now, using Eq. 13.9 as well as Eqs. 13.10 and 13.11, I can convert it into the
following:

�
E.0/q � E.0/s

� ˝
q.0/

ˇ̌
s.2/
˛ D ˝s.0/ ˇ̌ OV ˇ̌s.0/˛ Vqs

E.0/s � E.0/q
�
X
p¤s

VqpVps

E.0/s � E.0/p
)

˝
q.0/

ˇ̌
s.2/
˛ D � VssVqs�

E.0/s � E.0/q
�2 C

X
p¤s

VqpVps�
E.0/s � E.0/p

� �
E.0/s � E.0/q

� :

434 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

Respectively, the second-order correction to the eigenvector becomes

ˇ̌
s.2/
˛ D �

X
q¤s

VssVqs�
E.0/s � E.0/q

�2 jq.0/i C
X
q¤s

X
p¤s

VqpVps�
E.0/s � E.0/p

� �
E.0/s � E.0/q

� jq.0/i;

(13.13)

where I again set
˝
s.0/

ˇ̌
s.2/
˛

to zero as a normalization condition. Equations 13.12
and 13.13 complete my program of derivation of the lowest-order corrections to
the non-degenerate eigenvalues and the eigenvectors of Hamiltonian OH. Finding the
third- and higher-order corrections becomes progressively more cumbersome and is
rarely necessary.

Ideally, from the point of view of minimizing one’s efforts, it would be preferable
to get all the answers just from the first-order terms. Often, however, as we already
discussed in Chap. 10 on the two-level model, and Sect. 7.1.2 on quantum harmonic
oscillator, the diagonal elements of the perturbation part of the Hamiltonian, those
that determine the first-order corrections to the eigenvalues, vanish. This happens,
for instance, when the unperturbed Hamiltonian OH0 is invariant with respect to the
parity operator, so that its eigenvectors can be classified as being even or odd. If,
in addition, the perturbation operator OV is odd (changes sign upon application of
the inversion operator), then

˝
s.0/
ˇ̌ OV ˇ̌s.0/˛ vanishes and takes along the first-order

correction to the energy. If the first-order correction to the energy turns zero, you
do not have a choice as to rely on the second-order correction. If the second-order
terms are not sufficient as well, it usually means (there are exceptions, of course)
that the perturbation approach is not suitable for the problem at hand.

13.1.1 Quadratic Stark Effect

The perturbation theory plays a crucial role in understanding the responses of a
quantum system to external influences such as electric or magnetic fields. Leaving
the effects due to a magnetic field for a separate chapter, in this section I will focus
on the interaction between an atom and an electric field, E , which, in many instances,
can be assumed to be spatially uniform. I will also assume that this electric field is
static, i.e., does not depend on time. Then, what I need to do as the first matter
of business is to find the electric field-induced corrections to the energy spectrum
and corresponding eigenvectors of the unperturbed Hamiltonian. When this is done,
I can move on to analyze how these changes manifest themselves in observable
phenomena.

As a practical example, I will consider the effects of the static electric field on
the ground state of a hydrogen atom (Chap. 8), which is the only non-degenerate
energy level of hydrogen and can be, therefore, studied using the developed method.
So, assuming that OH.0/ in Eq. 13.1 describes a hydrogen-like system considered
in Chap. 8, I can replace the abstract zero-order eigenvectors

ˇ̌
s.0/
˛

of the previous

13.1 Non-degenerate Perturbation Theory 435

section with their more concrete version jnlmi and energy eigenvalues with E.0/n D
�Eg=n2. The ground state is described by eigenvector j100i and energy �Eg, the
expression for which can be found in Chap. 8. Indexes s and q are now replaced by
three indexes, n;m, and l, and summation over q involves the summation over all
three indexes subject to regular restrictions on their values determined in Chap. 8.
The operator of perturbation OV for a uniform electric field has a simple form, which
you already encountered in this book several times, for instance, in Chap. 10:

OV D �eE Oz: (13.14)

Here I chose the Z-axis of the coordinate system used to define the components of
the operators of the angular momentum along the direction of the electric field. The
first-order correction to the ground state vanishes as was explained above because
the perturbation potential is odd with respect to inversion. Those who are skeptical
about symmetry-based arguments can verify this statement directly working, for
instance, in the position representation

h100j OV j100i D �eE 1
4�

�̂

0

2�ˆ

0

1̂

0

d�d'dr sin �r2 ŒR10.r/�
2 r cos �;

R10.r/ is the radial component of the wave function representing the ground state
of the hydrogen-like system, and I converted z into spherical coordinates. Now
you only need to compute the integral

´ �
0

d� sin � cos � over the polar angle � to
convince yourself (which shouldn’t be too difficult) that it vanishes.

With this issue clarified, let’s take on a more difficult problem of finding the
second-order correction to the ground state energy using Eq. 13.12. The most
important thing to realize when using this equation is that the sum over q is now
three sums: over n, l, and m. The sum over n starts with n D 2, because the term
n D 1 is excluded from the summation by the condition q ¤ s, which in our case
translates to n ¤ 1, the sum over l runs from 0 to n � 1, and the sum over m covers
values from �l to l:

E.2/1 D
1X

nD2

n�1X
lD0

lX
mD�l

jV100;nlmj2
E.0/1 � E.0/n

D

e2E2
1X

nD2

n�1X
lD0

lX
mD�l

jz100;nlmj2
E.0/1 � E.0/n

: (13.15)

It should be noted that Eq. 13.15 does not tell the entire story since the spectrum
of a hydrogen atom contains also a continuous segment, which, strictly speaking,
needs to be included. Moreover, since the potential due to a constant external
field grows progressively more negative with the growing value of coordinate z,
at some point the total potential felt by the electron will become less negative

436 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

than a given negative energy of a bound state, providing for a possibility for the
electron to tunnel out of the nucleus’s potential (at which point the atom becomes
ionized). This turns the stationary state of the electron into quasi-stationary, similar
to the situation considered in Sect. 12.2. All these complications, however, can be
ignored for a weak enough field because (a) for states from continuous spectrum,
the energy denominators in Eq. 13.12 become so large that the contribution from the
corresponding terms can be safely neglected, and (b) even though the bound states
become formally quasi-stationary, if the field is not too strong, their lifetime is long
enough to treat them as normal stationary states.

Now my task is to compute the matrix elements z100;nlm, which, using the
hydrogen wave functions from Chap. 8, can be written down as

z100;nlm D 1p
4�

�̂

0

2�ˆ

0

1̂

0

d�d'dr sin �r3 cos �Yml .�; '/Rn;l.r/R1;0.r/: (13.16)

To evaluate the angular portion of the integral

�̂

0

2�ˆ

0

d�d' sin � cos �Yml .�; '/

let me first notice that this integral vanishes for all values of the magnetic number
m, with exception of m D 0. To see this just recall that the spherical harmonics
Yml .�; '/ contain the factor exp .im'/, which in this integral is the only factor
containing the azimuthal angle '. Integration of exp .im'/ over the entire range
of ' between 0 and 2� yields zero unless m D 0, when the value of the integral
becomes 2� . Having disposed of the integration with respect to ', I am left with the
integral over the polar angle

�̂

0

2�ˆ

0

d�d' sin � cos �Yml .�; '/ D 2�
r
2l C 1
4�

ım;0

�̂

0

d� sin � cos �Pl .�/ D

p
� .2l C 1/ım;0

1ˆ

�1
xPl.x/dx;

where I replaced the spherical harmonic Y0l with the regular Legendre polynomials
and made a substitution of variables x D cos � . To evaluate the remaining integral,
I recall that x D P1.x/ so that the last expression can be rewritten as

p
� .2l C 1/ım;0

1ˆ

�1
P1.x/Pl.x/dx:

13.1 Non-degenerate Perturbation Theory 437

All what is left now is to invoke orthogonality of the Legendre polynomials and use
Eq. 5.71 from Sect. 5.1.4 adapted to the case m D 0:

1ˆ

�1
Pl1 .x/Pl.x/dx D

2

2l C 1ıll1 :

Applying this formula to the case under consideration (l1 D 1/; I obtain the final
result for the angular part of the matrix element:

�̂

0

2�ˆ

0

d�d' sin � cos �Yml .�; '/ D

p
� .2l C 1/ 2

2l C 1ım;0ıl1 D
2
p
�p
3
ım;0ıl1:

Substituting this result into Eq. 13.16, I get

z100;nlm D 1p
3
ım;0ıl1

1̂

0

drr3Rn;1.r/R1;0.r/:

Replacing the integration variable r with its dimensionless counterpart x D Zr=.aB/,
where all notations are taken from Chap. 8, the expression for the matrix element can
be recast as

z100;nlm D 1p
3
ım;0ıl1

�aB
Z

�4 1̂

0

dxx3Rn;1.x/R1;0.x/:

The radial wave functions can be read of Eq. 8.21 for the hydrogen wave functions.
In terms of dimensionless variable x, the ground state (n D 1; l D 0) function takes
the form

R1;0 .x/ D 2
s

Z

aB

3
exp .�x/

(recall that the Laguerre polynomial L10 .2x/ � 1), while Rn;1.x/ becomes

Rn;1 .x/ D
s

2Z

naB

3
.n � 2/Š
2n .n C 1/Š

2x

n

exp

�
� x

n

�
L3n�2

2x

n

:

438 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

Substituting these expressions into the formula for the matrix element, I end up with

z100;nlm D 1p
3
ım;0ıl1

�aB
Z

�4
2

s
Z

aB

3s
2Z

naB

3
.n � 2/Š
2n .n C 1/Š

2

n
�

1̂

0

dxx3xe�xe�x=nL3n�2
2x

n

dx D

ım;0ıl1
8p
3

aB
Zn3

s
.n � 2/Š

n .n C 1/Š

1̂

0

dxx4e�x.1C1=n/L3n�2
2x

n

dx D

ım;0ıl1
aB
Z

f .n/; (13.17)

where I introduced function f .n/, which depends only on the principal quantum
number n:

f .n/ D 8p
3

1

n3

s
.n � 2/Š
.n C 1/Š

1̂

0

dxx4e�x.1C1=n/L3n�2
2x

n

dx: (13.18)

Now the second-order correction to the ground state energy can be written down as

E.2/1 D �
8�"r"0

Z4
E2a3B

1X
nD2

n2

n2 � 1 f
2.n/; (13.19)

where I replaced E.0/n with their actual values �Eg=n2 and used explicit expression
for Eg in terms of Bohr radius using Eq. 8.17. In principle, function f .n/ can be
found analytically for an arbitrary n, but the result is not worth the effort since we
will end up with a nasty looking sum over n, which at any rate we wouldn’t be
able to find exactly. Thus, instead, I will evaluate f .n/ only for n D 2; 3; 4; 5 and
use the results to compute E.2/s approximately, including only these terms into the
sum. To help you digest and reproduce these computations, I am providing some
intermediate results in Table 13.1.

Table 13.1 Data for
calculating the quadratic
Stark effect

n L3n�2.2x/ f .n/

2 1 0.7449

3 4� 2x 0.2983
4 2.5� 5x C 5x2/ 0.1759
5 20� 30x C 12x2 � 4

3
x3 0.1205

13.1 Non-degenerate Perturbation Theory 439

Using these data, you can now easily evaluate Eq. 13.19 to yield

E.2/1 D �
8�"r"0

Z4
E2a3B .0:7399C 0:1001C 0:0330C 0:01512C � � � / �

(13.20)

�1:784�"r"0
Z4

E2a3B:

Adding more terms to the sum does not affect the numerical coefficient too much:
going from four terms to 200 changes this factor by 2%. It is interesting to note
that the ground state energy of hydrogen in the electric field can be found exactly,
without resorting to the perturbation theory. This theory is too complicated to
discuss in this book, but no one can forbid me to use its result for comparison.
The exact solution produces factor 2:5 instead of 1:78, which is a 20% difference.
I would say that this is a pretty decent approximation given the difference in the
amount of efforts required to derive the approximate and exact results.

Equation 13.20 can also be recast in another illuminating form. By multiplying
the numerator and denominator of this equation by e2, you can notice that the
resulting expression contains the ground state energy Eg in its denominator. Making
this fact explicit, you can rewrite Eq. 13.20 in the form

E.2/1 D �
0:89

Z2
e2a2BE2

Eg
(13.21)

where the numerator has a clear physical meaning: eaBE is the change of the
potential energy of the electron in the field E over the distance equal to the “size”
of the atom expressed by the Bohr radius aB. This expression also makes it much
easier to get a feeling for the numerical magnitude of the Stark effect, since I know
that the Bohr radius (assuming that we are dealing with an actual hydrogen atom
in vacuum) aB D 5:29 � 10−11 m, and using it as a typical value for the electric
field E D 106 V/m, I find for eaBE in electron volts eaBE � 5 � 10�5 eV. Recalling
that the ground state energy of hydrogen in vacuum is 13:6 eV, I can estimate the
quadratic Stark effect correction to the energy as (ignoring numerical coefficients of
the order of unity) being of the order of 10�10 eV. This change in energy levels is
observed by measuring the electric field-induced shift of the absorption or emission
lines in the hydrogen spectrum. The energy shift of the order of 10�10 eV translates
into a frequency shift of about 105 Hz. This shift of spectral lines is what is known
as the Stark effect, and because in the case considered in this section the shift
is quadratic in the field, it is qualified as quadratic Stark effect. The effect was
discovered in 1913 by German physicist Johannes Stark who was awarded for this
discovery the 1919 Nobel Prize in Physics. Stark was an active supporter of the
Nazi regime and was closely involved in Deutsche Physik movement, whose goal
was to cleanse German science from foreign mostly Jewish influence. It was he who
described Heisenberg as a White Jew after Heisenberg publicly defended Einstein’s
relativity theory. He was probably the only one famous physicist who after the war
was sentenced to a prison term for collaboration with Hitler’s regime.

440 13 Perturbation Theory for Stationary States: Stark Effect and Polarizability of Atoms

13.1.2 Atom’s Polarizability

As you just saw, the modification of the energy eigenvalues in the presence of the
electric field m