Statistics for Experimental Physics

1. Measurement uncertainty
2. Standard Deviation and Uncertainty
3. Significant Figures
4. Error Propagation
5. Agreement and Difference Significance

1. Measurement uncertainty

Unfortunately, no measurement is perfect. Every instrument you ever use will have some margin
of error. For example, say you have a ruler with millimeter markings and you are measuring the
diameter of a cylindrical object that is between 5 and 6 cm. You may see that the measurement
lies between the second and third mm markings past 5 cm, but you cannot precisely measure
how far it is between marks. Your measurement could be 5.25 cm, 5.28 cm, 5.23 cm ... or
anything between 5.2 cm and 5.3 cm—you just know that it’s somewhere between the two! In
this case, you could report your measurement as 5.25 ± 0.05 cm, or 5.25 cm ± 0.95% (note that
0.05/5.25 = 0.0095). In general, a measurement is reported as r ± δr, though sometimes relative

uncertainties (given as percentages) are more useful: r ±
!"
"
∙100% .

Here we use a lowercase delta, δ, to indicate measurement uncertainty. δr is one single number,
and would be read as “delta r,” or “the uncertainty in r.” When you take a measurement, you’ll
look at the instrument you use and estimate how much you may be off. We’ll also use the same
symbol to indicate the total error on a calculated quantity, for which the uncertainty had to be
found through error propagation (covered below).

Part of being an honest experimenter is knowing and reporting the experimental uncertainties in
your equipment. For analog instruments, a good rule of thumb is to take the smallest increment
marked on the instrument and divide by two to find the uncertainty. This is what we did with the
ruler in the example above.

For digital instruments, the uncertainty (sometimes called the “tolerance”) should be indicated on
the instrument or in its user manual. Whoever builds the instrument must do careful calibration
to measure how uncertain its readings are. Use this uncertainty when available. In this lab course,
if you come across an instrument without a stated uncertainty, you may assume the digital
readout was properly chosen and the uncertainty is 1 in the last digit shown. For example, if a
bathroom scale reads out to the kilogram, the uncertainty would be ±1 kg.

Before continuing, we need to define some terms that are associated with error and error
propagation:

In everyday English “error” means “mistake,” but in statistics and science we use it a different
way. We’ll use “uncertainty” and “error” interchangeably as terms for the limitation on how well
we have measured a value. The smaller the uncertainty/error, the better we have determined the
value. One important thing to understand about error, is that it cannot be completely eliminated

from experiments and analysis of data. Error can be minimized to acceptable levels but cannot
be ignored. Since uncertainty and error cannot be completely ignored, we must have a process in
place to deal with error and uncertainty. Over the duration of this lab we will use multiple
methods of addressing and stating error. These methods begin with an understanding of how to
define error in terms of its nature.

There are two main categories of error. Our labs will mostly focus on the first, random error, but
it’s important to understand systematic error as well.

Random error is uncertainty caused by unknown and unpredictable (random) changes in the
experiment, including the physical setup, the instruments used, and the fundamental nature of
what’s being measured. For example, if you were trying to determine the top speed of a certain
model of racecar by taking many measurements of how fast you could make it go, the random
error would have contributions from the physical setup (like the density of air and temperature of
the air when you did the test), the instruments used (the measurement limitations of the
speedometer or your radar speed gun), and the nature of what’s being measured (you’d use
several cars of the same model, but they would all have very slightly different properties from
variations in the manufacturing process). This means you would get many different numbers for
the top speed, and the variation of these numbers would tell you how well you determined the
true top speed. Remember, none of this variation is about making mistakes, like using a different
model car or taking a reading before maximum speed was achieved.

Systematic error: this is uncertainty resulting from a bias in the measurement or theory,
consistently leading to values that are off in one particular direction.

• Systematic errors can arise from instruments that are miscalibrated. For example, if you
have two thermometers that are reading different temperatures when measuring the exact
same substance, you know that one or both of them are miscalibrated and will likely be
off every time you measure a temperature with it. (Unfortunately it’s not always obvious
which thermometer is wrong, or by how much!)

• Systematic errors can also arise from an experimental design that doesn’t take all relevant
factors into account. For example, if you’re measuring the acceleration due to gravity and
you're not working in a vacuum, but you haven’t taken air resistance into account, your
measurements for g will be systematically wrong (by an unknown amount).

• Systematic errors also show up when the theory used does not represent the physical
system being investigated. An example of this is when there is an error in the formula
being used to investigate data. Let’s say a vacuum chamber is used in conjunction with
an electronically controlled drop time mechanism to determine height and drop time of an
object falling in the chamber. Let’s say that the scientists use the following equation to
calculate the acceleration of the object falling in the chamber:

𝑎 =
𝑑𝑟𝑜𝑝 ℎ𝑒𝑖𝑔ℎ𝑡

2 ∙ 𝑑𝑟𝑜𝑝 𝑡𝑖𝑚𝑒 𝑠𝑞𝑢𝑎𝑟𝑒𝑑

In this case, no matter how hard the experimenter tried to take better data, the calculated
acceleration will always be about one fourth the physically realistic gravitational
acceleration value. This is because the theoretical equation used for the calculation of
acceleration does not accurately represent the relationship between gravitational
acceleration, distance and time.

2. Standard Deviation and Uncertainty

Often in science we’ll try to measure the same thing many times (like the mass of an electron, or
the speed of light). The key here is that the value itself should be a single, constant number, so
any variation in our measurements is related to measurement uncertainty. For example, you can
ask “what is my weight right now?”, but not “what is my weight (over my entire lifetime)?” The
second number has no single, well-defined value, since it’s changed dramatically from infancy to
now.
When we report the value after many measurements, we want to report our best estimate for the
true value, along with the uncertainty to show how precise our measurement was. Our best
estimate is the mean (average) of our measurements.
If we’ve measured the same thing many times, we can use as our uncertainty the standard
deviation (represented with the symbol σ, lowercase sigma). This measures how big the scatter
in our measurements is. We calculate this with the following formula, where N is the number of
trials you ran, and �̅� is your average value of all measured quantities (x1, x2, x3, ….. up to xN),

𝜎 = 8
1

𝑁 − 1 [
(�̅� − 𝑥>)@ + (�̅� − 𝑥@)@ + ⋯+ (�̅� − 𝑥C)@]

We can report a single measurement as, for example, x3 ± 𝜎. The standard deviation is the
appropriate uncertainty on a single one of our measurements, but the uncertainty on the average
of our measurements is smaller (the point of averaging is to reduce the uncertainty in our final
number). So if we average N measurements, the uncertainty on that average value is given by the
standard error of the mean,

𝜎EFG =
𝜎
√𝑁

We would report our average as �̅� ± 𝜎EFG, with a high probability of the true value lying between
�̅� - 𝜎EFG and �̅� + 𝜎EFG. Notice that as we collect more data and N gets larger, our standard error
of the mean will get smaller (although our standard deviation will remain about the same). It can
be difficult and time-consuming to collect data, so there’s a balance between completing the
experiment in a reasonable time and having an acceptably small error.
Example:

Let’s say we have a jar of jelly beans and we have N = 5 people taking a guess on how many
jelly beans are in the jar. The first person’s guess will be represented using the variable x1, the
second person’s guess will be represented as x2, and so on.

x1 = 235, x2 = 202, x3 = 215, x4 = 190, and x5 = 185.
The average of the guesses is:

�̅� =
1
𝑁
(𝑥> + 𝑥@ + 𝑥I + 𝑥J + 𝑥K)

�̅� = >

K
(235 + 202 + 215 + 190 + 185) = 205.4

The standard deviation for the set of guesses is:

𝜎 = 8
1

𝑁 − 1 [
(�̅� − 𝑥>)@ + (�̅� − 𝑥@)@ + (�̅� − 𝑥I)@ + (�̅� − 𝑥J)@ + (�̅� − 𝑥K)@]

𝜎 = 8
1

5 − 1 [
(205.4 − 235)@ + (205.4 − 202)@ + (205.4 − 215)@ + (205.4 − 190)@ + (205.4 − 185)@]

s = 20.2064346

Lastly, the uncertainty on the average is:

𝜎STU =
V
√C

= @[email protected]
√K

= 9.03659228

At this point there are no more calculations to be done so we round to the nearest jelly bean and
the result of our best guess and the uncertainty on that guess for the number of beans in the jar is:

205 ± 9 jelly beans.

3. Significant Figures
Notice what we did at the end of the example above, rounding our best estimate and the
uncertainty on it. The number of significant figures we use on our estimate is determined by the
uncertainty.
Our rules for significant figures:

1) Round your uncertainty to one significant figure, unless the uncertainty begins with
the digit 1. In that case, round to two significant figures.

2) Round your result to agree in decimal place with your uncertainty.
The basic idea behind both of these rules is that you want to state your results to the precision
you actually know them, and no farther. In your lecture class you will probably be given some
shorthand rule for significant figures. Use that rule in your lecture class, but keep in mind that
it’s just a nod to this real principle from experimental science about the limitations of
measurements. In lab class, we’ll be explicitly calculating the uncertainties, and can therefore

treat significant figures properly.
Consider the following examples and think about whether they are correctly or incorrectly stated.
Ex. 1: Brianna’s height is 165.0 ± 0.5 cm
Ex. 2: Brianna’s height is 165 ± 0.5 cm
Ex. 3: Jose’s height is 173 ± 1 cm
Ex. 4: Jose’s height is 173.3 ± 1.0 cm
Ex. 5: the area of San Francisco is 600. ± 2 km2
Ex. 6: the area of San Francisco is 6.00×102 ± 2 km2
Ex. 7: The mass of an electron is 9.109×10-31 ± 1.560×10-33 kg
Ex. 8: The mass of an electron is 9.109×10-31 ± 1.6×10-33 kg
Ex. 9: The mass of an electron is (9.109 ± 0.016)×10-31 kg
Answers to examples:
Ex. 1: correct
Ex. 2: incorrect (decimal place disagreement between value and uncertainty)
Ex. 3: incorrect (the uncertainty is rounded to one sig fig, but since it starts with “1” it

needs two sig figs)
Ex. 4: correct (trailing zeros are significant)
Ex. 5: correct (without the decimal point after the 600, it’s ambiguous whether it has one

or three significant figures)
Ex. 6: correct (writing it in scientific notation makes the significant figures clear)
Ex. 7: incorrect (decimal place disagreement between value and uncertainty—hard to see

because of the scientific notation—and incorrect number of sig figs on uncertainty)
Ex. 8: correct
Ex. 9: correct, and preferable to what’s in example 8 because it’s easier to read
In intermediate calculations, keep some extra digits along (as in the example above), at least two
or three digits more than you think you’ll eventually round to. That avoids introducing rounding
errors.

4. Error Propagation
If we have values and uncertainties for some quantities, and then use arithmetic to calculate
another quantity, we can find the uncertainty on the result by propagating our uncertainties on
the input.

Addition/subtraction:

If you calculate a quantity c by either adding or subtracting,

𝑐 = 𝑥 + 𝑦 or 𝑐 = 𝑥 − 𝑦

then the uncertainty on c can be found by adding the uncertainties on x and y “in quadrature”
(squaring them, adding, then taking the square root):

𝛿𝑐 = \(𝛿𝑥)@ + (𝛿𝑦)@

If you are adding or subtracting more than two numbers, just add more squared uncertainties
under the radical.

Example:

Two people each have a piece of copper wire. The first person measures the length of their wire
to be L1 = 2.40 cm and their ruler has tick marks spaced apart by two millimeters. The second
person measures their length of wire to be L2 = 10.0 cm and is using a less sophisticated ruler
that only has tick marks every half centimeter.

What is the uncertainty on the total length of both wires combined? In other words, what is the
error on Ltot, if Ltot = L1 + L2?
These are analog devices, so the measurement uncertainty is one half of the smallest unit of
measure. The measurement uncertainty for the first person is 0.1 cm and 0.25 cm for the second
person. Using these values and the uncertainty formula for added or subtracted values the error
on Ltot is:

𝛿𝐿^_^ = \(𝛿𝐿>)@ + (𝛿𝐿@)@

𝛿𝐿^_^ = \(0.1 cm)@ + (0.25 cm)@

𝛿𝐿^_^ = 0.27 cm

The result for the calculated total length would be 12.4 ± 0.3 cm, following our significant figure
rules.

Multiplication/division:

If you calculate a quantity d by either multiplying or dividing,

𝑑 = c
d
or 𝑑 = d

c
or 𝑑 = 𝑥 ∙ 𝑦

then the uncertainty on d can be found by multiplying the calculated value of d by the result of
adding the ratios of the uncertainties divided by the measured values on x and y in quadrature.

δd = d ∙ 8g
δx
x i

@

+ g
δy
y i

@

If you are multiplying or dividing more than two numbers, just add more squared uncertainty
ratios under the radical.

Example:

A ball is rolled along a straight track, measured with a tape measure to be 240 cm long. The tape
measure is spaced out in one-centimeter increments. An electronic timer counts the number of
seconds it takes for the ball to roll the entire length of the track. The time interval for the ball to
travel this distance is 2 seconds, which has a tolerance of 0.25 seconds.

The velocity of the ball is v = (length of straight path) / (time to travel the path),

L/t = 2.40 m / 2 s = 1.20 m/s.

What is the uncertainty on the calculated value of velocity?

𝛿𝑣 = 𝑣 ∙ 8g
𝛿𝐿
𝐿 i

@

+ g
𝛿𝑡
𝑡 i

@

The measurement uncertainty for the length is 0.005 meters and for the time is 0.25 seconds.

𝛿𝑣 = 1.2 m/s ∙ 8g
0.005 m
2.4 m i

@

+ g
0.25 s
2.0 s i

@

𝛿𝑣 = 0. 1500208 m/s

The result for the calculated value for velocity would be 1.20 ± 0.15 m/s, following our
significant figure rules.

5. Agreement and Difference Significance
How do you know if two measurements are different? There are situations when experiments
will be performed on two or more different systems. It is often useful to compare results between
systems to see if they behave similarly or not. Since there is uncertainty on each result, there
needs to be a mathematical procedure to determine the degree of disagreement between any two
experimental results.

Example:

A paleontologist measures a certain marsupial fossil to be 37.4 million years (Myr) old. Across
the continent, another fossil of this species is found and dated to 36.8 million years old. Are these
fossils the same age?

Your first instinct is to probably say “no,” since the numbers 36.8 and 37.4 are different. But this
neglects the key point that all measurements have uncertainties. So let’s consider four fossils.
Specimen A is dated (37.4 ± 0.1) Myr, specimen B (36.8 ± 0.2) Myr, specimen C (37.4 ± 0.9)
Myr, and specimen D (36.8 ± 0.8) Myr. Ages are plotted on below with error bars, that show the
uncertainty above and below the value.

If you compared the ages of A and B, the best estimates are 0.6 Myr apart. So are C and D. But
these are very different situations. You can see the error bars of the ages of A and B don’t
overlap, whereas the error bars for the ages of C and D overlap substantially. Intuitively, it seems
unlikely that the ages of A and B are really the same, but it seems quite likely that the ages of C
and D could be the same.

We’ll formalize this with a parameter that we’ll call difference significance (DS). The idea is
that we want to calculate the difference between our quantities in terms of uncertainties1. If the
difference is many times larger than the uncertainty, that should be a real difference. If the
difference is smaller than the uncertainty, that’s probably not a real difference. We’ll employ a
shorthand approach to a rigorous statistical treatment of hypothesis tests, but it will suffice as a
good introduction for this lab course.

We calculate the difference significance by

𝐷𝑆 =
difference

uncertainty in difference =
|𝐴 − 𝐵|
𝜎{|

The uncertainty on a difference between two numbers is calculated from the individual
uncertainties on the numbers added in quadrature, as explained in the Error Propagation section
above.

In this course, we’ll categorize difference significance into four possibilities:

1 For those with a statistics background, you may recognize that this is similar to t from a Student’s t-test.

A

36.8

fossil specimen

age
(Myr)

B C D

37.4

0 ≤ DS ≤ 1 Our measurements show no difference between the two numbers, and we
are very confident in that result.

1 < DS ≤ 2 Our measurements show no difference between the two numbers, but we
are not very confident in that result.

2 < DS ≤ 3 Our measurements do show a difference between the two numbers, but we
are not very confident in that result.

3 < DS Our measurements do show a difference between the two numbers, and we
are very confident in that result.

For our examples above, the difference between the ages of A and B has a total uncertainty of

\(0.1 Myr)@ + (0.2 Myr)@ = 0.2236 Myr

The difference significance is therefore

𝐷𝑆 =
37.4 Myr − 36.8 Myr

0.2236 Myr = 2.7

2.7 is large enough that this is probably a real difference between the ages, but it’s not large
enough to be very sure about that. (You can report your difference significance to two or three
significant figures.)

Doing the same calculation for specimens C and D yields an uncertainty on the difference of
1.204 Myr, and therefore a difference significance of 0.50. We definitely can’t see a difference
between the ages of these two specimens.