Error Analysis


In physics courses, we spend most of our time getting “the answer” - solving equations we know for a final result (numerical or symbolic). This is great on paper, but when it comes to experiments, we always have sources of error. Instrumentation can only provide results with a certain precision (roughly, how close together measurements are) and accuracy (roughly, how close the result is to reality), and as such, any time we run an experiment in the lab (or even when running simulations), we need to report the error in our measurement.

Often, we will run an experiment many times given the same setup (even if we are trying several variants, we’ll run each several times), then report the mean of our measurements


along with the standard deviation


The \(N-1\) is not for the true (population) standard deviation, but is a statistical correction based on taking a sample from a population (otherwise we would use \(N\)). For more, peruse chapter 4 of Dawson Baker’s “Physics Skill Manual” available here.

\(\sigma\) is also known as the root-mean-square (take the square root of the squared distance from the mean), and can alternatively be defined by:

\[\begin{split}\sigma=\sqrt{\langle(x-\langle{x}\rangle)^2\rangle}&= \sqrt{\langle{x^2}-2x\langle{x}\rangle+\langle{x}\rangle^2\rangle}\\ &=\sqrt{\langle{x^2}\rangle-2\langle{x}\rangle\langle{x}\rangle+\langle{x}\rangle^2}\\ &=\sqrt{\langle{x^2}\rangle-\langle{x}\rangle^2}\end{split}\]

This is well and good for quantities we measure directly. But what about if we then compute other quantities from those measurements? We need to be able to report our uncertainty in those values as well (for an example, see below). But how do we do that?

Single Variable Functions

Let’s first look at the simple case: a function of one variable, say \(f(x)\). If we have all the values of \(x\), we can compute \(\langle f\rangle\) simply:

\[\langle f \rangle=\sum_{i=1}^Nf(x_i)\]

We can even compute \(\sigma_f\):

\[\begin{split}\langle f^2\rangle=\sum_{i=1}^Nf^2(x_i)\\ \sigma_f^2=\langle f^2\rangle-\langle f\rangle^2\end{split}\]

But, if we don’t have the particular values of \(x\) and instead only have \(\mu_x,~\sigma_x\), we will need to make some approximations to find \(\mu_f,~\sigma_f\). Let’s do a Taylor approximation of \(f(x)\) around \(\mu_x=\mu\):

\[\begin{split}f(x)&\approx f(\mu)+\left.\frac{df}{dx}\right|_{\mu}(x-\mu)+\frac{1}{2!}\left.\frac{d^2f}{dx^2}\right|_{\mu}(x-\mu)^2\\ \langle f \rangle=\mu_f&\approx f(\mu)+\left.\frac{df}{dx}\right|_{\mu}\langle x-\mu\rangle+ \frac{1}{2!}\left.\frac{d^2f}{dx^2}\right|_{\mu}\langle(x-\mu)^2\rangle\\ &=f(\mu)+\frac{1}{2!}\left.\frac{d^2f}{dx^2}\right|_{\mu}\langle(x-\mu)^2\rangle\\ &=f(\mu)+\frac{1}{2!}\left.\frac{d^2f}{dx^2}\right|_{\mu}\sigma_x^2\end{split}\]

since \(\langle x-\mu\rangle=\langle x\rangle-\mu=0\).

Keeping the second order approximation in \(x-\mu\), we get:

\[\begin{split}f^2(x)&\approx f^2(\mu)+2f(\mu)\left.\frac{df}{dx}\right|_{\mu}(x-\mu)+2f(\mu) \left.\frac{d^2f}{dx^2}\right|_\mu\frac{(x-\mu)^2}{2!}+ \left(\frac{df}{dx}\middle|_\mu\right)^2(x-\mu)^2\\ \langle f^2\rangle&\approx f^2(\mu)+2f(\mu) \left.\frac{d^2f}{dx^2}\right|_\mu\frac{\langle(x-\mu)^2\rangle}{2!}+ \left(\frac{df}{dx}\middle|_\mu\right)^2\langle(x-\mu)^2\rangle\\ \sigma_f^2&=\langle f^2\rangle-\langle f\rangle^2=\left(\frac{df}{dx}\middle|_\mu\right)^2\sigma_x^2\end{split}\]

Thus, we have the following description of the distribution of values for \(f\):

\[\begin{split}\mu_f&=f(\mu_x)+\frac{1}{2!}\left.\frac{d^2f}{dx^2}\right|_{\mu_x}\sigma_x^2\\ \sigma_f^2&=\left(\frac{df}{dx}\middle|_{\mu_x}\right)^2\sigma_x^2\end{split}\]

Multivariate Functions

Let’s say that we have some function \(f\) over N independent variables \(x_1,~x_2,~...,~x_N\) that is the quantity we want to compute. To compute the mean and standard deviation, we can use the same techniques above to arrive at:

\[\begin{split}\langle{f}\rangle&=f(\overline{\mu})+ \sum_{i=1}^N\frac{1}{2!}\left. \frac{d^2f}{dx_i^2}\right|_{\overline{\mu}}\sigma_i^2\end{split}\]

Where \(\overline{\mu}\) represents the average of each variable applied as an averaged vector (just shorthand). But what about the uncertainty? For that, if we can assume that the variables are independent of one another, we can use the following equation:

\[\sigma_f^2=\sum_{i=1}^N \sigma_{x_i}^2\left(\frac{\partial f}{\partial i}\middle|_{ \overline{\mu}}\right)^2\]

This says that we add the variances (square of the standard deviation) of each variable, multiplied by the square of the partial derivative of the function with respect to that variable, applying the average values of all variables as necessary.

If the variables are not independent, then we have to include the covariance \(\sigma_{x,y}=\langle xy\rangle-\langle x\rangle\langle y\rangle\) (so the variance is \(\sigma_{x,x}=\sigma_x^2\)). But that topic is outside the scope of this course.

Example: Watermelon Drop

Here at the Physics Department of the University of Texas at Austin, we often drop watermelons from the physics, math, and astronomy building (Robert Lee Moore Hall), as it extends 61.5 meters above the ground. When we drop the watermelons, we often time the descent to approximate \(g\), the acceleration due to gravity. To do this, we first assume that the only massive sobjects are the watermelon and the Earth, so that we can just have the single force of Earth’s gravity acting on the watermelon (in practice, other sources are indeed negligible). Since the watermelon will not be relativistic, we can just use Newtonian gravity:

\[\overrightarrow{F}=-G\frac{M_E M_w}{r^2}\hat{r}=M_w \overrightarrow{a_w}\]



where \(r\) changes by 61.5m relative to Earth’s radius of ~6371km. With such a small change, we can take a first-order approximation, centered at 6371km:


And if we take the difference in \(|a_w|\) between \(r=6371061.5m\) and \(r=6371000m\), we get a difference of \(-0.00019\frac{m}{s^2}\), which is totally negligible relative to the first term, so we can assume that the acceleration is constant over the course of the fall (note: we cheated here a bit by plugging in the known values for the mass and radius of earth, but only to show that the acceleration due to gravity can be considered constant, making the standard approximation in nearly every introductory physics course).

With our acceleration being constant, we can easily integrate over time twice to get that


where away from Earth’s surface is considered positive and \(g\) is the acceleration due to gravity. We further will state that the initial velocity of the watermelon is 0, and the height is 61.5m. So, now we have that \(y(t)=61.5-.5gt^2\).

So if we have some set of times for when \(y=0\) (i.e., the watermelon explodes on impact with the ground), we can easily rearrange this to solve for \(g\)


So, our data is just time but we want to get acceleration due to gravity. Let’s say we had the times here.

The average of these values is \(\langle{t}\rangle=3.48s\), and the unbiased estimator for standard deviation (computed with the \(N-1\) correction, as above) is \(\sigma_t=0.058s\). We have that

\[\begin{split}g=\frac{123m}{(3.48s)^2}=10.16\frac{m}{s^2}\\ \frac{\partial g}{\partial t}=-\frac{246m}{t^3}\end{split}\]

So we have

\[\left.\frac{\partial g}{\partial t}\middle|_{\langle{t}\rangle}\right.=-5.78\frac{m}{s^3}\]


\[\begin{split}\sigma_g&=\sqrt{\sigma_t^2\left(\frac{\partial g}{\partial t} \middle|_{\langle{t}\rangle}\right)^2}\\ &=\sqrt{(0.058s)^2\left(-5.78\frac{m}{s^3}\right)^2}=0.11\frac{m}{s^2}\end{split}\]

So, we can report our result for \(g\) in the compact form: \(10.16\pm0.11 \frac{m}{s^2}\), which says that the mean is 10.16 m/s^2 and the standard deviation is 0.11 m/s^2. This seems pretty close to the correct answer of \(g=9.8 \frac{m}{s^2}\), but we are ~3.3 standard deviations from it! For most of science, 3 standard deviations is considered to be significantly different ( 99.73% of results should be within 3 standard deviations) [discovery of new particles such as the Higgs boson usually requires 5 standard deviations!]. But why should our result be different? Have we found that we’ve been measuring gravity incorrectly lo these many years? Or is something else at work?

Thus brings up the discussion of sources of error. One is that we neglected air resistance in the calculation above. It’s reasonable that the watermelon was nearing terminal velocity by the time it hit the ground, which means that the descent should have taken a little longer (we assumed constant acceleration, but near the end, it begins traveling at constant velocity). But if we corrected for this, since it took longer than our model predicts, \(g_{obs}\) should have been smaller than \(g\)! So although that is a source of error, it’s not one that explains the error we observed. What does explain the error is much less interesting - that we used humans to time the descent, and even with a countdown to the drop, there will always be a slighly delayed reaction in timing the drop. As such, with even a tenth of a second delay (giving \(\langle{t}\rangle=3.38s\)), \(g_{obs}\) is now less than \(g\). If we could quantify our delay, we could correct for it, then continue to determine where we went wrong. If the reaction time puts us under, then what happens if we take the more complicated model where we have constant force of gravity, but an air resistance dependent on velocity (or velocity squared)? Can we make further corrections to our model to attain a better and better result? Often, the answer is yes, but we may not always have the data to do so. For example, we would need to characterize the air drag and quantify the reaction delay, which is data we just don’t have (not difficult to get, but not collected).