Uncertainty Demo

Harys Dalvi

September 2021


This is an uncertainty demo. The “uncertainty” part is based on statistics, and I am using a simulation of experimental physics for the “demo” part. This page aims to test propagation of uncertainty, a topic useful to science and engineering in general, using physics as an example.

Simulation

Number of trials:
Scenario:

Help

Select a scenario, then select true values for the variables being measured. Check the scenario list below to see the equation for this scenario and the predicted error. After all measurements, an analysis of the graphs will be produced below, including calculations of how this relates to the measurement in a physics context. All errors in measured quantities are assumed to be ±5% (standard deviation).

On the first graph, the \(x\)-axis represents the first uncertain quantity, and the \(y\)-axis represents the second unknown quantity. (For a pendulum for example, these are \(T\) and \(\ell\) respectively). Note that the graph is centered around the expected values for \(x\) and \(y\), and has a range of 5% in both directions. Red dots represent individual measurements. The blue dot represents the mean of these measurements. The light blue circle represents the standard deviation of these measurements in the \(x\)-direction, and the dark blue circle represents the standard deviation of these measurements in the \(y\)-direction. The orange dot and circle represent what the blue dot and circles should approach in theory with many trials. See the extra statistics notes for more about this.

On the second graph, red dots represent calculated values based on individual measurements. The central blue line represents the mean of all calculated values up to that point. The top and bottom light blue lines represent standard deviation error bars for the central blue line. The dark blue lines represent the standard deviation calculated from propagation of error using the measurements up to that point. (This is what you would get as an experimentalist doing propagation of error with the data available to you.) The orange lines represent the theoretical predicted long-term standard deviation from propagation of error using the true values and distributions of random variables. (This is what the lines are expected to approach.) The black line represents the true value for the measurement.

Large sample sizes can take a while. I slow down the 20 and 100 sample sizes so you can see what's happening, but I run the 1000 and 2000 on the maximum speed your computer can handle. I think the graphing runs in \(O(x^2)\) time.

Introduction

This web app explores propagation of uncertainty and tests its accuracy in physics scenarios. It simulates experimental physics measurements with uncertainty and then runs statistical analysis related to propagation of uncertainty. First, I'll talk about what propagation of uncertainty is, and the math behind it. Then I'll talk about the physics scenarios used here as examples. Finally I'll conclude by asking how well this model with propagation of uncertainty and random error works judging from this simulation.

All physics equations are assumed to be perfectly accurate to reality in this simulation. Factors such as air resistance and intermolecular forces that may affect results are not taken into account. However, there is some random variation in measurement similar to an experimental setting.

What is Propagation of Uncertainty?

To illustrate propagation of uncertainty, I'll use the example of a pendulum from this site. You have a pendulum, and you are trying to use it to estimate the acceleration due to gravity on Earth's surface, \(g\). To do so, you will use the period \(T\) of the pendulum and its length \(\ell\). Then you use the equation $$g = \frac{4 \pi^2 \ell}{T^2}$$ Unfortunately, your measurements are a little off, and vary from the true values of \(T\) and \(\ell\) by up to \(\pm 5 \%\). Due to this, the value you calculate for \(g\) will be a little off too. The question is, will it be 5% off, or more, or less? The way to figure out exactly how far off \(g\) will be is through propagation of uncertainty.

Underlying Equations

Let's say you are trying to determine a quantity \(f\) based on the measured variable \(x\). Then the uncertainty in \(f\) can be estimated from the uncertainty in \(x\) using calculus. $$\sigma_f = \sigma_x \left| \frac{df}{dx} \right|$$ If \(f\) instead depends on more than one variable, say \(x\) and \(y\), you must use partial derivatives. Let's assume that the uncertainty of \(x\) and \(y\) are independent. This is usually reasonable; for example, having a ruler with few markings won't affect the accuracy of a thermometer that you also happen to have. If we make this assumption, we find $$\sigma_f = \sqrt{\sigma_x^2 \bigg( \frac{\partial f}{\partial x} \bigg)^2 + \sigma_y^2 \bigg( \frac{\partial f}{\partial y} \bigg)^2}$$ Using more than two variables is similar.

Note that this equation isn't perfect, as it assumes a constant derivative of \(f\). This is not too bad of an assumption for a function \(f\) that doesn't change too much within the bounds of the uncertainty. Part of the idea of this page is to test how well this equation works.

We can find the uncertainty \(\sigma_x\) if we know our random variable \(x\) has an uncertainty of 5%. On this site, every measurement \(x\) varies as a uniform distribution within 5% of its true value \(\langle x \rangle\). The standard deviation for a uniform distribution is $$\sigma = \sqrt{\frac{(b-a)^2}{12}}$$ Where \((b-a)\) is the range for the distribution. In this case, the range is 10% of our expected value for \(x\), since the 5% can go in either direction. So we find $$\sigma_x = \sqrt{\frac{(0.1 \langle x \rangle)^2}{12}}$$

Scenario List

Pendulum

You are trying to determine the strength of gravity \(g\). You measure the length \(\ell\) and the period \(T\) of a pendulum. You know $$T = 2 \pi \sqrt{\frac{\ell}{g}}$$ Therefore, to find the strength of gravity, you can use $$g = \frac{4 \pi^2 \ell}{T^2}$$ If we take partial derivatives, and plug them into the underlying equations, we can determine an uncertainty for \(g\). $$\frac{\partial g}{\partial T} = -\frac{8 \pi^2 \ell}{T^3}$$ $$\frac{\partial g}{\partial \ell} = \frac{4 \pi^2}{T^2}$$ You'll have to pardon me a bit on this one; I picked integral values for \(T\) and \(\ell\), but in exchange we get a value for \(g\) of \(9.87 \ \text{m}/\text{s}^2\). You can't adjust \(T\) and \(\ell\) freely like in some other scenarios, because if you could, you would be able to change gravity.

Ohm's Law

You are trying to determine the unknown resistance \(R\) of a resistor. You measure the voltage \(V\) across the resistor and the current \(I\) across the resistor. You know $$V=IR \implies R = \frac{V}{I}$$ The partial derivatives are $$\frac{\partial R}{\partial V} = \frac{1}{I}$$ $$\frac{\partial R}{\partial I} = -\frac{V}{I^2}$$

Ideal Gas Law

You are trying to determine the number of moles \(n\) of a gas in a container of known volume \(V\). You know the value of the ideal gas constant \(R\). You measure the pressure \(P\) and temperature \(T\) of the gas. The ideal gas law states $$PV=nRT \implies n = \frac{PV}{RT}$$ The partial derivatives are $$\frac{\partial n}{\partial P} = \frac{V}{RT}$$ $$\frac{\partial n}{\partial T} = - \frac{PV}{RT^2}$$

RC Circuit

You have a circuit with a resistor of resistance \(R\) and a capacitor of capacitance \(C\) in series. You apply a voltage \(V_0\) to charge the capacitor, and measure the voltage across the capacitor over time. You start a stopwatch to measure time \(t\), and at the end of that time, you measure the voltage \(V\) across the capacitor. From this data, you try to determine the time constant \(\tau\) of the RC circuit. The formula for the voltage \(V\) across a charging capacitor in an RC circuit over time \(t\) is $$V = V_0(1-e^{-t/\tau}) \implies \tau = \frac{t}{\ln \left(\frac{V_0}{V_0-V} \right)}$$ The partial derivatives are $$\frac{\partial \tau}{\partial V} = \frac{t}{(V-V_0) \ln^2 \left(\frac{V_0}{V_0-V} \right)}$$ $$\frac{\partial \tau}{\partial t} = \frac{1}{\ln \left(\frac{V_0}{V_0-V} \right)}$$

Projectile Motion

You are launching a projectile and want to determine its range. It is launched at a speed \(v\) and at an angle \(\theta\), both of which you measure but your measurements may be off by up to 5%. The range of the projectile is $$R = \frac{v^2 \sin (2 \theta)}{g}$$ You know \(g = 9.8 \ \text{m}/\text{s}^2 \). The partial derivatives are $$\frac{\partial R}{\partial v} = \frac{2v \sin (2 \theta)}{g}$$ $$\frac{\partial R}{\partial \theta} = \frac{2v^2 \cos (2 \theta)}{g}$$

Conclusion

I tried to choose physics scenarios with a wide variety of equations: simple algebraic, logarithmic, and trigonometric. If you run the simulations with a high number of trials, I think you'll find that propagation of uncertainty serves as an accurate prediction in all of these cases.

There might be slight deviations, but these are difficult to distinguish from random chance. One way to distinguish deviations in the mean from random chance is by introducing random error into the physics equations. Let's take the equation for \(g\) from the pendulum as an example. $$g = \frac{4 \pi^2 \ell}{T^2}$$ We see that $$g \propto \ell$$ $$g \propto \frac{1}{T^2}$$ Now let's see how random variation in \(\ell\) affects \(g\) by letting \(\ell\) be \(\ell_0 + \Delta \ell\), where \(\ell_0\) is the true value of the length and \(\Delta \ell\) is the random change from this, which could be positive or negative. $$g \propto \ell_0 + \Delta \ell$$ We can find an average value by integration. Let \(\Delta \ell\) vary from \(- a\) to \(a\), representing the 5% error in the length measurement. $$\frac{1}{2 a}\int_{-a}^{a} (\ell_0 + \delta) \, d \delta = \ell_0$$ Since there is no \(a\) term, this means that the variation in \(\ell\) has no long-term effect on \(g\). Let's compare this to a similar method for \(T\). $$\frac{1}{2 a} \int_{-a}^{a} \left( \frac{1}{(T_0 + \delta)^2} \right) \, d \delta = \frac{1}{T_0^2-a^2}$$ But if the variation in \(T\) had no long-term effect on \(g\), we would expect this integral to be $$\frac{1}{T_0^2} \lt \frac{1}{T_0^2-a^2}$$ This means that the variation in \(T\), even though it is equal in both directions, will on average lead to \(g\) being slightly higher than if \(T\) was known exactly. If you look carefully at the simulation with 2000 trials, you might be able to notice that the observed \(g\) tends to be slightly higher than the true value. We can even predict how much higher it will be on average. Since \(a\) represents the 5% error in \(T\), we can say \(a\) = \(0.05 T_0\). Then we find $$\frac{1}{T_0^2-(0.05 T_0)^2} \bigg/ \frac{1}{T_0^2} = \frac{400}{399}$$ Meaning that the \(g\) with error will on average be 1/399 (0.25%) higher than the true \(g\), even as the number of samples goes to infinity. A similar technique can be used for the other scenarios to determine how the error in measured quantities affects the calculated quantity for large samples. I'll leave this as an “exercise for the reader” as they say.

Another way to see why \(T\) has a long-term effect but \(\ell\) does not is to allow a 50% error for illustration. The true value for \(\ell\) is 1 m, and the true value for \(T\) is 2 s. With a 50% error, \(\ell\) will go from 0.5 m to 1.5 m, and \(T\) will go from 1 s to 3 s. Since \(g\) is proportional to \(\ell\), we take the average of 0.5 and 1.5, and find that it is 1, the true value. This means that over time, the effects of the variation in \(\ell\) cancelled out. With \(T\), we find \(1/1^2=1\) and \(1/3^2=1/9\). The average of these is 5/9, or about 0.55. But if we take the true value of T, we would get \(1/2^2=1/4=0.25\). The 50% variation in \(T\) led to a much larger value for \(g\) on average, even though neither higher nor lower values of \(T\) were favored.

Extra Statistics Notes

The orange dot on the first graph represents the theoretical mean of the measurements. It makes sense for this dot to be at the middle: over time, the mean of the measurements should approach the true value.

The reason for the orange circle is less clear. It is intuitive to think that with more and more measurements, we will eventually eliminate all variation in the data. But if the measurement is inherently variable, there will always be a standard deviation in the data, no matter how many measurements are taken. However, the standard error of the mean \(\sigma/\sqrt{n}\) will go to 0 and can help determine a confidence interval for the mean value of the measurement.

As for the actual radius of the circle, this comes from the standard deviation for a uniform distribution. This uniform distribution ranges within 5% of the true variable value, so we have $$\sqrt{\frac{[(0.05)-(-0.05)]^2}{12}} = 0.028868$$ Since this is about half of 5%, the radius of the circle is about half of the distance from the center of the graph to an edge.

Standard deviations reported in the graphs and the analysis are sample standard deviations, not population. The exception is the theoretically calculated standard deviations that are not taken from a data set, such as those represented by orange lines.