Suppose we want to estimate a quantity Q which is a function of variables x,y,z,... We write, \(~~~~~~~~~~~~~~~~~~~~~\small{Q = f(x,y,z,...)}\) In the previous section, we derived an expression for the uncertainity df of a function \(\small{f(x,y,z,...)}\) in terms of errors (dx, dy, dz,...) in the variables as, \(\small{ dQ~=~ \dfrac{\partial f}{\partial x} dx~+~\dfrac{\partial f}{\partial y} dy~+~\dfrac{\partial f}{\partial z} dz ~+~....}\) where \(\small{dx, dy, dz, ... }\) are the actual erros on the quantities.
In general, we may not get the values of actual errors on measured quantities, since their correct values are not known. For each variable, we generally get a mean value and its standard deviation computed from certain number of samples. We can use the standard deviations of dependent variables as a measure of their uncertainity .
The question is, given the standard deviations of the variables x, y, z,..., can we estimate the standard deviation in Q as a measure of the uncertainity on its computed value?. We will derive a methodology that uses the standard deviations \( \sigma_x, \sigma_y, \sigma_z, ....\) of the underlying distributions.
If we want to skip the derivation, we can jump to the summary formula box that follows to see the result and proceed from that point.
Derivation of a generalized error propagation formula: We start with the function, \(\small{Q = f(x,y,x,...)}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(1) \) We already know that \(\small{ dQ~=~ \dfrac{\partial f}{\partial x} dx~+~\dfrac{\partial f}{\partial y} dy~+~\dfrac{\partial f}{\partial z} dz ~+~....}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(2) \) Let \(\small{x_i, y_i, z_i, ... }\) denote the individual samples from their respective population Let \(\small{\overline{x}, \overline{y}, \overline{z},... }\) denote the mean values over the entire populations. That is, the sample size N is of entire populations or \(~~\small{N -> \infty ~~}\) when sampled from a distribution. Throuout this derivation, this is true of the sample size N. We have, for a single set of data points \(\small{x_i, y_i, z_i, ... }\), \(\small{Q_i = f(x_i, y_i, z_i) }\) We also assume that the best estimate of Q is when the variables have their average values (this may not be true always). We can write, \(\small{\overline{Q} = f(\overline{x}, \overline{y}, \overline{z}) }\) We can write, \(\small{dQ = Q_i - \overline{Q},~~~~dx = x_i -\overline{x},~~~~dy=y_i-\overline{y},~~~~dz = z_i-\overline{z} }\) Substituting these expressions in equation (2) we get, \(\small{dQ~=~Q_i-\overline{Q}~=~(x-x_i)\dfrac{\partial f}{\partial x}~+~(y-y_i)\dfrac{\partial f}{\partial y}~+~(z-z_i)\dfrac{\partial f}{\partial z}~+~....}~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(3) \) Using these, we will write down the expression for the variance \(\small{\sigma^2_Q }\) in Q: \(\sigma_Q^2~=~\small{\dfrac{1}{N}\sum_\limits {i=1}^N (Q_i - \overline{Q})^2 }\) \(~~~~~~=~\small{\dfrac{1}{N} \sum_\limits{i=1}^N \left( (x_i - \overline{x})\dfrac{\partial f }{\partial x } + (y_i - \overline{y})\dfrac{\partial f }{\partial y } + (z_i - \overline{z})\dfrac{\partial f }{\partial z } + \right )^2 }\) \(~~~~~~~ \begin{split} ~=~~\small{ \dfrac{1}{N} \sum_\limits{i=1}^N \left( (x_i - \overline{x})^2\left(\dfrac{\partial f }{\partial x }\right)^2 + (y_i - \overline{y})^2\left(\dfrac{\partial f }{\partial y }\right)^2 + (z_i - \overline{z})^2\left(\dfrac{\partial f }{\partial z }\right)^2 \\ \quad + 2(x_i - \overline{x})(y_i - \overline{y})\left(\dfrac{\partial f}{\partial x}\right) \left(\dfrac{\partial f}{\partial y}\right) + 2(y_i - \overline{y})(z_i - \overline{z})\left(\dfrac{\partial f}{\partial y}\right) \left(\dfrac{\partial f}{\partial z}\right) \\ \quad + 2(z_i - \overline{z})(x_i - \overline{z})\left(\dfrac{\partial f}{\partial z}\right) \left(\dfrac{\partial f}{\partial x}\right)~+~....... \right )~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(4) \\ } \end{split} \) We know that, \( \sigma_x^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^{N}(x_i - \overline{x})^2 ~~~~~~~~ }\) Variance in x \( \sigma_y^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^{N}(y_i - \overline{y})^2~~~~~~~~ }\) Variance in y \( \sigma_z^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^{N}(z_i - \overline{z})^2 ~~~~~~~~ }\) Variance in z \( \sigma_{xy}^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^N (x_i - \overline{x})(y_i - \overline{y})~~~~ }\) Covariance between x and y. \( \sigma_{yz}^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^N (y_i - \overline{y})(z_i - \overline{z})~~~~~~ }\) Covariance between y and z \( \sigma_{zx}^2 = \small{ \dfrac{1}{N} \sum\limits_{i=1}^N (z_i - \overline{z})(x_i - \overline{x})~~~~~~ }\) Covariance between z and x ................... similarly for other variables ............................ Substituting the above expressions into equation(4), we get, \(\small{ \sigma_Q^2~=~\sigma_x^2 \left( \dfrac{\partial f}{\partial x} \right)^2 + \sigma_y^2 \left( \dfrac{\partial f}{\partial y} \right)^2 + \sigma_z^2 \left( \dfrac{\partial f}{\partial z} \right)^2 + \sigma_{xy}^2 \left(\dfrac{\partial f}{\partial x} \right) \left(\dfrac{\partial f}{\partial y} \right) + \sigma_{yz}^2 \left(\dfrac{\partial f}{\partial y} \right) \left(\dfrac{\partial f}{\partial z} \right) + \sigma_{zx}^2 \left(\dfrac{\partial f}{\partial z} \right) \left(\dfrac{\partial f}{\partial x} \right) +... } \) \(~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~-------- (5) \) If the variables x,y and z etc. are independent of each other, their pairwise covariance terms \(\small{\sigma_{xy}^2, \sigma_{yz}^2, \sigma_{zx}^2 }\) etc., will approach zero, since the equally probabable positive and negative differences will add to zero. In this case, the above expression for the variance in Q reduces to, \(\small{ \sigma_Q^2~\approx~\sigma_x^2 \left( \dfrac{\partial f}{\partial x} \right)^2 + \sigma_y^2 \left( \dfrac{\partial f}{\partial y} \right)^2 + \sigma_z^2 \left( \dfrac{\partial f}{\partial z} \right)^2 + .... }~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(6) \) Thus, in a formula \(\small{Q = f(x,y,z,...) }\), the uncertainities in terms of variances of x, y, z, ... add quardratically to give the variance in the computed quantity Q.
We summarize the important reuslt from the above derivation here:
Let \(~~\small{Q~=~ax \pm by }~~\), where a and b are constants
Then, \(~~~\small{\left(\dfrac{\partial Q}{\partial x}\right)~=~a,~~~~~~ \left(\dfrac{\partial Q}{\partial y}\right)~=~\pm b }\)
The propagation formula \(~~\small{\sigma_Q^2 = \sigma_x^2 \left(\dfrac{\partial Q}{\partial x}\right)^2 + \sigma_y^2 \left(\dfrac{\partial Q}{\partial y}\right)^2 }~~\) becomes,
(i) \(~~\) Let \(\small{Q~=~a x y,~~ }\) where a is a constant We write,\(~~~\small{\left(\dfrac{\partial Q}{\partial x}\right)=ay, }~~~\)\(~~~\small{\left(\dfrac{\partial Q}{\partial y}\right)=ax }\) The propagation formula \(~~\small{\sigma_Q^2 = \sigma_x^2 \left(\dfrac{\partial Q}{\partial x}\right)^2 + \sigma_y^2 \left(\dfrac{\partial Q}{\partial y}\right)^2 }~~\) becomes, \(\small{ \sigma_Q^2~=~\sigma_x^2 a^2 y^2 + \sigma_y^2 a^2 x^2 }\) Dividing throught by Q, we get
Consider the exponential relation \(~~\small{Q = ae^{\pm bx}} \)
\(\small{\dfrac{\partial Q}{\partial x} = \pm ab e^{\pm bQ} = \pm bQ }\)
\(\small{\sigma_Q^2 = \sigma_x^2 \left(\dfrac{\partial Q}{\partial x}\right)^2 = \sigma_x^2 (\pm bQ)^2 = \sigma_x^2 b^2Q^2 }\)
In the case when the constant raised to the poer is not exponential e, we can estimate error propagation by the following manipulation:
Let \(~~\small{Q~=~a^{\pm bx} }\)
Writing \(\small{a}\) as \(\small{e^{log(a)} }\), we get
\(~~~\small{Q~=~(e^{log(a)})^{\pm bx}~=~e^{\pm(b~ log(a)) x} }\)
This is an exponential form. Using the previously derived formmula for propagation of errors in exponential function, we can write,
Let \(~~\small{Q~=~a~log(x)}\)
We get \(~~\small{ \dfrac{\partial Q }{\partial x} = \dfrac{a}{x} }\)
With this,\(~~\small{\dfrac{\sigma_Q^2}{Q^2} = \sigma_x^2 \left (\dfrac{\partial Q }{\partial x}\right )^2~=~ a^2 \dfrac{\sigma_x^2}{x^2} }\)