Delta method
|
|
It has been suggested that Taylor expansions for the moments of functions of random variables be merged into this article. (Discuss) Proposed since March 2015. |
In statistics, the delta method is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.
Contents
Univariate delta method[edit]
While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, if there is a sequence of random variables Xn satisfying
where θ and σ2 are finite valued constants and
denotes convergence in distribution, then
for any function g satisfying the property that g′(θ) exists, is non-zero valued, and is polynomially bounded with the random variable.[1]
Proof in the univariate case[edit]
Demonstration of this result is fairly straightforward under the assumption that g′(θ) is continuous. To begin, we use the mean value theorem:
where
lies between Xn and θ. Note that since
and
, it must be that
and since g′(θ) is continuous, applying the continuous mapping theorem yields
where
denotes convergence in probability.
Rearranging the terms and multiplying by
gives
Since
by assumption, it follows immediately from appeal to Slutsky's Theorem that
This concludes the proof.
Proof with an explicit order of approximation[edit]
Alternatively, one can add one more step at the end, to obtain the order of approximation:
This suggests that the error in the approximation converges to 0 in probability.
Multivariate delta method[edit]
By definition, a consistent estimator B converges in probability to its true value β, and often a central limit theorem can be applied to obtain asymptotic normality:
where n is the number of observations and Σ is a (symmetric positive semi-definite) covariance matrix. Suppose we want to estimate the variance of a function h of the estimator B. Keeping only the first two terms of the Taylor series, and using vector notation for the gradient, we can estimate h(B) as
which implies the variance of h(B) is approximately
One can use the mean value theorem (for real-valued functions of many variables) to see that this does not rely on taking first order approximation.
The delta method therefore implies that
or in univariate terms,
Example[edit]
|
|
This section needs attention from an expert in statistics. The specific problem is: this is a poor example to use; for any finite n, the variance of log(X_n/n) does not actually exist (since X_n can be zero); it does not make sense to talk about approximating something that does not exist in the first place. (October 2015) |
Suppose Xn is Binomial with parameters
and n. Since
we can apply the Delta method with g(θ) = log(θ) to see
Hence, the variance of
is approximately
Note that since p>0,
as
, so with probability one,
is finite for large n.
Moreover, if
and
are estimates of different group rates from independent samples of sizes n and m respectively, then the logarithm of the estimated relative risk
is approximately normally distributed with variance that can be estimated by
This is useful to construct a hypothesis test or to make a confidence interval for the relative risk.
Note[edit]
The delta method is often used in a form that is essentially identical to that above, but without the assumption that Xn or B is asymptotically normal. Often the only context is that the variance is "small". The results then just give approximations to the means and covariances of the transformed quantities. For example, the formulae presented in Klein (1953, p. 258) are:
where hr is the rth element of h(B) and Biis the ith element of B. The only difference is that Klein stated these as identities, whereas they are actually approximations.
See also[edit]
- Taylor expansions for the moments of functions of random variables
- Variance-stabilizing transformation
References[edit]
- ^ Oehlert, G. W. (1992). A note on the delta method. The American Statistician, 46(1), 27-29.
- Casella, G. and Berger, R. L. (2002), Statistical Inference, 2nd ed.
- Cramér, H. (1946), Mathematical Methods of Statistics, p. 353.
- Davison, A. C. (2003), Statistical Models, pp. 33–35.
- Greene, W. H. (2003), Econometric Analysis, 5th ed., pp. 913f.
- Klein, L. R. (1953), A Textbook of Econometrics, p. 258.
- Oehlert, G. W. (1992), A Note on the Delta Method, The American Statistician, Vol. 46, No. 1, p. 27-29. http://www.jstor.org/stable/2684406
- Lecture notes
- More lecture notes
- Explanation from Stata software corporation


![{\sqrt{n}[X_n-\theta]\,\xrightarrow{D}\,\mathcal{N}(0,\sigma^2)},](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F8%2F3%2F0%2F830e1fdb054a0513b3dbd67bb564e615.png)
![{\sqrt{n}[g(X_n)-g(\theta)]\,\xrightarrow{D}\,\mathcal{N}(0,\sigma^2[g'(\theta)]^2)}](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F6%2Fc%2F2%2F6c2860be64f7ec6ebe3785ac1d8ece1c.png)


![\sqrt{n}[g(X_n)-g(\theta)]=g' \left (\tilde{\theta} \right )\sqrt{n}[X_n-\theta].](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2Fa%2Ff%2F5%2Faf5cd82b17980e922db6863be4e3fc39.png)
![{\sqrt{n}[X_n-\theta] \xrightarrow{D} \mathcal{N}(0,\sigma^2)}](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F5%2F5%2F1%2F551cc3e671e261d46bcda1050ec35629.png)
![{\sqrt{n}[g(X_n)-g(\theta)] \xrightarrow{D} \mathcal{N}(0,\sigma^2[g'(\theta)]^2)}.](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F6%2F8%2Fe%2F68ecae947898d12e88ad9c99436bfa45.png)
![\begin{align}
\sqrt{n}[g(X_n)-g(\theta)]&=g' \left (\tilde{\theta} \right )\sqrt{n}[X_n-\theta]=\sqrt{n}[X_n-\theta]\left[ g'(\tilde{\theta} )+g'(\theta)-g'(\theta)\right]\\
&=\sqrt{n}[X_n-\theta]\left[g'(\theta)\right]+\sqrt{n}[X_n-\theta]\left[ g'(\tilde{\theta} )-g'(\theta)\right]\\
&=\sqrt{n}[X_n-\theta]\left[g'(\theta)\right]+O_p(1)\cdot o_p(1)\\
&=\sqrt{n}[X_n-\theta]\left[g'(\theta)\right]+o_p(1)
\end{align}](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2Fa%2F3%2Fb%2Fa3b1f7145e3dcdfb46f7b3f6d8072206.png)





![{\sqrt{n} \left[ \frac{X_n}{n}-p \right]\,\xrightarrow{D}\,N(0,p (1-p))},](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F7%2F7%2F7%2F77702e901106201df589bf940794b4b0.png)
![{\sqrt{n} \left[ \log\left( \frac{X_n}{n}\right)-\log(p)\right] \,\xrightarrow{D}\,N(0,p (1-p) [1/p]^2)}](./?mortyurl=https%3A%2F%2Fweb.archive.org%2Fweb%2F20151018103923im_%2Fhttps%3A%2F%2Fupload.wikimedia.org%2Fmath%2F2%2Fa%2Ff%2F2af8b93383f2d978abc101bcfcc05fe1.png)


