Biostatistics with R

Wilcoxon signed rank test for one variable

The Wilcoxon signed rank test for one variable is a nonparamteric eqivalent to the one sample t test. The t test is used for a continuous variable which is either assumed to follow a Gaussian distribution or near Gaussian distributions, which can be taken care with the help of central limit theorem. Wicoxon test is used when we do not know the shape of the underlying distribution.

Since the median is very robust against the outlier when compared to the mean, Wilcoxon signed rank test tests for the median value of the population.

The basic assumptions for the test are:

$~~~~~~~~(i)~~$ The data is randomly drawn from a continuous distribution

$~~~~~~~~(ii)~~$ The measurements can be ordered.

Let our random variable X be continuous with a parent distribution whose median is $m$.

In the Wilcoxon signed rank test, we test the null hypothesis $H_0:~m=m_0$ against the three alternate hypothesis $H_A: m \neq m_0~~$, $H_A: m \gt m_0~$ and $~H_A: m \lt m_0$.

For testing the hypothesis on popultion median, we do not compute the median of sample data as a statistic. Hence the name "nonparametric". Instead, we find the absolute deviations $\left|{x_i - m_0}\right|$ of each data point $x_i$ from the median value $m_0$ tested and rank these difference values to get a test statistic.



$~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$ Procedure for computing test statistic

We have n random observations $x_1, x_2, x_3, .....,x_n$ of the variable X.

1. Compute the deviations $(x_i - m_0)$ of each data point from the test median value $m_0$ to get $~~~~~~~~~~~(x_1 - m_0), (x_2 - m_0),............,(x_n - m_0)$
These differences can be positive ($x_i \gt m_0$), negative ($x_i \lt m_0)$ or sometimes zero $(x_i=m_0)$. Ignore the zero differences in the calculations ahead.

2. Find the absolute values of these devitions:
\(\small{~~~~~~~~~~~\left|{x_1 - m_0}\right|, \left|{x_2 - m_0}\right|,............,\left|{x_n - m_0}\right|}\)

3. Rank these absolute values in ascending order. Let \(\small{R_i}\) be the rank of absolute value \(\small{\left|{x_i - m_0}\right|}\). If the absolute difference from two or more observations are equal, each difference is assigned the average of the corresponding ranks.
Thus the sequence $R_1, R_2, R_3, ....., R_n$ is a permutation of positive integers 1,2,3,....,n.
For example, $R_1=4, R_2=7, R_3=9,.......,R_n=1$.

4. Now, with each rank $R_i$, associate the sign of the difference $x_i - m_0$.
For example, if \(\small{(x_1-m_0) \gt 0}\), then $R_1$ is positive.
If \(\small{(x_3-m_0) \lt 0 }\), then $R_3$ is negative.
Thus \(\small{R_1, R_2, ...., R_n}\) have become the signed ranks of the deviations from median value tested.

5. Compute the Wilcoxon test statistics W by adding the positive and negative signed ranks separately:

\(~~~~~~~~~~~~~~~~~~\small{W+~=~Sum~of~positive~ranks }~~~\)

\(~~~~~~~~~~~~~~~~~~\small{W-~=~Sum~of~negative~ranks }~~~\)

We take the minimum among W+ and W- as the test statistic.\(\small{~~W~=~min(W+,W-) }\)

(Chossing W+ or W- for test statistic will lead to same result).

The value of W varies from a minimum of zero to a maximum of \(\small{\dfrac{n(n+1)}{2}}\).

Also, \(~~~\small{\left|W+\right|~+~\left|W-\right|~=~\dfrac{n(n+1)}{2}},~~~ \) which is the sum of n integers 0,1,2,3,...,n.

6. Knowing the probability distribution of W, we can determine a p-vlue of w under the null hypothesis.




$~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~$ Hypothesis testing

If the null hypothesis \(\small{H_0:~m = m_0 }\) is true, we have drawn data points from a distribution whose median is \(\small{m_0}\). In this case, about half of the oberved differences \(\small{x_i-m_0 }\) would be positive and about half of them negative. Consequently, about half of the ranks \(\small{R_i}\) values will be positive and about half of them negative.

The value of W varies from a minimum of zero (no W+ values) to \(\small{\dfrac{n(n+1)}{2}}\), which is the sum of n integers 0,1,2,,,,,n when all of them are positive.

If in reality $m \gt m_0$, we will be randomly drawing data points from a distribution whose median is greater tham $m_0$, making most of the deviations \(\small{x_i-m_0 }\) and hence the ranks \(\small{R_i}\) positive, resulting in a W+ that is too large compared to W-. Thus, for an alternate hypothesis \(\small{H_A: m \gt m_0}\), we reject the null \(\small{H_0}\) if \(\small{W- \leq w_c}\) where \(\small{w_c}\) is the critical value of the test.

If $m \lt m_0$, we will be randomly drawing data points from a distribution whose median is less tham $m_0$, making most of the deviations \(\small{x_i-m_0 }\) and hence the ranks \(\small{R_i}\) negative, resulting in a W+ that is too small compared to W-. Thus, for an alternate hypothesis \(\small{H_A: m \lt m_0}\), we reject the null \(\small{H_0}\) if \(\small{W+ \leq w_c}\) where \(\small{w_c}\) is the critical value of the test.

For a two sided alternate hypothesis \(\small{H_A: m \neq m_0}\), we reject the null \(\small{H_0}\) if \(\small{W \leq w_c}\).

In order to compute the critical values like $w_c$, we need the probability distribution W. This can be computed by two methods:


Method-1 : Direct computation :

For a sample size n, the signed rank sum W is the summation of integers $1,2,3,...n$ after randomly assigning a minus of plus sign with equal probability to each one of them. We can compute all the possible values of W for a given n,thus getting the probability distribution of W. From this, we can directly compute the probability of gettting W above or below a critical value.

For example, let $n=2$. For this, W is a sum of two numbers (1,2) each with two possible signs (+,-) assigned randomly.

This gives 4 possible values of W namely, {(-1-2),(1-2),(-1+2),(1+2)} or {-3,-1,1,3}. Since their probability should sum to 1, their probability distribution is $P(-3)=\dfrac{1}{4},~~P(-1)=\dfrac{1}{4},~~P(1)=\dfrac{1}{4},~~P(3)=\dfrac{1}{4}$.

Similarly, for a given n, the P(W) can be computed for all possible values of W using algorithms implemented in computer programs. Some algorithms comput the same distribution P(W) using statistical methods like moment generating functions. Finally a critical value table for Wilcoxon signed rank tests is created, like the one found here. For a sample size n used in the test, the table gives the smallest rank total \(\small{W_c}\) for which the probablity level is equal to or less than a statistical significance $\alpha$. Our observed statistic W is statistically significant if it is equal to or less than this rank total \(\small{W_c }\). In other words, we reject the null if \(\small{W \leq W_c}\)

Summary of decisionon rules for the critical value table:

Compute \(\small{W+,~W-}\) values from the data.

Let \(\small{W~=~min(W+,W-)}\) be the test statistic.

For a given sample size n and \(\small{\alpha}\) value, compute the critical value \(\small{w_c}\) from the table mentioned above.

For an alternative hypothesis \(\small{H_A:~m \neq m_0},~~\) reject \(\small{H_0:~m=m_0~~ }\) when \(~~\small{W \leq w_c }\).

For an alternative hypothesis \(\small{H_A:~m \gt m_0},~~\) reject \(\small{H_0:~m=m_0 }~~\) when \(~~\small{W- \leq w_c }\).

For alternative hypothesis \(\small{H_A:~m \lt m_0},~~\) reject \(\small{H_0:~m=m_0 ~~}\) when \(~~\small{W+ \leq w_c }\).


Method-2 : Large sample approximation :

When the number of samples n is large, a normal approximation can be obtained for W using central limit theorem. It is given by,

\(\small{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~Z~=~\dfrac{W~-~\dfrac{n(n+1)}{4}}{\sqrt{\dfrac{n(n+1)(2n+1)}{24}}}~\approx N(0,1)},~~~~\) a unit normal distibution.

The above approximation is good when $n \sim 10$ or above. When the value of n is lower, it is a poor approximation.


Correction for the ties :

In the case of ties (same numbers repeared two or more times), we replaced the ranks of repeated numbers by their mean values. The ties thus affect the standard deviations of the statistic W.

A correction due to ties is applied to reduce the standard deviation term in the denominator of Z statistics described above. The corrected Z statistic is given by,

\(\small{~~~~~~~~~~~~~Z~=~\dfrac{W-\dfrac{n(n+1)}{4}}{\sqrt{\dfrac{n(n+1)(2n+1)}{24}~-~\displaystyle\sum_t\dfrac{t^3 - t}{48}}} }\)

The correction term is given by \(\small{\displaystyle\sum_t\dfrac{t^3 - t}{48} }\), where t takes the value of number of repeats in each instance.

For example, consider the following data set which is arranged in the ascending order:

X = {7.9,8.3,8.9,8.9,10.5,10.8,11.6,11.6,12.9,13.5,14.9,16.2,16.2,16.2,17.9,23.2,25.6,25.9,30.2 }

In this data, 8.9 repeats twice, 11.6 repeats twice and 16.2 repeats thrice. The correction term therefore is,

\(\small{ \displaystyle\sum_t\dfrac{2^3 - 2}{48}~+~\displaystyle\sum_t\dfrac{2^3 - 2}{48}~+~\displaystyle\sum_t\dfrac{3^3 - 3}{48}~=~ 0.75 }\)

Since the above data set has 19 data points,

\(\small{\dfrac{n(n+1)(2n+1)}{24}~=~\dfrac{19(20)(39)}{24}~= 617.5 }\)

We see that the correction is negligible when compared to the value of variance without correction when n is large and there are few instances of smaller repeats. This can be included if necessary.




Example-1 :

The amount of oil extracted from certain quantity of a ground nut is assumed to follow a continuous distribution. In an experiment, 16 random samples of seeds were drawn and the oil was extracted. The amount of oil (in liters) are given below:

\(\small{~~~~~\{~176.9,~158.3,~152.1,~158.8,~172.4,~169.8,~159.7,~162.7,~156.6,~174.5,~184.4,~165.2,~147.8,}\)
\(\small{~~~~~~~177.8,~160.1,~161.5~\} }\)

Test the sample with the null hypothesis for median value \(\small{H_0 : m=160}\) against an alternate hypothesis \(\small{H_A: m \gt 160}\) using \(\small{\alpha = 0.05}\) for a significant level in the test.


Let us compute the Wilcoxon test statistic for the signed ranks. We will create the following table:

___________________________________________________________________________________

$~~~~~~~~x_i~~~~~~~~~~~x_i-160~~~~~~~~~~~\left|x_i-160\right|~~~~~~~~~~~~Rank~~~~~~~~Signed~Rank$
___________________________________________________________________________________

$~~~~~~176.9~~~~~~~~~~~16.9~~~~~~~~~~~~~~~~~16.9~~~~~~~~~~~~~~~~~~~14~~~~~~~~~~~~~~~~~~14~~~~~~~~$
$~~~~~~158.3~~~~~~~~~-1.7~~~~~~~~~~~~~~~~~~~1.7~~~~~~~~~~~~~~~~~~~~5~~~~~~~~~~~~~~~~-5~~~~$
$~~~~~~152.1~~~~~~~~~-7.9~~~~~~~~~~~~~~~~~~~7.9~~~~~~~~~~~~~~~~~~~~9~~~~~~~~~~~~~~~~-9~~~~~~~~$
$~~~~~~155.8~~~~~~~~~-1.2~~~~~~~~~~~~~~~~~~~1.2~~~~~~~~~~~~~~~~~~~~3~~~~~~~~~~~~~~~~-3~~~~~~~~$
$~~~~~~172.4~~~~~~~~~~~12.4~~~~~~~~~~~~~~~~~12.4~~~~~~~~~~~~~~~~~~~12~~~~~~~~~~~~~~~~~~~12~~~~~~~~~$
$~~~~~~169.8~~~~~~~~~~~~~9.8~~~~~~~~~~~~~~~~~~~9.8~~~~~~~~~~~~~~~~~~~10~~~~~~~~~~~~~~~~~~10~~~~~~~~~~~$
$~~~~~~159.7~~~~~~~~~-0.3~~~~~~~~~~~~~~~~~~~0.3~~~~~~~~~~~~~~~~~~~~2~~~~~~~~~~~~~~~~-2~~~~$
$~~~~~~162.7~~~~~~~~~~~~~2.7~~~~~~~~~~~~~~~~~~~2.7~~~~~~~~~~~~~~~~~~~~6~~~~~~~~~~~~~~~~~~~~~6~~~~~~~~~~~~~~~~$
$~~~~~~156.6~~~~~~~~~-3.4~~~~~~~~~~~~~~~~~~~3.4~~~~~~~~~~~~~~~~~~~~7~~~~~~~~~~~~~~~~-7~~~~~~~~$
$~~~~~~174.5~~~~~~~~~~~14.5~~~~~~~~~~~~~~~~~14.5~~~~~~~~~~~~~~~~~~~13~~~~~~~~~~~~~~~~~~~13~~~~~~~$
$~~~~~~184.4~~~~~~~~~~~24.4~~~~~~~~~~~~~~~~~24.4~~~~~~~~~~~~~~~~~~~16~~~~~~~~~~~~~~~~~~~16~~~~~~~$
$~~~~~~165.2~~~~~~~~~~~~~5.2~~~~~~~~~~~~~~~~~~~5.2~~~~~~~~~~~~~~~~~~~~8~~~~~~~~~~~~~~~~~~~~~8~~~~~~~~$
$~~~~~~147.8~~~~~~~~~-12.2~~~~~~~~~~~~~~~12.2~~~~~~~~~~~~~~~~~~~11~~~~~~~~~~~~~~~~-11~~~$
$~~~~~~177.8~~~~~~~~~~~17.8~~~~~~~~~~~~~~~~~17.8~~~~~~~~~~~~~~~~~~~15~~~~~~~~~~~~~~~~~~~~15~~~~~~~~~$
$~~~~~~160.1~~~~~~~~~~~~~0.1~~~~~~~~~~~~~~~~~~~0.1~~~~~~~~~~~~~~~~~~~~~1~~~~~~~~~~~~~~~~~~~~1~~~~~~~$
$~~~~~~161.5~~~~~~~~~~~~~1.5~~~~~~~~~~~~~~~~~~~1.5~~~~~~~~~~~~~~~~~~~~~4~~~~~~~~~~~~~~~~~~~~4~~~~$
______________________________________________________________________________________

We compute the signed ranks, positive and negative:

\(\small{~~~~W+~=~14+12+10+6+13+16+8+15+1+4~=~99}\)

\(\small{~~~~W-~=~5+9+3+2+7+11~=~37}\)

As expected, \(\small{\left|W+\right|+\left|W-\right|~=~99+37~=136~=~\dfrac{16(16+1)}{2} }\)

We get the test statistic, which is the minimum among the two signed ranks:\(~~~\small{~~W~=~min(W+,W-)~=min(99,37)~=~37}\).

Since we have n=16 data points, we can compute the Z statistics under large sample approximation for w=99 to get,

\(\small{~~~~~~~~~~~~~Z~=~\dfrac{W-\dfrac{n(n+1)}{4}}{\sqrt{\dfrac{n(n+1)(2n+1)}{24}}}~=~\dfrac{37-\dfrac{16(17+1)}{4}}{\sqrt{\dfrac{16(17)(33)}{24}}} =~-1.603}\)

( Note:~If we use \(\small{W-~=~99}\) in the above expression, we get a symmetric value of \(\small{Z~=~1.603}\) and the final conclusions will be identical. )

The critical value for the test at significance of $\alpha=0.05$ can be obtained from unit Gaussian table as,

\(\small{Z_c~=~Z_{\alpha}~=~Z_{0.05}~=~-1.645 }\)

Since the Z statistic value of -1.603 is inside the critical region defined by $Z_c=1.645$ (though marginally), we fail to reject the null hypothesis to conclude that for the given sample size, the data supports the null hypothesis $m=160$.

We also compute the p-value for Z=1.603 using the R function call "pnorm(-1.603)" to be 0.0544, which (marginally) fails to reject the null hypothesis at a significance level of $\alpha=0.05$.


For this test, we can also get the critical value using the above mentioned Wiloxon table. From this table, for a sample size $n=16$, we get a vlue of 35 as the smallest rank total for which the probability is less than or equal to \(\small{\alpha=0.05}\).

Since \(\small{W-~=~37 }\) is more than the critical value of 35, we fail to reject the null hypothesis against the alternate hypothesis \(\small{H_A: m \gt 160}\) at a significant level of 0.05.

R-scripts


The Wilcoxon family of tests can be performed in R using the wilcox.test() function.

Important functon parameters are:

wilcox.test(x, y = NULL, alternative = c("two.sided", "less", "greater"),
                 mu = 0, paired = FALSE, exact = NULL, correct = TRUE,
		 conf.int = FALSE, conf.level = 0.95, ...) 

 x, y  -------> two data vectors. y=NULL makes it a single vriable test.

 alternative  ------>  type of alterntive hypothesis. 

 mu  ------> population mean to be compare. zero by default

paired   -----> In the case of two sample test, are the samples paired or not?

exact  ------>  a logical indicating whether an exact p-value should be computed.
                                          By default, exact=NULL

 correct  ---- a logical indicating whether to apply continuity correction
          in the normal approximation for the p-value.

 conf.int  ---- a logical indicating whether confidence intervl should be computed.

 conf.level  ---- confidence level for the test.





## Wilcoxon signed rank test for one variable x = c(176.9, 158.3, 152.1, 158.8, 172.4, 169.8, 159.7, 162.7, 156.6, 174.5, 184.4, 165.2, 147.8, 177.8, 160.1, 161.5) res = wilcox.test(x, mu=160, conf.level=0.95, alternative=c("greater"), conf.interval=TRUE ) print(res)

Executing the above script in R prints the following output.
Wilcoxon signed rank test data: x V = 99, p-value = 0.05833 alternative hypothesis: true location is greater than 160