Measures of Dispersion



Overview of dispersion measures in statistics.

Measures of dispersion are a means of describing the spread of a certain amount of data or distribution. They include range, variance, deviation, coeficient of variation and so on.

measures of dispersion

Range


The range measure is basically the absolute difference between the lowest and the highest values in a data set. Therefore, to obtain this result, only the minimum and the maximum values are needed.

For example, given a list of elements: $17,10,9,21,14,13,18,12,8,20,15$

To make to process more visually intuitive, lets firstly sort this list: $8,9,10,12,13,14,15,17,18,20,21$

The next step is to find the minimum and the maximum values, which are $8$ and $21$, respectivelly.

Finally, subtract the minimum from the maximum.

$$ \large max - min = 21 - 8 = 13 $$

Another very well know range measure is the interquartile range, which basically substitute the minimum and the maximum values with the $Q_1$ and $Q_3$, respectively. Using the same value list from the latest example, we have:

$$ \large Q_3 - Q_1 = 18 - 10 = 8 $$

Variance and Deviation


The mean absolute deviation, variance and the standard deviation are a kind of measure based on the dissimilarity of each element value in relation to the arithmetic mean value ($\large \mu$ or $\large \overline{x}$) of the data set. A very important point here is to distinguish the sample measure from the population measure:

SamplePopulation
Mean Absolute Deviation$$\large MAD=\frac{\sum \mid x_i - \overline{x} \mid}{n}$$$$\large MAD=\frac{\sum \mid x_i - \mu \mid}{n}$$
Variance$$\large s^2=\frac{\sum (x_i - \overline{x})^2}{n-1}$$$$\large \sigma^2=\frac{\sum (x_i - \mu)^2}{n}$$
Standard Deviation$$\large s=\sqrt{\frac{\sum (x_i - \overline{x})^2}{n-1}}$$$$\large \sigma=\sqrt{\frac{\sum (x_i - \mu)^2}{n}}$$

where:

For example, given a list of ages of randomly selected voters: $19, 21, 34, 20, 55, 43, 22, 36$

Firstly, lets calculate the arithmetic mean:

$$ \large \overline{x} = \frac{\sum x_i}{n} = \frac{19+21+34+20+55+43+22+36}{8} = \frac{240}{8} = 30 $$

Now we are able to calculate the Mean Absolute Deviation:

$$ \begin{align} MAD&=\frac{\sum \mid x_i - \overline{x}\mid }{n} \\ MAD&=\frac{\mid 19 - 30 \mid + \mid 21 - 30 \mid + \mid 34 - 30 \mid + \mid 20 - 30 \mid + \mid 55 - 30 \mid + \mid 43 - 30 \mid + \mid 22 - 30 \mid + \mid 36 - 30 \mid }{8} \\ MAD&=\frac{84}{8} \\ MAD& = 10.5 \end{align} $$

Considering our data as a sample, the variance value would be:

$$ \begin{align} s^2&=\frac{\sum (x_i - \overline{x})^2}{n-1} \\ s^2&=\frac{(19 - 30)^2 + (21 - 30)^2 + (34 - 30)^2 + (20 - 30)^2 + (55 - 30)^2 + (43 - 30)^2 + (22 - 30)^2 + (36 - 30)^2}{8 - 1} \\ s^2&=\frac{1192}{7} \\ s^2& \approx 170.29 \end{align} $$

Having that, the standard deviation is calculated as the square root of the variance.

$$ \large s = \sqrt{s^2} = \sqrt{170.29} \approx 13.05 $$

For another example, lets take the grades of all the 8 students of a class: $7.5, 8, 7, 9.5, 9, 8.5, 7.5, 7$

Now, lets calculate the arithmetic mean:

$$ \large \mu = \frac{\sum x_i}{n} = \frac{7.5+8+7+9.5+9+8.5+7.5+7}{8} = \frac{64}{8} = 8 $$

The mean absolute deviation would be:

$$ \begin{align} MAD&=\frac{\sum \mid x_i - \mu \mid }{n} \\ MAD&=\frac{\mid 7.5 - 8 \mid + \mid 8 - 8 \mid + \mid 7 - 8 \mid + \mid 9.5 - 8 \mid + \mid 9 - 8 \mid + \mid 8.5 - 8 \mid + \mid 7.5 - 8 \mid + \mid 7 - 8 \mid }{8} \\ MAD&=\frac{6}{8} \\ MAD& = 0.75 \end{align} $$

Considering our data represents the whole population, the variance value would be:

$$ \begin{align} \sigma^2&=\frac{\sum (x_i - \mu)^2}{n} \\ \sigma^2&=\frac{(7.5 - 8)^2 + (8 - 8)^2 + (7 - 8)^2 + (9.5 - 8)^2 + (9 - 8)^2 + (8.5 - 8)^2 + (7.5 - 8)^2 + (7 - 8)^2}{8} \\ \sigma^2&=\frac{6}{8} \\ \sigma^2& = 0.75 \end{align} $$

Having that, the standard deviation is calculated as the square root of the variance.

$$ \large \sigma = \sqrt{\sigma^2} = \sqrt{0.75} \approx 0.87 $$

Coeficient of Variation


The coeficient of variation (or relative standard deviation) is basically a measure of the extent of variability in relation to the absolute mean value ($\large |\mu|$). In other words, how far from the average the data points are. It can be simply defined as the ration between the standard deviation and the mean.

$$ \large c_v = \frac{\sigma}{\mu} $$

Given the same grades of all the 8 students of a class from our latest example: 7.5,8,7,9.5,9,8.5,7.5,7

We already know that the standard deviation $\large \sigma$ and absolute mean $\large \mu$ are 0.87 and 8, respectively. In this way, we have:

$$ \large c_v = \frac{\sigma}{\mu} = \frac{0.87}{8} \approx 0.11 $$