Measures of Central Location



Overview of location measures and central tendency of data.

Measures of location are a means of acquiring and describing the central tendency of a certain amount of data or distribution. The most common are mean, median and mode, despite these may be called as "average" (more formally, a measure of central tendency).

measures of location

Arithmetic Mean


The arithmetic mean (or simply mean or average) can be described as the sum of all measurements divided by the number of observations in the data set.

$$ \large \displaystyle \frac{1}{n} \sum_{i=1}^{n} x_i = \frac{x_1+x_2+\cdots+x_n}{n} $$

For example, given the list of 5 numbers: $5, 87, 45, 32, 1$

The arithmetic mean of this observation would be:

$$ \large \displaystyle \frac{5 + 87 + 45 + 32 + 1}{5} = \frac{170}{5} = 34 $$

Geometric Mean


The geometric mean can be described the nth root of the product of all observations in the data set.

$$ \large \displaystyle \left(\prod_{i=1}^{n} x_i\right)^{\frac{1}{n}} = \sqrt[n]{x_1 \cdot x_2 \cdots x_n} $$

This location measure is valid only for data that are measured absolutely on a strictly positive scale (values grather than zero).

$$ \large \displaystyle \mathbb{Z}_{>0} := \{x \in \mathbb{Z}:x > 0\} $$

For example, given the same list of 5 numbers: $5, 87, 45, 32, 1$

The geometric mean of this observation would be:

$$ \large \displaystyle \sqrt[\leftroot{-2}\uproot{2}5]{5 \cdot 87 \cdot 45 \cdot 32 \cdot 1} = \sqrt[\leftroot{-2}\uproot{2}5]{626400} \approx 14.4 $$

Harmonic Mean


The harmonic mean can be described as the reciprocal arithmetic mean of the reciprocals of the data values. In the same way as the geometric mean, this location measure is valid only for data that are measured absolutely on a strictly positive scale (values grather than zero).

$$ \large \displaystyle \frac{n}{\sum_{i=1}^{n} \frac{1}{x_i}} = \frac{n}{\frac{1}{x_1}+\frac{1}{x_2}+\cdots+\frac{1}{x_n}} $$

For example, given the same list of 5 numbers: $5, 87, 45, 32, 1$

The harmonic mean of this observation would be:

$$ \large \displaystyle \frac{5}{\frac{1}{5}+\frac{1}{87}+\frac{1}{45}+\frac{1}{32}+\frac{1}{1}} = \frac{5}{\frac{8352+480+928+1305+41760}{41760}} = \frac{5}{\frac{52825}{41760}} = \frac{208800}{52825} \approx 3.95 $$

Power Mean


The power mean is a kind of generalized mean that is basically an abstraction of the quadratic, arithmetic, geometric and harmonic means.

$$ \large \displaystyle \left(\frac{1}{n} \sum_{i=1}^{n} x_i^p\right)^{\frac{1}{p}} = \sqrt[p]{\frac{{x_1^p+x_2^p+\cdots+x_n^p}}{n}} $$

The expoent $\large p$ is the parameter that allows us to change its behavior. By choosing different values for the parameter $\large p$, the following types of means are obtained:

$$ \large \displaystyle \begin{align} p &\rightarrow - \infty & \text{minimum value} \\ p &= -1 & \text{harmonic mean} \\ p &\rightarrow 0 & \text{geometric mean} \\ p &= +1 & \text{arithmetic mean} \\ p &\rightarrow + \infty & \text{maximum value} \\ \end{align} $$

Median


The median measure is basically the way to find the middle point of a data set, which means it divides the observations into two halves. The mothod to reach these values follows two basic steps. Firstly, arrange the values in an ascending order (or descending.. it does not make any difference in this case). And finally, gets the middle value of the data. If data has odd number of elements, it is the middle element (or $\frac{n}{2}$th element). If data has even number of elements, it is the mean of the two center data ($\frac{n}{2}$th and $\left[\frac{n}{2} + 1\right]$th).

median

For example, given the list of 5 numbers: $5, 87, 45, 32, 1$

The first step is to sort all the elements: $1, 5, 32, 45, 87$

Finally, get the middle element, which in this case (odd number of elements) is the value $32$.

For another example, lets take a list of 6 numbers: $5, 87, 45, 32, 1, 38$

The first step is to sort all the elements: $1, 5, 32, 38, 45, 87$

Finally, get the center elements, which in this case (even number of elements) are $32$ ($\frac{n}{2}$) and $38$ ($\frac{n}{2} + 1$). Thus, the median value is $35$ ($\frac{32 + 38}{2}$).

Mode


The mode measure is the method to find the most frequent value in a data set. Any set of data can have one or more modes, which it is named as bimodal (2 modes) or multimodal (more than 2 modes). The mode measure is the only central tendency measure that can be used with nominal data, which have purely qualitative category assignments.

For example, given a list of elements: $4, 6, 4, 6, 8, 7, 9, 10, 6$

To make to process more visually intuitive, lets firstly sort this list: $4, 4, 6, 6, 6, 7, 8, 9, 10$

And after, lets build a frequency table:

valuenumber of occurrences
42
63
71
81
91
101

Given that, the mode value is $6$, which has the highest number of occurrences (3).

For another example, lets take a list of nominal elements: Brazil, Argentina, Brazil, Argentina, Chile, Argentina, Chile, Peru, Brazil, Argentina, Brazil

And after, lets build a frequency table:

valuenumber of occurrences
Brazil4
Argentina4
Chile2
Peru1

Given that, the mode are Argentina and Brazil, since they have the same number of occurrences (4). In other words, this measure is bimodal.