Measures of Central Tendancy
Measures of Central Tendencyare ways of describing the central position of a frequency distribution for a group of data. You can describe this central position using the mean, median, or mode. Which you use will depend on how much and the type of data you collected.
Type of Data for Responding Variable | Best Measure of Central Tendency |
Qualitative | Mode |
Qualitative | Median |
Quantitative | Mean |
Quantitative | Median |
The arithmetic mean is the most commonly used measure of central tendency. The mean is essentially a model of a data set for normally distributed (non-skewed) data. The mean has one main disadvantage: it is particularly susceptible to the influence of outliers. These are values that are unusual compared to the rest of the data set by being especially small or large in numerical value. That’s why we determined the skew.
- DO NOT calculate a mean for skewed data.
- DO NOT calculate a mean from values that are already averages.
- DO NOT calculate a mean when the measurement scale is not linear (i.e. pH units are not measured on a linear scale)
The mean is equal to the sum of all the values in the data set divided
by the number of values in the data set. So, if we have N values in a data set and they have values x1,x2, … ,xn, the sample mean, usually denoted by x̄ (pronounced "x bar"), is:
Calculating the Mean in Sheets
| Calculating the Mean in ExcelOpen Excel and enter your data in columns. You can label the columns if you prefer. To calculate mean:
|
Median
Mode
Median
The median is the middle score for a set of data that has been arranged in order of magnitude. The median is less affected by outliers and skewed data. In order to calculate the median, suppose we have the data below:
65 | 55 | 89 | 56 | 35 | 14 | 56 | 55 | 87 | 45 | 92 |
We first need to rearrange that data into order of magnitude (smallest first):
14 | 35 | 45 | 55 | 55 | 56 | 56 | 65 | 87 | 89 | 92 |
Our median mark is the middle mark - in this case, 56 (highlighted in bold). It is the middle mark because there are 5 scores before it and 5 scores after it. This works fine when you have an odd number of scores, but what happens when you have an even number of scores? What if you had only 10 scores? Well, you simply have to take the middle two scores and average the result. So, if we look at the example below:
65 | 55 | 89 | 56 | 35 | 14 | 56 | 55 | 87 | 45 |
We again rearrange that data into order of magnitude (smallest first):
14 | 35 | 45 | 55 | 55 | 56 | 56 | 65 | 87 | 89 |
Only now we have to take the 5th and 6th score in our data set and average them to get a median of 55.5.
Mode:
The mode is the most frequent score in our data set. On a histogram it represents the highest bar in a bar chart or histogram. You can, therefore, sometimes consider the mode as being the most popular option. An example of a mode is presented below:
Normally, the mode is used for categorical data where we wish to know which is the most common category, as illustrated below:
We can see above that the most common form of transport, in this particular data set, is the bus. However, one of the problems with the mode is that it is not unique, so it leaves us with problems when we have two or more values that share the highest frequency, such as below: