What is the measure of central tendency that is used with data that is in categories such as student major gender or country of birth?

When data is normally distributed, the mean, median and mode should be identical, and are all effective in showing the most typical value of a data set.

It's important to look the dispersion of a data set when interpreting the measures of central tendency.

Mean

The mean of a data set is also known as the average value. It is calculated by dividing the sum of all values in a data set by the number of values.

So in a data set of 1, 2, 3, 4, 5, we would calculate the mean by adding the values (1+2+3+4+5) and dividing by the total number of values (5). Our mean then is 15/5, which equals 3. 

Disadvantages to the mean as a measure of central tendency are that it is highly susceptible to outliers (observations which are markedly distant from the bulk of observations in a data set), and that it is not appropriate to use when the data is skewed, rather than being of a normal distribution.

Median

The median of a data set is the value that is at the middle of a data set arranged from smallest to largest.

In the data set 1, 2, 3, 4, 5, the median is 3. 

In a data set with an even number of observations, the median is calculated by dividing the sum of the two middle values by two. So in: 1, 2, 3, 4, 5, 6, the median is (3+4)/2, which equals 3.5.

The median is appropriate to use with ordinal variables, and with interval variables with a skewed distribution.

Mode

The mode is the most common observation of a data set, or the value in the data set that occurs most frequently.

The mode has several disadvantages. It is possible for two modes to appear in the one data set (e.g. in: 1, 2, 2, 3, 4, 5, 5, both 2 and 5 are the modes).

The mode is an appropriate measure to use with categorical data.

Resources

  • Designing and Conducting Health Systems Research Projects: Module 22 (Page 28) of this WHO guide provides instruction on the use of measures of central tendency.

  • Measures of Central Tendency: This webpage gives a concise and easy to follow explanation of the differences between the measures of central tendency, and when each is appropriate to use. It covers the mean, median and mode.

This page is a stub (a minimal version of a page). You can help expand it. Click on Contribute Content or Contact Us to suggest additional resources, share your experience using the option, or volunteer to expand the description.

Central tendency is commonly known as the ‘average’. In more technical terms, it is the ‘most central or representative number in a data set’.

There are various measures of central tendency in psychology that are used in descriptive statistics.

Imagine that you are a first-year university student, and a friend asks you about the ages of people in your psychology course. You'll say: ‘Well, most people are 18, there are a few in their 20’s and two or three over 40.’ You gave the average age or central tendency of 25.

In descriptive statistics, there are three ways to measure central tendency, mean, median, and mode.

The measures of central tendency

Let's take a look at measures of central tendency the mean, median, and mode with examples.

Mean

The mean in everyday terms is ‘average’. It is what you get if you add up all the values in a data set, and then divide by the total number of values.

A data set has the values 2, 4, 6, 8, 10. The mean would be (2+4+6+8+10) ÷ 5 = 6.

Advantages and disadvantages of the mean

Advantages

The mean is a powerful statistic used in population parameters.

Population parameter: When we conduct psychological studies, we use a limited number of participants as it would be impossible to test a whole population. The measures from these participants are measures of a sample (sample statistics) and we use these sample statistics as an estimate and reflection of the general population (population parameter).

These population parameters we derive from the mean can be used in inferential statistics.

The mean is the most sensitive and precise of the three measures of central tendency. This is because it is used on interval data (data measured in fixed units with equal distances between each point on the scale. E.g., the temperature measured in degrees, IQ test). The mean takes into account the exact distances between values in a data set.

Disadvantages

As the mean is so sensitive it can easily be distorted by unrepresentative values (outliers).

A sports coach is measuring how long it takes for pupils to swim 100m. There are 10 pupils, all of the pupils take around 2 minutes except for one who takes 5 minutes. Due to this outlier of 5 minutes, the value we get for the mean would actually be unrepresentative of the group.

Additionally, as the mean is very precise, sometimes the values calculated do not make sense.

A headteacher would like to calculate what is the average number of siblings children have at their school. After getting data of all sibling numbers and dividing by the number of pupils, it turns out the mean number of siblings is 2.4.

Median

The median is the central number in a data set when it is ordered from lowest to highest.

Out of the numbers 2, 3, 6, 11, 14, the median is 6.

If there are an even number of values in a data set, the median is between the two central values.

Out of the numbers 2, 3, 6, 11, 14, 61, the median is between 6 and 11. We calculate the mean of these two numbers, (6+11) ÷ 2, which is 8.5; thus the median of this data set is 8.5.

Advantages and disadvantages of the median

Advantages

  • Unaffected by extreme values unrepresentative of the data set.

  • The median is easier to calculate than the mean.

Disadvantages

  • The median does not take into account the exact distances between values like the mean does.

  • The median cannot be used in estimates of population parameters.

Mode

The mode is a measure of the category with the highest frequency count.

For a data set of 3, 4, 5, 6, 6, 6, 7, 8, 8, the mode is 6.

What is the best measure of central tendency for categorical data?

For instance, the mode is the only central tendency measure for categorical data, while a median works best with ordinal data.

Which central tendency is best for age?

Clearly median seems to be the statistic of choice when it comes to ages.

Which measure of central tendency is used with categorical nominal data?

Three measures of central tendency are the mode, the median and the mean. The mode is used almost exclusively with nominal-level data, as it is the only measure of central tendency available for such variables.

What is the best measure of central tendency to represent the number of children in a household?

Mean is the most frequently used measure of central tendency and generally considered the best measure of it. However, there are some situations where either median or mode are preferred. Median is the preferred measure of central tendency when: There are a few extreme scores in the distribution of the data.