Understanding the difference between Mean, Median, and Mode

difference between mean median and mode

There are three main ways of measuring central tendency- median, mean, and mode. Each has its own strengths and weaknesses, which is why data scientists use all three depending on the dataset they are examining. In this article, we will explore what each measure is and how it can be used to gain insights into data.

What is the median?

The median is the value that falls in the middle of a dataset when it is sorted from smallest to largest. To calculate the median, simply sort the data and find the value in the middle. If there are an even number of values, the median is calculated as the average of the two middle values.

The median is not affected by outliers, which makes it a good choice for datasets with extreme values. It is also easy to calculate by hand, which makes it a good choice for small datasets.

How to calculate the median

The median can be calculated by hand for small datasets, or using a spreadsheet program or statistical software for larger datasets.

To calculate the median by hand, simply sort the data from smallest to largest and find the value in the middle. If there are an even number of values, the median is calculated as the average of the two middle values.
For example, if the dataset is {12, 13, 14, 15, 16}, the median would be calculated as (14+15)/(16-13)=14.

Sign Up for Email Updates

The median can also be easily calculated using a spreadsheet program or statistical software. Simply enter the data into the spreadsheet and use the built-in median function to find the value.

The median is a robust statistic, meaning that it is not affected by outliers in the data. This makes it a good choice when working with datasets that may contain outliers.

What is the mean?

The mean is calculated by adding up all of the values in a dataset and then dividing by the number of values. The mean is sensitive to outliers, which can cause it to be skewed if there are extreme values in a dataset. However, this also means that the mean can give insights into trends that might be missed by the median.

How to calculate the mean

Calculating the mean is a simple process that can be done by hand or with a spreadsheet program. To calculate the mean by hand, simply add up all of the values in a dataset and then divide by the number of values.

For example, if you have the following dataset:

  • 12
  • 18
  • 24
  • 30

The mean would be calculated as follows:

(12 + 18 + 24 + 30) / (12+18+24+30) = 78 / 78 = 18.

So, the mean of this data set is 18.

If you’re using a spreadsheet program like Microsoft Excel, you can use the AVERAGE function to calculate the mean. Simply select all of the cells that contain data, click on the Formulas tab, and then select AVERAGE from the Statistical functions drop-down menu.

What is the mode?

The mode is the most common value in a dataset. To calculate the mode, simply sort the data and find the value that appears most often. The mode is not affected by outliers and can be used to get an idea of the general shape of a dataset.

However, it is important to note that a dataset can have more than one mode. This is because the mode is only concerned with the most common value and not the second most common value.

This means that the mode is not always a good measure of central tendency

How to calculate the mode

The mode can be calculated by hand or using a spreadsheet. To calculate the mode by hand, simply sort the data and find the value that appears most often.
For example, let’s say we have the following dataset:

  • 14
  • 20
  • 32
  • 20
  • 16
  • 20

To calculate the mode, we would sort the data and find that the value “20” appears most often. Therefore, the mode of this dataset is 20.

However, it is important to note that a dataset can have more than one mode. This means that the mode is not always a good measure of central tendency.

To calculate the mode using a spreadsheet, select all of the cells that contain data, click on the Formulas tab, and then select MODE from the Statistical functions drop-down menu.

What is the range?

Related to the mean median and mode is the range, which is simply the difference between the largest and smallest values.

How to calculate the range

To calculate the range using a spreadsheet, select all of the cells that contain data, click on the Formulas tab, and then select RANGE from the Statistical functions drop-down menu.

To calculate it by hand, simply subtract the smallest value from the largest value.

When are the mean, median and mode used?

All three of these measures are important in data science and computing. When choosing which measure to use, it is important to consider what insights you are hoping to gain from your data.

If you are looking for a general overview of your data, the median or mean might be a good choice. This is because they are less affected by outliers than the mode. The median is a good measure of central tendency when there are outliers in the data set because it is not affected by them as much as the mean.

The mode is a good measure to use when you are interested in finding the most common value in the data set. This is because it is not affected by outliers.

I hope this article has helped you to understand the difference between median, mean, and mode. As always, if you have any questions or comments, please feel free to reach out to us on our website or on social media. We would be happy to chat with you about your data!

Sign Up for Email Updates
Pierian Training
Pierian Training

You May Also Like

Python Basics, Tutorials

How to Convert A .py Script into A .exe File

Picture this: you’ve just finished creating a fantastic Python program and intend to let the world see it. You then send your friend a directory containing all your scripts and encourage them to try it out. Only first, they must install Python and then run the program via the IDLE shell or the command line. […]

Data Science, Python Basics

Top 10 Python Data Science Libraries

Today, Python is the most widely used programming language – it’s open-source, easy to learn, and easy to debug. Another key benefit of using Python is the Python libraries – incredible collections of related modules. Having these bundles of code, that can be repeatedly used in a wide range of different modules, makes Python programming […]