Understanding the difference between Mean, Median, and Mode

difference between mean median and mode

There are three main ways of measuring central tendency- median, mean, and mode. Each has its own strengths and weaknesses, which is why data scientists use all three depending on the dataset they are examining. In this article, we will explore what each measure is and how it can be used to gain insights into data.

Your FREE Guide to Become a Data Scientist

Discover the path to becoming a data scientist with our comprehensive free guide! Unlock your potential in this in-demand field and access valuable resources to kickstart your journey.

Don’t wait, download now and transform your career!


What is the median?

The median is the value that falls in the middle of a dataset when it is sorted from smallest to largest. To calculate the median, simply sort the data and find the value in the middle. If there are an even number of values, the median is calculated as the average of the two middle values.

The median is not affected by outliers, which makes it a good choice for datasets with extreme values. It is also easy to calculate by hand, which makes it a good choice for small datasets.

How to calculate the median

The median can be calculated by hand for small datasets, or using a spreadsheet program or statistical software for larger datasets.

To calculate the median by hand, simply sort the data from smallest to largest and find the value in the middle. If there are an even number of values, the median is calculated as the average of the two middle values.
For example, if the dataset is {12, 13, 14, 15, 16}, the median would be calculated as (14+15)/(16-13)=14.

The median can also be easily calculated using a spreadsheet program or statistical software. Simply enter the data into the spreadsheet and use the built-in median function to find the value.

The median is a robust statistic, meaning that it is not affected by outliers in the data. This makes it a good choice when working with datasets that may contain outliers.

What is the mean?

The mean is calculated by adding up all of the values in a dataset and then dividing by the number of values. The mean is sensitive to outliers, which can cause it to be skewed if there are extreme values in a dataset. However, this also means that the mean can give insights into trends that might be missed by the median.

How to calculate the mean

Calculating the mean is a simple process that can be done by hand or with a spreadsheet program. To calculate the mean by hand, simply add up all of the values in a dataset and then divide by the number of values.

For example, if you have the following dataset:

  • 12
  • 18
  • 24
  • 30

The mean would be calculated as follows:

(12 + 18 + 24 + 30) / (12+18+24+30) = 78 / 78 = 18.

So, the mean of this data set is 18.

If you’re using a spreadsheet program like Microsoft Excel, you can use the AVERAGE function to calculate the mean. Simply select all of the cells that contain data, click on the Formulas tab, and then select AVERAGE from the Statistical functions drop-down menu.

What is the mode?

The mode is the most common value in a dataset. To calculate the mode, simply sort the data and find the value that appears most often. The mode is not affected by outliers and can be used to get an idea of the general shape of a dataset.

However, it is important to note that a dataset can have more than one mode. This is because the mode is only concerned with the most common value and not the second most common value.

This means that the mode is not always a good measure of central tendency

How to calculate the mode

The mode can be calculated by hand or using a spreadsheet. To calculate the mode by hand, simply sort the data and find the value that appears most often.
For example, let’s say we have the following dataset:

  • 14
  • 20
  • 32
  • 20
  • 16
  • 20

To calculate the mode, we would sort the data and find that the value “20” appears most often. Therefore, the mode of this dataset is 20.

However, it is important to note that a dataset can have more than one mode. This means that the mode is not always a good measure of central tendency.

To calculate the mode using a spreadsheet, select all of the cells that contain data, click on the Formulas tab, and then select MODE from the Statistical functions drop-down menu.

What is the range?

Related to the mean median and mode is the range, which is simply the difference between the largest and smallest values.

How to calculate the range

To calculate the range using a spreadsheet, select all of the cells that contain data, click on the Formulas tab, and then select RANGE from the Statistical functions drop-down menu.

To calculate it by hand, simply subtract the smallest value from the largest value.

When are the mean, median and mode used?

All three of these measures are important in data science and computing. When choosing which measure to use, it is important to consider what insights you are hoping to gain from your data.

If you are looking for a general overview of your data, the median or mean might be a good choice. This is because they are less affected by outliers than the mode. The median is a good measure of central tendency when there are outliers in the data set because it is not affected by them as much as the mean.

The mode is a good measure to use when you are interested in finding the most common value in the data set. This is because it is not affected by outliers.

I hope this article has helped you to understand the difference between median, mean, and mode. As always, if you have any questions or comments, please feel free to reach out to us on our website or on social media. We would be happy to chat with you about your data!

Pierian Training
Pierian Training
Pierian Training is a leading provider of high-quality technology training, with a focus on data science and cloud computing. Pierian Training offers live instructor-led training, self-paced online video courses, and private group and cohort training programs to support enterprises looking to upskill their employees.

You May Also Like

Python Basics, Tutorials

Plotting Time Series in Python: A Complete Guide

Introduction Time series data is a type of data that is collected over time at regular intervals. It can be used to analyze trends, patterns, and behaviors over time. In order to effectively analyze time series data, it is important to visualize it in a way that is easy to understand. This is where plotting […]

Python Basics, Tutorials

A Beginner’s Guide to Scipy.ndimage

Introduction Scipy.ndimage is a package in the Scipy library that is used to perform image processing tasks. It provides functions to perform operations like filtering, interpolation, and morphological operations on images. In this guide, we will cover the basics of Scipy.ndimage and how to use it to manipulate images. What is Scipy.ndimage? Scipy.ndimage is a […]

Python Basics, Tutorials

Adding Subtitles to Plots in Python: A Complete Guide

Introduction Adding subtitles to plots is an essential part of data visualization. Subtitles provide context to the plot and help the viewer understand the purpose of the visualization. In Python, adding subtitles to plots is a straightforward process that can be achieved using Matplotlib – a popular data visualization library. Matplotlib provides the `title()` function […]