Central tendency summarizes an entire data set with a single value – often represented by the mean, median, and mode. In statistics, they are the foundation of numerical analysis. This guide explains how to compute mean, median, and mode in R.
Mean
The mean is used to find the average of the data or dataset. It is the summation of all the numbers in the dataset divided by the number of observations. There are two methods to calculate the mean in R.
Method 1: Using the mean() function
mean() is a function built-into R.
Syntax
mean(x)
Argument
- x = numerical vector
Example: Calculating the mean of a vector using the mean() function
#creating a vector
x <- c(21,21,22.8,21.4,18.7,18.1,14.3,24.4)
#calculating mean
mean(x)
Output
[1] 20.2125
Method 2: Using the sum() and length() functions
Mathematically, the mean is calculated by adding all the numerical values in the dataset divided by the number of data in it.
Syntax
sum(x)/length(x)
Arguments
- x = numerical vector
- sum = built-in function to compute the sum of all the values in a vector
- length = built-in function to find the number of observations in a vector
Example: Calculating the mean of a vector using the sum() and length() function
#creating a vector
x <- c(21,21,22.8,21.4,18.7,18.1,14.3,24.4)
#calculating mean
x_mean <- sum(x)/length(x)
x_mean
Output
[1] 20.2125
Median
The median is the middlemost value in a dataset arranged in ascending order. To calculate the median in R, use the median() function.
Syntax
median(x)
Argument
- x = numerical vector
Example: Calculating the median of a vector in R
#creating a vector
x <- c(12,15,23,25,27,13,14,15,15)
#calculating median
median(x)
Output
[1] 15
Mode
Mode is the most frequent data in a data set. It is mainly applied to a dataset containing nominal data like Yes or No. To compute the mode in R, use the mlv() function.
Installation
The mlv() function is built into the Modeest package. To install and load the package, run the following code:
install.packages(“modeest”)
library(“modeest”)
Syntax
mfv(x, method)
Arguments
- x = a vector
- method = returns the most frequent value in a vector
Example: Calculating the mode of a vector in R
#creating a numerical vector
x <- c(12,15,23,25,27,13,14,15,15)
#creating a string vector
y <- c('ryan','mary','edwards','mary','erica','mary')
#calculating mode
mlv(x, method = 'mfv') #mode of vector x
mlv(y, method = 'mfv') #mode of vector y
Output
[1] 15
[1] "mary"