Home » R Tutorials » Mean of Columns in Data Frame in R

Mean of Columns in Data Frame in R

mean() function and colMeans() function of the R base package is used to get the mean of columns in the data frame. It accepts the data frame as input argument and column name to find the mean of the column in R.

The function calculates the mean of a column by calculating the sum of the values and dividing by the number of rows in it.

In this tutorial, we will discuss how to get the mean of columns in a data frame in R using mean(), colMeans(), and summarise() function of the dplyr package.

Get Mean of Columns in R using mean()

Use the mean() function of the base R package to find the mean of columns of a data frame in R. Data frame is passed as an argument and column name by $ operator.

Let’s practice with an example to calculate the mean of columns in R.

Create a data frame using the data.frame() function in R.

# Create a data frame
emp_info <- data.frame(
  id = c(1,2,3,4,5,6),
  name = c("Henry","Gary","Sam","Julie","Kim","Chris"),
  age = c(28,24,29,25,23,24),
  salary = c(4500,4000,5500,3650,3400,3800)
)
# Print data frame
emp_info

In the above R code, it creates a data frame in R and prints the data frame as below:

  id  name age salary
1  1 Henry  28   4500
2  2  Gary  24   4000
3  3   Sam  29   5500
4  4 Julie  25   3650
5  5   Kim  23   3400
6  6 Chris  24   3800

To get the mean of a column of a data frame by column name in R, use the mean() function. It calculates the mean of the column. The Data frame column is passed as an argument.

# Calculate mean using mean function, $ operator to access the column
mean(emp_info$salary)

The output of the above R code is:

[1] 4141.667

Calculate the mean of column in data frame using index

Using the column index passed as an argument to mean() function, gets the mean of the column.

Let’s use the above data frame for illustration purposes.

The mean function in the following code uses the data frame and its column by index to find the mean.

# Calculate mean using column index
mean(emp_info[,3])

The output of the above R code to find the mean of column age in the data frame is:

[1] 25.5

How to Find the Mean of Column in R

Use the double square bracket to access the specific element in the data frame to passed as an argument to mean() function.

Let’s use the above emp_info data frame as an example to find the mean of the column in R.

# use double square bracket to access the salary column to find the mean of column
mean(emp_info[["salary"]])
[1] 4141.667

Calcualte Column mean in R using summarise() function

Using the summarise() function of the dplyr package, we can get the column means in R.

Let’s use the above data frame to find column means in R.

To use summarize() function in R, install the dplyr package if not installed.

Import the dplyr package and use summarize() function which accepts a data frame and column for which we want to calculate the mean of a column using dplyr.

# Calculate mean using the summarize function of dplyr package
import(dplyr)
summarise(emp_info,salary_mean = mean(salary))
  salary_mean
1    4141.667

Find the Mean of Multiple Columns in R using colMeans()

colMeans() function of the base R package is used to calculate the mean of multiple columns in R. Data frame is passed as an argument.

The data frame should have numeric data else colMeans function will not calculate the mean and display an error message as ‘Error in colMeans(emp_info): ‘x’ must be numeric’

Let’s create a data frame in R which is numeric. In the below R code, it creates a data frame using data.frame function and has age, physics_marks, and math_marks as columns.

student_info data frame contains all numeric data.

# Create a data frame
student_info <- data.frame(
    age = c(20,21,19,20,21,22),
    physics_marks = c(72,77,65,80,85,87),
    math_marks = c(45,67,85,70,65,77)
)
# Print data frame
student_info

Use the colMeans() function to find the mean of multiple columns of a data frame in R.

# Use colMeans() function and data frame passed as an argument
colMeans(student_info)

The output of the above R code calculates the mean for multiple columns in R.

         age physics_marks    math_marks 
     20.50000      77.66667      68.16667 

Conclusion

I hope the above article to find the mean of the column in R using the mean(), summarise, and colMeans() function is helpful to you.

Use the colMeans() function to find the mean of multiple columns of a data frame in R. The data frame should be numeric to use with the colMeans() function.

Leave a Comment