简体   繁体   中英

How to sum data.frame column values?

I have a data frame with several columns; some numeric and some character. How to compute the sum of a specific column? I've googled for this and I see numerous functions ( sum , cumsum , rowsum , rowSums , colSums , aggregate , apply ) but I can't make sense of it all.

For example suppose I have a data frame people with the following columns

people <- read(
  text = 
    "Name Height Weight
    Mary 65     110
    John 70     200
    Jane 64     115", 
  header = TRUE
)
…

How do I get the sum of all the weights?

You can just use sum(people$Weight) .

sum sums up a vector, and people$Weight retrieves the weight column from your data frame.

Note - you can get built-in help by using ?sum , ?colSums , etc. (by the way, colSums will give you the sum for each column).

To sum values in data.frame you first need to extract them as a vector.

There are several way to do it:

# $ operatior
x <- people$Weight
x
# [1] 65 70 64

Or using [, ] similar to matrix:

x <- people[, 'Weight']
x
# [1] 65 70 64

Once you have the vector you can use any vector-to-scalar function to aggregate the result:

sum(people[, 'Weight'])
# [1] 199

If you have NA values in your data, you should specify na.rm parameter:

sum(people[, 'Weight'], na.rm = TRUE)

you can use tidyverse package to solve it and it would look like the following (which is more readable for me):

library(tidyverse) people %>% summarise(sum(weight))

如果列中有'NA'值,那么

sum(as.numeric(JuneData1$Account.Balance), na.rm = TRUE)

to order after the colsum :

order(colSums(people),decreasing=TRUE)

if more than 20+ columns

order(colSums(people[,c(5:25)],decreasing=TRUE) ##in case of keeping the first 4 columns remaining.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM