简体   繁体   中英

Average of values in columns in dataframe?

I want to find the mean across a dataframe of values. For example, if I have the following data:

ID Value Status
1   10     A
2   15     B
3   20     A

And I want to find the mean of all values with the status A in it. How would I do so?

Here is my attempt:

dataframe$balance.mean(dataframe$status == 'A')

But I keep getting an error that says Error: attempt to apply non-function . Can anyone help me out? Thanks!

If I understood your requirement clearly, following should meet your requirement:

 id<-c(1,2,3)
 val<-c(10,15,20)
 sta<-c("A","B","A")

 df<-data.frame(id,val,sta)

 mean(df$val[df$sta=="A"])

Remember that () is used for function calls, [] are used for subsetting. Your are now calling a function while there is actually no function, giving the error message you see.

In a more general sense, for these kinds of things I like to use plyr , although data.table is an awesome other option.

library(plyr)
ddply(dataframe, .(Status), summarize, mean_value = mean(Value))

This will yield you a new data.frame with the average values of Value for each unique value of Status .

As @PaulHiemstra alluded to, there is a clean data.table solution which would be:

library(data.table)
DT[Status=="A", mean(val)]

where DT <- as.data.table(your_data_frame)


or you can set the key for quicker results:

setkey(DT, "status")
# this will produce a data.table, not a single 
DT["A", mean(val)]
# This produces a single number
DT["A"] [, mean(val)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM