简体   繁体   English

带数据框的R统计函数故障

[英]R Statistical Function Failure with Data Frames

When using mean(), sd(), etc. functions with a dataframe, I'm getting an 'argument is not numeric or logical' error. 当对数据帧使用mean(),sd()等函数时,出现“参数不是数字或逻辑上的”错误。

I created a simple frame from two vectors to test functionality (ie to use a stat function with a data frame). 我从两个向量创建了一个简单的框架来测试功能(即,将stat函数与数据框架一起使用)。

str() gives the following: str()提供以下内容:

'data.frame':   195 obs. of  2 variables:
 $ Births  : num  10.2 35.3 46 12.9 11 ...
 $ Internet: num  78.9 5.9 19.1 57.2 88 ...

Using the mean() function: 使用mean()函数:

mean(frame2, na.rm=TRUE)

Gives: 得到:

Warning message: In mean.default(frame2, na.rm = TRUE) : argument is not numeric or logical: returning NA 警告消息:在mean.default(frame2,na.rm = TRUE)中:参数不是数字或逻辑:返回NA

I've seen previous advice to not use mean() with a data frame, which is fine, but not the point. 我已经看过以前的建议,不要在数据帧中使用mean(),这很好,但不是重点。

I'm going through the O'Reilly R Cookbook, and it claims you should be able to use mean() and sd() with a dataframe. 我正在阅读O'Reilly R Cookbook,它声称您应该能够在数据帧中使用mean()和sd()。

However, I can't make it work. 但是,我无法使其工作。

About your problem: 关于您的问题:

I dont have access to your book, or other learning resource but the best learning tool is R help. 我无权访问您的书或其他学习资源,但最好的学习工具是R帮助。 So to understand the type of arguments you can do ?mean and it says: 因此,要理解参数的类型,您可以执行以下操作: ?mean ,它表示:

mean(x, trim = 0, na.rm = FALSE, ...)
Arguments

x   An R object. Currently there are methods for numeric/logical vectors and date, date-time and time interval objects. Complex vectors are allowed for trim = 0, only. 

So, as it explain it works the best for vectors , also based on this question , i think your book is a little old. 因此,正如它所解释的,它最适合vectors ,也基于这个问题 ,我认为您的书有些陈旧。 Get your R version, and compare it with book. 获取您的R版本,并将其与book进行比较。


It works well for me in this example: 在此示例中,它对我来说效果很好:

dt<-data.frame(Births =sample(c(1:100),50),
           Internet =sample(c(1:100),50))

str(dt)
mean(dt$Births)

or even if i make the data as num still works: 或者即使我将数据设为num仍然有效:

dt<-data.frame(Births =as.numeric( sample(c(1:100),50)),
           Internet =as.numeric(sample(c(1:100),50)))

str(dt)
mean(dt$Births)

if you wish to pass your dataframe, and get general info in one go you can use summary function: 如果您希望传递数据框并一次性获得常规信息,则可以使用summary功能:

summary(iris)

Two options, first works if indeed all columns are numeric, 2nd just summarizes the numeric columns: 有两种选择,第一种是在确实所有列都是数字的情况下起作用,第二种是汇总数字列:

dt %>% dplyr::summarise_all(mean)
dt %>% dplyr::summarise_if(is.numeric, mean)


  Births Internet
1  47.86    47.52

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM