简体   繁体   中英

Error in group_by function in dplyr

I've looked through the related dplyr questions, the R documentation, and attempted to sort through what I believe is a syntax misunderstanding.

Here is sample data that reflects the strx of my data.

id <- c(1:20)
xvar <- seq(from=2.0, to=6.0, length.out=100)
yvar <- c(1:100)
binary <- sample(x=c(0,1), size=100, replace=TRUE)

breaks <- c(0,11,21,31,41,51,61,71,81,91,100)
df <- data.frame(id, xvar, yvar, binary)
df <- transform(df, bin=cut(yvar, breaks)) 

     id     xvar yvar binary    bin
1  1 2.000000    1      1 (0,11]
2  2 2.040404    2      0 (0,11]
3  3 2.080808    3      0 (0,11]
4  4 2.121212    4      0 (0,11]
5  5 2.161616    5      1 (0,11]
6  6 2.202020    6      0 (0,11]

I'd like to run the following, looking at how the xvar means, divided by the binary variable, are significantly different based on the bin group they belong to.

pval <- df %>% group_by(bin) %>% summarise(p.value=t.test(xvar ~ factor(binary))$p.value)

However, I continue to get the error: "grouping factor must have exactly 2 levels"

I saw a similar post to this, but the problem was how the T.test was being run. I've ran this same code using a different group_by object and it worked just fine. The data time was a factor and everything.

Any thoughts? I also would appreciate critiques on how to improve the manner in which this question was posed.

You don't want to use dplyr for this. You want to fit a linear model .

mod <- lm(xvar ~ binary*bin, data=df)
anova(mod)

For further discussion of what the coefficients, P-values and sums of squares mean, consider asking on stats.SE.

I think I've resolved the issue.

"Grouping factor must have exactly 2 levels" comes from whenever there is not enough data in the t.test. I just assumed my original data set, which is large, would have enough to not run into this issue.

When I made the sample data more robust, the error disappeared.

Sorry for the wasted time, and thank you for your help!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM