![](/img/trans.png)
[英]In R, compute relative frequency of binomial values, grouped by multiple columns, and create a new dataset with this 'summary'
[英]Unable to create a grouped summary dataset in R
我在创建分组摘要统计信息时遇到麻烦。
以下是我用来创建此摘要数据集的代码
library(dplyr)
#sample dataset
D A B C VAL PD
Agriculture Services Bought with Cash 01OCT2014 10 0.4435714
Agriculture Grain Bought with Cash 01OCT2014 10 0.7266667
Agriculture Livestock Bought with Cash 01OCT2014 10 1.1372414
Agriculture Fr, ve Bought with Cash 01OCT2014 10 1.5170370
Agriculture Livestock Financed 01OCT2014 76 1.1372414
Agriculture Fr, ve Financed 01OCT2014 76 1.5170370
Agriculture Grain Financed 01OCT2014 76 0.7266667
Agriculture Services Financed 01OCT2014 76 0.4435714
Agriculture Services Insurance 01OCT2014 10 0.4435714
Agriculture Livestock Insurance 01OCT2014 10 1.1372414
groupDF<-select.other %>%
group_by(.dots=c("A","B","C")) %>%
summarize(PD=mean(PD),VAL=mean(VAL))
我期望数据集具有均值PD和均值VAL(按A,B和C分组)
A B C PD VAL
Services Bought with Cash 01OCT2017 1 10
相反,我越来越
PD VAL
0.8574816 6059877
任何帮助或指导将不胜感激。
如果它是字符串,我们可以使用group_by_at
library(dplyr)
select.other %>%
group_by_at(vars(c("A","B","C"))) %>%
summarize(PD=mean(PD),VAL=mean(VAL))
# A tibble: 10 x 5
# Groups: A, B [10]
# A B C PD VAL
# <chr> <chr> <chr> <dbl> <dbl>
# 1 Fr, ve Bought with Cash 01OCT2014 1.52 10
# 2 Fr, ve Financed 01OCT2014 1.52 76
# 3 Grain Bought with Cash 01OCT2014 0.727 10
# 4 Grain Financed 01OCT2014 0.727 76
# 5 Livestock Bought with Cash 01OCT2014 1.14 10
# 6 Livestock Financed 01OCT2014 1.14 76
# 7 Livestock Insurance 01OCT2014 1.14 10
# 8 Services Bought with Cash 01OCT2014 0.444 10
# 9 Services Financed 01OCT2014 0.444 76
#10 Services Insurance 01OCT2014 0.444 10
或者另一种选择是转换为sym
波士,然后做了评价( !!!
)
select.other %>%
group_by(!!! rlang::syms(c("A","B","C"))) %>%
summarize(PD=mean(PD),VAL=mean(VAL))
select.other <- structure(list(D = c("Agriculture", "Agriculture", "Agriculture",
"Agriculture", "Agriculture", "Agriculture", "Agriculture", "Agriculture",
"Agriculture", "Agriculture"), A = c("Services", "Grain", "Livestock",
"Fr, ve", "Livestock", "Fr, ve", "Grain", "Services", "Services",
"Livestock"), B = c("Bought with Cash", "Bought with Cash", "Bought with Cash",
"Bought with Cash", "Financed", "Financed", "Financed", "Financed",
"Insurance", "Insurance"), C = c("01OCT2014", "01OCT2014", "01OCT2014",
"01OCT2014", "01OCT2014", "01OCT2014", "01OCT2014", "01OCT2014",
"01OCT2014", "01OCT2014"), VAL = c(10L, 10L, 10L, 10L, 76L, 76L,
76L, 76L, 10L, 10L), PD = c(0.4435714, 0.7266667, 1.1372414,
1.517037, 1.1372414, 1.517037, 0.7266667, 0.4435714, 0.4435714,
1.1372414)), class = "data.frame", row.names = c(NA, -10L))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.