[英]Is there a way to divide answers of group_by summary statistics in R?
I'm trying to subset data from three vectors and then apply arithmetic to the summary statistics but I'm having issues with count().我正在尝试从三个向量中对数据进行子集化,然后将算术应用于汇总统计数据,但我遇到了 count() 问题。 Below is the summary using (summarise, dplyr), but I want it to return as the percentage of unfiltered (X_age65yr).
以下是使用 (summarise, dplyr) 的摘要,但我希望它以未过滤 (X_age65yr) 的百分比形式返回。
For example, filtered count results for Alabama is 1667, total count is 2411. I'd like Alabama, and all subsequent states, to return the filtered count by total, or 1667/2411 = .6914 or 69.14%例如,阿拉巴马州的筛选计数结果为 1667,总计数为 2411。我希望阿拉巴马州和所有后续州按总数返回筛选计数,或 1667/2411 = .6914 或 69.14%
cthigh <- brfss2013 %>% filter(bphigh4 == "Yes", !is.na(X_age65yr),X_age65yr == "Age 65 or older") %>%
group_by(X_state) %>% summarise(count = n())
cthigh
# A tibble: 53 x 2
X_state count
<fct> <int>
1 Alabama 1667
2 Alaska 507
3 Arizona 930
4 Arkansas 1352
5 California 1817
6 Colorado 2302
7 Connecticut 1488
8 Delaware 1123
9 District of Columbia 1032
10 Florida 8924
# ... with 43 more rows
ctall <- brfss2013 %>% filter(!is.na(X_age65yr),X_age65yr == "Age 65 or older") %>%
group_by(X_state) %>% summarise(count= n())
ctall
# A tibble: 53 x 2
X_state count
<fct> <int>
1 Alabama 2411
2 Alaska 864
3 Arizona 1578
4 Arkansas 2069
5 California 3111
6 Colorado 4067
7 Connecticut 2362
8 Delaware 1786
9 District of Columbia 1683
10 Florida 14245
# ... with 43 more rows
You can count the number of bphigh4 == "Yes"
and divide it by number of rows in each X_state
to get the ratio.您可以计算
bphigh4 == "Yes"
的数量并将其除以每个X_state
的行数以获得比率。
library(dplyr)
brfss2013 %>%
filter(!is.na(X_age65yr) & !is.na(bphigh4),X_age65yr == "Age 65 or older") %>%
group_by(X_state) %>%
summarise(count = sum(bphigh4 == "Yes")/n() * 100)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.