简体   繁体   English

有没有办法在R中划分group_by汇总统计的答案?

[英]Is there a way to divide answers of group_by summary statistics in R?

I'm trying to subset data from three vectors and then apply arithmetic to the summary statistics but I'm having issues with count().我正在尝试从三个向量中对数据进行子集化,然后将算术应用于汇总统计数据,但我遇到了 count() 问题。 Below is the summary using (summarise, dplyr), but I want it to return as the percentage of unfiltered (X_age65yr).以下是使用 (summarise, dplyr) 的摘要,但我希望它以未过滤 (X_age65yr) 的百分比形式返回。
For example, filtered count results for Alabama is 1667, total count is 2411. I'd like Alabama, and all subsequent states, to return the filtered count by total, or 1667/2411 = .6914 or 69.14%例如,阿拉巴马州的筛选计数结果为 1667,总计数为 2411。我希望阿拉巴马州和所有后续州按总数返回筛选计数,或 1667/2411 = .6914 或 69.14%

cthigh <- brfss2013 %>% filter(bphigh4 == "Yes", !is.na(X_age65yr),X_age65yr == "Age 65 or older") %>%
   group_by(X_state) %>% summarise(count = n())

cthigh
# A tibble: 53 x 2
   X_state              count
   <fct>                <int>
 1 Alabama               1667
 2 Alaska                 507
 3 Arizona                930
 4 Arkansas              1352
 5 California            1817
 6 Colorado              2302
 7 Connecticut           1488
 8 Delaware              1123
 9 District of Columbia  1032
10 Florida               8924
# ... with 43 more rows

ctall <- brfss2013 %>% filter(!is.na(X_age65yr),X_age65yr == "Age 65 or older") %>% 
    group_by(X_state) %>% summarise(count= n())

ctall
# A tibble: 53 x 2
   X_state              count
   <fct>                <int>
 1 Alabama               2411
 2 Alaska                 864
 3 Arizona               1578
 4 Arkansas              2069
 5 California            3111
 6 Colorado              4067
 7 Connecticut           2362
 8 Delaware              1786
 9 District of Columbia  1683
10 Florida              14245
# ... with 43 more rows

You can count the number of bphigh4 == "Yes" and divide it by number of rows in each X_state to get the ratio.您可以计算bphigh4 == "Yes"的数量并将其除以每个X_state的行数以获得比率。

library(dplyr)

brfss2013 %>% 
  filter(!is.na(X_age65yr) & !is.na(bphigh4),X_age65yr == "Age 65 or older") %>%
  group_by(X_state) %>% 
  summarise(count = sum(bphigh4 == "Yes")/n() * 100)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM