[英]R pass vector of grouping vars to purrr::map
這是讀取遠程數據集並准備四個匯總表的代碼,顯示性別、教育、種族/種族和地理區域等人口統計變量中每個類別的計數:
suppressMessages(suppressWarnings(library(tidyverse)))
urlRemote_path <- "https://raw.githubusercontent.com/"
github_path <- "DSHerzberg/WEIGHTING-DATA/master/INPUT-FILES/"
fileName_path <- "data-input-sim.csv"
census_match_input <- suppressMessages(read_csv(url(
str_c(urlRemote_path, github_path, fileName_path)
)))
var_order_census_match <- c("gender", "educ", "ethnic", "region")
census_match_cat_count_gender <- census_match_input %>%
group_by(gender) %>%
summarize(n_census = n()) %>%
rename(demo_cat = gender) %>%
mutate(demo_var = "gender") %>%
relocate(demo_var, .before = demo_cat)
census_match_cat_count_educ <- census_match_input %>%
group_by(educ) %>%
summarize(n_census = n()) %>%
rename(demo_cat = educ) %>%
mutate(demo_var = "educ") %>%
relocate(demo_var, .before = demo_cat)
census_match_cat_count_ethnic <- census_match_input %>%
group_by(ethnic) %>%
summarize(n_census = n()) %>%
rename(demo_cat = ethnic) %>%
mutate(demo_var = "ethnic") %>%
relocate(demo_var, .before = demo_cat)
census_match_cat_count_region <- census_match_input %>%
group_by(region) %>%
summarize(n_census = n()) %>%
rename(demo_cat = region) %>%
mutate(demo_var = "region") %>%
relocate(demo_var, .before = demo_cat)
我想使用purrr::map()
合並此代碼。 我的想法是遍歷變量名的向量,如下所示:
census_match_cat_count <- var_order_census_match %>%
map(~
census_match_input %>%
group_by(!!.x) %>%
summarize(n_census = n()))
這不會返回所需的 output; 相反,它返回的表格缺少每個人口統計變量下的類別的單獨行和計數。
此外,當我嘗試擴展映射 function 以包含代碼的 rest 時,如下所示:
census_match_cat_count <- var_order_census_match %>%
map(
~
census_match_input %>%
group_by(!!.x) %>%
summarize(n_census = n()) %>%
rename(demo_cat = !!.x) %>%
mutate(demo_var = .x) %>%
relocate(demo_var, .before = demo_cat)
)
我收到錯誤提示我沒有使用正確的tidyeval
程序。
Stack Overflow 中有相關主題,但似乎沒有一個主題能解決我的具體問題,即如何在purrr::map()
中傳遞dplyr::group_by()
使用的變量名。
提前感謝您的幫助。
您快到了,但您需要將變量名稱轉換為符號以與group_by()
一起使用。 請注意,在下面的代碼中count()
是group_by()
+ summarise(n = n())
的快捷方式。
library(dplyr)
library(purrr)
vars <- c("gender", "educ", "ethnic", "region")
vars %>%
map(~ census_match_input %>%
count(!!sym(.x)) %>%
rename(demo_cat = !!.x) %>%
mutate(demo_var = .x) %>%
relocate(demo_var))
[[1]]
# A tibble: 2 x 3
demo_var demo_cat n
<chr> <chr> <int>
1 gender female 524
2 gender male 476
[[2]]
# A tibble: 4 x 3
demo_var demo_cat n
<chr> <chr> <int>
1 educ BA_plus 311
2 educ HS_grad 247
3 educ no_HS 133
4 educ some_college 309
[[3]]
# A tibble: 5 x 3
demo_var demo_cat n
<chr> <chr> <int>
1 ethnic asian 48
2 ethnic black 146
3 ethnic hispanic 252
4 ethnic other 64
5 ethnic white 490
[[4]]
# A tibble: 4 x 3
demo_var demo_cat n
<chr> <chr> <int>
1 region midwest 218
2 region northeast 173
3 region south 367
4 region west 242
您可以使用pivot_longer
重塑數據集然后count
library(tidyverse)
census_match_input %>%
pivot_longer(all_of(var_order_census_match), "demo_var", values_to = "demo_cat") %>%
count(demo_var, demo_cat)
# A tibble: 15 x 3
demo_var demo_cat n
<chr> <chr> <int>
1 educ BA_plus 311
2 educ HS_grad 247
3 educ no_HS 133
4 educ some_college 309
5 ethnic asian 48
6 ethnic black 146
7 ethnic hispanic 252
8 ethnic other 64
9 ethnic white 490
10 gender female 524
11 gender male 476
12 region midwest 218
13 region northeast 173
14 region south 367
15 region west 242
您也可以在沒有非標准評估的情況下將列名保留為字符來執行此操作。
library(dplyr)
var_order_census_match <- c("gender", "educ", "ethnic", "region")
purrr::map(var_order_census_match,
~census_match_input %>%
group_by_at(.x) %>%
summarise(n = n()) %>%
rename(demo_cat = .x) %>%
mutate(demo_var = .x) %>%
relocate(demo_var))
#[[1]]
# A tibble: 2 x 3
# demo_var demo_cat n
# <chr> <chr> <int>
#1 gender female 524
#2 gender male 476
#[[2]]
# A tibble: 4 x 3
# demo_var demo_cat n
# <chr> <chr> <int>
#1 educ BA_plus 311
#2 educ HS_grad 247
#3 educ no_HS 133
#4 educ some_college 309
#....
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.