簡體   English   中英

R 將分組變量的向量傳遞給 purrr::map

[英]R pass vector of grouping vars to purrr::map

這是讀取遠程數據集並准備四個匯總表的代碼,顯示性別、教育、種族/種族和地理區域等人口統計變量中每個類別的計數:

suppressMessages(suppressWarnings(library(tidyverse)))

urlRemote_path  <- "https://raw.githubusercontent.com/"
github_path <- "DSHerzberg/WEIGHTING-DATA/master/INPUT-FILES/"
fileName_path   <- "data-input-sim.csv"

census_match_input <- suppressMessages(read_csv(url(
  str_c(urlRemote_path, github_path, fileName_path)
)))

var_order_census_match  <- c("gender", "educ", "ethnic", "region")

census_match_cat_count_gender <- census_match_input %>%
  group_by(gender) %>%
  summarize(n_census = n()) %>%
  rename(demo_cat = gender) %>%
  mutate(demo_var = "gender") %>%
  relocate(demo_var, .before = demo_cat)

census_match_cat_count_educ <- census_match_input %>%
  group_by(educ) %>%
  summarize(n_census = n()) %>%
  rename(demo_cat = educ) %>%
  mutate(demo_var = "educ") %>%
  relocate(demo_var, .before = demo_cat)

census_match_cat_count_ethnic <- census_match_input %>%
  group_by(ethnic) %>%
  summarize(n_census = n()) %>%
  rename(demo_cat = ethnic) %>%
  mutate(demo_var = "ethnic") %>%
  relocate(demo_var, .before = demo_cat)

census_match_cat_count_region <- census_match_input %>%
  group_by(region) %>%
  summarize(n_census = n()) %>%
  rename(demo_cat = region) %>%
  mutate(demo_var = "region") %>%
  relocate(demo_var, .before = demo_cat)

我想使用purrr::map()合並此代碼。 我的想法是遍歷變量名的向量,如下所示:

census_match_cat_count <- var_order_census_match %>% 
  map(~
        census_match_input %>%
        group_by(!!.x) %>%
        summarize(n_census = n()))

這不會返回所需的 output; 相反,它返回的表格缺少每個人口統計變量下的類別的單獨行和計數。

此外,當我嘗試擴展映射 function 以包含代碼的 rest 時,如下所示:

census_match_cat_count <- var_order_census_match %>%
  map(
    ~
      census_match_input %>%
      group_by(!!.x) %>%
      summarize(n_census = n()) %>%
      rename(demo_cat = !!.x) %>%
      mutate(demo_var = .x) %>%
      relocate(demo_var, .before = demo_cat)
  )

我收到錯誤提示我沒有使用正確的tidyeval程序。

Stack Overflow 中有相關主題,但似乎沒有一個主題能解決我的具體問題,即如何在purrr::map()中傳遞dplyr::group_by()使用的變量名。

提前感謝您的幫助。

您快到了,但您需要將變量名稱轉換為符號以與group_by()一起使用。 請注意,在下面的代碼中count()group_by() + summarise(n = n())的快捷方式。

library(dplyr)
library(purrr)

vars <- c("gender", "educ", "ethnic", "region")

vars %>%
  map(~ census_match_input %>%
         count(!!sym(.x)) %>%
         rename(demo_cat = !!.x) %>%
         mutate(demo_var = .x) %>%
         relocate(demo_var))

[[1]]
# A tibble: 2 x 3
  demo_var demo_cat     n
  <chr>    <chr>    <int>
1 gender   female     524
2 gender   male       476

[[2]]
# A tibble: 4 x 3
  demo_var demo_cat         n
  <chr>    <chr>        <int>
1 educ     BA_plus        311
2 educ     HS_grad        247
3 educ     no_HS          133
4 educ     some_college   309

[[3]]
# A tibble: 5 x 3
  demo_var demo_cat     n
  <chr>    <chr>    <int>
1 ethnic   asian       48
2 ethnic   black      146
3 ethnic   hispanic   252
4 ethnic   other       64
5 ethnic   white      490

[[4]]
# A tibble: 4 x 3
  demo_var demo_cat      n
  <chr>    <chr>     <int>
1 region   midwest     218
2 region   northeast   173
3 region   south       367
4 region   west        242

您可以使用pivot_longer重塑數據集然后count

library(tidyverse)
census_match_input %>% 
    pivot_longer(all_of(var_order_census_match), "demo_var", values_to = "demo_cat") %>%
    count(demo_var, demo_cat)

    # A tibble: 15 x 3
       demo_var demo_cat         n
       <chr>    <chr>        <int>
     1 educ     BA_plus        311
     2 educ     HS_grad        247
     3 educ     no_HS          133
     4 educ     some_college   309
     5 ethnic   asian           48
     6 ethnic   black          146
     7 ethnic   hispanic       252
     8 ethnic   other           64
     9 ethnic   white          490
    10 gender   female         524
    11 gender   male           476
    12 region   midwest        218
    13 region   northeast      173
    14 region   south          367
    15 region   west           242

您也可以在沒有非標准評估的情況下將列名保留為字符來執行此操作。

library(dplyr)

var_order_census_match  <- c("gender", "educ", "ethnic", "region")

purrr::map(var_order_census_match, 
         ~census_match_input %>%
              group_by_at(.x) %>%
              summarise(n = n()) %>%
              rename(demo_cat = .x) %>%
              mutate(demo_var = .x) %>%
              relocate(demo_var))


#[[1]]
# A tibble: 2 x 3
#  demo_var demo_cat     n
#  <chr>    <chr>    <int>
#1 gender   female     524
#2 gender   male       476

#[[2]]
# A tibble: 4 x 3
#  demo_var demo_cat         n
#  <chr>    <chr>        <int>
#1 educ     BA_plus        311
#2 educ     HS_grad        247
#3 educ     no_HS          133
#4 educ     some_college   309
#....

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM