创建一个变量，列出具有名称模式的其他几个变量的分组唯一值

Question

my problem is an extension of this one:我的问题是这个问题的扩展：

Create a list of all values of a variable grouped by another variable in R 创建由 R 中的另一个变量分组的变量的所有值的列表

Let's say we have a data frame with restaurants and the meals they offer by type and course:假设我们有一个数据框，其中包含餐厅及其按类型和课程提供的餐点：

    food <- data.frame(course = c("starter", "starter", "starter", "main", "main", "main", "main", "main"),
                      food_type = c("salad", "salad", "salad", "fish", "fish", "pasta", "pasta", "pasta"),
                      restaurant = c("dining_palace", "delicious_kitchen", "food_cube", "dining_palace", "food_cube", "dining_palace", "delicious_kitchen", "food_cube"),
                      meal1 = c("cesar_salad", "green_salad", "green_salad", "codfish", "trout", "spaghetti", "farfalle", "macaroni"),
                      meal2 = c("coleslaw", "tomato_salad", NA, "salmon", "codfish", "tagliatelle", "penne", "farfalle"),
                      meal3 = c(NA, "coleslaw", NA, "tuna", NA, NA, "spaghetti", "ravioli"), stringsAsFactors = FALSE)

food

 course food_type        restaurant       meal1        meal2     meal3
1 starter     salad     dining_palace cesar_salad     coleslaw      <NA>
2 starter     salad delicious_kitchen green_salad tomato_salad  coleslaw
3 starter     salad         food_cube green_salad         <NA>      <NA>
4    main      fish     dining_palace     codfish       salmon      tuna
5    main      fish         food_cube       trout      codfish      <NA>
6    main     pasta     dining_palace   spaghetti  tagliatelle      <NA>
7    main     pasta delicious_kitchen    farfalle        penne spaghetti
8    main     pasta         food_cube    macaroni     farfalle   ravioli

My aim is to gnereate a varaible that contains a list of all meals by course and food type independent of the offering restaurant.我的目标是生成一个变量，其中包含按课程和食物类型列出的所有餐点列表，与提供的餐厅无关。 Using the code from the link above with c(meal1, meal2, meal3) gives exactly the desired outcome:将上面链接中的代码与 c(meal1, meal2, meal3) 一起使用可以得到完全期望的结果：

library(dplyr)
selection_per_type <- food %>%
                      group_by(course, food_type) %>%
                      summarise(meals=paste(sort(unique(c(meal1, meal2, meal3))),collapse=",")) %>%
                      ungroup()

    selection_per_type
    
   course  food_type meals                                                
  <chr>   <chr>     <chr>                                                
1 main    fish      codfish,salmon,trout,tuna                            
2 main    pasta     farfalle,macaroni,penne,ravioli,spaghetti,tagliatelle
3 starter salad     cesar_salad,coleslaw,green_salad,tomato_salad

However, I'm looking for a solution with a higher number of meal variables, where a manual definition via c() is not practical.但是，我正在寻找具有更多膳食变量的解决方案，其中通过 c() 手动定义是不切实际的。 As the first n letters of all target variables are identical, I've tried some versions of "pattern" "grepl" and "regexec", but nothing has worked so far.由于所有目标变量的前 n 个字母都是相同的，我尝试了一些版本的“模式”、“grepl”和“regexec”，但到目前为止没有任何效果。 Are there any ideas, how to get this done?有什么想法，如何完成？

Answer 1

If there are more columns, we may use pivot_longer to convert to long format and then do a group by summarise如果有更多的列，我们可以使用pivot_longer转换为长格式，然后通过summary 进行分组

library(dplyr)
library(tidyr)
library(stringr)
food %>% 
  pivot_longer(cols = starts_with("meal"), values_to ='meal') %>% 
  group_by(course, food_type) %>%
  summarise(means = str_c(unique(sort(na.omit(meal))), 
       collapse = ","), .groups = 'drop')

-output -输出

# A tibble: 3 × 3
  course  food_type means                                                
  <chr>   <chr>     <chr>                                                
1 main    fish      codfish,salmon,trout,tuna                            
2 main    pasta     farfalle,macaroni,penne,ravioli,spaghetti,tagliatelle
3 starter salad     cesar_salad,coleslaw,green_salad,tomato_salad

创建一个变量，列出具有名称模式的其他几个变量的分组唯一值

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-07-21 15:20:47

创建一个变量，列出具有名称模式的其他几个变量的分组唯一值

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-07-21 15:20:47

解决方案1
0 已采纳 2022-07-21 15:20:47