Pivot R df 使用列名作为值（在某些情况下连接）和列值作为新的列名

Question

I have a df structured as below我有一个结构如下的df

month=c('jan','feb','jan','feb','feb','feb'),
therapyA=c(NA,'in person',NA,'teleheatlh',NA,'in person'),
therapyB=c('in person','in person',NA,'teleheatlh',NA,'in person'),
therapyC=c(NA,'in person','telehealth','teleheatlh',NA,'in person'),
therapyD=c(NA,'in person',NA,'teleheatlh','telehealth','in person'))

organization month   therapyA   therapyB   therapyC   therapyD
1            A   jan       <NA>  in person       <NA>       <NA>
2            A   feb  in person  in person  in person  in person
3            B   jan       <NA>       <NA> telehealth       <NA>
4            B   feb teleheatlh teleheatlh teleheatlh teleheatlh
5            C   feb       <NA>       <NA>       <NA> telehealth
6            D   feb  in person  in person  in person  in person

I would like to pivot the df in such as way as to get the results: meaning I'd like to use the current values as column names and concatenate all current column names that correalte.我想 pivot df 以获取结果：这意味着我想使用当前值作为列名并连接所有相关的当前列名。 The results also need to stay grouped organization and month.结果还需要保持分组组织和月份。

1            A   jan                            TherapyB                                <NA>
2            A   feb TherapyA,TherapyB,TherapyC,TherapyD                                <NA>
3            B   jan                                <NA>                            TherapyC
4            B   feb                                <NA> TherapyA,TherapyB,TherapyC,TherapyD
5            C   feb                                <NA>                            TherapyD
6            D   feb TherapyA,TherapyB,TherapyC,TherapyD                                <NA>

I have tried using dplyr::pivot_longer unsuccessfully.我尝试使用 dplyr::pivot_longer 失败。 I have also tried using various base R operations, but was unable to get far without manually rewriting the whole df.我也尝试过使用各种基本的 R 操作，但如果不手动重写整个 df 就无法走得更远。 Thank you for any input.感谢您的任何意见。

Answer 1

We reshape first to 'long' format with pivot_longer , then group by the columns, paste ( toString ) the column names and reshape back to 'wide' format ( pivot_wider )我们首先使用pivot_longer将其整形为“long”格式，然后按列分组， paste （ toString ）列名并重新整形为“wide”格式（ pivot_wider ）

library(dplyr)
library(tidyr)
library(snakecase)
df %>% 
 mutate(rn = row_number()) %>% 
 pivot_longer(cols = starts_with('therapy'), values_drop_na = TRUE) %>% 
 group_by(rn, organization, month, value) %>%
 summarise(name = toString(to_upper_camel_case(name)), .groups = 'drop') %>% 
 pivot_wider(names_from = value, values_from = name) %>% 
 select(-rn)

-output -输出

# A tibble: 6 × 4
  organization month `in person`                            telehealth                            
  <chr>        <chr> <chr>                                  <chr>                                 
1 A            jan   TherapyB                               <NA>                                  
2 A            feb   TherapyA, TherapyB, TherapyC, TherapyD <NA>                                  
3 B            jan   <NA>                                   TherapyC                              
4 B            feb   <NA>                                   TherapyA, TherapyB, TherapyC, TherapyD
5 C            feb   <NA>                                   TherapyD                              
6 D            feb   TherapyA, TherapyB, TherapyC, TherapyD <NA>

data数据

df <- structure(list(organization = c("A", "A", "B", "B", "C", "D"), 
    month = c("jan", "feb", "jan", "feb", "feb", "feb"), therapyA = c(NA, 
    "in person", NA, "telehealth", NA, "in person"), therapyB = c("in person", 
    "in person", NA, "telehealth", NA, "in person"), therapyC = c(NA, 
    "in person", "telehealth", "telehealth", NA, "in person"), 
    therapyD = c(NA, "in person", NA, "telehealth", "telehealth", 
    "in person")), row.names = c(NA, -6L), class = "data.frame")

Pivot R df 使用列名作为值（在某些情况下连接）和列值作为新的列名

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-08-09 18:35:40

data数据

Pivot R df 使用列名作为值（在某些情况下连接）和列值作为新的列名

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-08-09 18:35:40

data数据

解决方案1
0 已采纳 2022-08-09 18:35:40