[英]Pivot R df using colnames as values (concatenated in some cases) and column values as new colnames
I have a df structured as below我有一个结构如下的df
month=c('jan','feb','jan','feb','feb','feb'),
therapyA=c(NA,'in person',NA,'teleheatlh',NA,'in person'),
therapyB=c('in person','in person',NA,'teleheatlh',NA,'in person'),
therapyC=c(NA,'in person','telehealth','teleheatlh',NA,'in person'),
therapyD=c(NA,'in person',NA,'teleheatlh','telehealth','in person'))
organization month therapyA therapyB therapyC therapyD
1 A jan <NA> in person <NA> <NA>
2 A feb in person in person in person in person
3 B jan <NA> <NA> telehealth <NA>
4 B feb teleheatlh teleheatlh teleheatlh teleheatlh
5 C feb <NA> <NA> <NA> telehealth
6 D feb in person in person in person in person
I would like to pivot the df in such as way as to get the results: meaning I'd like to use the current values as column names and concatenate all current column names that correalte.我想 pivot df 以获取结果:这意味着我想使用当前值作为列名并连接所有相关的当前列名。 The results also need to stay grouped organization and month.
结果还需要保持分组组织和月份。
1 A jan TherapyB <NA>
2 A feb TherapyA,TherapyB,TherapyC,TherapyD <NA>
3 B jan <NA> TherapyC
4 B feb <NA> TherapyA,TherapyB,TherapyC,TherapyD
5 C feb <NA> TherapyD
6 D feb TherapyA,TherapyB,TherapyC,TherapyD <NA>
I have tried using dplyr::pivot_longer unsuccessfully.我尝试使用 dplyr::pivot_longer 失败。 I have also tried using various base R operations, but was unable to get far without manually rewriting the whole df.
我也尝试过使用各种基本的 R 操作,但如果不手动重写整个 df 就无法走得更远。 Thank you for any input.
感谢您的任何意见。
We reshape first to 'long' format with pivot_longer
, then group by the columns, paste
( toString
) the column names and reshape back to 'wide' format ( pivot_wider
)我们首先使用
pivot_longer
将其整形为“long”格式,然后按列分组, paste
( toString
)列名并重新整形为“wide”格式( pivot_wider
)
library(dplyr)
library(tidyr)
library(snakecase)
df %>%
mutate(rn = row_number()) %>%
pivot_longer(cols = starts_with('therapy'), values_drop_na = TRUE) %>%
group_by(rn, organization, month, value) %>%
summarise(name = toString(to_upper_camel_case(name)), .groups = 'drop') %>%
pivot_wider(names_from = value, values_from = name) %>%
select(-rn)
-output -输出
# A tibble: 6 × 4
organization month `in person` telehealth
<chr> <chr> <chr> <chr>
1 A jan TherapyB <NA>
2 A feb TherapyA, TherapyB, TherapyC, TherapyD <NA>
3 B jan <NA> TherapyC
4 B feb <NA> TherapyA, TherapyB, TherapyC, TherapyD
5 C feb <NA> TherapyD
6 D feb TherapyA, TherapyB, TherapyC, TherapyD <NA>
df <- structure(list(organization = c("A", "A", "B", "B", "C", "D"),
month = c("jan", "feb", "jan", "feb", "feb", "feb"), therapyA = c(NA,
"in person", NA, "telehealth", NA, "in person"), therapyB = c("in person",
"in person", NA, "telehealth", NA, "in person"), therapyC = c(NA,
"in person", "telehealth", "telehealth", NA, "in person"),
therapyD = c(NA, "in person", NA, "telehealth", "telehealth",
"in person")), row.names = c(NA, -6L), class = "data.frame")
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.