[英]R: Gather/Spread/Reshape 21 Columns Based on 21 Other Column s
I would like to create columns based on values in some fields, populated by values in other fields. 我想基于某些字段中的值创建列,并由其他字段中的值填充。 For example column1_time has value "1030" and column1_status has value "booked". 例如,column1_time的值为“ 1030”,column1_status的值为“ booked”。 I would like to pivot those into a new field time1030 with value "booked." 我想将它们转到具有“已预订”值的新字段time1030中。 There are 21 unique columns with times, (the times are only listed once per row, so they are unique across the 21 columns) -- and there are 21 unique columns with statuses that map back to the time columns. 有21个具有时间的唯一列(时间每行仅列出一次,因此它们在21列中是唯一的)-还有21个唯一的列,其状态映射回时间列。 So these 42 time+status columns should be rearranged to one column per unique time, being populated by that time's corresponding status. 因此,这42个时间+状态列应在每个唯一时间重新排列为一列,并以该时间的相应状态填充。
I have data that looks like this: 我有看起来像这样的数据:
I would like to utilize R's gather/spread or reshape2 (legacy) functionality to transpose this data to look like this: 我想利用R的collect / spread或reshape2(旧版)功能来转置此数据,使其看起来像这样:
I tinkered around with gather
and spread
for a few hours but couldn't figure it out. 我在gather
和spread
进行了几个小时的修改,但无法弄清楚。 I thought setting the key to ends_with('_time')
and the value to ends_with('_status')
might work but it did not from my attempts. 我以为将键设置为ends_with('_time')
,将值设置为ends_with('_status')
可能可行,但是我的尝试却没有。
For a reproducible example of the data: 有关数据的可重现示例:
structure(list(appointment1_time = c("1030", "1030"), appointment2_time = c("1100",
"1100"), appointment3_time = c("1130", "1130"), appointment4_time = c("1200",
"1200"), appointment5_time = c("1230", "1230"), appointment6_time = c("0100",
"0100"), appointment7_time = c("0130", "0130"), appointment8_time = c("0200",
"0200"), appointment9_time = c("0230", "0230"), appointment10_time = c("0300",
"0300"), appointment11_time = c("0330", "0330"), appointment12_time = c("0400",
"0400"), appointment13_time = c("0430", "0430"), appointment14_time = c("0500",
"0500"), appointment15_time = c("0530", "0530"), appointment16_time = c("0600",
""), appointment17_time = c("0630", ""), appointment18_time = c("0700",
""), appointment19_time = c("0730", ""), appointment20_time = c(NA_character_,
NA_character_), appointment21_time = c(NA_character_, NA_character_
), appointment1_status = c("booked", "available"), appointment2_status = c("booked",
"available"), appointment3_status = c("booked", "available"),
appointment4_status = c("booked", "available"), appointment5_status = c("booked",
"available"), appointment6_status = c("booked", "available"
), appointment7_status = c("booked", "available"), appointment8_status = c("booked",
"available"), appointment9_status = c("booked", "available"
), appointment10_status = c("booked", "available"), appointment11_status = c("booked",
"available"), appointment12_status = c("available", "available"
), appointment13_status = c("available", "available"), appointment14_status = c("available",
"available"), appointment15_status = c("booked", "available"
), appointment16_status = c("available", ""), appointment17_status = c("available",
""), appointment18_status = c("available", ""), appointment19_status = c("available",
""), appointment20_status = c(NA_character_, NA_character_
), appointment21_status = c(NA_character_, NA_character_)), row.names = 1:2, class = "data.frame")
A solution using tidyverse
. 使用tidyverse
的解决方案。
library(tidyverse)
# Get the time order
ord <- dat %>% select(ends_with("time")) %>% slice(1) %>% unlist()
# Remove NA
ord <- ord[!is.na(ord)]
dat2 <- dat %>%
rowid_to_column() %>%
gather(Column, Value, -rowid) %>%
separate(Column, into = c("Apt", "time/status"), sep = "_") %>%
spread(`time/status`, Value) %>%
# Remove NA or "" in the status column
filter(!is.na(status) & !status %in% "") %>%
mutate(Apt = str_c("apt_slot", time, sep = "_")) %>%
select(-time) %>%
spread(Apt, status) %>%
select(-rowid) %>%
# Reorder the column
select(str_c("apt_slot", ord, sep = "_"))
dat2
# apt_slot_1030 apt_slot_1100 apt_slot_1130 apt_slot_1200 apt_slot_1230 apt_slot_0100 apt_slot_0130
# 1 booked booked booked booked booked booked booked
# 2 available available available available available available available
# apt_slot_0200 apt_slot_0230 apt_slot_0300 apt_slot_0330 apt_slot_0400 apt_slot_0430 apt_slot_0500
# 1 booked booked booked booked available available available
# 2 available available available available available available available
# apt_slot_0530 apt_slot_0600 apt_slot_0630 apt_slot_0700 apt_slot_0730
# 1 booked available available available available
# 2 available <NA> <NA> <NA> <NA>
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.