[英]R reshape wide to long: multiple variables, observations with multiple indicies
I have got some data containing observations with multiple idicies $y_{ibc}$ stored in a messy wide format.我有一些数据包含以凌乱的宽格式存储的多个 idicies $y_{ibc}$ 的观察结果。 I have been fiddling around with tidyr and reshape2 but could not figure it out (reshaping really is my nemesis).我一直在摆弄 tidyr 和 reshape2,但无法弄清楚(重塑真的是我的克星)。
Here is an example:下面是一个例子:
df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9), a1b1c1 = c(5,
2, 1, 4, 3, 1, 0, 1, 3), a2b1c1 = c(3, 4, 1, 1, 3, 2, 1, 4, 4
), a3b1c1 = c(4, 0, 0, 1, 1, 1, 0, 0, 1), a1b2c1 = c(1, 0, 4,
2, 4, 1, 0, 4, 2), a2b2c1 = c(2, 0, 1, 0, 1, 0, 3, 2, 0), a3b2c1 = c(2,
4, 3, 0, 2, 3, 3, 3, 4), yc1 = c(1, 2, 2, 1, 2, 2, 2, 1, 1), a1b1c2 = c(4,
2, 3, 0, 4, 4, 2, 1, 4), a2b1c2 = c(3, 0, 3, 3, 4, 4, 3, 2, 2
), a3b1c2 = c(3, 1, 0, 1, 4, 0, 2, 2, 3), a1b2c2 = c(2, 2, 0,
3, 2, 1, 4, 1, 0), a2b2c2 = c(3, 0, 2, 3, 4, 4, 4, 0, 4), a3b2c2 = c(0,
0, 0, 2, 0, 0, 1, 4, 3), yc2 = c(2, 2, 2, 1, 2, 2, 2, 1, 1), X = c(5,
6, 3, 7, 4, 3, 2, 3, 2)), row.names = c(NA, -9L), class = c("tbl_df",
"tbl", "data.frame"))
This is what I want (excerpt):这就是我想要的(摘录):
id b c y a1 a2 a3 X
1 1 b1 c1 1 5 3 4 5
2 1 b2 c1 1 1 2 2 5
3 1 b1 c2 2 4 3 3 5
4 1 b2 c2 2 2 3 0 5
Using tidyr
& dplyr
:使用tidyr
和dplyr
:
library(tidyverse)
df %>%
pivot_longer(cols = matches("a.b.c."), names_to = "name", values_to = "value") %>%
separate(name, into = c("a", "b", "c"), sep = c(2,4)) %>%
mutate(y = case_when(c == "c1" ~ yc1,
c == "c2" ~ yc2)) %>%
pivot_wider(names_from = a, values_from = value) %>%
select(id, b, c, y, a1, a2, a3, X)
First, convert all your a/b/c columns to a long format & separate the 3 values into separate columns.首先,将所有 a/b/c 列转换为长格式并将 3 个值分成单独的列。 Then combine your y
columns into one depending on the value of c
using mutate
and case_when
(you could also use if_else
for two options but case_when
is more expandable for more values).然后根据c
的值使用mutate
和case_when
将您的y
列合并为一个(您也可以将if_else
用于两个选项,但case_when
对于更多值更易于扩展)。 Then pivot your a
columns back to wide format and use select
to put them in the right order and get rid of the yc1
and yc2
columns.然后将a
列转回宽格式并使用select
将它们按正确的顺序排列并去掉yc1
和yc2
列。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.