简体   繁体   English

R 将宽变长:多个变量,具有多个索引的观察

[英]R reshape wide to long: multiple variables, observations with multiple indicies

I have got some data containing observations with multiple idicies $y_{ibc}$ stored in a messy wide format.我有一些数据包含以凌乱的宽格式存储的多个 idicies $y_{ibc}$ 的观察结果。 I have been fiddling around with tidyr and reshape2 but could not figure it out (reshaping really is my nemesis).我一直在摆弄 tidyr 和 reshape2,但无法弄清楚(重塑真的是我的克星)。

Here is an example:下面是一个例子:

df <- structure(list(id = c(1, 2, 3, 4, 5, 6, 7, 8, 9), a1b1c1 = c(5, 
2, 1, 4, 3, 1, 0, 1, 3), a2b1c1 = c(3, 4, 1, 1, 3, 2, 1, 4, 4
), a3b1c1 = c(4, 0, 0, 1, 1, 1, 0, 0, 1), a1b2c1 = c(1, 0, 4, 
2, 4, 1, 0, 4, 2), a2b2c1 = c(2, 0, 1, 0, 1, 0, 3, 2, 0), a3b2c1 = c(2, 
4, 3, 0, 2, 3, 3, 3, 4), yc1 = c(1, 2, 2, 1, 2, 2, 2, 1, 1), a1b1c2 = c(4, 
2, 3, 0, 4, 4, 2, 1, 4), a2b1c2 = c(3, 0, 3, 3, 4, 4, 3, 2, 2
), a3b1c2 = c(3, 1, 0, 1, 4, 0, 2, 2, 3), a1b2c2 = c(2, 2, 0, 
3, 2, 1, 4, 1, 0), a2b2c2 = c(3, 0, 2, 3, 4, 4, 4, 0, 4), a3b2c2 = c(0, 
0, 0, 2, 0, 0, 1, 4, 3), yc2 = c(2, 2, 2, 1, 2, 2, 2, 1, 1), X = c(5, 
6, 3, 7, 4, 3, 2, 3, 2)), row.names = c(NA, -9L), class = c("tbl_df", 
"tbl", "data.frame"))

This is what I want (excerpt):这就是我想要的(摘录):

     id b     c         y    a1    a2    a3     X

1     1 b1    c1        1     5     3     4     5
2     1 b2    c1        1     1     2     2     5
3     1 b1    c2        2     4     3     3     5
4     1 b2    c2        2     2     3     0     5

Using tidyr & dplyr :使用tidyrdplyr

library(tidyverse)

df %>% 
  pivot_longer(cols = matches("a.b.c."), names_to = "name", values_to = "value") %>% 
  separate(name, into = c("a", "b", "c"), sep = c(2,4)) %>% 
  mutate(y = case_when(c == "c1" ~ yc1,
                       c == "c2" ~ yc2)) %>% 
  pivot_wider(names_from = a, values_from = value) %>% 
  select(id, b, c, y, a1, a2, a3, X)

First, convert all your a/b/c columns to a long format & separate the 3 values into separate columns.首先,将所有 a/b/c 列转换为长格式并将 3 个值分成单独的列。 Then combine your y columns into one depending on the value of c using mutate and case_when (you could also use if_else for two options but case_when is more expandable for more values).然后根据c的值使用mutatecase_when将您的y列合并为一个(您也可以将if_else用于两个选项,但case_when对于更多值更易于扩展)。 Then pivot your a columns back to wide format and use select to put them in the right order and get rid of the yc1 and yc2 columns.然后将a列转回宽格式并使用select将它们按正确的顺序排列并去掉yc1yc2列。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM