简体   繁体   English

使用 2 个时间变量将 Wide 重塑为 Long

[英]Reshape Wide to Long with 2 time variables

Though there is an abundance of 'wide to long' threads for R, I haven't found an answer that will help me with my issue.尽管 R 有大量“从宽到长”的线程,但我还没有找到可以帮助我解决问题的答案。 Any assistance is greatly appreciated!非常感谢任何帮助!

Example of my dataframe (in wide format):我的 dataframe 示例(宽格式):

CODE  NAME  M_2010_1  M_2011_1  M_2012_1  M_2010_3  M_2011_3  M_2012_3
  1     A      10        11        10        9         10       13
  12    B      11        13        15        15        14       11
  8     C       9         2         4        2         8         8

Desired dataframe (in long):所需 dataframe(长):

CODE  NAME  YEAR  M1  M3
  1    A    2010  10  9
  1    A    2011  11  10
  1    A    2012  10  13
 12    B    2010  11  15
 12    B    2011  13  14
 12    B    2012  15  11
  8    C    2010   9   2
  8    C    2011   2   8
  8    C    2012   4   8

Thanks in advance!提前致谢!

Data数据

df<-
structure(list(CODE = c(1L, 12L, 8L), NAME = c("A", "B", "C"), 
    M_2010_1 = c(10L, 11L, 9L), M_2011_1 = c(11L, 13L, 2L), M_2012_1 = c(10L, 
    15L, 4L), M_2010_3 = c(9L, 15L, 2L), M_2011_3 = c(10L, 14L, 
    8L), M_2012_3 = c(13L, 11L, 8L)), class = "data.frame", row.names = c(NA, 
-3L))

Code代码

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(cols = -c(CODE,NAME),names_sep = "_",names_to = c("aux1","YEAR","aux2")) %>% 
  unite(aux,aux1,aux2,sep = "") %>% 
  pivot_wider(names_from = aux,values_from = value)

Output Output

# A tibble: 9 x 5
   CODE NAME  YEAR     M1    M3
  <int> <chr> <chr> <int> <int>
1     1 A     2010     10     9
2     1 A     2011     11    10
3     1 A     2012     10    13
4    12 B     2010     11    15
5    12 B     2011     13    14
6    12 B     2012     15    11
7     8 C     2010      9     2
8     8 C     2011      2     8
9     8 C     2012      4     8

A one liner using reshape which allows to define all in one.一个使用reshape的单衬里,允许将所有内容合而为一。

reshape(dat, idv=1:2, var=list(3:5, 6:8), dir='long', timev='YEAR', times=2010:2012, v.n=c('M1', 'M2'))
#           CODE NAME YEAR M1 M2
# 1.A.2010     1    A 2010 10  9
# 12.B.2010   12    B 2010 11 15
# 8.C.2010     8    C 2010  9  2
# 1.A.2011     1    A 2011 11 10
# 12.B.2011   12    B 2011 13 14
# 8.C.2011     8    C 2011  2  8
# 1.A.2012     1    A 2012 10 13
# 12.B.2012   12    B 2012 15 11
# 8.C.2012     8    C 2012  4  8

Data:数据:

dat <- structure(list(CODE = c(1L, 12L, 8L), NAME = c("A", "B", "C"), 
    M_2010_1 = c(10L, 11L, 9L), M_2011_1 = c(11L, 13L, 2L), M_2012_1 = c(10L, 
    15L, 4L), M_2010_3 = c(9L, 15L, 2L), M_2011_3 = c(10L, 14L, 
    8L), M_2012_3 = c(13L, 11L, 8L)), class = "data.frame", row.names = c(NA, 
-3L))

We could do this in pivot_longer after we rearrange the substring in the column names在我们重新排列列名中的pivot_longer之后,我们可以在 pivot_longer 中执行此操作

library(dplyr)
library(stringr)
library(tidyr)
df1 %>% 
  rename_with(~ str_replace(.x, "_(\\d+)_(\\d+)", "\\2_\\1"), 
     starts_with("M_")) %>% 
  pivot_longer(cols = starts_with("M"), 
     names_to = c(".value", "year"), names_sep = "_")

-output -输出

# A tibble: 9 × 5
   CODE NAME  year     M1    M3
  <int> <chr> <chr> <int> <int>
1     1 A     2010     10     9
2     1 A     2011     11    10
3     1 A     2012     10    13
4    12 B     2010     11    15
5    12 B     2011     13    14
6    12 B     2012     15    11
7     8 C     2010      9     2
8     8 C     2011      2     8
9     8 C     2012      4     8

data数据

df1 <- structure(list(CODE = c(1L, 12L, 8L), NAME = c("A", "B", "C"), 
    M_2010_1 = c(10L, 11L, 9L), M_2011_1 = c(11L, 13L, 2L), M_2012_1 = c(10L, 
    15L, 4L), M_2010_3 = c(9L, 15L, 2L), M_2011_3 = c(10L, 14L, 
    8L), M_2012_3 = c(13L, 11L, 8L)), class = "data.frame", row.names = c(NA, 
-3L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM