简体   繁体   English

R:将数据重新整形为多列到行

[英]R: Reshaping data as multiple columns into rows

I have a df which includes multiple columns, which you could find my templete below.我有一个包含多个列的 df,您可以在下面找到我的模板。 I would like to reshape as columns into rows in R.我想在 R 中将列重塑为行。 I am sure it is possible with tidyr::gather() function but I can not manage it.我确信 tidyr::gather() function 是可能的,但我无法管理它。 If someone could help me I would be glad!如果有人可以帮助我,我会很高兴!

Best wishes最好的祝愿

# Df I have
             A1 A2 A3 A4  B1 B2 B3 B4  C1 C2 C3  C4  D1 D2 D3 D4
X1 X2 X3 X4   a b  c  d   e  f  g  h    i  j  k  l
Y1 Y2 Y3 Y4   m n  o  p    
Z1 Z2 Z3 Z4   r s  t  u   w  v  y  z 


# Df I would like to reshape

            Col1 Col2 Col3 Col4
X1 X2 X3 X4   a   b    c   d
X1 X2 X3 X4   e   f    g   h
X1 X2 X3 X4   i   j    k   l
Y1 Y2 Y3 Y4   m   n    o   p
Z1 Z2 Z3 Z4   r   s    t   u
Z1 Z2 Z3 Z4   w   v    y   z

We could also do this with a single pivot_longer我们也可以用一个pivot_longer来做到这一点

library(dplyr)
library(tidyr)
library(stringr)
df %>% 
      pivot_longer(cols = -id,  names_to = c("grp", ".value"), 
            names_sep="(?<=[A-Z])(?=[0-9])", values_drop_na = TRUE) %>% 
      select(-grp) %>%
      rename_at(-1, ~ str_c('Col', .))
# A tibble: 7 x 5
#     id Col1  Col2  Col3  Col4 
#  <int> <chr> <chr> <chr> <chr>
#1     1 a     b     c     d    
#2     1 e     f     g     h    
#3     1 i     j     k     l    
#4     2 m     n     o     p    
#5     2 q     <NA>  <NA>  <NA> 
#6     3 r     s     t     u    
#7     3 w     v     y     z    

data数据

df <- structure(list(id = 1:3, A1 = c("a", "m", "r"), A2 = c("b", "n", 
"s"), A3 = c("c", "o", "t"), A4 = c("d", "p", "u"), B1 = c("e", 
"q", "w"), B2 = c("f", NA, "v"), B3 = c("g", NA, "y"), B4 = c("h", 
NA, "z"), C1 = c("i", NA, NA), C2 = c("j", NA, NA), C3 = c("k", 
NA, NA), C4 = c("l", NA, NA), D1 = c(NA, NA, NA), D2 = c(NA, 
NA, NA), D3 = c(NA, NA, NA), D4 = c(NA, NA, NA)), class = "data.frame",
row.names = c("1", 
"2", "3"))

I bet there are more elegant solutions, but this one uses tidyr and dplyr :我敢打赌有更优雅的解决方案,但这个使用tidyrdplyr

Suppose your data looks like假设您的数据看起来像

> df
# A tibble: 3 x 17
     id A1    A2    A3    A4    B1    B2    B3    B4    C1    C2    C3    C4    D1    D2    D3    D4   
  <dbl> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1     1 a     b     c     d     e     f     g     h     i     j     k     l     NA    NA    NA    NA   
2     2 m     n     o     p     q     NA    NA    NA    NA    NA    NA    NA    NA    NA    NA    NA   
3     3 r     s     t     u     w     v     y     z     NA    NA    NA    NA    NA    NA    NA    NA

I replaced your X1 X2 X3 X4, ... by an indexing column and I added on q in column B1 .我用索引列替换了您的X1 X2 X3 X4, ...并在B1列中添加了q

Using使用

df %>%
  pivot_longer(cols=matches("\\d$"), 
               names_to = c("set"),
               names_pattern = ".(.)") %>%
  pivot_wider(names_from="set", 
              names_prefix="Col",
              values_fn = list) %>%
  unnest(matches("\\d$")) %>%
  rowwise() %>%
  filter(sum(is.na(c_across(matches("\\d$")))) != ncol(.) - 1)  # -1 because of the indexing column

returns返回

# A tibble: 7 x 5
# Rowwise: 
     id Col1  Col2  Col3  Col4 
  <dbl> <chr> <chr> <chr> <chr>
1     1 a     b     c     d    
2     1 e     f     g     h    
3     1 i     j     k     l    
4     2 m     n     o     p    
5     2 q     NA    NA    NA   
6     3 r     s     t     u    
7     3 w     v     y     z 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM