简体   繁体   English

将 R 中的单个 dataframe 中的多列合并为一个新的 dataframe 作为长格式的一列

[英]Combine multiple columns from a single dataframe in R into a new dataframe as one column in long form

My dataframe looks like我的 dataframe 看起来像

ID  FMR  FML  KPR  KPL
----------------------
a    1   NA   NA   1
b    0   NA    1   0
c    NA   1   NA   NA
d    1   NA    0   NA
e    NA   1   NA   NA

I need to make a new dataframe that looks like我需要做一个新的 dataframe 看起来像

ID   FM   KP
-------------
a    1    NA
b    0     1
c    NA   NA
d    1     0
e    NA   NA
a    NA    1
b    NA    0
c    1    NA
d    NA   NA
e    1    NA

What would be the best way to do this?最好的方法是什么?

tidyverse ( coalesce() specifically) is your friend here... tidyverse (特别是coalesce() )是你的朋友......


fm <- df %>%
  transmute(ID,
            FM = coalesce(FMR, FML))

kp <- df %>%
  transmute(ID,
            KP = coalesce(KPR, KPL))
fm %>%
  bind_rows(kp)

# A tibble: 10 x 3
   ID       FM    KP
   <chr> <dbl> <dbl>
 1 a         1    NA
 2 b         0    NA
 3 c         1    NA
 4 d         1    NA
 5 e         1    NA
 6 a        NA     1
 7 b        NA     1
 8 c        NA    NA
 9 d        NA     0
10 e        NA    NA

but are you sure you want that output?但是你确定你想要那个 output 吗? You would be losing information on whether KP is coming from KPR or KPL for example.例如,您将丢失有关 KP 是来自 KPR 还是来自 KPL 的信息。

From the example shared it can be seen that the first two characters of column name identify which column a particular value goes.从共享的示例可以看出,列名的前两个字符标识了特定值所在的列。 We can use that in tidyr 's pivot_longer .我们可以在tidyrpivot_longer中使用它。

tidyr::pivot_longer(df, cols = -ID, 
                    names_to = '.value', 
                    names_pattern = '(..)')


#   ID       FM    KP
#   <chr> <int> <int>
# 1 a         1    NA
# 2 a        NA     1
# 3 b         0     1
# 4 b        NA     0
# 5 c        NA    NA
# 6 c         1    NA
# 7 d         1     0
# 8 d        NA    NA
# 9 e        NA    NA
#10 e         1    NA

This will work for any number of columns as far as the column name pattern holds.只要列名模式成立,这将适用于任意数量的列。

data数据

df <- structure(list(ID = c("a", "b", "c", "d", "e"), FMR = c(1L, 0L, 
NA, 1L, NA), FML = c(NA, NA, 1L, NA, 1L), KPR = c(NA, 1L, NA, 
0L, NA), KPL = c(1L, 0L, NA, NA, NA)), class = "data.frame", 
row.names = c(NA, -5L))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM