R：通过替换 NA 行将几个字符列合并为一个

Question

I have a data frame consisting of character variables which looks like this:我有一个由字符变量组成的数据框，如下所示：

   V1             V2   V3   V4   V5
1  ID           Date pic1 pic2 pic3
2   1 15.06.16 11:50  abc <NA>  def
3   1 16.06.16 11:19 <NA>  hij <NA>
4   1 17.06.16 11:41 <NA> <NA>  nop
5   2 28.05.16 11:40  tuv <NA> <NA>
6   2 29.05.16 11:39 <NA>  zab <NA>
7   2 30.05.16 09:07 <NA> <NA>  wxy
8   3 03.06.16 07:31  lmn <NA> <NA>
9   3 04.06.16 11:01 <NA>  rst <NA>
10  3 05.06.16 13:57 <NA> <NA>  opq

So on each day one of the pic-variables contains a value, the rest is NA.因此，每天其中一个 pic 变量包含一个值，其余为 NA。 Now I want to combine all pic-values into one variable by replacing the NA's.现在我想通过替换 NA 将所有图片值组合成一个变量。 Sorry if this is a dublicate, I've already tried a lot of suggested solutions but nothing has worked so far.对不起，如果这是重复的，我已经尝试了很多建议的解决方案，但到目前为止没有任何效果。 Thanks!谢谢！

Answer 1

We can try with data.table .我们可以尝试使用data.table 。 Convert the 'data.frame' to 'data.table' ( setDT(df1) , grouped by 'ID', and 'Date', we unlist the Subset of Data.table ( .SD ) and omit the NA elements ( na.omit )将“data.frame”转换为“data.table”（ setDT(df1) ，按“ID”和“Date”分组，我们unlist的子集（ .SD ）并省略 NA 元素（ na.omit )

library(data.table)
setDT(df1)[, .(pic = na.omit(unlist(.SD))), by = .(ID, Date)]
#    ID           Date pic
# 1:  1 15.06.16 11:50 abc
# 2:  1 15.06.16 11:50 def
# 3:  1 16.06.16 11:19 hij
# 4:  1 17.06.16 11:41 nop
# 5:  2 28.05.16 11:40 tuv
# 6:  2 29.05.16 11:39 zab
# 7:  2 30.05.16 09:07 wxy
# 8:  3 03.06.16 07:31 lmn
# 9:  3 04.06.16 11:01 rst
#10:  3 05.06.16 13:57 opq

Or another option is pmax if there is only a single non-NA per row或者另一个选项是pmax如果每行只有一个非 NA

setDT(df1)[, pic := do.call(pmax, c(.SD, na.rm = TRUE)),
         .SDcols = pic1:pic3][, paste0("pic", 1:3) := NULL][]

Or using dplyr或者使用dplyr

library(dplyr)
df1 %>%
     mutate(pic = pmax(pic1, pic2, pic3, na.rm=TRUE))%>% 
     select(-(pic1:pic3))

data数据

df1 <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), Date = c("15.06.16 11:50", 
"16.06.16 11:19", "17.06.16 11:41", "28.05.16 11:40", "29.05.16 11:39", 
"30.05.16 09:07", "03.06.16 07:31", "04.06.16 11:01", "05.06.16 13:57"
), pic1 = c("abc", NA, NA, "tuv", NA, NA, "lmn", NA, NA), pic2 = c(NA, 
"hij", NA, NA, "zab", NA, NA, "rst", NA), pic3 = c("def", NA, 
"nop", NA, NA, "wxy", NA, NA, "opq")), .Names = c("ID", "Date", 
"pic1", "pic2", "pic3"), row.names = c(NA, -9L), class = "data.frame")

Answer 2

Assuming假设

on each day one of the pic-variables contains a value, the rest is NA每天一个 pic 变量包含一个值，其余的为NA

You can use coalesce from dplyr to get what you want:您可以使用coalesce从dplyr得到你想要的东西：

library(dplyr)
result <- df1 %>% mutate(pic = coalesce(pic1, pic2, pic3)) %>% 
                  select(-(pic1:pic3))

With the data supplied by akrun:使用 akrun 提供的数据：

print(result)
##  ID           Date pic
##1  1 15.06.16 11:50 abc
##2  1 16.06.16 11:19 hij
##3  1 17.06.16 11:41 nop
##4  2 28.05.16 11:40 tuv
##5  2 29.05.16 11:39 zab
##6  2 30.05.16 09:07 wxy
##7  3 03.06.16 07:31 lmn
##8  3 04.06.16 11:01 rst
##9  3 05.06.16 13:57 opq

R：通过替换 NA 行将几个字符列合并为一个

问题描述

2 个解决方案

解决方案1
1 2016-09-13 13:36:16

data数据

解决方案2
0 2016-09-13 23:40:00

R：通过替换 NA 行将几个字符列合并为一个

问题描述

2 个解决方案

解决方案1 1 2016-09-13 13:36:16

data数据

解决方案2 0 2016-09-13 23:40:00

解决方案1
1 2016-09-13 13:36:16

解决方案2
0 2016-09-13 23:40:00