根据两列R中的匹配行填充列中的空白

Question

In the df2 I would like to fill in gaps in column d based on matching records in columns b and c between two dataframes. 在df2我想基于两个数据帧之间的b和c列中的匹配记录来填充d列中的间隙。 What would be a quick and elegant way to do that? 什么是快速而优雅的方法呢？ It is important to mention it should work well for occasions where matching rows might have different locations in both dataframes. 重要的是要提到它在匹配的行在两个数据帧中可能具有不同位置的情况下应该能很好地工作。

df1 <- data.frame( a = c(1,1,1,1,1,2,2,2,2,2) ,b = rep(seq(41,45,1),each=2), c = c(101:105,101:105), d = LETTERS[seq( from = 1, to = 10 )])
df2 <- data.frame( a = c(1,1,1,1,1,2,2,2,2,2) ,b = rep(seq(41,45,1),each=2), c = c(101:105,101:105), d = c(LETTERS[seq( from = 1, to = 6 )],rep(NA,4)))

> df1
   a  b   c d
1  1 41 101 A
2  1 41 102 B
3  1 42 103 C
4  1 42 104 D
5  1 43 105 E
6  2 43 101 F
7  2 44 102 G
8  2 44 103 H
9  2 45 104 I
10 2 45 105 J
> df2
   a  b   c    d
1  1 41 101    A
2  1 41 102    B
3  1 42 103    C
4  1 42 104    D
5  1 43 105    E
6  2 43 101    F
7  2 44 102 <NA>
8  2 44 103 <NA>
9  2 45 104 <NA>
10 2 45 105 <NA>

The result should be following: 结果应为：

   a  b   c d
1  1 41 101 A
2  1 41 102 B
3  1 42 103 C
4  1 42 104 D
5  1 43 105 E
6  2 43 101 F
7  2 44 102 G
8  2 44 103 H
9  2 45 104 I
10 2 45 105 J

Answer 1

While you can do lookups with match and perhaps %in% , I'd think another (robust) way to do it is with a merge/join: 虽然您可以使用match或%in%进行查找，但我认为另一种（健壮）的方法是使用合并/联接：

df2mod <- merge(df2, df1[,c('b','c','d')], by = c("b", "c"), all=TRUE)
df2mod
#     b   c a  d.x d.y
# 1  41 101 1    A   A
# 2  41 102 1    B   B
# 3  42 103 1    C   C
# 4  42 104 1    D   D
# 5  43 101 2    F   F
# 6  43 105 1    E   E
# 7  44 102 2 <NA>   G
# 8  44 103 2 <NA>   H
# 9  45 104 2 <NA>   I
# 10 45 105 2 <NA>   J

In this case, dx are the original df2$d . 在这种情况下， dx是原始df2$d 。 Because your data is factor s, some extra parts are necessary ( as.character and the re factor ). 由于您的数据是s factor ，因此需要一些额外的部分（ as.character和re factor ）。

df2mod$d <- with(df2mod, ifelse(is.na(d.x), as.character(d.y), as.character(d.x)))
df2mod$d <- factor(df2mod$d, levels = levels(df1$d))
df2mod
#     b   c a  d.x d.y d
# 1  41 101 1    A   A A
# 2  41 102 1    B   B B
# 3  42 103 1    C   C C
# 4  42 104 1    D   D D
# 5  43 101 2    F   F F
# 6  43 105 1    E   E E
# 7  44 102 2 <NA>   G G
# 8  44 103 2 <NA>   H H
# 9  45 104 2 <NA>   I I
# 10 45 105 2 <NA>   J J
df2mod[,c("d.x", "d.y")] <- NULL # cleanup unnecessary columns

根据两列R中的匹配行填充列中的空白

问题描述

1 个解决方案

解决方案1
0 2019-07-31 22:44:43

根据两列R中的匹配行填充列中的空白

问题描述

1 个解决方案

解决方案1 0 2019-07-31 22:44:43

解决方案1
0 2019-07-31 22:44:43