[英]how find or match one data frame as a subset(full) into another data frame in R?
I have two data frames df1 and df2 given below. 我在下面给出了两个数据帧df1和df2。
df1
is df1
是
c1 c2 c3 c4
B 2.34000 1.00 I
A 14.43000 2.10 J
D 3.45515 1.00 K
B 2.50000 2.09
A 2.44000 1.10 K
K 5.00000 1.09 L
df2
is: df2
是:
c1 c2 c3
B 2.34 1.00
A 14.43 2.10
D 3.43 1.00
B 2.50 2.09
E 5.00 1.09
A 2.44 1.10
the requirement here is like this: there is matching(or comparison) between these two data frames. 这里的要求是这样的:这两个数据帧之间有匹配(或比较)。 if
df2
completely found ( that means the content of df2
matched with any subset of df1
irrespective of the order ) in df1
(either exactly matched with df2
or subset of df1
matched with df2
) then output is true
. 如果
df1
完全找到df2
( 这意味着df2
的内容与df1
任何子集匹配,而与顺序 df1
(与df2
完全匹配或与df2
匹配的df1
子集),则输出为true
。 If not matched then return false
. 如果不匹配,则返回
false
。
I tried following methods: 我尝试了以下方法:
1. left_join(df2,df1)
2. merge(df2,df1)
3. inner_join(df2,df1)
4. dd1[dd1$c1 %in% dd$c1,]
all the above methods give that data which is common in between both but not give results as per requirements. 以上所有方法给出的数据在两者之间是公用的,但未给出根据要求的结果。
Please suggest me some solution for the same. 请为我建议一些解决方案。
You can use match
and interaction
like: 您可以使用
match
和interaction
例如:
df1 <- read.table(text="c1 c2 c3 c4
B 2.34000 1.00 I
A 14.43000 2.10 J
D 3.45515 1.00 K
B 2.50000 2.09 NA
A 2.44000 1.10 K
K 5.00000 1.09 L", header=T)
df2 <- read.table(text="c1 c2 c3
B 2.34 1.00
A 14.43 2.10
D 3.43 1.00
B 2.50 2.09
E 5.00 1.09
A 2.44 1.10", header=T)
!any(is.na(match(interaction(df2), interaction(df1[names(df2)]))))
#[1] FALSE
#And packed in a function
"%completelyFoundIn%" <- function(x, y) {!any(is.na(match(interaction(x), interaction(y[names(x)]))))}
df2 %completelyFoundIn% df1
#[1] FALSE
df2[c(1,2,4,6),] %completelyFoundIn% df1
#[1] TRUE
df2[-5,c(1,3)] %completelyFoundIn% df1
#[1] TRUE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.