简体   繁体   English

匹配来自两个数据框的列,并根据其他列的值进行过滤

[英]Match columns from two dataframes and filter on the value of another columns

I have two dataframes with the same number of columns but different number of rows: 我有两个具有相同列数但不同行数的数据框:

colA colB colC colD
xxx  303  200  A
yyy  111  20   B
zzz  24  188   C

I need to match colA from df1 to colA from df2 and select only the rows where df1$colB - df2$colC <= 2000 我需要匹配colAdf1colAdf2并且只选择行,其中df1$colB - df2$colC <= 2000

I tried to do the for loop but it didn't work: 我试图做for循环,但是没有用:

for (i in nrow(df1)) {
    for (j in nrow(df2)) {
        df3 <- subset(merge(df2[j,], df1[i,], by="row.names", all=T), df2$colA[j] == df1$colA[i] && (df1$colB[i] - df2$colC[j]) <= abs(2000))
    }
}

What am I doing wrong? 我究竟做错了什么? It doesn't give me any error but the new dataframe is empty. 它没有给我任何错误,但是新数据框为空。

If dplyr is an option, try this: 如果dplyr是一个选项,请尝试以下操作:

df1 %>%
    inner_join(df2, by = "colA") %>%
    filter(abs(colB.x - colC.y) <= 2000)

This will give you a frame with columns colA, colB.x, colC.x, colD.x, colB.y, colC.y, colD.y where the .x are from df1 and the .y are from df2. 这将为您提供带有colA, colB.x, colC.x, colD.x, colB.y, colC.y, colD.y列的框架colA, colB.x, colC.x, colD.x, colB.y, colC.y, colD.y其中.x来自df1,.y来自df2。 Also note that b - a <= abs(2000) is likely supposed to mean abs(b - a) <= 2000 还要注意b - a <= abs(2000)可能意味着abs(b - a) <= 2000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM