简体   繁体   English

R中基于多个条件匹配两个数据框的值

[英]Matching values of two data frames based on multiple conditions in R

I have a two datasets我有两个数据集

cycle <- c(160, 150, 158, 180)
split1 <- c(2, 4, 6, 8)
split2 <- c(10, 12, 14, 16)
df1 <- data.frame(cycle, split1, split2)
df1
  cycle split1 split2
1   160      2     10
2   150      4     12
3   158      6     14
4   180      8     16

cycle <- c(160,150,190,180,161,150,140,179)
split1 <- c(2,4,12,8,2,4,32,8)
split2 <- c(10, 12, 18, 16, 10, 12, 21, 16)
df2 <- data.frame(cycle, split1, split2)
df2
  cycle split1 split2
1   160      2     10
2   150      4     12
3   190     12     18
4   180      8     16
5   161      2     10
6   150      4     12
7   140     32     21
8   179      8     16

I want to match and label the values of df1 and df2 based on two conditions:我想根据两个条件匹配和标记 df1 和 df2 的值:

1- If the values of all three columns ie cycle, split1, and split2 are exactly the same then assign a row with the label "Same" otherwise "Different". 1- 如果所有三列(即循环、split1 和 split2)的值完全相同,则分配一个标签为“相同”的行,否则为“不同”。

2- If the difference of only cycle value from df1 and df2 is +1 or -1 and the rest of the row values are the same then assign a row with the label "Same" otherwise "Different". 2- 如果 df1 和 df2 的唯一循环值的差异为 +1 或 -1,并且其余行值相同,则分配标签为“相同”的行,否则为“不同”。

The output should look like this输出应该是这样的

  cycle split1 split2      Type
1   160      2     10      Same
2   150      4     12      Same
3   190     12     18 Different
4   180      8     16      Same
5   161      2     10      Same
6   150      4     12      Same
7   140     32     21 Different
8   179      8     16      Same

I was successful in achieving the first condition as below我成功地达到了第一个条件,如下所示

df1<- df1 %>% mutate(key = paste0(cycle,split1, split2, "_"))
df2<- df2 %>% mutate(key = paste0(cycle,split1, split2, "_"))
df2 %>% mutate(Type = ifelse(df2$key %in% df1$key, 'same', 'different'))%>%
  select(-key)

  cycle split1 split2      Type
1   160      2     10      same
2   150      4     12      same
3   190     12     18 different
4   180      8     16      same
5   161      2     10 different
6   150      4     12      same
7   140     32     21 different
8   179      8     16 different

but having a problem achieving the second one.但是在实现第二个时遇到问题。

Any idea how to do this efficiently?知道如何有效地做到这一点吗?

Thank you in advance.先感谢您。

You could use你可以用

library(dplyr)


df2 %>%
  left_join(df1, by = c("split1", "split2"), suffix = c("", ".y")) %>%
  mutate(type = coalesce(ifelse(abs(cycle - cycle.y) <= 1, "same", "different"), "different")) %>%
  select(-cycle.y)

This returns这返回

  cycle split1 split2      type
1   160      2     10      same
2   150      4     12      same
3   190     12     18 different
4   180      8     16      same
5   161      2     10      same
6   150      4     12      same
7   140     32     21 different
8   179      8     16      same

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM