[英]How can I match the values of a column according to another data frame in R and print a message using dplyr?
I have a dataset that looks like this :我有一个如下所示的数据集:
id1 id1 | var1变量1 |
---|---|
A一个 | chair椅子 |
B乙 | table桌子 |
C C | glass玻璃 |
D D | phone电话 |
E乙 | pistol手枪 |
and a second one that contains the licenses of each id (but it could contain different ids)第二个包含每个 id 的许可证(但它可以包含不同的 id)
id2 id2 | var2变量2 |
---|---|
G G | mobile移动的 |
H H | pistol手枪 |
I我 | pistol手枪 |
E乙 | phone电话 |
D D | phone电话 |
I want to check if the ids in the first dataframe are licensed to have what they declare.我想检查第一个数据帧中的 id 是否被许可拥有他们声明的内容。 For example the id D is licensed to have a phone but E is not licensed to have pistol because it's licensed to have phone.例如,id D 被许可拥有电话,但 E 未被许可拥有手枪,因为它被许可拥有电话。 So there are three conditions here and ideally the final data frame must look like this:所以这里有三个条件,理想情况下最终的数据框必须是这样的:
id1 id1 | var1变量1 | license执照 |
---|---|---|
A一个 | chair椅子 | not_needed没有必要 |
B乙 | table桌子 | not_needed没有必要 |
C C | glass玻璃 | not_needed没有必要 |
D D | phone电话 | ok_checked ok_checked |
E乙 | pistol手枪 | danger危险 |
How can I make this cross check and print these messages according to the logical condition in R using dplyr
?如何使用dplyr
根据 R 中的逻辑条件进行交叉检查并打印这些消息?
library(tidyverse)
id1 =c("A","B","C","D","E")
var1 = c("chair","table","glass","phone","pistol")
data1 = tibble(id1,var1);data1
id2 = c("G","H","I","E","D")
var2 = c("mobile","pistol","pistol","phone","phone")
data2 = tibble(id2,var2);data2
You can first left_join
the two dataset, then use a case_when
statement to assign terms to the condition.您可以先left_join
两个数据集,然后使用case_when
语句将术语分配给条件。
library(tidyverse)
left_join(data1, data2, by = c("id1" = "id2")) %>%
mutate(var2 = case_when(is.na(var2) ~ "not_needed",
var1 == var2 ~ "ok_checked",
var1 != var2 ~ "danger",
TRUE ~ NA_character_)) %>%
rename("license" = "var2")
# A tibble: 5 × 3
id1 var1 license
<chr> <chr> <chr>
1 A chair not_needed
2 B table not_needed
3 C glass not_needed
4 D phone ok_checked
5 E pistol danger
library(dplyr)
data1 |>
full_join(data2, by=c("id1"="id2")) |>
rename(declared=var1, actual=var2) |>
mutate(license=ifelse(is.na(declared),"Not declared",
ifelse(declared %in% c("chair","table","glass"),"Not needed",
ifelse(declared==actual,"OK","Danger"))))
I have use full_join.我已经使用了 full_join。 You can use let join if you don't need all the ids.如果您不需要所有 id,可以使用 let join。 I have also assumed that chair, table and gas do not need license.我还假设椅子、桌子和煤气不需要许可证。 You can add or remove items for your need.您可以根据需要添加或删除项目。 Finally, you can remove the columns that you don't need.最后,您可以删除不需要的列。 It is also possible to use case_when()
function instead of ifelse()
statements to achieve the same result.也可以使用case_when()
函数而不是ifelse()
语句来实现相同的结果。
id1 declared actual license
<chr> <chr> <chr> <chr>
1 A chair NA Not needed
2 B table NA Not needed
3 C glass NA Not needed
4 D phone phone OK
5 E pistol phone Danger
6 G NA mobile Not declared
7 H NA pistol Not declared
8 I NA pistol Not declared
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.