[英]Finding intersected rows in different columns and label them
我正在尋找有關如何正確標記數據框中相交行的解決方案。
這樣的
df = data.frame(id=c("good","bad","ugly","dirty","clean","frenzy"),di=c("good",2,"good","dirty",4,"ugly"))
> df
id di
1 good good
2 bad 2
3 ugly good
4 dirty dirty
5 clean 4
6 frenzy ugly
我想創建列名match
,如果存在常見的相交行值,則將它們標記為intersected
否則為no.intersect
。
我試過了
df%>%
mutate(match=ifelse(isTRUE(intersect(id,di)),"intersected","no.intersect"))
它輸出
id di match
1 good good no.intersect
2 bad 2 no.intersect
3 ugly good no.intersect
4 dirty dirty no.intersect
5 clean 4 no.intersect
6 frenzy ugly no.intersect
盡管第1行和第4行存在相交。
library(dplyr)
# example dataframe
# (non factor variables)
df = data.frame(id=c("good","bad","ugly","dirty","clean","frenzy"),
di=c("good",2,"good","dirty",4,"ugly"),
stringsAsFactors = F)
# check equaity of values at each row
df %>% mutate(match = ifelse(id == di, "intersected", "no.intersect"))
# id di match
# 1 good good intersected
# 2 bad 2 no.intersect
# 3 ugly good no.intersect
# 4 dirty dirty intersected
# 5 clean 4 no.intersect
# 6 frenzy ugly no.intersect
另外,如果要使用intersect
,可以像這樣使用它:
df %>%
rowwise() %>%
mutate(match = ifelse(length(intersect(id,di)) > 0, "intersected", "no.intersect")) %>%
ungroup()
# # A tibble: 6 x 3
# id di match
# <chr> <chr> <chr>
# 1 good good intersected
# 2 bad 2 no.intersect
# 3 ugly good no.intersect
# 4 dirty dirty intersected
# 5 clean 4 no.intersect
# 6 frenzy ugly no.intersect
因為intersect
不是向量化的(所以您需要rowwise
),並且它不返回TRUE或FALSE(所以您不能使用isTRUE
)。 如果有匹配項,它將返回實際值;否則,將不返回任何值。
如果您要比較id和di之間的相等性,請嘗試:
df %>% mutate(match = ifelse(id == di, "intersected", "no.intersect"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.