简体   繁体   English

Rstudio:尝试在数据框中使用%in%运算符时出错

[英]Rstudio: Error while trying to use %in% operator in a dataframe

I want to find if a certain cell of a dataframe contains a word which exists in another cell of another dataframe. 我想查找数据框的某个单元格是否包含一个单词,该单词存在于另一个数据框的另一个单元格中。 In specific: 具体来说:

geoplaces$name[1] has the value: Athens geoplaces$name[1]的值: Athens

adresses$Comments[1] has the value: Is he still in Athens? adresses$Comments[1]的价值: Is he still in Athens?

But when I execute the follow script: 但是当我执行以下脚本时:

if(geoplaces$name[1] %in% adresses$Comments[1]){
  print("hello")
}else{print("error")}

I'm getting "error" as a result. 结果我得到了“错误”。

Any suggestions on what's wrong on this? 对这有什么问题有什么建议吗?

The %in% operator is looking for an exact match. %in%运算符正在寻找精确匹配。 In your case it is not true that "Athens" is contained in the vector consisting of the single element "Is he still in Athens?" 在您的情况下,由单个元素“他还在雅典吗?”组成的向量中不包含“雅典”

You are perhaps interested in substring matching. 您可能对子字符串匹配感兴趣。 There are many ways to do this. 有很多方法可以做到这一点。 You could try this using the grepl function: 您可以使用grepl函数尝试此操作:

if(grepl(geoplaces$name[1], adresses$Comments[1])) {
  print("hello")
} else {
  print("error")
} 

here you're checking if the first element of the vector ( geoplaces$name[1] ) is IDENTICAL TO the first element of another vector ( adresses$Comments[1] ). 在这里,您要检查向量的第一个元素( geoplaces$name[1] )与另一个向量的第一个元素( adresses$Comments[1] )是否相同。 It compares if the strings are identical. 比较字符串是否相同。 But they aren't. 但事实并非如此。 If you want just the logical that the element (Athens ) is in the list, try regular expressions. 如果只希望元素(Athens)在列表中是合乎逻辑的,请尝试使用正则表达式。 So this should work: grepl(geoplaces$name[1], adresses$Comments[1]) 所以这应该工作: grepl(geoplaces$name[1], adresses$Comments[1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM