R在数据帧的列中标识文本字符串

Question

One column of my data frame has words and phrases. 我的数据框的一列有单词和短语。 I am trying to create a dummy variable for those fields within this column that have specific strings of text anywhere within. 我正在尝试为此列中的那些字段创建一个虚拟变量，其中包含特定的文本字符串。

For example: 例如：

kite 风筝
cars 汽车
box kites 盒子风筝
model cars 模型车
i like kites that fly 我喜欢放风筝

cars of the world 世界汽车

  myvector<-c("kite","cars","box kites","model cars","i like kites that fly", "cars of the world")

I would want to identify all the fields with the string "kite" 我想用字符串“kite”识别所有字段

I've tried a few things such as any() , which() and %in% but nothing has worked so far. 我已经尝试了一些东西，比如any() ， which()和%in%但到目前为止还没有任何工作。

Any help greatly appreciated 任何帮助非常感谢

Answer 1

You didn't provided any reproducible example. 您没有提供任何可重现的示例。 But your answer will be grepl. 但你的答案将是grepl。

grepl("kite", df$words)

It will return a logical vector if the word is in the row. 如果单词在行中，它将返回逻辑向量。

If you want to match multiple words use logical or | 如果要匹配多个单词，请使用logical或| inside the string to match 在匹配的字符串内

grepl("kite|cars|box kites", df$words)

R在数据帧的列中标识文本字符串

问题描述

1 个解决方案

解决方案1
22 已采纳 2012-09-13 15:11:10

R在数据帧的列中标识文本字符串

问题描述

1 个解决方案

解决方案1 22 已采纳 2012-09-13 15:11:10

解决方案1
22 已采纳 2012-09-13 15:11:10