与 grepl R 完全匹配

Question

I'm trying to extract certain records from a dataframe with grepl.我正在尝试使用 grepl 从数据框中提取某些记录。

This is based on the comparison between two columns Result and Names.这是基于两列结果和名称之间的比较。 This variable is build like this "WordNumber" but for the same word I have multiple numbers (more than 30), so when I use the grepl expression to get for instance Word1 I get also results that I would like to avoid, like Word12.这个变量是这样构建的，但是对于同一个单词，我有多个数字（超过 30 个），所以当我使用 grepl 表达式来获取例如 Word1 时，我也会得到我想要避免的结果，如 Word12。

Any ideas on how to fix this?有想法该怎么解决这个吗？

Names <- c("Word1")
colnames(Names) <- name
Results <- c("Word1", "Word11", "Word12", "Word15")
Records <- c("ThisIsTheResultIWant", "notThis", "notThis", "notThis") 
Relationships <- data.frame(Results, Records)

Relationships <- subset(Relationships, grepl(paste(Names$name, collapse = "|"), Relationships$Results))

This doesn't work, if I use fixed = TRUE than it doesn't return any result at all (which is weird).这不起作用，如果我使用fixed = TRUE则它根本不返回任何结果（这很奇怪）。 I have also tried concatenating the name part with other numbers like this, but with no success:我也尝试将名称部分与这样的其他数字连接，但没有成功：

Relationships <- subset(Relationships, grepl(paste(paste(Names$name, '3', sep = ""), collapse = "|"), Relationships$Results))

Since I'm concatenating I'm not really sure of how to use the \\b to enforce a full match.由于我正在连接，因此我不太确定如何使用 \\b 来强制执行完全匹配。

Any suggestions?有什么建议？

Answer 1

In addition to @Richard's solution, there are multiple ways to enforce a full match.除了@Richard 的解决方案之外，还有多种方法可以强制执行完全匹配。

\\b \\b

"\\b" is an anchor to identify word before/after pattern “\\b”是在模式之前/之后识别单词的锚点

> grepl("\\bWord1\\b",c("Word1","Word2","Word12"))
[1]  TRUE FALSE FALSE

\\< & \\> \\< & \\>

"\\<" is an escape sequence for the beginning of a word, and ">" is used for end "\\<" 是单词开头的转义序列，">" 用于结尾

> grepl("\\<Word1\\>",c("Word1","Word2","Word12"))
[1]  TRUE FALSE FALSE

Answer 2

Use ^ to match the start of the string and $ to match the end of the string使用 ^ 匹配字符串的开头，使用 $ 匹配字符串的结尾

Names <-c('^Word1$')

Or, to apply to the entire names vector或者，应用于整个名称向量

Names <-paste0('^',Names,'$')

Answer 3

I think this is just:我认为这只是：

Relationships[Relationships$Results==Names,]

If you end up doing ^Word1$ you're just doing a straight subset.如果你最终做^Word1$你只是在做一个直接的子集。 If you have multiple names, then instead use:如果您有多个名称，请改用：

Relationships[Relationships$Results %in% Names,]

与 grepl R 完全匹配

问题描述

3 个解决方案

解决方案1
11 2017-09-11 11:12:08

\\b \\b

\\< & \\> \\< & \\>

解决方案2
5 2017-09-11 10:56:18

解决方案3
3 已采纳 2017-09-11 11:18:10

与 grepl R 完全匹配

问题描述

3 个解决方案

解决方案1 11 2017-09-11 11:12:08

\\b \\b

\\< & \\> \\< & \\>

解决方案2 5 2017-09-11 10:56:18

解决方案3 3 已采纳 2017-09-11 11:18:10

解决方案1
11 2017-09-11 11:12:08

解决方案2
5 2017-09-11 10:56:18

解决方案3
3 已采纳 2017-09-11 11:18:10