在 R 中捕获并提取一个 RegEX 组

Question

我有一组看起来像这样的数据：

text_string <- structure(list(text_string = c("A Nanny-Back Up Care and Staffing Company-San Diego, OC, LA, San Francisco, Portland, Las Vegas, Phoenix, Seattle, Denver and NY. @jefffoes", 
"Creative Producer of @crwnmag  LA-NY-TX dereksith@googke.com Founded @marcusharper", 
"daily elements for life and style  texas transplant in california LA lauren@gmail.com read my blog + shop my instagram", 
"LIVE, LAUGH, LOVE")), class = "data.frame", row.names = c(NA, 
-4L))

我正在尝试捕获字符串中“LA”的每个实例并用它创建一个新字段。 在我使用的 Regex 代码中，它应该为前三个字符串返回“LA”的匹配项，而最后一个不返回匹配项。 您可以在此处查看示例。

我认为这段代码可以解决问题，但事实并非如此：

text_string_new <- text_string %>% mutate(new_field = str_replace(string = text_string,
                                                           pattern = "(LA)(\\b|,)",
                                                           replacement = "\\1"))

似乎所做的只是返回text_string字段的精确副本。

Answer 1

使用str_extract而不是str_replace似乎可以解决问题。

text_string %>% mutate(new_field = str_extract(string = text_string,
                                               pattern = "(LA)(\\b|,)"))

在 R 中捕获并提取一个 RegEX 组

问题描述

1 个解决方案

解决方案1
2 已采纳 2022-10-06 17:49:11

在 R 中捕获并提取一个 RegEX 组

问题描述

1 个解决方案

解决方案1 2 已采纳 2022-10-06 17:49:11

解决方案1
2 已采纳 2022-10-06 17:49:11