简体   繁体   English

使用lookbehinds在string_extract()中提取字符串组

[英]Using lookbehinds to extract groups of strings in string_extract()

library(stringr)图书馆(字符串)

I tried following the advice here but could not make it work for my problem.我尝试遵循此处的建议但无法解决我的问题。 Using stringr I need to extract all the characters following the first string of letters plus a single underscore.使用stringr我需要提取第一个字母字符串后面的所有字符加上一个下划线。

The following extracts exactly what I don't want以下提取的正是我想要的

str_extract("mean_q4.8_addiction_critCount", "(^[a-z]*_)")

# [1] "mean_"

What I want is我想要的是

# [1] "q4.8_addiction_critCount"

Based on the link I inserted above I tried a positive lookbehind根据我在上面插入的链接,我尝试了积极的回顾

str_extract("mean_q4.8_addiction_critCount", "(?<=^[a-z]*_)\\w+")

But got the error但得到了错误

# Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) : 
#  Look-Behind pattern matches must have a bounded maximum length. (U_REGEX_LOOK_BEHIND_LIMIT)

And I couldn't work out how to constrain the maximum length.而且我不知道如何限制最大长度。

Any advice much appreciated.任何建议非常感谢。

Can't you do the opposite instead?你不能做相反的事情吗? Remove everything until first underscore.删除所有内容,直到第一个下划线。

sub('.*?_', '', 'mean_q4.8_addiction_critCount')
#[1] "q4.8_addiction_critCount"

As far as look-behind regex is concerned you can extract everything after first underscore ?就后视正则表达式而言,您可以在第一个下划线之后提取所有内容?

stringr::str_extract("mean_q4.8_addiction_critCount", "(?<=_).*")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM