[英]Extracting numbers from string in R using regex
I have a string like this: 我有一个像这样的字符串:
myString <- "[0.15][4577896]blahblahblahblahwhatever"
I need to extract the number between second brackets. 我需要提取第二个括号之间的数字。
Currently I am trying to use this: 目前,我正在尝试使用此功能:
str_extract(myString, "\\]\\[(\\d+)")
But this gives me ][4577896
但这给了我
][4577896
My desired result would be: 4577896
我想要的结果是:
4577896
How could I achieve this? 我怎样才能做到这一点?
With no need of look behinds 无需回头
gsub(".*\\[(\\d+).*","\\1",myString)
[1] "4577896"
You can try this . 你可以试试看。
(?<=\\]\\[)(\\d+)
This is a demo. 这是一个演示。 https://regex101.com/r/fvHW05/1
https://regex101.com/r/fvHW05/1
Here is another version with minimal or no regex 这是带有最小或没有正则表达式的另一个版本
qdapRegex::ex_between_multiple(myString, "[", "]")[[2]]
#[1] "4577896"
It extracts all the substring between [
and ]
and we select the value between second bracket. 它提取
[
和]
之间的所有子字符串,然后选择第二个括号之间的值。 You can convert it into numeric or integer if needed. 您可以根据需要将其转换为数字或整数。
You may use 您可以使用
^(?:[^\[\]]*\[[^\[\]]+\])[^\]\[]*\[([^\]\[]+).+
And replace this with the first captured group using gsub
, see a demo on regex101.com . 并使用
gsub
将其替换为第一个捕获的组,请参见regex101.com上的演示 。 In base R
: 在基数
R
:
myString <- "[0.15][4577896]blahblahblahblahwhatever"
pattern <- "^(?:[^\\[\\]]*\\[[^\\[\\]]+\\])[^\\]\\[]*\\[([^\\]\\[]+).+"
gsub(pattern, "\\1", myString, perl = T)
# [1] "4577896"
An option using str_extract
使用
str_extract
的选项
library(stringr)
str_extract(myString, "(?<=.\\[)([0-9]+)")
#[1] "4577896"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.