简体   繁体   English

从字符串中提取数字的正则表达式

[英]Regular expression to extract numbers from a string

Can somebody help me construct this regular expression, please...有人可以帮我构造这个正则表达式吗,拜托...

Given the following strings...给定以下字符串...

  • "April ( 123 widgets less 456 sprockets )" “四月(123 个小部件减去 456 个链轮)”
  • "May (789 widgets less 012 sprockets)" “五月(789 个小部件减去 012 个链轮)”

I need a regular expression that will extract the two numbers from the text.我需要一个正则表达式来从文本中提取两个数字。 The month name will vary.月份名称会有所不同。 The brackets, "widgets less" and "sprockets" text are not expected to change between strings, however, it would be really useful if this text was able to be varied as well.括号、“widgets less”和“sprockets”文本预计不会在字符串之间发生变化,但是,如果该文本也能够变化,那将非常有用。

如果您确定只有 2 个地方有您的字符串中的数字列表,并且这是您要提取的唯一内容,那么您应该可以简单地使用

\d+
^\s*(\w+)\s*\(\s*(\d+)\D+(\d+)\D+\)\s*$

should work.应该管用。 After the match, backreference 1 will contain the month, backreference 2 will contain the first number and backreference 3 the second number.匹配后,反向引用 1 将包含月份,反向引用 2 将包含第一个数字,反向引用 3 将包含第二个数字。

Explanation:解释:

^     # start of string
\s*   # optional whitespace
(\w+) # one or more alphanumeric characters, capture the match
\s*   # optional whitespace
\(    # a (
\s*   # optional whitespace
(\d+) # a number, capture the match
\D+   # one or more non-digits
(\d+) # a number, capture the match
\D+   # one or more non-digits
\)    # a )
\s*   # optional whitespace
$     # end of string

you could use something like:你可以使用类似的东西:

[^0-9]+([0-9]+)[^0-9]+([0-9]+).+

Then get the first and second capture groups.然后得到第一个和第二个捕获组。

we can use \\b as a word boundary and then;我们可以使用 \\b 作为单词边界,然后; \\b\\d+\\b \\b\\d+\\b

On bigquery you need to make sure to use the 'r' preceding the expression:在 bigquery 上,您需要确保在表达式前使用“r”:

REGEXP_EXTRACT(my_string,r'\d+')

This will extract all digits from a string column.这将从字符串列中提取所有数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM