简体   繁体   English

用于获取字符后字符串中所有数字的正则表达式

[英]Regex for getting all digits in a string after a character

I am trying to parse the following string and return all digits after the last square bracket: 我试图解析以下字符串并返回最后一个方括号后的所有数字:

C9: Title of object (foo, bar) [ch1, CH12,c03,4]

So the result should be: 所以结果应该是:

1,12,03,4

The string and digits will change. 字符串和数字会改变。 The important thing is to get the digits after the '[' regardless of what character (if any) precede it. 重要的是得到'['之后的数字,不管它前面有什么字符(如果有的话)。 (I need this in python so no atomic groups either!) I have tried everything I can think of including: (我在python中需要这个,所以也没有原子组!)我已经尝试了我能想到的一切,包括:

 \[.*?(\d) = matches '1' only
 \[.*(\d) = matches '4' only
 \[*?(\d) = matches include '9' from the beginning

etc 等等

Any help is greatly appreciated! 任何帮助是极大的赞赏!

EDIT: I also need to do this without using str.split() too. 编辑:我也需要这样做而不使用str.split()。

You can rather find all digits in the substring after the last [ bracket: 您最好在最后一个[括号后面的子字符串中找到所有数字:

>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> # Get substring after the last '['.
>>> target_string = s.rsplit('[', 1)[1]
>>>
>>> re.findall(r'\d+', target_string)
['1', '12', '03', '4']

If you can't use split, then this one would work with look-ahead assertion: 如果你不能使用split,那么这个可以使用前瞻断言:

>>> s = 'C9: Title of object (fo[ 123o, bar) [ch1, CH12,c03,4]'
>>> re.findall(r'\d+(?=[^[]+$)', s)
['1', '12', '03', '4']

This finds all digits, which are followed by only non- [ characters till the end. 这将找到所有数字,后面只有非[字符直到结尾。

It may help to use the non-greedy ? 使用非贪心可能有帮助? . For example: 例如:

\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]

And, here's how it works (from https://regex101.com/r/jP7hM3/1 ): 而且,这是它的工作原理(来自https://regex101.com/r/jP7hM3/1 ):

"\[.*?(\d*?),.*?(\d*?),.*?(\d*?),.*?(\d*?)\]"
\[ matches the character [ literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
1st Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
2nd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
3rd Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
, matches the character , literally
.*? matches any character (except newline)
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
4th Capturing group (\d*?)
\d*? match a digit [0-9]
Quantifier: *? Between zero and unlimited times, as few times as possible, expanding as needed [lazy]
\] matches the character ] literally

Although - I have to agree with others... This is a regex solution, but its not a very pythonic solution. 虽然 - 我必须同意其他人......这是一个正则表达式解决方案,但它不是一个非常pythonic的解决方案。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM