简体   繁体   English

正则表达式:如何捕获可能由空格分隔的 6-12 位数字序列而不捕获任何尾随空格

[英]Regex: How to capture a sequence of 6-12 digits that may be separated by spaces without capturing any trailing space

I am attempting capture a sequence of 6-12 digits that may be separated by spaces like the ones below (The letter D at the end is just an example. It's possible that there is nothing at the end of the string, or there is some kind of punctuation or letter).我正在尝试捕获可能由空格分隔的 6-12 位数字序列,如下所示(末尾的字母 D 只是一个示例。字符串末尾可能没有任何内容,或者有一些一种标点符号或字母)。

123 345 4567 89 D
123 345456789 D

My current attempts are as follows:我目前的尝试如下:

Attempt 1 : with the lazy quantifier *?尝试 1 :使用惰性量词*? : :

"\b(?:\d *?){6,12}\b"

With this, it will successfully return all the digits in this string, 123 345456789 D , but fails to fully capture the digits in 123 345 4567 89 D (only the first two groups are captured) -- This I assume is because the first two groups of digits (ie, 123 345 ) fulfill the minimum requirement of 6 digits due to the lazy quantifier, so the regex stops once the minimum requirement is fulfilled.这样,它将成功返回此字符串中的所有数字123 345456789 D ,但无法完全捕获123 345 4567 89 D中的数字(仅捕获前两组)——我认为这是因为前两个由于惰性量词,数字组(即123 345 )满足 6 位数字的最低要求,因此一旦满足最低要求,正则表达式就会停止。

Attempt 2 : without using the lazy quantifier (just using * ):尝试 2 :不使用惰性量词(仅使用* ):

"\b(?:\d *){6,12}\b"

With this, all the groups of digits in the examples above are captured.有了这个,上面例子中的所有数字组都被捕获了。 However, this regex will also capture the trailing space that is right between the last digit and the letter D.但是,此正则表达式还将捕获最后一位数字和字母 D 之间的尾随空格。

So I wonder if there is a way to capture all the digits without including the trailing space.所以我想知道是否有一种方法可以在不包括尾随空格的情况下捕获所有数字。 I am doing this in Python, so one thought was to use the second regex but strip away any trailing space after a match is returned, but it seems really inelegant.我在 Python 中这样做,所以有人认为是使用第二个正则表达式,但在返回匹配项后去除任何尾随空格,但这看起来真的很不雅观。

This will do it: ((?:\d\s*){5,11}\d?)这将做到: ((?:\d\s*){5,11}\d?)

See: https://regex101.com/r/qcRbip/1参见: https://regex101.com/r/qcRbip/1

The quantifier in (?:\d *) is greedy, and will match a space if it is there, also matching it at the end. (?:\d *)中的量词是贪心的,如果有空格就会匹配,最后也会匹配。

In this part (?:\d *?) the quantifier for matching the space is non greedy so after the minimum requirement of 6 times there is a match.在这部分(?:\d *?)中,用于匹配空间的量词是非贪婪的,因此在满足最低要求 6 次之后就有了匹配。

\b\d(?: *\d){5,11}\b
  • \b A word boundary \b单词边界
  • Match the first digit匹配第一个数字
  • (?: *\d){5,11} Repeat 5 - 11 times and optional spaces and a digit (?: *\d){5,11}重复 5 - 11 次和可选的空格和一个数字
  • \b A word boundary \b单词边界

Regex demo正则表达式演示

I cannot reproduce your problem.我无法重现您的问题。 Your attempt 2 works find for me.你的尝试 2 为我找到了作品。 Here is my code:这是我的代码:

s = "123 345 4567 89 D"
re.findall("(?:\d *?){6,12}", s)

['123 345', '4567 89']


d = "123 345456789 D"
re.findall("(?:\d *?){6,12}", d)

['123 345456789']
"\b\d(?: *\d){5,11}\b"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Python中使用Regex捕获任何带有前导$,尾随%的数字或带有小数,逗号或空格的任何数字 - Using Regex in Python to capture any number(s) with a leading $, a trailing %, or any number with a decimal, a comma or space 如何读入以空格和数字隔开的数据? - How to read in data with space separated strings and digits? 用于清理 Python 中空格分隔数字的好正则表达式 - Nice regex for cleaning up space separated digits in Python 通过现有的正则表达式捕获由空格分隔的单词序列 - Capture sequence of words separated by whitespace thru existing regex 如何在打印的数字中显示数字,用两个空格分隔? 在 python - how to display digits in a number printed, separated by two spaces? in python 正则表达式捕获超过4位数的任何数字之前的重叠匹配 - regex to capture overlapping matches preceding any number with more than 4 digits 正则表达式,用于捕获字符串中的数字(Python) - Regex for capturing digits in a string (Python) 用数字之间的空格捕获数字并删除该空格 - Capturing numbers with space between digits and removing that space 如何在Python中不留空格的情况下打印 - How to print without trailing space in Python 如何创建没有尾随空格的文件夹? - How to create folder without trailing white space?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM