正则表达式的情况…多于一组可变空间

Question

I'm new to regex but I seem to have things going my way. 我是regex的新手，但似乎有前进的路。

https://regex101.com/r/Is8wZK/1 --- group 8 might have more than one word in it... sepereated by a space, but, as u can see, so does group 5, and i've exhausted my one time useage of (.+) https://regex101.com/r/Is8wZK/1 ---第8组中可能有一个以上的单词...由空格分隔，但是，如您所见，第5组也是如此，我已经用尽我一次使用（。+）

How can I re-write my regex to detect group 8 in exactly the way group 5 is detected? 如何以检测组5的方式重新编写正则表达式以检测组8？

Answer 1

^(\S+)\s+(\S+)\s+(\S+)\s+(\S+)\s+((?:[[:alpha:]]+)(?:\s+[[:alpha:]]+)*)\s+(\S+)\s+(\S+)\s+((?:[[:alpha:]]+)(?:\s+[[:alpha:]]+)*)\s+(\S+)\s+(\S+)\s+(\S+)\s+(\S+)$

Link: https://regex101.com/r/v4mEJK/1 链接： https ： //regex101.com/r/v4mEJK/1

Pretty much all you need to do is match a group of alphabetic character and an optional group of spaces plus alphabetic characters in order to capture names which may or may not have more than one word; 您几乎需要做的就是匹配一组字母字符和一组可选的空格以及字母字符，以捕获可能有一个或多个单词的名称。 this is done by using 这是通过使用完成的

(?:[[:alpha:]]+)(?:\s+[[:alpha:]]+)*)

for groups 5 and 8. 适用于第5组和第8组。

The rest of the regex could possibly be made more specific, but there isn't really any need to add more complexity unless your input text is significantly more complex than your test case. 正则表达式的其余部分可能会变得更加具体，但实际上并不需要增加任何复杂性，除非您的输入文本比测试用例复杂得多。

FWIW: It's far better to use \\s+ instead of a raw space between groups so you can match other delimiting whitespace. FWIW：最好使用\\s+代替组之间的原始空间，以便您可以匹配其他定界空格。

Answer 2

I redid your generic capture groups into this: 我将您的通用捕获组重新定义为：

^(\d+\/\d+\/\d+) ([A-Z]\d+) (\d+) (\d+) (.+) (\d+[A-Z]{3}\d+) (\d+) (.+) ([A-Z]) (\d+\.\d+) (\d+\.\d+) (\d+\.\d+)$

Breaking that down: 分解：

(\\d+\\/\\d+\\/\\d+) : this matches the date (\\d+\\/\\d+\\/\\d+) ：与日期匹配
([AZ]\\d+) : this matches a capital followed by some numbers ([AZ]\\d+) ：这匹配一个大写([AZ]\\d+)后跟一些数字
(\\d+) : this matches a number (\\d+) ：此数字匹配
(\\d+) : this matches a number (\\d+) ：此数字匹配
(.+) : this is the first general group (.+) ： 这是第一个常规组
(\\d+[AZ]{3}\\d+) : this matches any number followed by 3 capitals followed by any number (\\d+[AZ]{3}\\d+) ：此值匹配任意数字，后跟3个大写字母，后跟任意数字
(\\d+) : this matches a number (\\d+) ：此数字匹配
(.+) : this is the second general group (.+) ： 这是第二个一般组
(\\d+\\.\\d+) : this matches a number with a decimal point (\\d+\\.\\d+) ：这与带小数点的数字匹配
(\\d+\\.\\d+) : this matches a number with a decimal point (\\d+\\.\\d+) ：这与带小数点的数字匹配
(\\d+\\.\\d+) : this matches a number with a decimal point (\\d+\\.\\d+) ：这与带小数点的数字匹配

This should help you get what you want. 这应该可以帮助您获得所需的东西。

If you are only interested in groups 5 and 8, try non capturing groups: 如果您仅对组5和8感兴趣，请尝试不捕获组：

^(?:\d+\/\d+\/\d+) (?:[A-Z]\d+) (?:\d+) (?:\d+) (.+) (?:\d+[A-Z]{3}\d+) (?:\d+) (.+) (?:[A-Z]) (?:\d+\.\d+) (?:\d+\.\d+) (?:\d+\.\d+)$

Or only group what you need: 或者仅对您需要的内容进行分组：

^\d+\/\d+\/\d+ [A-Z]\d+ \d+ \d+ (.+) \d+[A-Z]{3}\d+ \d+ (.+) [A-Z] \d+\.\d+ \d+\.\d+ \d+\.\d+$

正则表达式的情况…多于一组可变空间

问题描述

2 个解决方案

解决方案1
2 已采纳 2018-05-17 23:36:25

解决方案2
1 2018-05-17 22:48:54

正则表达式的情况…多于一组可变空间

问题描述

2 个解决方案

解决方案1 2 已采纳 2018-05-17 23:36:25

解决方案2 1 2018-05-17 22:48:54

解决方案1
2 已采纳 2018-05-17 23:36:25

解决方案2
1 2018-05-17 22:48:54