如何在python正则表达式中重复模式？

Question

I'm doing a python regex and have a working expression: 我正在做一个python正则表达式，并有一个有效的表达式：

\n(?P<curve>\w+)(?:.+)(?P<unit>\.\S*)(?:\s+.\s+)(?P<desc>:.+)|\n(?P<curve2>\w+)(?:.+)(?P<unit2>\.\S*)|\n(?P<curve3>\w+)

I would like to know I could repeat the pattern from the first if, the reason is that I would like to not group in many "curve" or "unit" for each case. 我想知道我可以从头开始重复这种模式，原因是我不想针对每种情况将许多“曲线”或“单元”分组。

My test data is as follows: 我的测试数据如下：

#-------------
MD              
BMK_STA            .Mpsi                                   : Modulus
FANG        .                                   : Friction Angle
PR             .unitless                               :  
RHO           .g/cm3

The idea is to have MD and RHO also in "curve" group. 想法是将MD和RHO也放在“曲线”组中。

Answer 1

I am not entirely sure what you mean, but the following may help: 我不确定您的意思，但以下内容可能会有所帮助：

If you want to find every match for a pattern, you can use re.findall(pattern, string) 如果要查找模式的每个匹配项，则可以使用re.findall(pattern, string)

It returns a list of the matches.. 它返回匹配项list 。

re module docs 重新模组文件

Answer 2

There is no special syntax to avoid that kind of repetition in regexes, so in the general case you can't avoid a certain amount of repetition. 在正则表达式中没有特殊的语法可以避免这种重复，因此在一般情况下，您无法避免一定程度的重复。 However in your specific case you should be able to solve your problem using optional groups: 但是，在您的特定情况下，您应该可以使用可选组解决问题：

\n(?P<curve>\w+)((?:.+)(?P<unit>\.\S*)((?:\s+.\s+)(?P<desc>:.+))?)?

Which is probably better written in verbose mode as: 最好用详细模式写成：

\n(?P<curve>\w+)
(
    .+
    (?P<unit>\.\S*)
    (
        \s+.\s+
        (?P<desc>:.+)
    )?
)?

to make the group nesting easier to read. 使组嵌套更易于阅读。 I've also remove the ?: groups since in this case they are useless. 我也删除了?:组，因为在这种情况下它们是无用的。

Answer 3

Assuming your regex is correct. 假设您的正则表达式正确。 Use the finditer() method for this purpose to iterate all the matches. 为此，请使用finditer（）方法来迭代所有匹配项。

Example: 例：

for m in re.finditer(r'REGEX_GOES_HERE', text):
    print m.group('curve')
    print m.group("unit")

In this way you picked all the matches, as well as their named groups are intact as you wanted! 通过这种方式，您可以选择所有比赛，以及它们的named groups完整无缺！

如何在python正则表达式中重复模式？

问题描述

3 个解决方案

解决方案1
0 2014-03-20 20:58:53

解决方案2
0 已采纳 2014-03-20 21:28:53

解决方案3
0 2014-03-20 21:37:07

如何在python正则表达式中重复模式？

问题描述

3 个解决方案

解决方案1 0 2014-03-20 20:58:53

解决方案2 0 已采纳 2014-03-20 21:28:53

解决方案3 0 2014-03-20 21:37:07

解决方案1
0 2014-03-20 20:58:53

解决方案2
0 已采纳 2014-03-20 21:28:53

解决方案3
0 2014-03-20 21:37:07