python正则表达式分组

Question

My regular expression goal: 我的正则表达式目标：

"If the sentence has a '#' in it, group all the stuff to the left of the '#' and group all the stuff to the right of the '#'. If the character doesn't have a '#', then just return the entire sentence as one group" “如果句子中包含'＃'，则将所有内容分组到'＃'的左侧，并将所有内容分组在'＃'的右侧。如果字符没有'＃'，然后将整个句子归为一组”

Examples of the two cases: 两种情况的示例：

A) '120x4#Words' -> ('120x4', 'Words')
B) '120x4@9.5' -> ('120x4@9.5')

I made a regular expression that parses case A correctly 我做了一个可以正确解析情况A的正则表达式

(.*)(?:#(.*))

# List the groups found
>>> r.groups()
(u'120x4', u'words')

But of course this won't work for case B -- I need to make "# and everything to the right of it" optional 但是，这当然不适用于情况B-我需要将“＃及其右边的所有内容”设置为可选

So I tried to use the '?' 因此我尝试使用“？” "zero or none" operator on that second grouping to indicate it's optional. 第二个分组上的“零或无”运算符表示它是可选的。
(.*)(?:#(.*))?

But it gives me bad results. 但这给我不好的结果。 The first grouping eats up the entire string. 第一组吃掉了整个串。

# List the groups found
>>> r.groups()
(u'120x4#words', None)

Guess I'm either misunderstanding the none-or-one '?' 猜猜我是不是误解了一个或一个'？' operator and how it works on groupings or I am misunderstanding how the first group is acting greedy and grabbing the entire string. 运算符及其在分组上的工作方式，或者我误解了第一组如何表现贪婪并抓取整个字符串。 I did try to make the first group 'reluctant', but that gave me a total no-match. 我确实尝试过让第一组“不愿”，但这给了我一个完全不匹配的机会。

(.*?)(?:#(.*))?


# List the groups found
>>> r.groups()
(u'', None)

Answer 1

Simply use the standard str.split function: 只需使用标准的str.split函数：

s = '120x4#Words'
x = s.split( '#' )

If you still want a regex solution, use the following pattern: 如果仍然需要正则表达式解决方案，请使用以下模式：

([^#]+)(?:#(.*))?

Answer 2

use re.split : 使用re.split ：

>>> import re
>>> a='120x4#Words'
>>> re.split('#',a)
['120x4', 'Words']
>>> b='120x4@9.5'
>>> re.split('#',b)
['120x4@9.5']
>>>

Answer 3

(.*?)#(.*)|(.+)

this sjould work.See demo. 这应该工作。请参阅演示。

http://regex101.com/r/oC3nN4/14 http://regex101.com/r/oC3nN4/14

Answer 4

Here's a verbose re solution. 这里有一个详细的re解。 But, you're better off using str.split . 但是，最好使用str.split 。

import re

REGEX = re.compile(r'''
    \A
    (?P<left>.*?)
    (?:
        [#]
        (?P<right>.*)
    )?
    \Z
''', re.VERBOSE)


def parse(text):
    match = REGEX.match(text)
    if match:
        return tuple(filter(None, match.groups()))

print(parse('120x4#Words'))
print(parse('120x4@9.5'))

Better solution 更好的解决方案

def parse(text):
    return text.split('#', maxsplit=1)

print(parse('120x4#Words'))
print(parse('120x4@9.5'))

python正则表达式分组

问题描述

4 个解决方案

解决方案1
3 2014-09-07 14:32:02

解决方案2
1 2014-09-07 14:29:54

解决方案3
1 已采纳 2014-09-07 14:30:04

解决方案4
1 2014-09-07 14:42:25

python正则表达式分组

问题描述

4 个解决方案

解决方案1 3 2014-09-07 14:32:02

解决方案2 1 2014-09-07 14:29:54

解决方案3 1 已采纳 2014-09-07 14:30:04

解决方案4 1 2014-09-07 14:42:25

解决方案1
3 2014-09-07 14:32:02

解决方案2
1 2014-09-07 14:29:54

解决方案3
1 已采纳 2014-09-07 14:30:04

解决方案4
1 2014-09-07 14:42:25