[英]Regex pattern to match substring
Would like to find the following pattern in a string: 想在字符串中找到以下模式:
word-word-word++
or -word-word-word++
word-word-word++
或-word-word-word++
So that it iterates the -word
or word-
pattern until the end of the substring. 以便迭代
-word
或word-
模式直到子字符串的结尾。
the string is quite large and contains many words with those^ patterns. 字符串很大,包含许多带有那些^模式的单词。 The following has been tried:
已尝试以下方法:
p = re.compile('(?:\w+\-)*\w+\s+=', re.IGNORECASE)
result = p.match(data)
but it returns NONE. 但它返回NONE。 Does anyone know the answer?
有人知道答案吗?
Your regex will only match the first pattern, match() will only find one occurrence, and that only if it is immediately followed by some whitespace and an equals sign. 您的正则表达式将仅匹配第一个模式,match()仅会发现一个匹配项,并且仅在其后紧跟一些空格和等号。
Also, in your example you implied you wanted three or more words, so here's a version that was changed in the following ways: 另外,在您的示例中,您暗示您想要三个或三个以上的单词,因此,此版本已通过以下方式进行了更改:
-?
) -?
) {2,}
instead of +
) {2,}
而不是+
)时匹配 \\b
matches a word boundary. It is not really necessary here, since the preceding \\w+
guarantees we are at a word boundary anyway) \\b
匹配单词边界。在这里并没有必要,因为前面的\\w+
保证我们始终处于单词边界) Here's the code: 这是代码:
#!/usr/bin/python
import re
data=r"foo-bar-baz not-this -this-neither nope double-dash--so-nope -yeah-this-even-at-end-of-string"
p = re.compile(r'-?(?:\w+-){2,}\w+\b', re.IGNORECASE)
print p.findall(data)
# prints ['foo-bar-baz', '-yeah-this-even-at-end-of-string']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.