正则表达式，如何匹配所有出现的事件

Question

I'm trying to get everything from a webpage up until the second occurrence of a word matchdate . 我正在尝试从网页中获取所有信息，直到第二次出现单词matchdate 。

(.*?matchdate){2} is what I'm trying but that's not doing that trick. (.*?matchdate){2}是我正在尝试的方法，但这并不是在做这个技巧。 The page has 14+ matches of "matchdate" and I only want to get everything up to the second one, and then nothing else. 该页面具有14个以上的“ matchdate”匹配项，我只想让所有内容都达到第二个，然后就别无其他。

https://regex101.com/r/Cjyo0f/1 <--- my saved regex. https://regex101.com/r/Cjyo0f/1 <---我保存的正则表达式。

What am I missing here? 我在这里想念什么？

Thanks. 谢谢。

Answer 1

There are a couple ways you can do this: 您可以通过以下几种方法执行此操作：

If you can, remove the `g` flag 如果可以，请删除`g`标志

Without the global flag, regex will only grab the first instance it encounters. 没有全局标志，正则表达式将仅捕获其遇到的第一个实例。

https://regex101.com/r/Cjyo0f/2 https://regex101.com/r/Cjyo0f/2

Add a `^` to the front of the regex 在正则表达式的前面添加`^`

A caret will force the regex to match from the beginning of the string, ruling out all other possibilities. 尖号将迫使正则表达式从字符串的开头开始匹配，排除所有其他可能性。

https://regex101.com/r/Cjyo0f/3 https://regex101.com/r/Cjyo0f/3

If Python is available, use `.split()` and `.join()` 如果Python可用，请使用`.split()`和`.join()`

If regular python is available, I would recommend: 如果有常规的python，我建议：

string = "I like to matchdate, I want to each matchdate for breakfest"
print "matchdate".join(string.split("matchdate")[:2])

Answer 2

You almost had it! 你差点就吃了！ (.*?matchdate){2} was actually correct. (.*?matchdate){2}实际上是正确的。 It just needs a re.DOTALL flag so that the dot matches newlines as well as other characters. 它只需要一个re.DOTALL标志，以便点与换行符以及其他字符匹配。

Here is a working test: 这是一个工作测试：

>>> import re

>>> s = '''First line
Second line
Third with matchdate and more
Fourth line
Fifth with matchdate and other
stuff you're
not interested in
like another matchdate
or a matchdate redux.
'''

>>> print(re.search('(.*?matchdate){2}', s, re.DOTALL).group())
First line
Second line
Third with matchdate and more
Fourth line
Fifth with matchdate

正则表达式，如何匹配所有出现的事件

问题描述

2 个解决方案

解决方案1
2 已采纳 2017-03-17 21:50:12

If you can, remove the `g` flag 如果可以，请删除`g`标志

Add a `^` to the front of the regex 在正则表达式的前面添加`^`

If Python is available, use `.split()` and `.join()` 如果Python可用，请使用`.split()`和`.join()`

解决方案2
1 2017-03-19 05:57:33

正则表达式，如何匹配所有出现的事件

问题描述

2 个解决方案

解决方案1 2 已采纳 2017-03-17 21:50:12

If you can, remove the g flag 如果可以，请删除g标志

Add a ^ to the front of the regex 在正则表达式的前面添加^

If Python is available, use .split() and .join() 如果Python可用，请使用.split()和.join()

解决方案2 1 2017-03-19 05:57:33

解决方案1
2 已采纳 2017-03-17 21:50:12

If you can, remove the `g` flag 如果可以，请删除`g`标志

Add a `^` to the front of the regex 在正则表达式的前面添加`^`

If Python is available, use `.split()` and `.join()` 如果Python可用，请使用`.split()`和`.join()`

解决方案2
1 2017-03-19 05:57:33