Python3.7：RegEx用于多行字符串之间的字符串吗？

Question

我想在以下位置找到30,850 ：

  <div class='user-information__achievements-heading' data-test-points-title>
    Points
    </div>
    <div class='user-information__achievements-data' data-test-points-count>
    30,850
    </div>
    </div>

与：

^(?!<div class='user-information__achievements-data' data-test-points-count>
|<.div>)(.*)$

（不返回任何内容）

^(?!START\\-OF\\-FIELDS|END\\-OF\\-FIELDS)(.*)$为何适用于：

START-OF-FIELDS
<div>
Line A
END-OF-FIELDS

（返回<div> ）？

Answer 1

此外，我完全同意，如果您只拥有这段文本并且需要快速的re.search ，则永远不要使用re解析HTML （而且非常有趣，顺便说一句），一个简单的r'\\d+,\\d+'就可以了。：：

import re

s = '''<div class='user-information__achievements-heading' data-test-points-title>
    Points
    </div>
    <div class='user-information__achievements-data' data-test-points-count>
    30,850
    </div>
    </div>'''

re.search(r'\d+,\d+', s)
<re.Match object; span=(179, 185), match='30,850'>

Answer 2

无需使用正则表达式即可：

i="    <div class='user-information__achievements-data' data-test-points-count>"
print(s.splitlines()[s.splitlines().index(i)+1].lstrip())

输出：

30,850

Answer 3

您也可以通过bs4搜索文本

from bs4 import BeautifulSoup

tx = """
  <div class='user-information__achievements-heading' data-test-points-title>
    Points
    </div>
    <div class='user-information__achievements-data' data-test-points-count>
    30,850
    </div>
    </div>
"""

bs = BeautifulSoup(tx,"lxml")
result = bs.find("div",{"class":"user-information__achievements-data"}).text
print(result.strip()) # 30,850

Answer 4

您需要re.DOTALL因为默认情况下. 与换行符和行制动器不匹配。

re.compile(YOUR_REGEX, flags=re.S)

您也可以在正则表达式前加上(?s)以达到相同的效果。

Python3.7：RegEx用于多行字符串之间的字符串吗？

问题描述

4 个解决方案

解决方案1
1 2018-10-05 09:39:26

解决方案2
1 2018-10-05 09:52:31

解决方案3
1 已采纳 2018-10-06 02:12:31

解决方案4
0 2018-10-05 09:41:51

Python3.7：RegEx用于多行字符串之间的字符串吗？

问题描述

4 个解决方案

解决方案1 1 2018-10-05 09:39:26

解决方案2 1 2018-10-05 09:52:31

解决方案3 1 已采纳 2018-10-06 02:12:31

解决方案4 0 2018-10-05 09:41:51

解决方案1
1 2018-10-05 09:39:26

解决方案2
1 2018-10-05 09:52:31

解决方案3
1 已采纳 2018-10-06 02:12:31

解决方案4
0 2018-10-05 09:41:51