获取 Python 字符串中特定单词之间的单词

Question

I'm working on getting the words between certain words in a string.我正在努力获取字符串中某些单词之间的单词。

Find string between two substrings Referring to this article, I succeeded in catching words in the following way. 在两个子字符串之间查找字符串参考这篇文章，我通过以下方式成功捕获了单词。

s = 'asdf=5;iwantthis123jasd'
result = re.search('asdf=5;(.*)123jasd', s)
print(result.group(1))

But in the sentence below it failed.但在下面的句子中它失败了。

s = '''        <div class="prod-origin-price ">
        <span class="discount-rate">
            4%
        </span>
            <span class="origin-price">'''


result = re.search('<span class="discount-rate">(.*)</span>', s)
print(result.group(1))

I'm trying to bring '4%'.我试图带来“4%”。 Everything else succeeds, but I don't know why only this one fails.其他一切都成功，但我不知道为什么只有这个失败。 Help帮助

Answer 1

Try this (mind the white spaces and new lines)试试这个（注意空格和换行）

import re
s = '''        <div class="prod-origin-price ">
        <span class="discount-rate">
            4%
        </span>
            <span class="origin-price">'''


result = re.search('<span class="discount-rate">\s*(.*)\s*</span>', s)
print(result.group(1))

Answer 2

Use re.DOTALL flag for matching new lines:使用 re.DOTALL 标志匹配新行：

result = re.search('<span class="discount-rate">(.*)</span>', s, re.DOTALL)

Documentation: https://docs.python.org/3/library/re.html文档： https://docs.python.org/3/library/re.html

Answer 3

This is structured data, not just a string, so we can use a library like Beautiful Soup to help us simplify such tasks:这是结构化数据，而不仅仅是字符串，因此我们可以使用Beautiful Soup之类的库来帮助我们简化此类任务：

from bs4 import BeautifulSoup

s = '''        <div class="prod-origin-price ">
        <span class="discount-rate">
            4%
        </span>
            <span class="origin-price">'''

soup = BeautifulSoup(s)
value = soup.find(class_='discount-rate').get_text(strip=True)
print(value)

# Output:
4%

获取 Python 字符串中特定单词之间的单词

问题描述

3 个解决方案

解决方案1
1 已采纳 2022-08-13 02:10:02

解决方案2
1 2022-08-13 02:10:42

解决方案3
1 2022-08-13 03:43:13

获取 Python 字符串中特定单词之间的单词

问题描述

3 个解决方案

解决方案1 1 已采纳 2022-08-13 02:10:02

解决方案2 1 2022-08-13 02:10:42

解决方案3 1 2022-08-13 03:43:13

解决方案1
1 已采纳 2022-08-13 02:10:02

解决方案2
1 2022-08-13 02:10:42

解决方案3
1 2022-08-13 03:43:13