用正則表達式提取子串，總是沒有 re.match()

Question

我想通過正則表達式從字符串中提取一些信息，但結果始終為 None。 源代碼如下：

line = '<meta content=\"Allrecipes\" property=\"og:site_name\"/>'
x = re.match(r'property=".+?"',line)
print(x)

我想提取內容和屬性元組，我該如何解決？

Answer 1

我會建議更合適的東西。

使用beautifulsoup ：

from bs4 import BeautifulSoup

line = '<meta content=\"Allrecipes\" property=\"og:site_name\"/>'
soup = BeautifulSoup(line, 'lxml')

print("Content: {}".format(soup.meta["content"]))
print("Property: {}".format(soup.meta["property"]))

輸出：

Content: Allrecipes
Property: og:site_name

Answer 2

@DirtyBit 的答案比使用正則表達式要好。 但是，如果您仍然想使用正則表達式，它可能會有所幫助（ RegexDemo ）：

line = '<meta content=\"Allrecipes\" property=\"og:site_name\"/>'
regex = re.search("content=\\\"(?P<content>.*)\\\".*property=\\\"(?P<prop>.*)\\\"\/>",line)
print (regex.groups())

輸出：

('Allrecipes', 'og:site_name')

用正則表達式提取子串，總是沒有 re.match()

問題描述

2 個解決方案

解決方案1
0 2019-03-26 08:04:56

解決方案2
0 已采納 2019-03-26 08:08:25

用正則表達式提取子串，總是沒有 re.match()

問題描述

2 個解決方案

解決方案1 0 2019-03-26 08:04:56

解決方案2 0 已采納 2019-03-26 08:08:25

解決方案1
0 2019-03-26 08:04:56

解決方案2
0 已采納 2019-03-26 08:08:25