Python美麗湯選擇文本

Question

以下是我要解析的HTML代碼的示例：

<html>
<body>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>

我正在使用漂亮的湯通過選擇style8來解析HTML代碼，如下所示（其中html讀取我的http請求的結果）：

html = result.read()
soup = BeautifulSoup(html)

content = soup.select('.style8')

在此示例中， content變量返回4個標簽的列表。 我想檢查content.text ，其中包含列表中每個項目的每個style8類的文本（如果它包含Example並將其附加到變量中）。 如果它遍歷整個列表，並且列表中沒有出現Example ，則將Not present Present附加到變量中。

到目前為止，我有以下內容：

foo = []

for i, tag in enumerate(content):
    if content[i].text == 'Example':
        foo.append('Example')
        break
    else:
        continue

如果出現，它將僅將Example附加到foo ，但是如果沒有出現在整個列表中，則不會將Not Present附加。

任何這樣做的方法都將受到贊賞，或者更好的搜索整個結果以檢查是否存在字符串的方法會很棒

Answer 1

您可以使用find_all()查找所有class='style8' td元素，並使用列表class='style8'來構造foo列表：

from bs4 import BeautifulSoup


html = """<html>
<body>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> Example BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
<td style="PADDING-LEFT:  5px"bgcolor="ffffff" class="style8"> BLAB BLAB BLAB </td>
</body>
</html>"""

soup = BeautifulSoup(html)

foo = ["Example" if "Example" in node.text else "Not Present" 
       for node in soup.find_all('td', {'class': 'style8'})]
print foo

打印：

['Example', 'Not Present', 'Not Present', 'Not Present']

Answer 2

如果只想檢查是否找到了它，可以使用一個簡單的布爾標志，如下所示：

foo = []
found = False
for i, tag in enumerate(content):
    if content[i].text == 'Example':
        found = True
        foo.append('Example')
        break
    else:
        continue
if not found:
    foo.append('Not Example')

如果我得到了您想要的，這可能是一種簡單的方法，盡管alecxe的解決方案看起來很棒。

Python美麗湯選擇文本

問題描述

2 個解決方案

解決方案1
3 2014-03-03 11:29:20

解決方案2
1 已采納 2014-03-03 11:32:26

Python美麗湯選擇文本

問題描述

2 個解決方案

解決方案1 3 2014-03-03 11:29:20

解決方案2 1 已采納 2014-03-03 11:32:26

解決方案1
3 2014-03-03 11:29:20

解決方案2
1 已采納 2014-03-03 11:32:26