跨度標簽內文本的美麗湯提取

Question

我正在嘗試從以下 HTML 中提取文本Weight: 16.5 pounds ：

<div class="product__description__text">.........
<p dir="ltr"><span><strong>Dimensions:</strong> 39 x 17.3 x 32.2 inches</span></p><p dir="ltr"><span><strong>Weight:</strong> 16.5 pounds</span></p><p dir="ltr"><span><strong>Weight limit:</strong> 35 pounds</span></p><p dir="ltr"><span><strong>Height limit:</strong>&nbsp;32 inches</span></p></div>

這是我到目前為止所嘗試的：

results = soup.find_all('div', attrs={'class':'product'})
Weight_L = []
for result in results:
    if result.find('p', attrs={'dir':'ltr'})is not None:
        weight = result.span.text
    Weight_L.append(weight)

Answer 1

如果您只查找weight ，我建議您只檢查關鍵字“weight”是否在p標簽中。 此外，如果您使用find ，它只會返回第一個結果 - 所以如果第一個p標簽不是“Weight”，您將無法找到它。 此外，如果您的 class 名稱是product__description__text ，您還應該將您的發現 class 名稱更改為product__description__text 。

results = soup.find_all('div', attrs={'class':'product__description__text'})
Weight_L = []
for result in results:
    p_tags = result.find_all('p', attrs={'dir':'ltr'})
    for tag in p_tags:
        if "Weight:" in tag.text:
            weight = tag.text
            Weight_L.append(weight)

如果您發布的上述代碼是soup ，結果將是： ['Weight: 16.5 pounds']

Answer 2

Weight: 16.5 pounds是來自父 class .product__description__text的第二個p標簽，您可以使用p:nth-child(2)獲得第二個p

results = soup.select(".product__description__text p:nth-child(2)")
Weight_L = []
for result in results:
  Weight_L.append(result.text)
  print(result.text)

跨度標簽內文本的美麗湯提取

問題描述

2 個解決方案

解決方案1
0 已采納 2021-01-29 01:29:55

解決方案2
0 2021-01-29 13:50:19

跨度標簽內文本的美麗湯提取

問題描述

2 個解決方案

解決方案1 0 已采納 2021-01-29 01:29:55

解決方案2 0 2021-01-29 13:50:19

解決方案1
0 已采納 2021-01-29 01:29:55

解決方案2
0 2021-01-29 13:50:19