通過將其與列表（Python）進行比較來從網頁中刪除項目

Question

我已經收集了需要刪除的列表中的數據，下面的代碼顯示了列表：

keyword= "www.indigo.com"
hrefs = [links['href'] for links in getDetails.find_all('a', href=True) if target in links['href']]
print(hrefs)

它打印以下輸出：

['https://www.indigo.com/registration.html']
[]
['https://www.indigo.com/buservfcl.html', 'https://www.indigo.com/2021/07/agents.html']

getDetails有完整的頁面源碼

現在，我如何將getDetails與hrefs列表進行比較並刪除/分解列表中存在的每個項目。

我試過這個，但由於某種原因它不起作用：

hrefs = [links['href'] for links in getDetails.find_all('a', href=True) if target in links['href']]
print(hrefs)
for z in hrefs:
    getDetails.decompose()

它刪除了 getDescription 中的整個數據，但我只需要刪除列表中的元素而不是所有內容

輸出應該是完整的 HTML，除了包含www.indigo.com的那些

Answer 1

你必須找到parent標簽，然后使用decompose()方法

html="""<div><a href="www.indigo.com"></div>"""

soup=BeautifulSoup(html,"html.parser")

target= "www.indigo.com"
href_tags = [links for links in soup.find_all('a', href=True) if target in links['href']]

for i in href_tags:
    i.parent.decompose()

輸出：

soup會是空的

從網址：

import requests
res=requests.get("https://www.assamcareer.com/2021/06/oil-india-limited.html")
soup=BeautifulSoup(res.text,"html.parser")
target= "www.assamcareer.com"
tags = [links for links in soup.find_all('a', href=True) if target in links['href']]
for i in tags:
    i.parent.decompose()

更新答案：

for title in root:
    /
 
        Your code

    /
    href_tags = [links for links in getDetails.find_all('a',href=True) if target in links['href']]
    print(href_tags)

for i in href_tags:
    i.parent.decompose()

通過將其與列表（Python）進行比較來從網頁中刪除項目

問題描述

1 個解決方案

解決方案1
1 已采納 2021-11-11 08:54:28

通過將其與列表（Python）進行比較來從網頁中刪除項目

問題描述

1 個解決方案

解決方案1 1 已采納 2021-11-11 08:54:28

解決方案1
1 已采納 2021-11-11 08:54:28