如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多個標簽？

Question

我正在嘗試編寫 python 腳本，其中將自動修改多個 html 文件中的某些標簽； 從終端運行單個命令。

我構建了代碼庫。

在我的代碼庫中，我做了如下所示的事情。 有沒有更方便的方法可以用更少的代碼做到這一點？

#modifying the 'src' of <img> tag in the soup obj
for img in soup.findAll('img'):
    img['src'] = '{% static ' + "'" + img['src'] + "'" + ' %}'

#modifying the 'href' of <link> tag in the soup obj
for link in soup.findAll('link'):
    link['href'] = '{% static ' + "'" + link['href'] + "'" + ' %}'

#modifying the 'src' of <script> tag in the soup obj
for script in soup.findAll('script'):
    script['src'] = '{% static ' + "'" + script['src'] + "'" + ' %}'

例如，我可以在單個 for 循環中而不是 3 中執行嗎？ 並不是說它必須像我在下面寫的那樣，任何好的實踐建議都是我正在尋找的。

for img, link, script in soup.findAll('img', 'link', 'script'):
    rest of the code goes here....

Answer 1

也許使用字典來檢索適當的屬性？ 此外，使用更快的 css 選擇器。

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://stackoverflow.com/questions/66541098/how-can-i-iterate-over-multiple-tags-in-soup-findalltag1-tag2-tag3')
soup = bs(r.content, 'lxml')

lookup = {
    'img':'src',
    'link': 'href',
    'script':'src'
}

for i in soup.select('img, link, script'):
    var = lookup[i.name]
    if i.has_attr(var):
        i[var] = '{% static ' + "'" + i[var] + "'" + ' %}'
        print(i[var])

Answer 2

是的你可以。 您可以將元素列表傳遞給 findAll 方法

for element in soup.findAll(['img', 'link', 'script']): # use find_all for bs4
    
    if element.name == 'img':
        value = element['src']
    elif element.name == 'href':
        value = element['href']
    elif element.name == 'script':
        value = element['src']
    else:
        continue
        
    print(val)

如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多個標簽？

問題描述

2 個解決方案

解決方案1
2 已采納 2021-03-09 05:35:05

解決方案2
0 2021-03-09 06:21:09

如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多個標簽？

問題描述

2 個解決方案

解決方案1 2 已采納 2021-03-09 05:35:05

解決方案2 0 2021-03-09 06:21:09

解決方案1
2 已采納 2021-03-09 05:35:05

解決方案2
0 2021-03-09 06:21:09