如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多个标签？

Question

I'm trying to write a python script where modifying certain tags in multiple html files will be automated;我正在尝试编写 python 脚本，其中将自动修改多个 html 文件中的某些标签； running single command from the terminal.从终端运行单个命令。

I constructed the code base.我构建了代码库。

In my code base something I've done like below.在我的代码库中，我做了如下所示的事情。 Is there even more convenient way to do so with less code?有没有更方便的方法可以用更少的代码做到这一点？

#modifying the 'src' of <img> tag in the soup obj
for img in soup.findAll('img'):
    img['src'] = '{% static ' + "'" + img['src'] + "'" + ' %}'

#modifying the 'href' of <link> tag in the soup obj
for link in soup.findAll('link'):
    link['href'] = '{% static ' + "'" + link['href'] + "'" + ' %}'

#modifying the 'src' of <script> tag in the soup obj
for script in soup.findAll('script'):
    script['src'] = '{% static ' + "'" + script['src'] + "'" + ' %}'

For instance, can I do it in single for loop instead of 3?例如，我可以在单个 for 循环中而不是 3 中执行吗？ Not saying it has to be like the way I wrote below, any good practice suggestion is what I'm looking for.并不是说它必须像我在下面写的那样，任何好的实践建议都是我正在寻找的。

for img, link, script in soup.findAll('img', 'link', 'script'):
    rest of the code goes here....

Answer 1

Perhaps use a dictionary to retrieve appropriate attribute?也许使用字典来检索适当的属性？ Also, use faster css selectors.此外，使用更快的 css 选择器。

import requests
from bs4 import BeautifulSoup as bs

r = requests.get('https://stackoverflow.com/questions/66541098/how-can-i-iterate-over-multiple-tags-in-soup-findalltag1-tag2-tag3')
soup = bs(r.content, 'lxml')

lookup = {
    'img':'src',
    'link': 'href',
    'script':'src'
}

for i in soup.select('img, link, script'):
    var = lookup[i.name]
    if i.has_attr(var):
        i[var] = '{% static ' + "'" + i[var] + "'" + ' %}'
        print(i[var])

Answer 2

Yes you can.是的你可以。 You can pass a list of elements to findAll method您可以将元素列表传递给 findAll 方法

for element in soup.findAll(['img', 'link', 'script']): # use find_all for bs4
    
    if element.name == 'img':
        value = element['src']
    elif element.name == 'href':
        value = element['href']
    elif element.name == 'script':
        value = element['src']
    else:
        continue
        
    print(val)

如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多个标签？

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-03-09 05:35:05

解决方案2
0 2021-03-09 06:21:09

如何迭代soup.findAll（'tag1'，'tag2'，'tag3'）中的多个标签？

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-03-09 05:35:05

解决方案2 0 2021-03-09 06:21:09

解决方案1
2 已采纳 2021-03-09 05:35:05

解决方案2
0 2021-03-09 06:21:09