Python BeautifulSoup 为 findAll 提供了多个标签

Question

I'm looking for a way to use findAll to get two tags, in the order they appear on the page.我正在寻找一种使用 findAll 获取两个标签的方法，按照它们出现在页面上的顺序。

Currently I have:目前我有：

import requests
import BeautifulSoup

def get_soup(url):
    request = requests.get(url)
    page = request.text
    soup = BeautifulSoup(page)
    get_tags = soup.findAll('hr' and 'strong')
    for each in get_tags:
        print each

If I use that on a page with only 'em' or 'strong' in it then it will get me all of those tags, if I use on one with both it will get 'strong' tags.如果我在只有 'em' 或 'strong' 的页面上使用它，那么它会为我获取所有这些标签，如果我同时使用这两个标签，它将获得 'strong' 标签。

Is there a way to do this?有没有办法做到这一点？ My main concern is preserving the order in which the tags are found.我主要关心的是保留找到标签的顺序。

Answer 1

您可以传递一个 list ，以查找任何给定的标签：

tags = soup.find_all(['hr', 'strong'])

Answer 2

Use regular expressions:使用正则表达式：

import re
get_tags = soup.findAll(re.compile(r'(hr|strong)'))

The expression r'(hr|strong)' will find either hr tags or strong tags.表达式r'(hr|strong)'将找到hr标签或strong标签。

Answer 3

To find multiple tags, you can use the , CSS selector , where you can specify multiple tags separated by a comma , .要查找多个标签，您可以使用, CSS 选择器，您可以在其中指定以逗号,分隔的多个标签。

To use a CSS selector, use the .select_one() method instead of .find() , or .select() instead of .find_all() .要使用 CSS 选择器，请使用.select_one()方法代替.find() ，或使用.select()代替.find_all() 。

For example, to select all <hr> and strong tags, separate the tags with a , :例如，选择所有<hr>和strong标记，一个独立的标签, ：

tags = soup.select('hr, strong')

Python BeautifulSoup 为 findAll 提供了多个标签

问题描述

3 个解决方案

解决方案1
109 已采纳 2013-12-18 04:01:04

解决方案2
10 2013-12-18 02:34:31

解决方案3
0 2021-07-07 18:49:26

Python BeautifulSoup 为 findAll 提供了多个标签

问题描述

3 个解决方案

解决方案1 109 已采纳 2013-12-18 04:01:04

解决方案2 10 2013-12-18 02:34:31

解决方案3 0 2021-07-07 18:49:26

解决方案1
109 已采纳 2013-12-18 04:01:04

解决方案2
10 2013-12-18 02:34:31

解决方案3
0 2021-07-07 18:49:26