简体   繁体   English

如何处理 Python 中的异常

[英]How do I handle exceptions in Python

This answered my question, problem solved.这回答了我的问题,问题解决了。 You may delete this post.你可以删除这个帖子。

RETRIES = 10

id = None
session = requests.Session()

for attempt in range(1, RETRIES + 1):
    response = session.get(url)
    soup = BeautifulSoup(r.text, "lxml")

    element = soup.find('a', class_="class", id=True)
    if element is None:
        print("Attempt {attempt}. Element not found".format(attempt=attempt))
        continue
    else:
        id = element["id"]
        break

print(id)

This answered my question, problem solved.这回答了我的问题,问题解决了。 You may delete this post.你可以删除这个帖子。

You can apply the "look before you leap" ( LBYL ) principle and check the result of find() - it would return None if an element was not found.您可以应用“跳前查看”( LBYL )原则并检查find()的结果 - 如果未找到元素,它将返回None You can then put the thing into a loop and exit when you have a value, also safeguarding yourself with a loop counter limit:然后,您可以将事物放入循环并在有值时退出,同时使用循环计数器限制来保护自己:

RETRIES = 10

id = None
session = requests.Session()

for attempt in range(1, RETRIES + 1):
    response = session.get(url)
    soup = BeautifulSoup(r.text, "lxml")

    element = soup.find('a', class_="class", id=True)
    if element is None:
        print("Attempt {attempt}. Element not found".format(attempt=attempt))
        continue
    else:
        id = element["id"]
        break

print(id)

Couple notes:情侣笔记:

  • id=True was set to find only elements with the id element present. id=True设置为仅查找存在id元素的元素。 You could also do an equivalent with a CSS selector soup.select_one("a.class[id]")您也可以使用CSS 选择器soup.select_one("a.class[id]")
  • Session() helps to improve performance when issuing requests to the same host multiple times. Session()有助于在多次向同一主机发出请求时提高性能。 See more at Session ObjectsSession 对象中查看更多信息

If all you want to do is make that same request a second time, you could do something like this:如果您只想再次发出相同的请求,则可以执行以下操作:

import requests
from bs4 import BeautifulSoup

def find_data(url):
    found_data = False
    while not found_data:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, "lxml")
        try:
            id = soup.find('a', class_="class").get('id')
            found_data = True
        except:
            pass

This puts you at risk of an infinite loop if the data really aren't there.如果数据确实不存在,这会使您面临无限循环的风险。 You can do this to avoid that infinite loop:您可以这样做以避免无限循环:

import requests
from bs4 import BeautifulSoup

def find_data(url, attempts_before_fail=3):
    found_data = False
    while not found_data:
        r = requests.get(url)
        soup = BeautifulSoup(r.text, "lxml")
        try:
            id = soup.find('a', class_="class").get('id')
            found_data = True
        except:
            attempts_before_fail -= 1
            if attempts_before_fail == 0:
                raise ValueError("couldn't find data after all.")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM