Python requests 模块卡在 requests.get() 上并超时

Question

I've been trying to web scrape from the following site: "https://www.india.ford.com/cars/aspire/"我一直在尝试从以下站点抓取 web：“https://www.india.ford.com/cars/aspire/”

import requests
from bs4 import BeautifulSoup
import csv

response = requests.get("https://www.india.ford.com/cars/aspire/", timeout=5)

if response.status_code!=200:
    print("error!")
else:
    print(response.status_code)

The execution gets stuck indefinitely.执行被无限期地卡住了。

On using timeout=5使用timeout=5

I get the following error:我收到以下错误：

I'm new to this so sorry if this is a noob question.如果这是一个菜鸟问题，我对此很抱歉。 Any help is highly appreciated: :P非常感谢任何帮助：：P

Answer 1

Timeout need to use try except.超时需要使用try except。

This page needs to disguise the browser.这个页面需要伪装浏览器。

try:
    headers = {
        'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36',
    }
    response = requests.get("https://www.india.ford.com/cars/aspire/", headers=headers, timeout=5)

    if response.status_code != 200:
        print("error!")
    else:
        print(response.status_code)
except requests.exceptions.Timeout as error:
    print('time out')

Python requests 模块卡在 requests.get() 上并超时

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-12-16 06:12:39

Python requests 模块卡在 requests.get() 上并超时

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-12-16 06:12:39

解决方案1
2 已采纳 2020-12-16 06:12:39