Web 抓取网站无法使用 python

Question

My code just keeps running without any results我的代码一直在运行，没有任何结果

import requests
import  pandas as pd

url = 'http://www.cmegroup.com/markets/agriculture/livestock/pork-cutout.quotes.html'
Data = requests.get(url)
print (Data)

Answer 1

Seems like an issue with that site specifically requiring header information.该站点似乎有问题，特别需要 header 信息。 I found a solution here that worked for me:我在这里找到了适合我的解决方案：

requests.get in python giving connection timeout error requests.get in python 给出连接超时错误

import requests
import  pandas as pd

headers = {'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36'}
url = 'http://www.cmegroup.com/markets/agriculture/livestock/pork-cutout.quotes.html'
Data = requests.get(url, timeout=15, verify=False, allow_redirects=True,headers=headers)
print(Data.content)

Answer 2

In this case, your program basically requests the page and then stores the requested data inside the Data variable.在这种情况下，您的程序基本上请求页面，然后将请求的数据存储在 Data 变量中。 You do not do any processing on the variable after.之后您不对变量进行任何处理。 In order to do something with it, you can do something like为了用它做点什么，你可以做类似的事情

print(Data)

This will show you what is inside the variable.这将向您显示变量内部的内容。 You can also use a debugging tool such as those within vscode if you add a breakpoint there.如果在此处添加断点，您还可以使用调试工具，例如 vscode 中的工具。

Web 抓取网站无法使用 python

问题描述

2 个解决方案

解决方案1
0 2022-03-01 23:32:36

解决方案2
-1 2022-03-01 23:10:56

Web 抓取网站无法使用 python

问题描述

2 个解决方案

解决方案1 0 2022-03-01 23:32:36

解决方案2 -1 2022-03-01 23:10:56

解决方案1
0 2022-03-01 23:32:36

解决方案2
-1 2022-03-01 23:10:56