![](/img/trans.png)
[英]Python error: urllib.error.HTTPError: HTTP Error 404: Not Found
[英]urllib.error.HTTPError: HTTP Error 404: Not Found (Yahoo Finance)
对于我的计算项目,我正在尝试制作一个财务预测网站。 代码中的元素之一是网页抓取 API。 它从雅虎财经上一家公司的损益表中抓取数据。
但是,即使 URL 正确,我仍然不断收到 404 错误。
import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
import warnings
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
income_url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
read_url = ur.urlopen(income_url).read()
income_soup = BeautifulSoup(read_url, 'lxml')
div_list = []
for div in income_soup.find_all('div'):
div_list.append(div.string)
if not div.string == div.get('title'):
div_list.append(div.get('title'))
div_list = [incl for incl in div_list if incl not in
('Operating Expenses', 'Non-recurring Events', 'Expand All')]
div_list = list(filter(None, div_list))
div_list = [incl for incl in div_list if not incl.startswith('(function')]
income_list = div_list[13: -5]
income_list.insert(0, 'Breakdown')
income_data = list(zip(*[iter(income_list)]*6))
income_df = pd.DataFrame(income_data)
headers = income_df.iloc[0]
income_df = income_df[1:]
income_df.columns = headers
income_df.set_index('Breakdown', inplace=True, drop=True)
warnings.warn('Amounts are in thousands.')
print(income_df)
我不断收到此错误:
urllib.error.HTTPError:HTTP 错误 404:未找到错误
如何解决?
通过确保您传递 User-Agent 标头,似乎可以解决此问题。
使用请求模块:
import requests
agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15'
headers = {'User-Agent': agent}
url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
response = requests.get(url, headers=headers)
response.raise_for_status()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.