繁体   English   中英

urllib.error.HTTPError:HTTP 错误 404:未找到(雅虎财经)

[英]urllib.error.HTTPError: HTTP Error 404: Not Found (Yahoo Finance)

对于我的计算项目,我正在尝试制作一个财务预测网站。 代码中的元素之一是网页抓取 API。 它从雅虎财经上一家公司的损益表中抓取数据。

但是,即使 URL 正确,我仍然不断收到 404 错误。

我的代码

import pandas as pd
import urllib.request as ur
from bs4 import BeautifulSoup
import warnings
import ssl


ssl._create_default_https_context = ssl._create_unverified_context
income_url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
read_url = ur.urlopen(income_url).read()
income_soup = BeautifulSoup(read_url, 'lxml')

div_list = []
for div in income_soup.find_all('div'):
    div_list.append(div.string)

    if not div.string == div.get('title'):
        div_list.append(div.get('title'))

div_list = [incl for incl in div_list if incl not in
            ('Operating Expenses', 'Non-recurring Events', 'Expand All')]
div_list = list(filter(None, div_list))
div_list = [incl for incl in div_list if not incl.startswith('(function')]
income_list = div_list[13: -5]
income_list.insert(0, 'Breakdown')

income_data = list(zip(*[iter(income_list)]*6))
income_df = pd.DataFrame(income_data)

headers = income_df.iloc[0]
income_df = income_df[1:]
income_df.columns = headers
income_df.set_index('Breakdown', inplace=True, drop=True)

warnings.warn('Amounts are in thousands.')
print(income_df)

我不断收到此错误:

urllib.error.HTTPError:HTTP 错误 404:未找到错误

如何解决?

通过确保您传递 User-Agent 标头,似乎可以解决此问题。

使用请求模块:

import requests

agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 11_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Safari/605.1.15'
headers = {'User-Agent': agent}
url = 'http://uk.finance.yahoo.com/quote/AAPL/financials?p=AAPL'
response = requests.get(url, headers=headers)
response.raise_for_status()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM