简体   繁体   English

无法读取从 JSON 导入的 pandas dataframe 列

[英]Can't read pandas dataframe columns imported from JSON

I am trying to read apandas dataframe imported from a JSON file.我正在尝试读取从 JSON 文件导入的 apandas dataframe。

I get the following error:我收到以下错误:

The data does not contain a column named 'totalRevenue'.
The data does not contain a column named 'future_revenue'.

Traceback (most recent call last):
  File "/Users/Blake/PycharmProjects/Project/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3803, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 165, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5745, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5753, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'future_revenue'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/aidanschmidt/PycharmProjects/Project/main.py", line 66, in <module>
    financial_data['future_expenses'] = financial_data['future_revenue'] * expense_ratio
  File "/Users/Blake/PycharmProjects/Project/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 3805, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Users/Blake/PycharmProjects/Project/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3805, in get_loc
    raise KeyError(key) from err
KeyError: 'future_revenue'

"totalRevenue" is in the JSON and the Pandas dataframe so I'm unsure what the issue is. “totalRevenue”在 JSON 和 Pandas dataframe 中,所以我不确定是什么问题。

I added error handling to for totalRevenue which also fails however, the error handling for future_revenue fails because it doesn't get created until later.我为 totalRevenue 添加了错误处理,但它也失败了,但是 future_revenue 的错误处理失败了,因为它直到后来才创建。

Here is my code:这是我的代码:

import requests
import json
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Replace YOUR_API_KEY with your own Alpha Vantage API key
api_key = 'YOUR API KEY'

# Get the ticker symbol from the user
ticker_symbol = input("Enter the ticker symbol of the company: ")

# Gathering all three financial statements
functions = ['INCOME_STATEMENT', 'BALANCE_SHEET', 'CASH_FLOW']
financial_data = {}

for function in functions:
    # Specify the frequency of the data
    interval = 'annual'  # or 'quarter'

    # Send a request to the Alpha Vantage API to gather the financial data
    url = f'https://www.alphavantage.co/query?function={function}&symbol={ticker_symbol}&apikey={api_key}&interval={interval}'
    response = requests.get(url)

    # Check the response status code and the content of the response in case the API returns an error message
    if response.status_code != 200:
        print("Error: API request failed with status code", response.status_code)
        print(response.content)
        exit()

    try:
        financial_data[function] = json.loads(response.text)
    except json.decoder.JSONDecodeError as e:
        print("Error: Failed to parse JSON data from API response")
        print(e)
        print(response.content)
        exit()

# Load the financial data into a pandas DataFrame
financial_data = pd.DataFrame(financial_data)
# Example expense ratio (expenses as a proportion of revenue)
expense_ratio = 0.6

# Example growth rate
growth_rate = 0.03

# Check if the data contains the column 'revenue'
if 'totalRevenue' not in financial_data.columns:
    print("The data does not contain a column named 'totalRevenue'.")
    if 'future_revenue' not in financial_data.columns:
        print("The data does not contain a column named 'future_revenue'.")
    else:
        # Create a new column for future expenses by assuming a constant expense ratio
        financial_data['future_expenses'] = financial_data['future_revenue'] * expense_ratio
else:

    # Create a new column for future revenue by assuming a constant growth rate
    financial_data['future_revenue'] = financial_data['totalRevenue'].iloc[-1] * (1 + growth_rate)**(range(1, len(financial_data) + 1))

# Create a new column for future expenses by assuming a constant expense ratio
financial_data['future_expenses'] = financial_data['future_revenue'] * expense_ratio

# Assume a discount rate of 10%
discount_rate = 0.1

If there is no totalRevenue nor future_revenue , you'll end up in a branch where the future_revenue column is not created, and you can't use it to compute future_expenses .如果没有totalRevenuefuture_revenue ,您最终会进入一个未创建future_revenue列的分支,并且您不能使用它来计算future_expenses

The final part of your program seems to simplify to您程序的最后部分似乎简化为

if "future_revenue" in financial_data.columns and "future_expenses" not in financial_data.columns:
    financial_data["future_expenses"] = financial_data["future_revenue"] * expense_ratio
if "totalRevenue" in financial_data.columns:
    financial_data["future_revenue"] = financial_data["totalRevenue"].iloc[-1] * (1 + growth_rate) ** (range(1, len(financial_data) + 1))
financial_data["future_expenses"] = financial_data["future_revenue"] * expense_ratio

but you'll still need some way to derive future_revenue so you can compute future_expenses at all;但是您仍然需要一些方法来导出future_revenue以便您可以计算future_expenses right now that can't be done if there's no totalRevenue .现在如果没有totalRevenue就无法完成。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将数据从 csv 读取到具有多列的 pandas dataframe 中? - How can the data be read from a csv into a pandas dataframe, with multiple columns? 无法从csv读取的熊猫数据框中选择一行 - Can't select a row from a pandas dataframe, read from a csv Pandas 从剪贴板读取带有日期时间列的 DataFrame - Pandas read DataFrame with datetime columns from clipboard Pandas - 无法更改 Z6A8064B5DF4794555500553C47C55057DZ 列的数据类型 - Pandas - Can't change datatype of dataframe columns 无法创建 pandas DataFrame 与来自 dicts 的 MultiIndex 列作为列 - Can't create pandas DataFrame with MultiIndex columns from dicts with tuples as columns 熊猫read_html如何只能从整个DataFrame中获取选定的列 - How Pandas read_html can get only selected columns from entire DataFrame 将列表读入pandas DataFrame列 - Read lists into columns of pandas DataFrame Python Pandas-read_csv数据框正在从列中删除值 - Python pandas - read_csv Dataframe is dropping values from columns 比较季度数据:在 Python(Pandas) 中迭代以比较来自四个不同 excel 文件的多列,这些文件导入为 dataframe - Comparing quarterly data: Iteration in Python(Pandas) to compare multiple columns from four different excel files imported as dataframe 无法访问使用qPython从kdb +导入数据的pandas DataFrame中的所有列 - Unable to access all the columns in pandas DataFrame where data is imported from kdb+ using qPython
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM