简体   繁体   English

Pandas DataFrame - KeyError:“日期”

[英]Pandas DataFrame - KeyError: 'date'

For a current project, I am working with a large Pandas DataFrame sourced from a JSON file.对于当前的项目,我正在使用来自 JSON 文件的大型 Pandas DataFrame。

As soon as calling specific objects of the JSON file within Pandas, I am getting key errors such as KeyError: 'date' for line df['date'] = pd.to_datetime(df['date']) .只要在 Pandas 中调用 JSON 文件的特定对象,我就会收到诸如 KeyError KeyError: 'date' for line df['date'] = pd.to_datetime(df['date'])类的关键错误。

I have already excluded the identifier/object wording as a possible source for the error.我已经排除了标识符/对象措辞作为错误的可能来源。 Is there any smart tweak to make this code work?是否有任何智能调整可以使此代码正常工作?

The JSON file has the following structure: JSON 文件具有以下结构:

[
{"stock_symbol": "AMG", "date": "2013-01-01", "txt_main": "ABC"}
]

And the corresponding code section looks like this:相应的代码部分如下所示:

import string
import json
import pandas as pd

# Loading and normalising the input file
file = open("sp500.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df = pd.DataFrame().fillna("")

# Datetime conversion
df['date'] = pd.to_datetime(df['date'])

Take a look at the documentation examples of fillna function fillna function .查看 fillna function fillna function的文档示例。

By doing df = pd.DataFrame().fillna("") you are overriding your previous df with a new (empty) dataframe.通过执行df = pd.DataFrame().fillna("")您正在用新的(空的)dataframe 覆盖以前的 df。 You can just apply it this way: df = df.fillna("") .你可以这样应用它: df = df.fillna("")

In [42]: import string
    ...: import json
    ...: import pandas as pd
    ...:
    ...: # Loading and normalising the input file
    ...: #file = open("sp500.json", "r")
    ...: #data = json.load(file)
    ...: df = pd.json_normalize(a)
    ...: #df = pd.DataFrame().fillna("")
    ...:
    ...: # Datetime conversion
    ...: df['date'] = pd.to_datetime(df['date'])

In [43]: df
Out[43]:
  stock_symbol       date txt_main
0          AMG 2013-01-01      ABC

df = pd.DataFrame().fillna("") creates a new empty dataframe and fills "NaN" with empty. df = pd.DataFrame().fillna("")创建一个新的空 dataframe 并用空填充“NaN”。

So, change that line to df = df.fillna("")因此,将该行更改为df = df.fillna("")

You are using df = pd.DataFrame().fillna("") which will create a new dataframe and fill an with no value.您正在使用df = pd.DataFrame().fillna("")它将创建一个新的 dataframe 并填充一个没有值。

Here the old df is replaced by empty dataframe, so there is no column named date .这里旧的 df 被空的 dataframe 替换,所以没有名为date的列。 Instead, you can use to fill 'na' values using df.fillna("") .相反,您可以使用df.fillna("")填充“na”值。

import string
import json
import pandas as pd

# Loading and normalising the input file
file = open("sp500.json", "r")
data = json.load(file)
df = pd.json_normalize(data)
df = df.fillna("")

# Datetime conversion
df['date'] = pd.to_datetime(df['date'])

Thank you谢谢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM