简体   繁体   English

KeyError pandas dataframe

[英]KeyError pandas dataframe

The purpose of this script is to read a csv file then create a data frame out of it.此脚本的目的是读取 csv 文件,然后从中创建一个数据框。

The file contains forex historical data.该文件包含外汇历史数据。

The file has 7 columns Date, Time, Open, High, Low, Close and Volume, and around 600k rows.该文件有 7 列 Date、Time、Open、High、Low、Close 和 Volume,以及大约 600k 行。

Here is a data sample:这是一个数据样本:

                        Open     High      Low    Close  Volume
Release Date                                                   
2020-02-05 01:50:00  109.450  109.452  109.449  109.451      79
2020-02-05 01:51:00  109.451  109.451  109.449  109.450      26
2020-02-05 01:52:00  109.451  109.453  109.449  109.449      29
2020-02-05 01:53:00  109.449  109.449  109.440  109.442      35
2020-02-05 01:54:00  109.443  109.443  109.432  109.432      49
2020-02-05 01:55:00  109.432  109.439  109.432  109.438      19
2020-02-05 01:56:00  109.439  109.450  109.439  109.449      56
2020-02-05 01:57:00  109.449  109.450  109.446  109.446      20
2020-02-05 01:58:00  109.446  109.451  109.446  109.448      33
2020-02-05 01:59:00  109.449  109.454  109.443  109.443      75

After scraping the date and time the script must make some date time calculation like month and day.抓取日期和时间后,脚本必须进行一些日期时间计算,例如月份和日期。

Then, some technical analysis using TA-LIB library.然后,使用 TA-LIB 库进行一些技术分析。

With every new step the code produces a dataframe.对于每一个新步骤,代码都会生成一个 dataframe。

All the new dataframes will be stored in a list.所有新的数据帧都将存储在一个列表中。

The last step is to concatenate all these dataframes into one final dataframe.最后一步是将所有这些数据帧连接成一个最终的 dataframe。

Here is the code:这是代码:

import pandas as pd
import talib


class Data:
    def __init__(self):
        self.dfs = []
        self.names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
        self.df = pd.DataFrame()
        self.Close = self.df['Close'].astype(float)

    def file(self, file):
        self.df = pd.read_csv(file, names=self.names,
                              parse_dates={'Release Date': ['Date', 'Time']})
        return self.dfs.append(self.df)

    def year(self):
        self.df['year'] = self.df['Release Date'].dt.year
        return self.dfs.append(self.df['year'])

    def month(self):
        self.df['month'] = self.df['Release Date'].dt.month
        return self.dfs.append(self.df['month'])

    def week(self):
        self.df['week'] = self.df['Release Date'].dt.week
        return self.dfs.append(self.df['week'])

    def day(self):
        self.df['day'] = self.df['Release Date'].dt.day
        return self.dfs.append(self.df['day'])

    def hour(self):
        self.df['hour'] = self.df['Release Date'].dt.hour
        return self.dfs.append(self.df['hour'])

    def minute(self):
        self.df['minute'] = self.df['Release Date'].dt.minute
        return self.dfs.append(self.df['minute'])

    def dema(self):
        self.df['DEMA'] = talib.DEMA(self.Close, timeperiod=30)
        return self.dfs.append(self.df['dema'])

    def ema(self):
        self.df['EMA'] = talib.EMA(self.Close, timeperiod=30)
        return self.dfs.append(self.df['ema'])

    def KAMA(self):
        self.df['KAMA'] = talib.KAMA(self.Close, timeperiod=30)
        return self.dfs.append(self.df['KAMA'])

    def ma(self):
        self.df['MA'] = talib.MA(self.Close, timeperiod=30, matype=0)
        return self.dfs.append(self.df['ma'])

    def action(self):
        self.year()
        self.month()
        self.week()
        self.day()
        self.hour()
        self.minute()
        self.dema()
        self.ema()
        self.KAMA()
        self.ma()

    def print(self):
        self.action()
        print(len(self.dfs))


x = Data()
x.file(r"D:\Projects\Project Forex\EURUSD.csv")
x.print()

Here is the error:这是错误:

Traceback (most recent call last):

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
    return self._engine.get_loc(key)

  File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Close'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "C:/Users/Sayed/PycharmProjects/project/Technical Analysis.py", line 74, in <module>
    x = Data()

  File "C:/Users/Sayed/PycharmProjects/project/Technical Analysis.py", line 10, in __init__
    self.Close = self.df['Close'].astype(float)

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Close'

You have several problems that have been quickly mentioned by Serge in his comments. Serge 在他的评论中很快提到了几个问题。

You try to define self.Close as self.Close = self.df['Close'].astype(float)您尝试将self.Close定义为self.Close = self.df['Close'].astype(float)

but self.df is initialized as an empty dataframe so there is no column named 'Close' .但是self.df被初始化为一个空的 dataframe 所以没有名为'Close'列。

You then proceed to define:然后您继续定义:

    def dema(self):
        self.df['DEMA'] = talib.DEMA(self.Close, timeperiod=30)
        return self.dfs.append(self.df['dema'])

    def ema(self):
        self.df['EMA'] = talib.EMA(self.Close, timeperiod=30)
        return self.dfs.append(self.df['ema'])

    def ma(self):
        self.df['MA'] = talib.MA(self.Close, timeperiod=30, matype=0)
        return self.dfs.append(self.df['ma'])

In these three cases you define a column using uppercase letters (eg 'MA' )在这三种情况下,您使用大写字母定义一列(例如'MA'

But then try to append a column with a lowercase name (eg self.dfs.append(self.df['ma']) )但是然后尝试 append 一个带有小写名称的列(例如self.dfs.append(self.df['ma'])

What do you want to accomplish with self.Close?你想用 self.Close 完成什么? Do you want a DataFrame with only the column close?您想要仅关闭列的 DataFrame 吗?

If yes, try self_close instead of self.close.如果是,请尝试使用 self_close 而不是 self.close。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM