简体   繁体   中英

Calendar pivot table pandas keyerror

I have a time series of values by day, so, something like this

date          value
2020-01-01    50000
2020-01-02    50130
...
2020-10-18    48763

The column 'date'is used as index and parsed when importing the csv

I'd like to put those values into a pivot table like this using pandas

       2018     2019     2020
------------------------------
jan   50000    32420    21488
feb   48237    38240    98783
mar   51682    21984    21984
apr   49956    14878    14847

where the data by month/year are aggregated by sum of the values taken into the specified month I'm using the libraries pandas and calendar and the function .pivot_table

Looking at what they suggest in this guide and the author uses these lines of code

import calendar
all_month_year_df = pd.pivot_table(df, values="Open",
                               index=["month"],
                               columns=["year"],
                               fill_value=0,
                               margins=True)
named_index = [[calendar.month_abbr[i] if isinstance(i, int) else i for i in 
list(all_month_year_df.index)]] # name months
all_month_year_df = all_month_year_df.set_index(named_index)
all_month_year_df

but all I get is a KeyError for 'month' and I can't figure out what is the reason

Can you help me figure out why? Where is this code wrong? Using python 3.8.3 64 bit with vscode on ubuntu 20.04 it this info helps

Thank you

Your input dataframe only has two columns, data, and value.

You need to put in two columns for month and year.

something like

df['month'] = df['date'].dt.month
df['year'] = df['date'].dt.year

The problem is you don't have a month or year column yet (you only have a date column), so you need to create the columns first based on the date column as follows:

df['month'] = df.date.dt.month
df['year'] = df.date.dt.year

this wont work because you are using 'Date' column as Index so this wont solve your issue.. simply replace index and column with this

index=[df.index.month], columns=[df.index.year]

and since you have not cleaned your dataset use

margins=False

this will definately work and its too short also inplace of making new column of month and year.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM