简体   繁体   中英

creating a pivot_table fails on a pandas dataframe

I have a dataframe with columns year , month , source , ... there are multiple records per (year,month,source) and I need to generate a pivot table whose index is (year, month) and source is the column and count of the records per (year,month, source) are the values. I have the following code

df.privot_table(index = ['year','month'], columns = ['source'], aggfunc = np.size, fill_value = 0)

here is how my data look like

2001,02,A, ....
2001,02,A,....
2001,03,B,....
2001,03,B,....
2001,03,B,....

and this is how I want the data to be

           A  B
2001, 02,  2, 0
2001, 03,  0, 3

but it throws the following error message

 Reindexing only valid with uniquely values index objects

What's wrong?

Your desired output can be reached by using aggfunc=len .

import pandas as pd

df = pd.DataFrame([[2001, '02', 'A'], [2001, '02', 'A'], [2001, '03', 'B'],
                   [2001, '03', 'B'], [2001, '03', 'B']],
                  columns=['Year', 'Month', 'Source'])

res = df.pivot_table(index=['Year', 'Month'], columns='Source',
                     aggfunc=len, fill_value=0)

print(res)

Source      A  B
Year Month      
2001 02     2  0
     03     0  3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM