简体   繁体   中英

Pandas: pivot dataframe and preserve additional non-numeric column

I have some data in a list format: figures for 150-odd organisations, with a figure for each of a 12-month series. In its raw form it looks like this:

Name Size   Date  Figure
Org1 Medium Jun16 8.36
Org1 Medium Jul16 7.55
Org1 Medium Aug16 8.57
...
Org1 Medium May17 9.41
Org2 Large  Jun16 12.12
Org2 Large  Jul16 11.44
...

So each organisation has a unique name, twelve months of data, and one of three sizes (small, medium, large). I've successfully pivoted these figures to give me a timeseries for each organisation, ie,

Name Jun16 Jul16 Aug16 Sep16 Oct16...
Org1 8.36  7.55  8.57  7.66  9.43
Org2 12.12 11.44 11.01 12.01 10.44...

But I want to include another column containing the size of each organisation. The code I've used for the pivot is:

dataPivot = dataRaw.pivot_table(index='Name', columns ='Date'],
                              aggfunc='sum', values = 'Figure').fillna(0)

where dataRaw is the raw data read in from a .csv. I've tried adding 'Size' to the columns field, but this just gives me 12 additional columns for each size!

One way of doing that is by using concat after creating a new df based on size ie

table = df.pivot_table(index='Name', columns ='Date', aggfunc='sum', values = 'Figure').fillna(0)

size = df.groupby('Name').size().to_frame().rename(columns={0:'size'})

ndf = pd.concat([table,size],1)

Output based on sample data:

Aug16  Jul16  Jun16  May17  size
Name                                  
Org1   8.57   7.55   8.36   9.41     4
Org2   0.00  11.44  12.12   0.00     2

If you mean to add Size column preset in the dataframe then add that column name to index parameter not columns ie

df.pivot_table(index=['Name','Size'], columns =['Date'],aggfunc='sum', values =['Figure','Size']).fillna(0).reset_index()

Output:

Name    Size Figure                    
Date                Aug16  Jul16  Jun16 May17
0     Org1  Medium   8.57   7.55   8.36  9.41
1     Org2   Large   0.00  11.44  12.12  0.00

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM