I have a dataframe:
index col1 col2 col3
0 X A 123
1 X A 456
2 X B 654
3 X B 321
4 X A 111
I'm trying to use pivot_table
to get this df into the following format:
index A B
0 123 654
1 456 321
2 111 NaN
I'm trying to use... pivot_table(index=df.index, columns=['col2'], values='col3')
, however this just ends up in the following state:
index A B
0 123 NaN
1 456 NaN
2 111 NaN
3 NaN 654
4 NaN 321
How can I avoid this?
Still using pivot_table
pd.pivot_table(df,index=df.groupby('col2').cumcount(),columns='col2',values='col3')
Out[163]:
col2 A B
0 123.0 654.0
1 456.0 321.0
2 111.0 NaN
Your implementation of pivot_table
is correct.
To combine records (in the pivoted dataframe), which you are asking about, you need first use pivot_table
and then simply drop the NaNs from the pivoted dataframe using apply
, as this answer shows :
# Perform spreadsheet-style pivot table
df2 = df.pivot_table(index=df.index, columns=['col2'], values='col3')
print df2
# Drop NaNs to shift rows up and combine records
df1 = df2.apply(lambda x: pd.Series(x.dropna().values))
print (df1)
The output is:
# df2
col2 A B
0 123.0 NaN
1 456.0 NaN
2 NaN 654.0
3 NaN 321.0
4 111.0 NaN
# df1
col2 A B
0 123.0 654.0
1 456.0 321.0
2 111.0 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.