I have a dataframe like the following:
A B C Property Value
1 Bob b100 X1 prop1 a
2 Bob b100 X1 prop2 b
3 Bob b100 X2 prop1 c
4 Bob b100 Y1 prop1 a
5 Bob b100 Z prop9 b
6 Bob b200 X1 prop1 c
7 Bob b200 X1 prop2 d
How can I use unstack, pivot, or some other method to only unstack the Property and Value columns?
I am trying to obtain the following dataframe:
A B C prop1 prop2 prop9
1 Bob b100 X1 a b -
2 Bob b100 X2 c - -
3 Bob b100 Y1 a a -
4 Bob b100 Z - - b
5 Bob b200 X1 c d -
In [115]: (df.pivot_table(index=['A','B','C'], columns='Property', values='Value',
...: aggfunc='first', fill_value='-')
...: .reset_index()
...: .rename_axis(None,1))
...:
Out[115]:
A B C prop1 prop2 prop9
0 Bob b100 X1 a b -
1 Bob b100 X2 c - -
2 Bob b100 Y1 a - -
3 Bob b100 Z - - b
4 Bob b200 X1 c d -
or using unstack
:
In [124]: (df.set_index(['A','B','C','Property'])
...: ['Value'].unstack('Property', fill_value='-')
...: .reset_index()
...: .rename_axis(None,1))
...:
Out[124]:
A B C prop1 prop2 prop9
0 Bob b100 X1 a b -
1 Bob b100 X2 c - -
2 Bob b100 Y1 a - -
3 Bob b100 Z - - b
4 Bob b200 X1 c d -
Here's another solution using pd.get_dummies
. Also added some benchmarking for comparison.
import pandas as pd, numpy as np
def jp(df):
df = df.join(pd.get_dummies(df.Property))
for col in ['prop1', 'prop2', 'prop9']:
df[col] = np.where(df[col], df.Value, df[col])
return df.drop(['Property', 'Value'], 1).groupby(['A', 'B', 'C'])\
.agg(lambda s: next((i for i in s if i), 0)).reset_index()
def maxu(df):
return df.pivot_table(index=['A','B','C'], columns='Property', values='Value', \
aggfunc='first', fill_value='-').reset_index().rename_axis(None,1)
def maxu2(df):
return df.set_index(['A','B','C','Property'])['Value']\
.unstack('Property', fill_value='-').reset_index().rename_axis(None,1)
%timeit jp(df.copy()) # 14 ms ± 176 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit maxu(df.copy()) # 14.1 ms ± 181 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit maxu2(df.copy()) # 10.4 ms ± 1.98 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.