I have a dataframe with different data types like the one below.
df:
A B C
0 True X 1
1 False X 2
2 False Y 3
3 True Y 4
4 False X 5
5 True X 6
6 True Y 7
7 False Y 8
I need to get the values from C and put them in different columns for both X and Y in B that are True in A. The desired df looks likes this:
df_desired:
X_1 X_2 Y_1 Y_2
0 1 6 4 7
I was able to group df to get the values that are true in A for both X and Y in B using this code:
df1 = df.groupby(by=['A', 'B'])['C'].apply(list).reset_index()
df1:
B A C
0 X False [2, 5]
1 X True [1, 6]
2 Y False [3, 8]
3 Y True [4, 7]
Assigning the True columns in another dataframe gives me a hint that I'm on the right path but it seems that I'm stuck in the last steps on getting to the desired dataframe.
df2['X'] = df1[df1['A']].iloc[0]['C']
df2['Y'] = df1[df1['A']].iloc[1]['C']
df2:
X Y
0 1 4
1 6 7
I've tried df2. transpose but it's not working specially if the shape of df2 is not square.
What would be the fastest way of doing this?
You can do:
d = df.loc[df['A']]
c = d['B'] + '_' + d.groupby('B').cumcount().add(1).astype(str)
d = pd.DataFrame([d['C'].values], columns=c).sort_index(1)
Alternatively,
d = df.loc[df['A'], ['B', 'C']].copy()
d['B'] += '_' + d.groupby('B').cumcount().add(1).astype(str)
d = d.set_index('B').T.reset_index(drop=True).sort_index(1)
print(d)
X_1 X_2 Y_1 Y_2
0 1 6 4 7
I think you need:
df2 = df.loc[df['A']]
new_df = df2.pivot_table(colums=['B', df2.groupby('B').cumcout().add(1)], values ='C', aggfunc = 'first')
new_df = new_df.set_axis([f'{x}_{y}' for x, y in new_df.columns], axis = 1)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.