简体   繁体   中英

Values from rows to columns of a pandas dataframe

I have a dataframe with different data types like the one below.

df:
    A     B     C
0   True  X     1
1   False X     2
2   False Y     3
3   True  Y     4
4   False X     5
5   True  X     6
6   True  Y     7
7   False Y     8

I need to get the values from C and put them in different columns for both X and Y in B that are True in A. The desired df looks likes this:

df_desired:
    X_1    X_2    Y_1    Y_2
0   1      6      4      7

I was able to group df to get the values that are true in A for both X and Y in B using this code:

df1 = df.groupby(by=['A', 'B'])['C'].apply(list).reset_index()
df1:
    B   A       C
0   X   False   [2, 5]
1   X   True    [1, 6]
2   Y   False   [3, 8]
3   Y   True    [4, 7]

Assigning the True columns in another dataframe gives me a hint that I'm on the right path but it seems that I'm stuck in the last steps on getting to the desired dataframe.

df2['X'] = df1[df1['A']].iloc[0]['C']
df2['Y'] = df1[df1['A']].iloc[1]['C']
df2:
    X   Y
0   1   4
1   6   7

I've tried df2. transpose but it's not working specially if the shape of df2 is not square.

What would be the fastest way of doing this?

You can do:

d = df.loc[df['A']]
c = d['B'] + '_' + d.groupby('B').cumcount().add(1).astype(str)
d = pd.DataFrame([d['C'].values], columns=c).sort_index(1)

Alternatively,

d = df.loc[df['A'], ['B', 'C']].copy()
d['B'] += '_' + d.groupby('B').cumcount().add(1).astype(str)
d = d.set_index('B').T.reset_index(drop=True).sort_index(1)

print(d)

   X_1  X_2  Y_1  Y_2
0    1    6    4    7

I think you need:

df2 = df.loc[df['A']]
new_df = df2.pivot_table(colums=['B', df2.groupby('B').cumcout().add(1)], values ='C', aggfunc = 'first')
new_df = new_df.set_axis([f'{x}_{y}' for x, y in new_df.columns], axis = 1)
import pandas as pd
a = pd.DataFrame([[True,'X',1],[False,'X',2],[False,'Y',3],[True,'Y',4],[False, 'X' ,5],[True,'X',6],[True,'Y',7],[False ,'Y',8]],columns=["A","B","C"])
a=pd.pivot_table(a, values='C', index=['B'],columns=["A"],observed=True,aggfunc=list,)

OUTPUT by clicking here:
单击此处输出

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM