简体   繁体   中英

Pandas Dataframe - How to combine multiple rows to one

I have a dataset in the form:

    A            B
0  30  60538815980
1  30   7410811099
2  26   2238403510
3  26   2006613744
4  26   2006618783
5  26   2006613743

I want to combine the rows where the value of A matches and produce something like that

            C_1         C_2         C_3         C_4
A                                                  
26   2238403510  2006613744  2006618783  2006613743
30  60538815980  7410811099         NaN         NaN

I have tried expressing it in terms of join or merge but have failed so far. Is there any simple way to express that or will I have to use apply and create a new DataFrame?

First, create a groupby object based on column A . Then create a new dataframe df2 which uses ix to index column B of each group based on the value n from column A . Set the index of this dataframe equal to the key values from the groupby (ie the unique values from column A ).

Finally, use a list comprehension to set the new column values equal to C_1 , C_2 , ..., etc.

df = pd.DataFrame({'A': [30, 30, 26, 26, 26, 26], 
                   'B': [60538815980, 7410811099, 2238403510, 
                         2006613744, 2006618783, 2006613743]})

gb = df.groupby('A')
df2 = pd.DataFrame([df.ix[gb.groups[n], 'B'].values for n in gb.groups], 
                   index=gb.groups.keys())
df2.columns = ["C_" + str(i + 1) for i in df2.columns]
df2.index.name = "A"

>>> df2
            C_1         C_2         C_3         C_4
A
26   2238403510  2006613744  2006618783  2006613743
30  60538815980  7410811099         NaN         NaN

I'd also recommend using groupby but I think we can use pivot to simplify things. First, we create a new C column with the column labels we want to use, and then we call pivot :

>>> df["C"] = "C_" + (df.groupby("A").cumcount() + 1).astype(str)
>>> df.pivot(index="A", values="B", columns="C")
C           C_1         C_2         C_3         C_4
A                                                  
26   2238403510  2006613744  2006618783  2006613743
30  60538815980  7410811099         NaN         NaN

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM