简体   繁体   中英

adding first row for each group in pandas dataframe

These are my dataframes:

df1 = pd.DataFrame({'sym': ['msft', 'msft', 'amd', 'amd'], 'close': [102, 100, 35, 30]})

df2 = pd.DataFrame({'sym': ['msft', 'amd'], 'close': [103, 36]})

I want to combine df1 and df2 in a way that a first row is added for each group of sym in df1 . This is a simplified dataframe, my original has thousand of rows.

This is my desired outcome:

 sym  close
0  msft    103
1  msft    102
2  msft    100
3   amd     36
4   amd     35
5   amd     30

You can do this using pd.Categorical() which lets you set the order of a categorical column.

m=df2.append(df1,ignore_index=True)
m=m.assign(sym=pd.Categorical(m.sym,df1.sym.unique(),ordered=True)).sort_values('sym')

    sym  close
0  msft    103
2  msft    102
3  msft    100
1   amd     36
4   amd     35
5   amd     30

We can use pd.concat with sort_values descending:

df = pd.concat([df1, df2]).sort_values(['sym', 'close'], 
                                       ascending=False).reset_index(drop=True)

    sym  close
0  msft    103
1  msft    102
2  msft    100
3   amd     36
4   amd     35
5   amd     30

2nd example

based on anky_91's comment

df1 = pd.DataFrame({'sym': ['msft', 'msft', 'amd', 'amd','cmd'], 'close': [102, 100, 35, 30,40]})
df2 = pd.DataFrame({'sym': ['msft', 'cmd','amd'], 'close': [103, 23,36]})

    sym  close
0  msft    102
1  msft    100
2   amd     35
3   amd     30
4   cmd     40

    sym  close
0  msft    103
1   cmd     23
2   amd     36

df = pd.concat([df1, df2]).sort_values(['sym', 'close'], 
                                        ascending=False).reset_index(drop=True)

    sym  close
0  msft    103
1  msft    102
2  msft    100
3   cmd     40
4   cmd     23
5   amd     36
6   amd     35
7   amd     30

First concat both dataframes
Sort by both columns in descending order

df3 = pd.concat([df1,df2],sort=False)
df3.sort_values(['close', 'sym'], ascending=False,inplace=True)

This should work:

import pandas as pd

df1 = pd.DataFrame({'sym': ['msft', 'msft', 'amd', 'amd'], 'close': [102, 100, 35, 30]})
df2 = pd.DataFrame({'sym': ['msft', 'amd'], 'close': [103, 36]})

df3 = df1.append(df2)
df3 = df3.sort_values(['close', 'sym'], ascending=False)

Output:

    sym close
0   msft    103
0   msft    102
1   msft    100
1   amd 36
2   amd 35
3   amd 30

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM