简体   繁体   中英

Pandas: for each last row in a group, assign a column a value

How can I assign the last row of a group (assuming I have sorted it the DF already) a value I want?

# data
df = pd.DataFrame([['a', 1], ['a', 2], ['b', 1], ['b', 2]],
                  columns=['colA', 'colB'])

# create a new col
df['colC'] = 'Not Current'

# my attempt -- groupby col of interest, get last row, apply value to 'colC' column
df.loc[df.reset_index().groupby('colA').tail(1), 'colC'] = 'Current'

You can do fix with call index

df.loc[df.groupby('colA').tail(1).index, 'colC'] = 'Current'
df
Out[105]: 
  colA  colB         colC
0    a     1  Not Current
1    a     2      Current
2    b     1  Not Current
3    b     2      Current

Use loc with duplicated :

df['colC'] = 'Not Current'
not_last_rows = df['colA'].duplicated(keep='last')
df.loc[~not_last_rows, 'colC'] = 'Current'

Or on your case, np.where :

 not_last_rows = df['colA'].duplicated(keep='last')
 df['colC'] = np.where(not_last_rows, 'Not Current', 'Current')

Output:

  colA  colB         colC
0    a     1  Not Current
1    a     2      Current
2    b     1  Not Current
3    b     2      Current

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM