[英]Pandas: for each last row in a group, assign a column a value
如何為組的最后一行分配我想要的值(假設我已經對 DF 進行了排序)?
# data
df = pd.DataFrame([['a', 1], ['a', 2], ['b', 1], ['b', 2]],
columns=['colA', 'colB'])
# create a new col
df['colC'] = 'Not Current'
# my attempt -- groupby col of interest, get last row, apply value to 'colC' column
df.loc[df.reset_index().groupby('colA').tail(1), 'colC'] = 'Current'
您可以使用通話index
進行修復
df.loc[df.groupby('colA').tail(1).index, 'colC'] = 'Current'
df
Out[105]:
colA colB colC
0 a 1 Not Current
1 a 2 Current
2 b 1 Not Current
3 b 2 Current
使用loc
與duplicated
:
df['colC'] = 'Not Current'
not_last_rows = df['colA'].duplicated(keep='last')
df.loc[~not_last_rows, 'colC'] = 'Current'
或者在你的情況下, np.where
:
not_last_rows = df['colA'].duplicated(keep='last')
df['colC'] = np.where(not_last_rows, 'Not Current', 'Current')
輸出:
colA colB colC
0 a 1 Not Current
1 a 2 Current
2 b 1 Not Current
3 b 2 Current
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.