[英]How to create a new column based on row value in previous row in Pandas dataframe?
[英]How to increment a column based off the value in the previous row while using groupby in Pandas dataframe?
我有以下數據框:
claim diagnosis sequence
100 1 1.0
100 2 1.0
100 3 NaN
100 4 NaN
105 1 1.0
105 2 2.0
105 3 2.0
105 4 NaN
111 1 1.0
111 2 2.0
111 3 3.0
111 4 NaN
我需要的是通過聲明將所有 NaN 替換為前一行中的 oneup 值:
claim diagnosis sequence
100 1 1.0
100 2 1.0
100 3 2.0
100 4 3.0
105 1 1.0
105 2 2.0
105 3 2.0
105 4 3.0
111 1 1.0
111 2 2.0
111 3 3.0
111 4 4.0
我試過 cumcount,但似乎無法讓它使用以前的值。 我也試過 loc,但還不太了解它。
things = [{'claim':100, 'diagnosis':1, 'sequence':1},
{'claim':100, 'diagnosis':2, 'sequence':1 },
{'claim':100, 'diagnosis':3, },
{'claim':100, 'diagnosis':4, },
{'claim':105, 'diagnosis':1, 'sequence':1},
{'claim':105, 'diagnosis':2, 'sequence':2},
{'claim':105, 'diagnosis':3,'sequence':2 },
{'claim':105, 'diagnosis':4, },
{'claim':111, 'diagnosis':1, 'sequence':1},
{'claim':111, 'diagnosis':2, 'sequence':2},
{'claim':111, 'diagnosis':3, 'sequence':3},
{'claim':111, 'diagnosis':4, }]
df = pd.DataFrame(things)
df
幾天來我一直在絞盡腦汁,任何幫助都會很棒。
使用該行之前有多少NaN
cumsum
計數,然后與ffill
s1=df['sequence'].isnull().groupby(df['claim']).cumsum()
df['sequence']=s1+df.groupby('claim')['sequence'].ffill()
df
Out[145]:
claim diagnosis sequence
0 100 1 1.0
1 100 2 1.0
2 100 3 2.0
3 100 4 3.0
4 105 1 1.0
5 105 2 2.0
6 105 3 2.0
7 105 4 3.0
8 111 1 1.0
9 111 2 2.0
10 111 3 3.0
11 111 4 4.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.