簡體   English   中英

根據連續的行制作源和目標列

[英]Make Source and Target column based on consecutive rows

我有以下問題

人員 1001 完成活動 A,然后完成活動 C(在活動 A 之后)我需要將連續的行移動到目標列

df = pd.DataFrame([[1001, 'A'], [1001,'C'], [1004, 'D'],[1005, 'C'], 
                   [1005,'D'], [1010, 'A'],[1010,'D'],[1010,'F']], columns=['CustomerNr','Activity'])
df = pd.DataFrame([[1001, 'A','C'], [1004, 'D',np.nan],[1005, 'C','D'], 
                   [1010, 'A','D'],[1010,'D' ,'F']], columns=['CustomerNr','Target','Source'])
客戶編號 來源 目標
1001 一種 C
1004 鈉鹽
1005 C
1010 一種
1010 F

您可以使用:

df['Target']=df['Activity'].shift(-1)
df['prev_CustomerNr']=df['CustomerNr'].shift(-1)
print(df)
'''
   CustomerNr Activity Target  prev_CustomerNr
0        1001        A      C           1001.0
1        1001        C      D           1004.0
2        1004        D      C           1005.0
3        1005        C      D           1005.0
4        1005        D      A           1010.0
5        1010        A      D           1010.0
6        1010        D      F           1010.0
7        1010        F   None              NaN
'''
#we can't find the target information of the most recent activity. So we drop the last row for each CustomerNr.

m1 = df.duplicated(['CustomerNr'], keep="last") #https://stackoverflow.com/a/70216388/15415267
m2 = ~df.duplicated(['CustomerNr'], keep=False)
df = df[m1|m2]

#If CustomerNr and prev_CustomerNr are not the same, I replace with nan.
df['Target']=np.where(df['CustomerNr']==df['prev_CustomerNr'],df['Target'],np.nan)
df=df.drop(['prev_CustomerNr'],axis=1)

print(df)
'''
   CustomerNr Activity Target
0        1001        A      C
2        1004        D    NaN
3        1005        C      D
5        1010        A      D
6        1010        D      F
'''

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM