[英]Pandas Join Two Dataframes According to Range and Date
我有兩個這樣的數據框:
DATE MAX_AMOUNT MIN_AMOUNT MAX_DAY MIN_DAY RATE
01/09/2022 20 15 10 5 0.01
01/09/2022 25 20 15 10 0.02
03/09/2022 30 10 5 3 0.03
03/09/2022 40 30 20 5 0.04
04/09/2022 10 5 10 1 0.05
ID DATE AMOUNT DAY
1 01/09/2022 18 7
2 01/09/2022 22 11
3 01/09/2022 30 20
4 03/09/2022 35 10
5 04/09/2022 35 10
我想根據 DATE 將 RATE 值帶到第二個 df。 此外,相關 DATE 中的 AMOUNT 和 DAY 值必須在適當的范圍內(MAX_AMOUNT & MIN_AMOUNT、MAX_DAY & MIN_DAY)。
所需的 output 像這樣:
ID DATE AMOUNT DAY RATE
1 01/09/2022 18 7 0.01
2 01/09/2022 22 11 0.02
3 01/09/2022 30 20
4 03/09/2022 35 10 0.04
5 04/09/2022 35 10
你能幫我解決這個問題嗎?
# Merge df1 and df2 using your custom condition
match = df1.merge(df2, on="DATE").query("MIN_AMOUNT <= AMOUNT <= MAX_AMOUNT and MIN_DAY <= DAY <= MAX_DAY")
# Now join any matching rate back to df2
result = df2.merge(match[["ID", "RATE"]], on="ID", how="left")
首先通過Series.between
過濾列使用merge
,然后將Series.map
用於具有第一個匹配ID
的RATE
列 - 添加DataFrame.drop_duplicates
:
df = df2.merge(df1, on='DATE')
df = (df[df['AMOUNT'].between(df['MIN_AMOUNT'], df['MAX_AMOUNT']) &
df['DAY'].between(df['MIN_DAY'], df['MAX_DAY'])])
df2['RATE'] = df2['ID'].map(df.drop_duplicates('ID').set_index('ID')['RATE'])
print (df2)
ID DATE AMOUNT DAY RATE
0 1 01/09/2022 18 7 0.01
1 2 01/09/2022 22 11 0.02
2 3 01/09/2022 30 20 NaN
3 4 03/09/2022 35 10 0.04
4 5 04/09/2022 35 10 NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.