简体   繁体   中英

Update values in a DataFrame by values in another by date

I tried to make a code to inputting version of items. There are two dataframes, one with the items, one with the version info.

Here's a dummy data and a code I made:

import pandas as pd
from datetime import datetime

Item = pd.DataFrame({"ID":["A1","A1","A2","A2","A3","B1"],"DATE":["2021-07-05","2021-08-01","2021-02-02","2021-02-03","2021-01-01","2021-10-12"]})
Ver = pd.DataFrame({"ver_date" : ["2021-01-01","2021-07-07","2021-09-09"],"version":["1.1","1.2","1.3"]})

for id,dat in zip(Item["ID"],Item["DATE"]):
    dat2 = datetime.strptime(dat,'%Y-%m-%d')
    for dtc,ver in zip(Ver["ver_date"],Ver["version"]):
        dtc = datetime.strptime(dtc,'%Y-%m-%d')
        if dat2 >= dtc:
            Item.loc[(Item.ID == id) & (Item.DATE == dat),"VER"] = ver
            continue
        else:
            pass

So what I tried is this code. Actually this code works, but I think it is not efficient since it gives me a result in about 1~2 min. (for 200 items).

You could use merge_asof . It merges by key distance. By default, it uses a "backward" search, which selects the last row in the Ver whose "ver_date" is less than or equal to the "Date" of Item (which is what your if-else condition is checking for).

Also it expects the keys to be merged on to be sorted; so you could first convert the date columns to datetime objects and sort the DataFrames by them:

Item['DATE'] = pd.to_datetime(Item['DATE'])
Ver['ver_date'] = pd.to_datetime(Ver['ver_date'])
out = (pd.merge_asof(Item.sort_values(by='DATE'), 
                     Ver.sort_values(by='ver_date'), 
                     left_on='DATE', right_on='ver_date')
       .drop(columns='ver_date')
       .sort_values(by='ID')
       .rename(columns={'version':'VER'}))

Output:

   ID       DATE  VER
3  A1 2021-07-05  1.1
4  A1 2021-08-01  1.2
1  A2 2021-02-02  1.1
2  A2 2021-02-03  1.1
0  A3 2021-01-01  1.1
5  B1 2021-10-12  1.3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM