简体   繁体   English

如何从 Pandas 表中的先前加载中选择存储在文件中的大于最大时间戳的记录

[英]How to select records greater than max timestamp stored in a file from previous load from table in Pandas

I have SQL script from which I am extracting data storing into a data frame.我有 SQL 脚本,从中提取数据存储到数据框中。

Current Code当前代码

cursor.execute("""
    SELECT * FROM ofs.ord_add oa
            WHERE
            oa.is_active = 'Y' AND   oa.or_ad_id = 5820  AND   oa.flag_value = 'Y'
    """)
    data3=cursor.fetchall()
    columns = [column[0] for column in cursor.description]
    order_addition = pd.DataFrame(data3,columns=columns)

** Current Output- Run1** ** 电流输出 - Run1**

ID     IS_ACTIVE   OR_AD_ID   FLAG_VALUE   INSERT TIMESTAMP
12300     Y          5820        Y         2020-01-06 08:12:53
14340     Y          5820        Y         2020-01-19 06:11:53

** Current Output-Run2** ** 电流输出-Run2**

ID     IS_ACTIVE   OR_AD_ID   FLAG_VALUE   INSERT TIMESTAMP
12300     Y          5820        Y         2020-01-06 08:12:53
14340     Y          5820        Y         2020-01-19 06:11:53
22368     Y          5820        Y         2020-01-22 08:12:53
34567     Y          5820        Y         2020-01-24 06:11:53

I am want use a condition in my current code such that :我想在我当前的代码中使用一个条件,这样:

1) max insert timestamp from the last run is stored in a file 1) 上次运行的最大插入时间戳存储在文件中

2) In the next run of the SQL query, Only records greater than max timestamp stored in the file is loaded into the dataframe. 2) 在 SQL 查询的下一次运行中,仅将存储在文件中的大于最大时间戳的记录加载到数据帧中。

Expected Output after Run2 Run2 后的预期输出

  ID     IS_ACTIVE   OR_AD_ID   FLAG_VALUE   INSERT TIMESTAMP
  22368     Y          5820        Y         2020-01-22 08:12:53
  34567     Y          5820        Y         2020-01-24 06:11:53

How can this be done in python ?这如何在 python 中完成?

First, create a variable from the very first run:首先,从第一次运行创建一个变量:

unique_ids = tuple(dataframe.IDs.unique.tolist())

Then you execute the sql query the next time然后你下次执行sql查询

cursor.execute("""
     SELECT * FROM ofs.ord_add oa
              WHERE
             oa.is_active = 'Y' AND   oa.or_ad_id = 5820  AND   oa.flag_value = 'Y'
             AND oa.ID NOT IN %(unique_ids)s; 
"""
({'unique_ids':unique_ids})
)

do test the variable place holding ... i may have bungled it in this code.做测试变量的地方保持...我可能在这段代码中搞砸了。 But the whole idea is that before u run the query again, get the variable then pass it on to the query.但整个想法是,在您再次运行查询之前,获取变量然后将其传递给查询。 https://www.psycopg.org/docs/usage.html has a guide on how to pass variables in. love to get your feedback if it works or not https://www.psycopg.org/docs/usage.html有一个关于如何传入变量的指南。喜欢得到你的反馈,如果它有效与否

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM