简体   繁体   English

Python-根据日期过滤SQL查询

[英]Python - Filtering SQL query based on dates

I am trying to build a SQL query that will filter based on system date (Query for all sales done in the last 7 days): 我正在尝试构建一个将根据系统日期进行过滤的SQL查询(查询最近7天完成的所有销售):

import datetime
import pandas as pd
import psycopg2

con = p.connect(db_details)
cur = con.cursor()

df = pd.read_sql("""select store_name,count(*) from sales 
     where created_at between datetime.datetime.now() - (datetime.today() - timedelta(7))""",con=con)

I get an error 我得到一个错误

psycopg2.NotSupportedError: cross-database references are not implemented: datetime.datetime.now

You are mixing Python syntax into your SQL query. 您正在将Python语法混合到SQL查询中。 SQL is parsed and executed by the database, not by Python, and the database knows nothing about datetime.datetime.now() or datetime.date() or timedelta() ! SQL是由数据库而不是Python解析和执行的,并且数据库对datetime.datetime.now()datetime.date()timedelta() The specific error you see is caused by your Python code being interpreted as SQL instead and as SQL , datetime.datetime.now references the now column of the datetime table in the datetime database, which is a cross-database reference, and psycopg2 doesn't support queries that involve multiple databases. 你看到的是你的Python代码造成的特定错误被解释为SQL,而不是和SQL, datetime.datetime.now引用now的列datetime在表datetime数据库,这是一个跨数据库的引用, psycopg2没有按”支持涉及多个数据库的查询。

Instead, use SQL parameters to pass in values from Python to the database. 而是使用SQL参数将值从Python传递到数据库。 Use placeholders in the SQL to show the database driver where the values should go: 在SQL中使用占位符向数据库驱动程序显示值应放在的位置:

params = {
    # all rows after this timestamp, 7 days ago relative to 'now'
    'earliest': datetime.datetime.now() - datetime.timedelta(days=7),
    # if you must have a date *only* (no time component), use
    # 'earliest': datetime.date.today() - datetime.timedelta(days=7),
}
df = pd.read_sql("""
     select store_name,count(*) from sales 
     where created_at >= %(latest)s""", params=params, con=con)

This uses placeholders as defined by the psycopg2 parameters documentation , where %(latest)s refers to the latest key in the params dictionary. 这将使用psycopg2参数文档所定义的占位符,其中%(latest)s表示params字典中的latest键。 datetime.datetime() instances are directly supported by the driver. 驱动程序直接支持datetime.datetime()实例。

Note that I also fixed your 7 days ago expression, and replaced your BETWEEN syntax with >= ; 请注意,我还修复了7天前的表达式, >=替换了BETWEEN语法。 without a second date you are not querying for values between two dates, so use >= to limit the column to dates at or after the given date. 没有第二个日期,您不会查询两个日期之间的值,因此请使用>=将列限制为给定日期或之后的日期。

datetime.datetime.now() is not a proper SQL syntax, and thus cannot be executed by read_sql() . datetime.datetime.now()不是正确的SQL语法,因此无法由read_sql()执行。 I suggest either using the correct SQL syntax that computes current time, or creating variables for each datetime.datetime.now() and datetime.today() - timedelta(7) and replacing them in your string. 我建议要么使用计算当前时间的正确SQL语法,要么为每个datetime.datetime.now()datetime.today() - timedelta(7)创建变量,然后将其替换为字符串。

edit: Do not follow the second suggestion. 编辑:不要遵循第二个建议。 See comments below by Martijn Pieters. 请参阅以下Martijn Pieters的评论。

Maybe you should remove that Python code inside your SQL, compute your dates in python and then use the strftime function to convert them to strings. 也许您应该在SQL中删除该Python代码,在python中计算日期,然后使用strftime函数将其转换为字符串。

Then you'll be able to use them in your SQL query. 然后,您将可以在SQL查询中使用它们。

Actually, you do not necessarily need any params or computations in Python. 实际上,您在Python中不一定需要任何参数或计算。 Just use the corresponding SQL statement which should look like this: 只需使用相应的SQL语句,该语句应如下所示:

select store_name,count(*)
from sales 
where created_at >= now()::date - 7
group by store_name

Edit: I also added a group by which I think is missing. 编辑:我还添加了一个group by我认为这是缺少。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM