简体   繁体   中英

Python - Filtering SQL query based on dates

I am trying to build a SQL query that will filter based on system date (Query for all sales done in the last 7 days):

import datetime
import pandas as pd
import psycopg2

con = p.connect(db_details)
cur = con.cursor()

df = pd.read_sql("""select store_name,count(*) from sales 
     where created_at between datetime.datetime.now() - (datetime.today() - timedelta(7))""",con=con)

I get an error

psycopg2.NotSupportedError: cross-database references are not implemented: datetime.datetime.now

You are mixing Python syntax into your SQL query. SQL is parsed and executed by the database, not by Python, and the database knows nothing about datetime.datetime.now() or datetime.date() or timedelta() ! The specific error you see is caused by your Python code being interpreted as SQL instead and as SQL , datetime.datetime.now references the now column of the datetime table in the datetime database, which is a cross-database reference, and psycopg2 doesn't support queries that involve multiple databases.

Instead, use SQL parameters to pass in values from Python to the database. Use placeholders in the SQL to show the database driver where the values should go:

params = {
    # all rows after this timestamp, 7 days ago relative to 'now'
    'earliest': datetime.datetime.now() - datetime.timedelta(days=7),
    # if you must have a date *only* (no time component), use
    # 'earliest': datetime.date.today() - datetime.timedelta(days=7),
}
df = pd.read_sql("""
     select store_name,count(*) from sales 
     where created_at >= %(latest)s""", params=params, con=con)

This uses placeholders as defined by the psycopg2 parameters documentation , where %(latest)s refers to the latest key in the params dictionary. datetime.datetime() instances are directly supported by the driver.

Note that I also fixed your 7 days ago expression, and replaced your BETWEEN syntax with >= ; without a second date you are not querying for values between two dates, so use >= to limit the column to dates at or after the given date.

datetime.datetime.now() is not a proper SQL syntax, and thus cannot be executed by read_sql() . I suggest either using the correct SQL syntax that computes current time, or creating variables for each datetime.datetime.now() and datetime.today() - timedelta(7) and replacing them in your string.

edit: Do not follow the second suggestion. See comments below by Martijn Pieters.

Maybe you should remove that Python code inside your SQL, compute your dates in python and then use the strftime function to convert them to strings.

Then you'll be able to use them in your SQL query.

Actually, you do not necessarily need any params or computations in Python. Just use the corresponding SQL statement which should look like this:

select store_name,count(*)
from sales 
where created_at >= now()::date - 7
group by store_name

Edit: I also added a group by which I think is missing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM