简体   繁体   中英

Passing python list of dates to sql query where clause

I have a date column in the data frame which I convert to a list of dates

begdtlist = df["BEG_DT"].tolist()

print(begdtlist) returns following.

Timestamp('2018-04-29 00:00:00'), Timestamp('2018-04-22 00:00:00'), 
Timestamp('2018-04-22 00:00:00'), Timestamp('2018-04-29 00:00:00'), 
Timestamp('2018-04-29 00:00:00'). 

Date is getting type casted to Timestamp.

I pass this list to an SQL query as below

sql = ("select  calndr_dt,wk_of_mnth from DatabaseName where calndr_dt = cast {} as date").format(repr(begdtlist).replace('[','(').replace(']',')'))

However my sql is generated in the below format which is causing the query to fail.

"select  calndr_dt,wk_of_mnth from DatabaseName where calndr_dt = cast 
(Timestamp('2018-04-29 00:00:00'), Timestamp('2018-04-22 00:00:00'), 
Timestamp('2018-04-22 00:00:00'), Timestamp('2018-04-29 00:00:00'), 
Timestamp('2018-04-29 00:00:00')) as date"

I am not sure why it is coming as a Timestamp. I just need the date portion in quotes. Any guidance in accomplishing this would be appreciated.

You are yousing repr() to transform an object into a string. This method is not intended for the use case you apply it to; it gives information on the internal representation of the underlying object.

What you really want to do is to format your timestamp to make it a string that suits your needs. Just as regular python Datetime objects, pandas ' Timestamp objects also have a method called strftime() that is used for string formatting.

# This part is just to create a MWE; it mimics your dataframe that we
# do not have at hand here.
df = pandas.DataFrame({
    'BEG_DT': [
        pandas.Timestamp('2018-04-29'),
        pandas.Timestamp('2019-04-22')]
    })

# This is what you did
begdtlist = df['BEG_DT'].tolist()
print(begdtlist)

# This is how you can format the date according to your needs
for dt in begdtlist:
    print(dt.strftime('%Y-%m-%d'))

This generates the following output:

[Timestamp('2018-04-29 00:00:00'), Timestamp('2019-04-22 00:00:00')]
2018-04-29
2019-04-22

You see that the formatting has created the date strings you need for your SQL query generation in lines 2 and 3. You can read up on formatting options in the official python docs .

By the way, it is totally fine that pandas transforms your dates into its own Timestamp objects, because they require interfaces for DataFrame interaction and aggregation, etc. that are not present in regular python Date or Datetime objects.

Still, you cannot use the list in your SQL query, because it will generate invalid syntax:

SELECT calndr_dt, wk_of_mnth
FROM databasename
WHERE calendr_dt = CAST 2018-04-29, 2018-04-22 AS DATE
                                  ^
                    this will be your syntax error

Your approach is flawed in several ways:

  1. You are trying to plug in a list into a single CAST call, which will fail
  2. You are testing equality of a column against an array, which will fail

Thus, you may also want to read up on SQL syntax and generate an example query before trying to plug in values via python.

It may look like this ( untested - use with caution ):

sql = '''
    SELECT calndr_dt, wk_of_mnth
    FROM databasename
    WHERE calendr_dt IN [{:s}]
'''.format(', '.join(
    [
        '\'{:s}\'::DATE'.format(dt.strftime('%Y-%m-%d'))
        for dt in begdtlist
    ]
))

Which results in:

SELECT calndr_dt, wk_of_mnth
FROM databasename
WHERE calendr_dt IN ['2018-04-29'::DATE, '2019-04-22'::DATE]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM