简体   繁体   中英

How to pass tuple in read_sql 'where in' clause in pandas python

I am passing a tuple converted to a string in a read_sql method as

sql = "select * from table1 where col1 in " + str(tuple1) + " and col2 in " + str(tuple2)

df = pd.read_sql(sql, conn)

This is working fine but, when tuple have only one value sql fails with ORA-00936: missing expression, as single element tuple has an extra comma

For example

tuple1 = (4011,)
tuple2 = (23,24)

sql formed is as

select * from table1 where col1 in (4011,) + " and col2 in (23,24)
                                        ^
ORA-00936: missing expression

Is there any better way doing this, other than removal of comma with string operations?

Is there a better way to paramatrize read_sql function?

There might be a better way to do it but I would add an if statement around making the query and would use .format() instead of + to parameterise the query.

Possible if statement:

if len(tuple1) < 2:
    tuple1 = tuple1[0]

This will vary based on what your input is. If you have a list of tuples you can do this:

tuples = [(4011,), (23, 24)]
new_t = []
for t in tuples:
    if len(t) == 2:
         new_t.append(t)
    elif len(t) == 1:
         new_t.append(t[0])

Ouput:

[4011, (23, 24)]

Better way of parameterising querys using .format() :

sql = "select * from table1 where col1 in {} and col2 in {}".format(str(tuple1), str(tuple2))

Hope this helps!

the reason you're getting the error is because of SQL syntax.

When you have a WHERE col in (...) list, a trailing comma will cause a syntax error.

Either way, putting values into SQL statements using string concatenation is frowned upon, and will ultimately lead you to more problems down the line.

Most Python SQL libraries will allow for parameterised queries. Without knowing which library you're using to connect, I can't link exact documentation, but the principle is the same for psycopg2:

http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries

This functionality is also exposed in pd.read_sql , so to acheive what you want safely , you would do this:

sql = "select * from table1 where col1 in %s and col2 in %s"

df = pd.read_sql(sql, conn, params = [tuple1, tuple2])
select * from table_name where 1=1 and (column_a, column_b) not in ((28,1),(25,1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM