简体   繁体   English

Python - 从字符串列表生成 SQL WHERE/IN 子句

[英]Python - Generating SQL WHERE/IN clause from string List

I am given a Python List of an arbitrary length and containing arbitrary strings.我得到了一个任意长度并包含任意字符串的 Python 列表。 In particular, it can have strings with embedded single and/or double quotes.特别是,它可以包含嵌入单引号和/或双引号的字符串。 I have no control over the input so I have to take what I am given.我无法控制输入,所以我必须接受给定的东西。

For example:例如:

    valueList = [ "hello'world", 'foo"bar', 'my\'name"is', "see\'you\"soon" ]

    Python shell:
        >>> valueList = [ "hello'world", 'foo"bar', 'my\'name"is', "see\'you\"soon" ]
        >>>
        >>> valueList
        ["hello'world", 'foo"bar', 'my\'name"is', 'see\'you"soon']
        >>>
        >>> valueList[0]
        "hello'world"
        >>>
        >>> valueList[1]
        'foo"bar'
        >>>
        >>> valueList[2]
        'my\'name"is'
        >>>
        >>> valueList[3]
        'see\'you"soon'

From this, I need to generate an SQL string such as:由此,我需要生成一个 SQL 字符串,例如:

    "SELECT * FROM myTable as mt
        WHERE mt."colName" IN ("hello'world", 'foo"bar', 'my\'name"is', 'see\'you"soon')

Any solution has to work with both SQLite and Postgres.任何解决方案都必须与 SQLite 和 Postgres 一起使用。

I have tried to generate the (...) portion of the clause using Python join but that just ends up making one big string with all single quotes escaped.我尝试使用 Python 连接来生成子句的 (...) 部分,但这最终会产生一个所有单引号都转义的大字符串。 For example:例如:

    Python shell:
        >>> values = "','".join(valueList)
        >>> values
        'hello\'world\',\'foo"bar\',\'my\'name"is\',\'see\'you"soon'

        >>> values = "'" + "','".join(valueList) + "'"
        >>> values
        '\'hello\'world\',\'foo"bar\',\'my\'name"is\',\'see\'you"soon\''

Additional info: The code that I inherited uses SQLAlchemy and Pandas.附加信息:我继承的代码使用 SQLAlchemy 和 Pandas。

        import pandas as pd
        ...cut...cut...cut...
        my_df = pd.read_sql(sql, my_conn);

I do NOT want to use Pandas to do the filtering.我不想使用 Pandas 进行过滤。 In fact, my assigned task is to REMOVE the existing Pandas filtering and replace it with SQL with explicit WHERE/IN filters for speed.实际上,我分配的任务是删除现有的 Pandas 过滤并将其替换为具有显式 WHERE/IN 过滤器的 SQL 以提高速度。

For example, replace this:例如,替换这个:

    my_df = pd.read_sql("SELECT * FROM myTable", my_conn) <==== can return 10's of thousands of rows
    my_df = my_df[my_df.loc[:, 'colName'].isin(myList)] <==== ends up with a handful of rows

with this:有了这个:

    my_df = pd.read_sql("SELECT * FROM myTable as mt WHERE mt."colName" IN ("hello'world", 'foo"bar', ...)", my_conn)

SQL injection protection is a plus, but at this point I'll be happy with any solution that works. SQL 注入保护是一个优点,但在这一点上,我会对任何可行的解决方案感到满意。

Well, based on the SQL specification that defines a string literal as being delimited by single quotes and to include a single quote insde a string literal you have to double it (you can consult the syntax specification of Sqlite and PostgreSQL to see that they comply with that specification) here's my attempt:好吧,基于 SQL 规范,该规范将字符串文字定义为由单引号分隔,并且要在字符串文字中包含单引号,您必须将其加倍(您可以查阅SqliteZ399BD1EE587245FECAC6F3的语法规范以查看它们是否符合 9BEAA886该规范)这是我的尝试:

value_list = [ "hello'world", 'foo"bar', """my'name"is""", """see'you"soon""" ]
value_list_escaped = [f"""'{x.replace("'", "''")}'""" for x in value_list]
query_template = "SELECT * FROM myTable as mt WHERE mt.colName IN ({})"
query = query_template.format(", ".join(value_list_escaped))
print(query)

Is that what you wanted?那是你想要的吗?

Here are code fragments from a functioning solution to my problem.以下是针对我的问题的有效解决方案的代码片段。

This function is very specific to my problem but demonstrates the parameter injection technique.这个 function 对我的问题非常具体,但演示了参数注入技术。 It also demonstrates how to handle SQLite parameter injection vs Postgres parameter injection.它还演示了如何处理 SQLite 参数注入与 Postgres 参数注入。

def whereInjection(valueList, sqlDict):
    # sqlDict starts with just a "paramCount" key set to an initial value (typically 0 but could be any number).
    # As this function generates parameter injection strings, it generates a key/value pair for each parameter
    # in the form {"p_#": value} where # in the current "paramCount" and value is the value of the associated parameter.
    #
    # The end result for a valueList containing ["aaa", "bbb", "ccc'ddd", 'eee"fff'] will be:
    #   injStr = "(:p_0, :p_1, :p_2, :p_3)"
    #       Note: For Postgres, it has to be "(%(p_0)s, %(p_1)s, etc.)"
    #   sqlDict = {
    #       "paramCount": 3,
    #       "p_0": "aaa",
    #       "p_1": "bbb",
    #       "p_2": "ccc'ddd",
    #       "p_3": 'eee"fff'
    #   }
    localDebugPrintingEnabled = False

    # take into account whether the item values are presented as a list, tuple, set, single int, single string, etc.
    if isinstance(valueList, list):
        vList = valueList
    elif isinstance(valueList, tuple):
        vList = list(valueList)
    elif isinstance(valueList, set):
        vList = list(valueList);
    elif isinstance(valueList, int) or isinstance(valueList, str):
        vList = [valueList]
    else:
        vList = valueList # unexpected type...

    sz = len(vList)
    pc = sqlDict["paramCount"]
    if (db_type == 'SQLite'):
        injectStr = "(" + ",".join((":p_" + str(i + pc)) for i in range(0, sz)) + ")"
    else: # assume Postgres
        injectStr = "(" + ",".join(("%(p_" + str(i + pc) + ")s") for i in range(0, sz)) + ")"
    valueDict = {('p_' + str(i + pc)): vList[i] for i in range(0, sz)}

    sqlDict.update(valueDict) # add the valueDict just generated
    sqlDict["paramCount"] += sz # update paramCount for all parameters just added

    return injectStr

The invoking code would look like this.调用代码如下所示。 This assumes that you know how to create an engine connection to your DB.这假设您知道如何创建与数据库的引擎连接。

sqlDict = {"paramCount": 0} # start with empty dictionary and starting count of 0
sql = """SELECT * FROM myTable as mt WHERE mt."aColName" IN {0}""".format(whereInjection(itemList, sqlDict));
my_df = pd.read_sql(sql, engine_connection, params=sqlDict); # does the actual parameter injection

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM