简体   繁体   中英

Pandas GroupBy and SQL Where Clause parameters

I want to get a SQL query string output that takes in multiple parameters in the WHERE Clause from a Pandas DataFrame column using groupby. What is the best way to do this?

import pandas as pd

df = pd.DataFrame({
    'Contact Name':['John Doe','John Doe','Jane Doe','Jim Doe','Jim Doe'],
    'Email Address': ['john.doe@gmail.com','john.doe@gmail.com','jane.doe@gmail.com','jim.doe@gmail.com','jim.doe@gmail.com'],
    'Contract No':['2851','2852','2853','2854','2855'],
})

From the above example, I need to get 3 different SQL queries that go as follows:

SELECT * FROM TABLE WHERE [Contract No] IN ('2851', '2852')
SELECT * FROM TABLE WHERE [Contract No] IN ('2853')
SELECT * FROM TABLE WHERE [Contract No] IN ('2854', '2855')

Let's use parametrized sql to give hackers one less entryway into our databases :

sqls = []
args = []
for key, grp in df.groupby(['Contact Name', 'Email Address']):
    arg = tuple(grp['Contract No'])
    sql = 'SELECT * FROM TABLE WHERE [Contract No] IN ({})'.format(','.join(['%s']*len(arg)))
    sqls.append(sql)
    args.append(arg)

for sql, arg in zip(sqls, args):
    print(sql, arg)
    # SELECT * FROM TABLE WHERE [Contract No] IN (%s) ('2853',)
    # SELECT * FROM TABLE WHERE [Contract No] IN (%s,%s) ('2854', '2855')
    # SELECT * FROM TABLE WHERE [Contract No] IN (%s,%s) ('2851', '2852')

To execute the parametrized sql, use the 2-argument form of cursor.execute :

for sql, arg in zip(sqls, args):
    cursor.execute(sql, arg)

Figured out the solution. I just needed to use a lambda function alongwith groupby.

import pandas as pd

df1 = pd.DataFrame({
    'Contact Name':['John Doe','John Doe','Jane Doe','Jim Doe','Jim Doe'],
    'Email Address':['john.doe@gmail.com','john.doe@gmail.com','jane.doe@gmail.com','jim.doe@gmail.com','jim.doe@gmail.com'],
    'Contract No':['2851','2852','2853','2854','2855'],
})

df2 = df1.groupby(['Contact Name','Email Address'])['Contract No'].apply(lambda x: ','.join('\'' + x + '\'')).reset_index()

for index, row in df2.iterrows():
    print('SELECT * FROM TABLE WHERE [Contract No] IN (' + row['Contract No'] + ')')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM