简体   繁体   中英

How to use Pandas.DataFrame as input of SQL query?

I am trying to use Pandas.DataFrame as the intermediate result dataset between two consequent SQL queries.

I imagine it looks like:

import pandas.io.sql as pisql
import pyodbc

SQL_command1 = """
                  select * from tab_A
              """
result = pisql.read_frame(SQL_command1)


SQL_command2 = """
                  select * 
                  from ? A
                  inner join B
                  on A.id = B.id
               """    
pyodbc.cursor.execute(SQL_command2, result)

The SQL_command2 in above code is simply a pseudo code, where ? takes in the result as the input and given a alias name as A .

This is my first time using Pandas , so I'm not confident if my idea is feasible or efficient. Can anyone enlight me please?

Many thanks.

The pseudo code would look like this

import pandas as pd
df_a = pd.read_csv('tab_a.csv') #or read_sql or other read engine
df_b = pd.read_csv('tab_b.csv')
result = pd.merge(left=df_a,
                  right=df_b,
                  how='inner',
                  on='id') #assuming 'id' is in both table

And to select columns of pandas dataframe, it would be something like df_a[['col1','col2','col3']]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM