简体   繁体   English

通过ODBC将熊猫输出连接到Excel工作表

[英]Connect pandas output to excel sheet via ODBC

The query below does the following: Connect to a SQL Server db via ODBC, run a SQL script (many queries separated by ";"), create two dataframes for two specific query results, and then export them to two tabs within an excel workbook: 下面的查询执行以下操作:通过ODBC连接到SQL Server数据库,运行SQL脚本(许多查询以“;”分隔),为两个特定的查询结果创建两个数据框,然后将它们导出到excel工作簿中的两个选项卡:

import pyodbc
import pandas as pd
import time

name= 'ouput' + str(time.strftime("%Y-%b-%d__%H_%M_%S",time.localtime()))
print ("Connecting via ODBC")

conn = pyodbc.connect('DSN=Server DB Prod', autocommit=True)

print ("Connected!\n")

inputdir = 'H:\\Queries\\ADS'

#for script in os.listdir(inputdir):
with open(inputdir+'\\' + 'query' +'.sql','r') as inserts:
    sqlScript = inserts.read()
    for statement in sqlScript.split(';'):
        with conn.cursor() as cur:
            cur.execute(statement)

query1="Select * from #leadership"
data1=pd.read_sql_query(query1, conn).sort_values(['channel','terr_code'], ascending=[0,1]).reset_index(drop=True)
#print(data1.head(n=100))
query2="Select * from #ml"
data2=pd.read_sql_query(query2, conn).sort_values(['channel','terr_code','client_name'], ascending=[0,1,1]).reset_index(drop=True)

print('query finished')
conn.close()

writer = pd.ExcelWriter(name+ '.xlsx')
data1.to_excel(writer,'Leadership Summary')
data2.to_excel(writer,'ML Detail')
writer.save()

print("Results were succesfully exported")

However, I am hoping to instead be able to connect to an existing excel file via ODBC in order to update the tabs of my workbook dynamically and not lose the formatting and graphs- to allow real automation. 但是,我希望能够通过ODBC连接到现有的excel文件,以便动态更新我的工作簿的标签,并且不会丢失格式和图形-从而实现真正的自动化。 Any other solution that would allow the same will certainly work. 任何其他允许相同的解决方案也肯定会起作用。

Background: I am trying to automate a process where I run a query in SQL Server (via Python), and have the output update the tabs of an existing excel worksheet- I was thinking by connecting via ODBC. 背景:我正在尝试使我在SQL Server(通过Python)中运行查询的过程自动化,并使输出更新现有excel工作表的选项卡-我正在考虑通过ODBC连接。 That worksheet has specific formatting, and formulas+graphs built based on the data. 该工作表具有特定的格式,并基于数据构建了公式+图形。

Note: I dont have write permission, only read, so I cant connect a "final" SQL table to excel via ODBC. 注意:我没有写权限,只能读,所以我无法通过ODBC将“最终” SQL表连接到excel。 I am doing some additional data blending from other sources in Python (not shown), so connecting a SQL query to excel via ODBC will not work. 我正在从Python中的其他来源进行一些其他数据混合(未显示),因此无法通过ODBC将SQL查询连接到excel。

Any help is greatly appreciated. 任何帮助是极大的赞赏。

The best strategy is to access the Excel workbook and use its facilities instead of external tools in order to leave all other objects untouched. 最好的策略是访问Excel工作簿并使用其功能而不是外部工具,以使所有其他对象保持不变。 Hence, consider the win32com client where you can then access the Excel object library such as its CopyFromRecordset method. 因此,考虑使用win32com客户端,然后可以在其中访问Excel对象库,例如其CopyFromRecordset方法。

And instead of pyodbc as database API, use Window's ADODB API which can use ODBC connections. 而不是使用pyodbc作为数据库API,而使用可以使用ODBC连接的Window的ADODB API。 Also, there is no need for pandas as worksheet objects are used to hold data. 同样,由于工作表对象用于保存数据,因此也不需要pandas NOTE: this solution only works on Windows machines. 注意:此解决方案仅适用于Windows计算机。

import win32com.client as win32

try:
    # INITIALIZE OBJECTS
    xlapp = win32.gencache.EnsureDispatch('Excel.Application')
    ado_conn = win32.gencache.EnsureDispatch('ADODB.Connection')
    ado_rst = win32.gencache.EnsureDispatch('ADODB.Recordset')

    # OPEN CONNECTION
    ado_conn.Open('DSN=Server DB Prod')

    # RUN QUERIES
    with open(inputdir+'\\' + 'query' +'.sql','r') as inserts:
        sqlScript = inserts.read()
        for statement in sqlScript.split(';'):
            ado_conn.Execute(statement)

    # OPEN WORKBOOK AND UPDATE SHEETS
    xlwb = xlapp.Workbooks.Open(r'C:\Full\Path\To\Workbook.xlsx')

    ls = xlwb.Worksheets('Leadership Summary') 
    ls.Cells.ClearContents()
    ado_rst.Open("Select * from #leadership", ado_conn)
    for i in range(ado_rst.Fields.Count):
       ls.Cells(1, i+1).Value = ado_rst.Fields(i).Name     # COLUMNS
    ls.Range("A2").CopyFromRecordset(ado_rst)              # DATA ROWS
    ado_rst.Close()

    ml = xlwb.Worksheets('ML Detail')
    ml.Cells.ClearContents()
    ado_rst.Open("Select * from #ml", ado_conn)         
    for i in range(ado_rst.Fields.Count): 
       ml.Cells(1, i+1).Value = ado_rst.Fields(i).Name     # COLUMNS
    ml.Range("A2").CopyFromRecordset(ado_rst)              # DATA ROWS
    ado_rst.Close()

    ado_conn.Close()
    xlapp.Visible = True        # OPENS WORKBOOK WITH ABOVE CHANGES TO SCREEN

except Exception as e:
    print(e)

finally:
    # RELEASE RESOURCES
    ls = None; ml = None
    ado_rst = None; ado_conn = None
    xlwb = None; xlapp = None

Hopefully, this debunks any who think VBA (also a COM-interfaced language) is the only coding language for Excel! 希望这能揭穿任何认为V​​BA(也是COM接口语言)是Excel唯一的编码语言的人!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM