简体   繁体   English

Python - 使用 sqlalchemy 的 Postgres 查询返回“空数据框”

[英]Python - Postgres query using sqlalchemy returns "Empty Dataframe"

I try to query some data from a postgres database and add the results into an excel with the below Python code (I am connecting to the server through ssh tunnel and connecting to database using sqlalchemy):我尝试从 postgres 数据库中查询一些数据,并使用以下 Python 代码将结果添加到 excel 中(我通过 ssh 隧道连接到服务器并使用 sqlalchemy 连接到数据库):

from sshtunnel import SSHTunnelForwarder
from sqlalchemy.orm import sessionmaker 
from sqlalchemy import create_engine
import pandas as pd
from pandas import DataFrame
import xlsxwriter
import openpyxl

with SSHTunnelForwarder(
    ('<server_ip>', 22),
    ssh_username="<server_username>",
    ssh_private_key='<private_key_path>', 
    remote_bind_address=('localhost', 5432)) as server:
    server.start()
    print "server connected"

    #connect to DB
    local_port = str(server.local_bind_port)
    engine = create_engine('postgresql://<db_username>:<db_password>:' + local_port +'/<db_name>')
    Session = sessionmaker(bind=engine)
    s = Session()
    print 'Database session created'

    not_empty_query = False #flag empty queries
    arg_query = "SELECT * from portalpage where id not in (select entityid from sharepermissions where entitytype='PortalPage')"
    query = s.execute(arg_query)
    print(query)
    for row in query: #check if the query is empty
        if (row[0] > 0):
            not_empty_query = True
            break
    if not_empty_query == True: #if the query isn not empty add response into excel
        df = pd.DataFrame(pd.np.empty((0, 8)))
        df = DataFrame(query.fetchall())
        print(df)
        df.columns = query.keys()
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")

s.close()

It works for the most of the queries that I tried to execute, however with the above query it returns the below error:它适用于我尝试执行的大多数查询,但是对于上述查询,它返回以下错误:

ValueError: Length mismatch: Expected axis has 0 elements, new values have 8 elements

While I was troubleshooting I printed the the df parameter and I got an "Empty Dataframe".在进行故障排除时,我打印了 df 参数,并得到了一个“空数据框”。 However when I run the same query in my database directly I get results.但是,当我直接在我的数据库中运行相同的查询时,我会得到结果。

I also noticed that in the response, on my database, some columns are empty (not sure if it makes any difference).我还注意到,在响应中,在我的数据库中,有些列是空的(不确定是否有任何区别)。

Please also find a print screen of the code execution.另请找到代码执行的打印屏幕。 在此处输入图片说明

The above will work if I remove the below piece of code:如果我删除以下代码,上述内容将起作用:

for row in query: #check if the query is empty
    if (row[0] > 0):
        not_empty_query = True
        break
if not_empty_query == True:

However, if I remove this 'for loop' then for other queries (mainly for queries which return empty results) I get the same error.但是,如果我删除这个“for 循环”,那么对于其他查询(主要是返回空结果的查询),我会得到同样的错误。 Please find an example below.请在下面找到一个例子。 在此处输入图片说明

Ay ideas?有什么想法吗?

Please try this.请试试这个。 I found that the logic you are using to check if the query returns any data is the problem.我发现您用来检查查询是否返回任何数据的逻辑是问题所在。 I have modified it to have that check first.我已经修改它以先进行检查。 If there is any row returned then it builds the dataframe and then exports to excel.如果返回任何行,则它会构建数据框,然后导出到 excel。 Please let me know if it works.请让我知道它是否有效。

from sshtunnel import SSHTunnelForwarder
from sqlalchemy.orm import sessionmaker 
from sqlalchemy import create_engine
import pandas as pd
from pandas import DataFrame
import xlsxwriter
import openpyxl

with SSHTunnelForwarder(
    ('<server_ip>', 22),
    ssh_username="<server_username>",
    ssh_private_key='<private_key_path>', 
    remote_bind_address=('localhost', 5432)) as server:
    server.start()
    print "server connected"

    #connect to DB
    local_port = str(server.local_bind_port)
    engine = create_engine('postgresql://<db_username>:<db_password>:' + local_port +'/<db_name>')
    Session = sessionmaker(bind=engine)
    s = Session()
    print 'Database session created'
    arg_query = "SELECT * from portalpage where id not in (select entityid from sharepermissions where entitytype='PortalPage')"
    query = conn.execute(arg_query)##rows_count
    rows = query.fetchall()
    columns=query.keys()
    if len(rows) > 0:
        df = DataFrame(rows)
        df.columns =columns
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")
    else:
        print "no data"

Try to create an empty data frame first.首先尝试创建一个空的数据框。

if not_empty_query == True: #if the query isn not empty add response into excel
        df = pd.DataFrame(pd.np.empty((0, 8)))   
        df = DataFrame(query.fetchall())
        print(df)
        df.columns = query.keys()
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM