简体   繁体   中英

Python - Postgres query using sqlalchemy returns "Empty Dataframe"

I try to query some data from a postgres database and add the results into an excel with the below Python code (I am connecting to the server through ssh tunnel and connecting to database using sqlalchemy):

from sshtunnel import SSHTunnelForwarder
from sqlalchemy.orm import sessionmaker 
from sqlalchemy import create_engine
import pandas as pd
from pandas import DataFrame
import xlsxwriter
import openpyxl

with SSHTunnelForwarder(
    ('<server_ip>', 22),
    ssh_username="<server_username>",
    ssh_private_key='<private_key_path>', 
    remote_bind_address=('localhost', 5432)) as server:
    server.start()
    print "server connected"

    #connect to DB
    local_port = str(server.local_bind_port)
    engine = create_engine('postgresql://<db_username>:<db_password>:' + local_port +'/<db_name>')
    Session = sessionmaker(bind=engine)
    s = Session()
    print 'Database session created'

    not_empty_query = False #flag empty queries
    arg_query = "SELECT * from portalpage where id not in (select entityid from sharepermissions where entitytype='PortalPage')"
    query = s.execute(arg_query)
    print(query)
    for row in query: #check if the query is empty
        if (row[0] > 0):
            not_empty_query = True
            break
    if not_empty_query == True: #if the query isn not empty add response into excel
        df = pd.DataFrame(pd.np.empty((0, 8)))
        df = DataFrame(query.fetchall())
        print(df)
        df.columns = query.keys()
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")

s.close()

It works for the most of the queries that I tried to execute, however with the above query it returns the below error:

ValueError: Length mismatch: Expected axis has 0 elements, new values have 8 elements

While I was troubleshooting I printed the the df parameter and I got an "Empty Dataframe". However when I run the same query in my database directly I get results.

I also noticed that in the response, on my database, some columns are empty (not sure if it makes any difference).

Please also find a print screen of the code execution. 在此处输入图片说明

The above will work if I remove the below piece of code:

for row in query: #check if the query is empty
    if (row[0] > 0):
        not_empty_query = True
        break
if not_empty_query == True:

However, if I remove this 'for loop' then for other queries (mainly for queries which return empty results) I get the same error. Please find an example below. 在此处输入图片说明

Ay ideas?

Please try this. I found that the logic you are using to check if the query returns any data is the problem. I have modified it to have that check first. If there is any row returned then it builds the dataframe and then exports to excel. Please let me know if it works.

from sshtunnel import SSHTunnelForwarder
from sqlalchemy.orm import sessionmaker 
from sqlalchemy import create_engine
import pandas as pd
from pandas import DataFrame
import xlsxwriter
import openpyxl

with SSHTunnelForwarder(
    ('<server_ip>', 22),
    ssh_username="<server_username>",
    ssh_private_key='<private_key_path>', 
    remote_bind_address=('localhost', 5432)) as server:
    server.start()
    print "server connected"

    #connect to DB
    local_port = str(server.local_bind_port)
    engine = create_engine('postgresql://<db_username>:<db_password>:' + local_port +'/<db_name>')
    Session = sessionmaker(bind=engine)
    s = Session()
    print 'Database session created'
    arg_query = "SELECT * from portalpage where id not in (select entityid from sharepermissions where entitytype='PortalPage')"
    query = conn.execute(arg_query)##rows_count
    rows = query.fetchall()
    columns=query.keys()
    if len(rows) > 0:
        df = DataFrame(rows)
        df.columns =columns
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")
    else:
        print "no data"

Try to create an empty data frame first.

if not_empty_query == True: #if the query isn not empty add response into excel
        df = pd.DataFrame(pd.np.empty((0, 8)))   
        df = DataFrame(query.fetchall())
        print(df)
        df.columns = query.keys()
        df.to_excel("out.xlsx", engine="openpyxl", sheet_name="Worksheet_Name")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM