简体   繁体   中英

Python pandas to_sql maximum 2100 parameters

i always stored into my DB (SQL server) thousands of parameters until some days ago. I use spyder (Python 3.6). I updated all packages with conda update --all some days ago and now im not able to import my dataframes into my DB.

--- I Don't want a workaround to split in a 2100- parameters DF ---

I would like to understand what is changed and why and how to come back to a working one.

this is a simple code:

import pyodbc
import sqlalchemy
import numpy as np
import pandas as pd


c = pyodbc.connect("Driver={SQL Server};Server=**;Trusted_Connection=no;Database=*;UID=*;PWD=*;")
cursor = c.cursor()  
engine = sqlalchemy.create_engine('mssql+pyodbc://*:*/*?driver=SQL+Server')



df= pd.DataFrame(np.random.randn(5000))
df.to_sql('pr',engine,if_exists= 'append', index=False)

and this is the error: ProgrammingError: (pyodbc.ProgrammingError) ('42000', '[42000] [Microsoft][ODBC SQL Server Driver][SQL Server]The incoming request has too many parameters. The server supports a maximum of 2100 parameters. Reduce the number of parameters and resend the request. (8003) (SQLExecDirectW)')

Thanks a lot

There is an open (as of 2018.06.01) issue for this bug in pandas 0.23.

You might want to downgrade to 0.22 which would work as expected.

Try to limit the chunksize:

df.to_sql('pr',
          engine,
          chunksize=20,
          if_exists= 'append',
          index=False)

This worked for me. Think the math for chosing the right chunksize number is: chunksize = 2100 / your number of columns

Under the hood, Pandas uses SQLAlchemy to talk to the Database. SQLAlchemy prepares and executes what are known as parameterized queries. SQLAlchemy will prepare the query as a string with ? as a placeholder for your data. It will end up looking like this:

INSERT INTO my_table
VALUES (?,?,?,?,?, etc.)

To SQL Server the ? is treated as a parameter. SQL Server only allows 2100 of these parameters in a parameterized query.

The issue isn't so much the number of columns in your dataset, its that SQL Server only allows you to include up to 2100 bits of data in the parameterized query. Parameterized queries are used because they separate what is SQL and what is data, an important protection against accidentally executing malicious sql.

As you've discovered, you can chunk the work such that you don't bump into the 2100 limit. Alternatively, you can construct the SQL yourself and execute it, but this is generally frowned upon as a security risk (SQL injection attack).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM