[英]Failing bulk insert data from Pandas dataframe into Sybase database table using to_sql
The purpose of my below code is to get data from a restful service, normalize it, store it in dataframe with necessary columns and then finally load it in Sybase table using Pandas' to_sql
.我下面代码的目的是从一个 restful 服务中获取数据,对其进行规范化,将其存储在 dataframe 中,并使用必要的列,然后最后使用 Pandas 的to_sql
将其加载到 Sybase 表中。
Error:错误:
File "C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 467, in do_executemany cursor.executemany(statement, parameters) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]Incorrect syntax near ','.\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [parameters: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420/KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0), ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW', 0), ('10TBA/KFX', 'KFX', 'KRW', 0)... displaying 10 of 4525 total bound parameter sets... ('ZURF/DTB', 'DTB', 'CHF', 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]文件“C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py”,第 467 行,在 do_executemany cursor.executemany(语句,参数)sqlalchemy.exc.ProgrammingError:( '42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]',' 附近的语法不正确。\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID" , "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [参数: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420 /KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0) , ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW ', 0), ('10TBA/KFX', 'KFX', 'KRW', 0)...显示 4525 个总绑定参数集中的 10 个... ('ZURF/DTB', 'DTB', 'CHF' , 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]
Process finished with exit code 1进程以退出代码 1 结束
Code:代码:
from sqlalchemy.engine.url import *
from sqlalchemy.connectors.pyodbc import *
from sqlalchemy import create_engine
import urllib.request as request
import json
import pandas as pd
from pandas.io.json import json_normalize, DataFrame
response = request.urlopen('http://tfsdscsw5XX/mdsclass/CONTFUTURES--O.json')
output=response.read()
data=json.loads(output)
df=json_normalize(data)
df1=(df[['CONTRACT_ID','EXCHANGE_ID','CURRENCY','TRADING_CODE']])
df2=pd.DataFrame(df1)
print(df2)
print(df2.CONTRACT_ID)
connector = PyODBCConnector()
url = make_url("sybase+pyodbc://myhost/mydatabase?driver=Adaptive Server Enterprise&port=2306")
print(connector.create_connect_args(url))
engine=create_engine(url)
#it is failing here**
df2.to_sql("contract_test",engine,index=False,if_exists="append",schema="dbo")
response.close()
Sample of data in dataframe df2: dataframe df2中的数据样本:
CONTRACT_ID EXCHANGE_ID CURRENCY TRADING_CODE
0 0050/TAIEX TAIEX TWD 0
1 035420/KORE KORE KRW 0
2 0TL/LIF LIF NOK 1
3 100FTSE/LIF LIF GBP 0
4 101FTSE/LIF LIF GBP 0
Table contract_test definition:表contract_test定义:
CREATE TABLE contract_test (
CONTRACT_ID char(12) NOT NULL,
EXCHANGE_ID char(12),
CURRENCY char(4) NOT NULL,
TRADING_CODE smallint
)
GO
Please help as to how can this be resolved?请帮助如何解决这个问题? I am stuck here.我被困在这里。
Your issue may simply be the incompatibilities of Python database APIs.您的问题可能只是 Python 数据库 API 不兼容。 Pandas' to_sql
is really running an executemany()
call from pyodbc
. Pandas 的to_sql
实际上是从pyodbc
运行executemany()
调用。 This module is more popularly used with SQL Server especially in implementation with SQLAlchemy.该模块更普遍地与 SQL 服务器一起使用,尤其是在与 SQLAlchemy 一起实现时。 However, integration with Sybase is not fully supported.但是,不完全支持与 Sybase 的集成。 As mentioned on the SQLAlchemy Sybase docs page :如 SQLAlchemy Sybase文档页面所述:
Note笔记
The Sybase dialect within SQLAlchemy is not currently supported.目前不支持 SQLAlchemy 中的 Sybase 方言。 It is not tested within continuous integration and is likely to have many issues and caveats not currently handled.它没有在持续集成中进行测试,并且可能存在许多当前未处理的问题和警告。 Consider using the external dialect instead.考虑改用外部方言。
Specifically, executemany
appears to be running multiple VALUES
row inserts which is supported in SQL Server but not Sybase (even though both dialects are variants of TSQL with known history of connection):具体来说, executemany
似乎正在运行多个VALUES
行插入,SQL 服务器支持但 Sybase 不支持(即使这两种方言都是 TSQL 的变体,具有已知的连接历史):
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE")
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0),
('035420/KORE', 'KORE', 'KRW', 0),
('0TL/LIF', 'LIF', 'NOK', 1),
...
Instead, Sybase requires classic ANSI-SQL with multiple INSERT INTO
calls:相反,Sybase 需要具有多个INSERT INTO
调用的经典 ANSI-SQL:
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE")
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0)
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE")
VALUES ('035420/KORE', 'KORE', 'KRW', 0)
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE")
VALUES ('0TL/LIF', 'LIF', 'NOK', 1)
...
To resolve, instead of Pandas' convenient to_sql
method, consider a direct SQLAlchemy executemany
call with parameters using list of data frame rows via DataFrame.to_numpy()
.要解决此问题,请考虑使用通过DataFrame.to_numpy()
使用数据帧行列表的参数的直接 SQLAlchemy executemany
调用,而不是 Pandas 方便的to_sql
方法。 Below assumes contract_test
table always exists beforehand.下面假设contract_test
表总是预先存在。
engine = create_engine(url)
sql = """INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE")
VALUES (?, ?, ?, ?)"""
with engine.connect() as connection:
result = connection.execute(sql, df2.to_numpy().tolist())
If above still faces same issue, integrate a for-loop:如果上面仍然面临同样的问题,请集成一个 for 循环:
with engine.connect() as connection:
for row in df2.to_numpy().tolist():
result = connection.execute(sql, row)
The external SAP ASE (Sybase) dialect is now the recommended SQLAlchemy dialect for Sybase, and it does support fast_executemany
if you use the SAP ASE ODBC driver.外部 SAP ASE (Sybase) 方言现在是 Sybase 推荐的 SQLAlchemy 方言,如果使用 SAP ASE ODBC 驱动程序,它确实支持fast_executemany
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.