简体   繁体   English

使用 to_sql 将 Pandas dataframe 中的数据批量插入 Sybase 数据库表失败

[英]Failing bulk insert data from Pandas dataframe into Sybase database table using to_sql

The purpose of my below code is to get data from a restful service, normalize it, store it in dataframe with necessary columns and then finally load it in Sybase table using Pandas' to_sql .我下面代码的目的是从一个 restful 服务中获取数据,对其进行规范化,将其存储在 dataframe 中,并使用必要的列,然后最后使用 Pandas 的to_sql将其加载到 Sybase 表中。

Error:错误:

File "C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py", line 467, in do_executemany cursor.executemany(statement, parameters) sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]Incorrect syntax near ','.\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [parameters: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420/KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0), ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW', 0), ('10TBA/KFX', 'KFX', 'KRW', 0)... displaying 10 of 4525 total bound parameter sets... ('ZURF/DTB', 'DTB', 'CHF', 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]文件“C:\Program Files\Anaconda3\lib\site-packages\sqlalchemy\engine\default.py”,第 467 行,在 do_executemany cursor.executemany(语句,参数)sqlalchemy.exc.ProgrammingError:( '42000', "[42000] [Sybase][ODBC Driver][Adaptive Server Enterprise]',' 附近的语法不正确。\n (102) (SQLExecDirectW)") [SQL: 'INSERT INTO dbo.contract_test ("CONTRACT_ID" , "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") VALUES (?, ?, ?, ?)'] [参数: (('0050/TAIEX', 'TAIEX', 'TWD', 0), ('035420 /KORE', 'KORE', 'KRW', 0), ('0TL/LIF', 'LIF', 'NOK', 1), ('100FTSE/LIF', 'LIF', 'GBP', 0) , ('101FTSE/LIF', 'LIF', 'GBP', 0), ('10STAT/OM', 'OM', 'SEK', 0), ('10TB/KFX', 'KFX', 'KRW ', 0), ('10TBA/KFX', 'KFX', 'KRW', 0)...显示 4525 个总绑定参数集中的 10 个... ('ZURF/DTB', 'DTB', 'CHF' , 0), ('ZX/NYCE', 'NYCE', 'USD', 0))]

Process finished with exit code 1进程以退出代码 1 结束

Code:代码:

from sqlalchemy.engine.url import *               
from sqlalchemy.connectors.pyodbc import *             
from sqlalchemy import create_engine                       
import urllib.request as request                  
import json                         
import pandas as pd                      
from pandas.io.json import json_normalize, DataFrame      
           
response = request.urlopen('http://tfsdscsw5XX/mdsclass/CONTFUTURES--O.json')            
output=response.read()                              
data=json.loads(output)           
df=json_normalize(data)                           
df1=(df[['CONTRACT_ID','EXCHANGE_ID','CURRENCY','TRADING_CODE']])                
df2=pd.DataFrame(df1)           
print(df2)                
print(df2.CONTRACT_ID)          
            
connector =  PyODBCConnector()                 
url = make_url("sybase+pyodbc://myhost/mydatabase?driver=Adaptive Server Enterprise&port=2306")              
print(connector.create_connect_args(url))                         
engine=create_engine(url)

#it is failing here**
df2.to_sql("contract_test",engine,index=False,if_exists="append",schema="dbo")   

response.close()               

Sample of data in dataframe df2: dataframe df2中的数据样本:

      CONTRACT_ID EXCHANGE_ID CURRENCY  TRADING_CODE
0      0050/TAIEX       TAIEX      TWD             0
1     035420/KORE        KORE      KRW             0
2         0TL/LIF         LIF      NOK             1
3     100FTSE/LIF         LIF      GBP             0
4     101FTSE/LIF         LIF      GBP             0

Table contract_test definition:表contract_test定义:

CREATE TABLE contract_test (
    CONTRACT_ID char(12) NOT NULL,
    EXCHANGE_ID char(12),
    CURRENCY char(4) NOT NULL,
    TRADING_CODE smallint
) 
GO

Please help as to how can this be resolved?请帮助如何解决这个问题? I am stuck here.我被困在这里。

Your issue may simply be the incompatibilities of Python database APIs.您的问题可能只是 Python 数据库 API 不兼容。 Pandas' to_sql is really running an executemany() call from pyodbc . Pandas 的to_sql实际上是从pyodbc运行executemany()调用。 This module is more popularly used with SQL Server especially in implementation with SQLAlchemy.该模块更普遍地与 SQL 服务器一起使用,尤其是在与 SQLAlchemy 一起实现时。 However, integration with Sybase is not fully supported.但是,不完全支持与 Sybase 的集成。 As mentioned on the SQLAlchemy Sybase docs page :如 SQLAlchemy Sybase文档页面所述:

Note笔记

The Sybase dialect within SQLAlchemy is not currently supported.目前不支持 SQLAlchemy 中的 Sybase 方言。 It is not tested within continuous integration and is likely to have many issues and caveats not currently handled.它没有在持续集成中进行测试,并且可能存在许多当前未处理的问题和警告。 Consider using the external dialect instead.考虑改用外部方言。

Specifically, executemany appears to be running multiple VALUES row inserts which is supported in SQL Server but not Sybase (even though both dialects are variants of TSQL with known history of connection):具体来说, executemany似乎正在运行多个VALUES行插入,SQL 服务器支持但 Sybase 不支持(即使这两种方言都是 TSQL 的变体,具有已知的连接历史):

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0), 
       ('035420/KORE', 'KORE', 'KRW', 0), 
       ('0TL/LIF', 'LIF', 'NOK', 1), 
...

Instead, Sybase requires classic ANSI-SQL with multiple INSERT INTO calls:相反,Sybase 需要具有多个INSERT INTO调用的经典 ANSI-SQL:

INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0050/TAIEX', 'TAIEX', 'TWD', 0) 
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('035420/KORE', 'KORE', 'KRW', 0)
INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
VALUES ('0TL/LIF', 'LIF', 'NOK', 1)
...

To resolve, instead of Pandas' convenient to_sql method, consider a direct SQLAlchemy executemany call with parameters using list of data frame rows via DataFrame.to_numpy() .要解决此问题,请考虑使用通过DataFrame.to_numpy()使用数据帧行列表的参数的直接 SQLAlchemy executemany调用,而不是 Pandas 方便的to_sql方法。 Below assumes contract_test table always exists beforehand.下面假设contract_test表总是预先存在。

engine = create_engine(url)
sql = """INSERT INTO dbo.contract_test ("CONTRACT_ID", "EXCHANGE_ID", "CURRENCY", "TRADING_CODE") 
         VALUES (?, ?, ?, ?)"""

with engine.connect() as connection:
    result = connection.execute(sql, df2.to_numpy().tolist())

If above still faces same issue, integrate a for-loop:如果上面仍然面临同样的问题,请集成一个 for 循环:

with engine.connect() as connection:
    for row in df2.to_numpy().tolist():
        result = connection.execute(sql, row)

The external SAP ASE (Sybase) dialect is now the recommended SQLAlchemy dialect for Sybase, and it does support fast_executemany if you use the SAP ASE ODBC driver.外部 SAP ASE (Sybase) 方言现在是 Sybase 推荐的 SQLAlchemy 方言,如果使用 SAP ASE ODBC 驱动程序,它确实支持fast_executemany

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 无法使用 pandas to_sql() 方法将数据插入雪花数据库表 - Unable to insert data into Snowflake database table using pandas to_sql() method 使用to_sql将熊猫数据框中的数据导入SQL数据库时,PC挂起 - PC hangs while importing data from pandas dataframe into SQL database using to_sql 在不使用 BULK INSERT 或 Pandas to_sql 的情况下,加快从 CSV 文件插入到 SQL Server 的速度 - Speed up insert to SQL Server from CSV file without using BULK INSERT or pandas to_sql Pandas to_sql 不会在我的表中插入任何数据 - Pandas to_sql doesn't insert any data in my table 使用 to_sql 将 pandas dataframe 导出到访问表中生成错误 - exporting pandas dataframe into a access table using to_sql generate error 使用to_sql和sqlalchemy将pandas数据帧添加到mariadb数据库 - pandas dataframe to mariadb database with to_sql and sqlalchemy 使用to_sql将数据附加到pandas中已存在的表中 - append the data to already existing table in pandas using to_sql Pandas to_sql 改变数据库表中的数据类型 - Pandas to_sql changing datatype in database table Pandas to_sql 创建表但不插入数据 - Pandas to_sql creating table but not inserting data 使用to_sql将数据从Pandas排序并加载到Redshift - Sorting and loading data from Pandas to Redshift using to_sql
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM