简体   繁体   English

快速从Python Dataframe向SQL服务器插入数据

[英]Inserting Data to SQL Server from a Python Dataframe Quickly

I have been trying to insert data from a dataframe in Python to a table already created in SQL Server.我一直在尝试将数据从 dataframe 中的 Python 插入到已在 SQL 服务器中创建的表中。 The data frame has 90K rows and wanted the best possible way to quickly insert data in the table.数据框有 90K 行,需要尽可能最好的方法来快速将数据插入表中。 I only have read,write and delete permissions for the server and I cannot create any table on the server.我对服务器只有读、写和删除权限,不能在服务器上创建任何表。

Below is the code which is inserting the data but it is very slow.下面是插入数据的代码,但速度很慢。 Please advise.请指教。

import pandas as pd
import xlsxwriter
import pyodbc

df = pd.read_excel(r"Url path\abc.xlsx")
conn = pyodbc.connect('Driver={ODBC Driver 11 for SQL Server};'
                      'SERVER=Server Name;'
                      'Database=Database Name;'
                      'UID=User ID;'
                      'PWD=Password;'
                      'Trusted_Connection=no;')
cursor= conn.cursor()
#Deleting existing data in SQL Table:- 
cursor.execute("DELETE FROM datbase.schema.TableName")
conn.commit()
#Inserting data in SQL Table:- 
for index,row in df.iterrows():
    cursor.execute("INSERT INTO Table Name([A],[B],[C],) values (?,?,?)", row['A'],row['B'],row['C']) 
conn.commit()
cursor.close()
conn.close()

To insert data much faster, try using sqlalchemy and df.to_sql .要更快地插入数据,请尝试使用sqlalchemydf.to_sql This requires you to create an engine using sqlalchemy , and to make things faster use the option fast_executemany=True这需要您使用sqlalchemy创建引擎,并使用选项fast_executemany=True来加快速度

connect_string = urllib.parse.quote_plus(f'DRIVER={{ODBC Driver 11 for SQL Server}};Server=<Server Name>,<port>;Database=<Database name>')
engine = sqlalchemy.create_engine(f'mssql+pyodbc:///?odbc_connect={connect_string}', fast_executemany=True)

with engine.connect() as connection:
  df.to_sql(<table name>, connection, index=False)

This should do what you want...very generic example...这应该做你想做的......非常通用的例子......

# Insert from dataframe to table in SQL Server
import time
import pandas as pd
import pyodbc

# create timer
start_time = time.time()
from sqlalchemy import create_engine


df = pd.read_csv("C:\\your_path\\CSV1.csv")

conn_str = (
    r'DRIVER={SQL Server Native Client 11.0};'
    r'SERVER=Excel-PC\SQLEXPRESS;'
    r'DATABASE=NORTHWND;'
    r'Trusted_Connection=yes;'
)
cnxn = pyodbc.connect(conn_str)

cursor = cnxn.cursor()

for index,row in df.iterrows():
    cursor.execute('INSERT INTO dbo.Table_1([Name],[Address],[Age],[Work]) values (?,?,?,?)', 
                    row['Name'], 
                    row['Address'], 
                    row['Age'],
                    row['Work'])
    cnxn.commit()
cursor.close()
cnxn.close()

# see total time to do insert
print("%s seconds ---" % (time.time() - start_time))

Try that and post back if you have additional questions/issues/concerns.如果您有其他问题/问题/疑虑,请尝试并回复。

Replace df.iterrows() with df.apply() for one thing.一方面,将 df.iterrows() 替换为 df.apply()。 Remove the loop for something much more efficient.删除循环以获得更有效的方法。

Here is the script and hope this works for you.这是脚本,希望这对你有用。

import pandas as pd    
import pyodbc as pc    
connection_string = "Driver=SQL Server;Server=localhost;Database={0};Trusted_Connection=Yes;"                           
cnxn = pc.connect(connection_string.format("DataBaseNameHere"), autocommit=True)    
cur=cnxn.cursor()
df= pd.read_csv("your_filepath_and_filename_here.csv").fillna('')    
query = 'insert into TableName({0}) values ({1})'    
query = query.format(','.join(df.columns), ','.join('?' * len(df1.columns)))        
cur.fast_executemany = True    
cur.executemany(query, df.values.tolist())
cnxn.close()

Try to populate a temp table with 1 or none indexes then insert it into your good table all at once.尝试用 1 个或无索引填充临时表,然后将其一次性全部插入到您的好表中。 Might speed things up due to not having to update the indexes after each insert??由于不必在每次插入后更新索引,可能会加快速度??

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM