简体   繁体   English

使用Python将CSV文件导入SQL Server

[英]Import CSV file into SQL Server using Python

I am having trouble uploading a CSV file into a table in MS SQL Server, The CSV file has 25 columns and the header has the same name as table in SQL which also has 25 columns. 我在将CSV文件上载到MS SQL Server中的表时遇到问题,CSV文件有25列,标题与SQL中的表同名,也有25列。 When I run the script it throws an error 当我运行脚本时,它会抛出一个错误

params arg (<class 'list'>) can be only a tuple or a dictionary

What is the best way to import this data into MS SQL? 将此数据导入MS SQL的最佳方法是什么? Both the CSV and SQL table have the exact same column names. CSV和SQL表都具有完全相同的列名。

Here is the code: 这是代码:

import csv
import pymssql

conn = pymssql.connect(
    server="xx.xxx.xx.90",
    port = 2433,
    user='SQLAdmin',
    password='xxxxxxxx',
    database='NasrWeb'
)

cursor = conn.cursor()
customer_data = csv.reader('cleanNVG.csv') #25 columns with same header as SQL

for row in customer_data:
    cursor.execute('INSERT INTO zzzOracle_Extract([Customer Name]\
      ,[Customer #]\
      ,[Account Name]\
      ,[Identifying Address Flag]\
      ,[Address1]\
      ,[Address2]\
      ,[Address3]\
      ,[Address4]\
      ,[City]\
      ,[County]\
      ,[State]\
      ,[Postal Code]\
      ,[Country]\
      ,[Category ]\
      ,[Class]\
      ,[Reference]\
      ,[Party Status]\
      ,[Address Status]\
      ,[Site Status]\
      ,[Ship To or Bill To]\
      ,[Default Warehouse]\
      ,[Default Order Type]\
      ,[Default Shipping Method]\
      ,[Optifacts Customer Number]\
      ,[Salesperson])''VALUES(%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,)',row)

conn.commit()
cursor.close()
print("Done")
conn.close()

This is what the first rows of the CSV file looks like 这就是CSV文件的第一行

在此输入图像描述

Try d6tstack which has fast pandas to SQL functionality because it uses native DB import commands. 尝试d6tstack ,它具有快速的pandas到SQL功能,因为它使用本机数据库导入命令。 It works for Postgres and MYSQL, MS SQL is experimental. 它适用于Postgres和MYSQL,MS SQL是实验性的。 Comment or raise an issue if it doesn't work. 如果不起作用,请评论或提出问题。

import pandas as pd
df = pd.read_csv('cleanNVG.csv')
uri_mssql = 'mssql+pymssql://usr:pwd@localhost/db'
d6tstack.utils.pd_to_mssql(df, uri_mssql, 'table', 'schema') # experimental

It is also useful for importing multiple CSV with data schema changes and/or preprocess with pandas before writing to db, see further down in examples notebook 在写入db之前,使用数据模式更改和/或使用pandas进行预处理导入多个CSV也很有用,请参阅示例笔记本中的更多内容

d6tstack.combine_csv.CombinerCSV(glob.glob('*.csv'), 
    apply_after_read=apply_fun).to_mssql_combine(uri_psql, 'table')

You are using csv.reader incorrectly. 您正在错误地使用csv.reader The first argument to .reader is not the path to the CSV file, it is .reader的第一个参数不是CSV文件的路径,它是

[an] object which supports the iterator protocol and returns a string each time its __next__() method is called — file objects and list objects are both suitable. [an]对象,它支持迭代器协议,并在每次调用__next__()方法时返回一个字符串 - 文件对象和列表对象都是合适的。

Hence, according to the example in the documentation , you should be doing something like this: 因此,根据文档中的示例,您应该执行以下操作:

import csv
with open('cleanNVG.csv', newline='') as csvfile:
    customer_data = csv.reader(csvfile)
    for row in customer_data:
        cursor.execute(sql, tuple(row))

Check the data types on the table, and the sizes of each field as well. 检查表格上的数据类型以及每个字段的大小。 If it is varchar(10) and your data is 20 characters long, it will throw an error. 如果它是varchar(10)并且您的数据长度为20个字符,则会引发错误。

Also, 也,

Consider building the query dynamically to ensure the number of placeholders matches your table and CSV file format. 请考虑动态构建查询以确保占位符数与表和CSV文件格式匹配。 Then it's just a matter of ensuring your table and CSV file are correct, instead of checking that you typed enough ? 那么只需确保您的表格和CSV文件是正确的,而不是检查您输入的内容是否正确? placeholders in your code. 代码中的占位符。

The following example assumes 以下示例假定

CSV file contains column names in the first line
Connection is already built
File name is test.csv
Table name is MyTable
Python 3

...
with open ('test.csv', 'r') as f:
    reader = csv.reader(f)
    columns = next(reader) 
    query = 'insert into MyTable({0}) values ({1})'
    query = query.format(','.join(columns), ','.join('?' * len(columns)))
    cursor = connection.cursor()
    for data in reader:
        cursor.execute(query, data)
        cursor.commit()

If column names are not included in the file: 如果文件中未包含列名:

...
with open ('test.csv', 'r') as f:
    reader = csv.reader(f)
    data = next(reader) 
    query = 'insert into dbo.Test values ({0})'
    query = query.format(','.join('?' * len(data)))
    cursor = connection.cursor()
    cursor.execute(query, data)
    for data in reader:
        cursor.execute(query, data)
    cursor.commit()

Basically, though, your code looks fine. 但基本上,您的代码看起来很好。 Here is a generic sample. 这是一个通用的样本。

cur=cnxn.cursor() # Get the cursor
csv_data = csv.reader(file(Samplefile.csv')) # Read the csv
for rows in csv_data: # Iterate through csv
    cur.execute("INSERT INTO MyTable(Col1,Col2,Col3,Col4) VALUES (?,?,?,?)",rows)
cnxn.commit()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM