[英]How do I get to my values from a CSV file to load into my SQL database with python?
I'm trying to figure out how to load my data from a CSV file into my SQL database. 我试图弄清楚如何将数据从CSV文件加载到SQL数据库中。
I currently used Sqlite3, because I couldnt install pymssql yet. 我目前使用Sqlite3,因为我还无法安装pymssql。 That is my code so far:
到目前为止,这是我的代码:
import csv, sqlite3
con = sqlite3.connect("aerzte.db")
cur = con.cursor()
#cur.execute("""CREATE TABLE liste (id INTEGER PRIMARY KEY, Anrede TEXT, Titel TEXT, Titel2 TEXT, Vorname TEXT, Name TEXT, Praxis TEXT, Straße TEXT, PLZ TEXT, Ort TEXT);""")
with open('arztliste.csv', 'r') as f:
file = csv.reader(f)
columns = next(file)
query = 'insert into liste({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
for data in file:
cur.execute(query, data)
cur.commit()
con.commit()
con.close()
My CSV file Looks like this: 我的CSV文件如下所示:
Anrede;Titel;Titel2;Vorname;Name;Praxis;Straße;PLZ;Ort;
Herr;Dr.;med.;Norbert;Braunisch;CoMedicum Landshuter Allee GmbH; Landshuter Allee 45;80637;München;
The first row is the Header with the column values. 第一行是带有列值的标题。 After that follows the "real" data that should get inserted into those columns.
之后,应该插入这些列的“真实”数据。 I also alreasy created the database, the Table and the columns.
我也很容易创建数据库,表和列。 I think the data cant load in because of the semicolons between the diffrent column values.
我认为由于不同列值之间的分号而无法加载数据。 I already replaced them with ","'s, but then at the end the semicolon s missing to end the line.
我已经用“,”代替了它们,但是最后缺少分号来结束这一行。 I hope to get any advices soon.
我希望尽快得到任何建议。 Thank You.
谢谢。
Using csv.DictReader makes the work simpler compared to reader, And i changed it to commas from semi-colon, incase youre gonna use semicolon , specify the delimitter in the reader object 与阅读器相比,使用csv.DictReader可使工作更简单,并且我将其从分号更改为逗号,以防万一您要使用分号,请在reader对象中指定分隔符
with open('arztliste.csv', 'r') as f:
file = csv.Dicteader(f)
csv_data = []
for element in file :
csv_data.append(element)
csv_data now contains list of dictionaries where keys are the headers of your csv file and values are the "real" data . csv_data现在包含字典列表,其中键是csv文件的标题,而值是“实际”数据。
Once you get the data correct its simple to dump it into the sqldb , 正确获取数据后,将其轻松转储到sqldb中,
query = 'INSERT INTO table_name(Anrede,Titel,Titel2,Vorname,Name,Praxis,Straße,PLZ,Ort) VALUES(%s,%s,%s,%s,%s,%s,%s,%s)'
query ='INSERT INTO table_name(Anrede,Titel,Titel2,Vorname,Name,Praxis,Straße,PLZ,Ort)值(%s,%s,%s,%s,%s,%s,%s,%s )'
Looping through the values , 遍历值,
for data in csv_data:
cur.execute(query,data['Anrede'],data['Titel'],data['Titel2']...data['Ort'])
The Python csv module allows you to declare the delimiter. Python csv模块允许您声明定界符。 And as you have an additional semicolon at the end of the line, you will get an additional field in each row that you have to ignore.
并且由于在行尾有一个附加的分号,因此您将在每行中获得一个必须忽略的附加字段。
It does not make sense to commit a cursor: you only commit at the connection level. 提交游标没有任何意义:您只能在连接级别进行提交。 You must choose if you want to commit after each line (uncommon), at the end of file (may use memory) of every n-th line (use a counter).
您必须选择是否要在每行(不常见)之后,第n行(使用计数器)的文件末尾(可能使用内存)进行提交。 So you code should become (using this last option)
因此,您的代码应该成为(使用最后一个选项)
...
counter = 20 # commit every 20-th row
with open('arztliste.csv', 'r') as f:
file = csv.reader(f, delimiter=";") # declare delimiter
columns = next(file)[:-1] # ignore last (empty) field
query = 'insert into liste({0}) values ({1})'
query = query.format(','.join(columns), ','.join('?' * len(columns)))
for data in file:
cur.execute(query, data[:-1]) # ignore last (empty) field
counter -= 1
if counter == 0:
con.commit()
counter = 20
con.commit()
con.close()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.