The following code fails to run.
It goes through a CSV file and retrieves the values and formats them in a array of tuples (a insert query) to be used later. Problem is the csv last column is sometimes a String or nothing (as shown in csv sample below). The error follows. Can anyone help me with this?
def csv_to_DB(csv_input):
with open(csv_input, newline='') as csvfile:
csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
to_insert = [] # will be list of tuples
insert_str = "INSERT INTO table (ID, user, user_version, value, description) VALUES "
template = "('%s', '%s', '%s', '%s', '%s')"
for row in csv_data:
to_insert.append(tuple(row)) # convert row/list to tuple and add it to list
query = insert_str + '\n'.join(template % to_insert)
#use query for other operations...
CSV sample:
1,aaa,1,0.0,
2,bbb,1,0.13,
3,ccc,1,0.0,
4,ddd,3,1.0,Rom
5,eee,1,0.08,
Error:
query = insert_str + '\n'.join(template % to_insert)
TypeError: not enough arguments for format string
Note: this question is a followup from this question
UPDATE
To clarify: the goal is to create one INSERT with several values instead of several inserts. In this case:
INSERT INTO table (ID, user, user_version, value, description) VALUES
('1', 'aaa', '1', '0.0', ''),
('2', 'bbb', '1', '0.13', ''),
('3', 'ccc', '1', '0.0', ''),
('4', 'ddd', '3', '1.0', 'Rom'),
('5', 'eee', '1', '0.08', '')
to_insert
will be:
[('1', 'aaa', '1', '0.0', ''), ('2', 'bbb', '1', '0.13', ''), ('3', 'ccc', '1', '0.0', ''), ('4', 'ddd', '3', '1.0', 'Rom'), ('5', 'eee', '1', '0.08', '')]
The desired output can be achieved with simple string additions without the need for a string template:
def xing_csv_to_crmDB2(csv_input):
query = ''
with open(csv_input, 'r', newline='') as csvfile:
csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
insert_str = "INSERT INTO table (ID, user, user_version, value, description) VALUES "
for row in csv_data:
query += '\n' + str(tuple(row))
insert_str += query
# do something
this produces the following output:
INSERT INTO table (ID, user, user_version, value, description) VALUES
('1', 'aaa', '1', '0.0', '')
('2', 'bbb', '1', '0.13', '')
('3', 'ccc', '1', '0.0', '')
('4', 'ddd', '3', '1.0', 'Rom')
('5', 'eee', '1', '0.08', '')
UPDATE : according to @Tomerikoo's idea, an even more simplified version:
def xing_csv_to_crmDB2(csv_input):
with open(csv_input, 'r', newline='') as csvfile:
csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
insert_str = "INSERT INTO table (ID, user, user_version, value, description) VALUES "
for row in csv_data:
insert_str += '\n' + str(tuple(row))
# do something
output is still the same.
Your problem is with this expression:
(template % to_insert)
template is expecting 5 arguments and to_insert
will always be 1! since its a list it is considered as a single argument.
changing to_insert
to tuple(to_insert)
and moving the query outside the loop will solve your problem, depends on what you're trying to get.
Try changing your loop to something like this:
for row in csv_data:
to_insert.append(tuple(row)) # convert row/list to tuple and add it to list
query = insert_str + '\n'.join(template % tuple(to_insert))
UPDATE : according to @JonyD's update, the template is simply not necessary since it forces you to 5 rows. Additionaly, you would want to pass to join()
the list and not a string. what you should do is:
def csv_to_DB(csv_input):
with open(csv_input, newline='') as csvfile:
csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
to_insert = [] # will be list of tuples
insert_str="INSERT INTO table (ID, user, user_version, value, description) VALUES"
for row in csv_data:
to_insert.append(tuple(row)) # convert row/list to tuple and add it to list
query = insert_str + '\n'.join(to_insert)
Here is the answer to what I wanted. Feel free to use it. It's very fast. To insert 3,8Million records in a RDS mysql it takes 2 minutes when the bloc_size=10000
. Thanks to torresmateo
def csv2mysql(csv_input, db_opts, insert_conf, block_size='1000'):
"""
:param csv_input: the input csv file path
:param db_opts: is a dictionary. Should be like the following Example:
tvnow_db_opts = {
'user': db_conn.login,
'password': db_conn.password,
'host': db_conn.host,
'database': db_conn.schema
}
:param insert_conf: see explanation below
insert_conf = {
'table_name': 'my_table',
'columns': 'ID, field1, field2, field3, field_4',
'values_template': "('%s', '%s', '%s', '%s', '%s')"
}
table_name: DB table name where data will be inserted
columns: columns corresponding to csv; separated by comma.
Example: "ID, field1, field2, field3, field_4"
values_template: String with following format "('%s', '%s', '%s', '%s', '%s')". Nr of '%s' must be the same as the nr of fields in the csv/columns in the table
:param block_size: nr of rows/records to be inserted per sql insert command. Default 1000
"""
print("Inserting csv file {} to database {}".format(csv_input, db_opts['host']))
conn = pymysql.connect(**db_opts)
cur = conn.cursor()
try:
with open(csv_input, newline='') as csvfile:
csv_data = csv.reader(csvfile, delimiter=',', quotechar='"')
to_insert = [] # will be list of tuples
insert_str = "INSERT INTO {} ({}) VALUES ".format(insert_conf.table_name, insert_conf.columns)
count = 0
for row in csv_data:
count += 1
to_insert.append(tuple(row)) # convert row/list to tuple and add it to list
if count % block_size == 0:
query = insert_str + ',\n'.join([insert_conf.values_template % r for r in to_insert])
cur.execute(query)
to_insert = []
conn.commit()
# commit/insert the remaining rows
if len(to_insert) > 0:
query = insert_str + ',\n'.join([insert_conf.values_template % r for r in to_insert])
cur.execute(query)
conn.commit()
finally:
conn.close()
print('Finished inserting csv file to database')
template = "('%s', '%s', '%s', '%s', '%s')"
- FIVE arguments!
But you use:
1,aaa,1,0.0,
- 4 args ( ERROR )
2,bbb,1,0.13,
- 5 args ( OK )
3,ccc,1,0.0,
- 4 args ( ERROR )
4,ddd,3,1.0,Rom
- 5 args ( OK )
5,eee,1,0.08,
- 4 args ( ERROR )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.