简体   繁体   English

使用Python将大型CSV文件导入MySQL

[英]Import Large CSV File into MySQL using Python

I'm trying to import one column of a large CSV file into MySQL using python 3.7. 我正在尝试使用python 3.7将大型CSV文件的一列导入MySQL。 This is being done as a test run to import the rest of the columns. 这是作为导入其余列的测试运行而完成的。

For now, I can't even get the one column into the database. 现在,我什至无法将一列输入数据库。 I was hoping to find some help. 我希望能找到帮助。

I have setup a database with one table and only one field for the test data: 我已经建立了一个数据库,其中只有一个表,并且只有一个字段用于测试数据:

mysql> use aws_bill
Database changed

mysql> show tables;
+--------------------+
| Tables_in_aws_bill |
+--------------------+
| billing_info       |
+--------------------+

mysql> desc billing_info;
+----------+---------+------+-----+---------+-------+
| Field    | Type    | Null | Key | Default | Extra |
+----------+---------+------+-----+---------+-------+
| RecordId | int(11) | NO   |     | NULL    |       |
+----------+---------+------+-----+---------+-------+

When I run my code: 当我运行代码时:

mydb = mysql.connector.connect(user='xxxx', password='xxxxx',
                            host='xxxxx',
                            database='aws_bill')
cursor = mydb.cursor()
try:
    with open(source) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=',')
        sql = "INSERT INTO billing_info (RecordId) VALUES (%s)"
        for row in csv_reader:
            row = (', '.join(row))
            print(row)
            cursor.execute(sql, row)
except:
    mydb.rollback()
finally:
    mydb.close()

Only ONE line of the CSV column gets printed out: CSV列中只有一行被打印出来:

python3 .\aws_billing.py
200176595756546201775238333

And nothing makes it into the database: 没有任何东西进入数据库:

mysql> select RecordId from billing_info;
Empty set (0.00 sec)

If I comment out the sql insert statement: cursor.execute(sql, row) 如果我注释掉sql插入语句: cursor.execute(sql, row)

Then all of the lines of the CSV print out: 然后打印出CSV的所有行:

203528424494971448426778962
203529863341009197771806423
203529974021473640029260511
203530250722634745672445063
203525214761502622966710100
203525122527782254417348410
203529365278919207614044035
...continues to the end of the file

But none of the data makes it into the database, of course. 但是,当然没有数据可以进入数据库。 Because the SQL line is commented out. 因为SQL行已被注释掉。 At least all of the lines of the CSV are printing out now, however, putting them into the database would be good! 至少现在已经打印了CSV的所有行,但是,将它们放入数据库中会很好!

Why is this happening? 为什么会这样呢? How can I get all the lines of the CSV into the database? 如何将CSV的所有行都放入数据库中?

You can do it like this: 您可以这样做:

Change this line sql = "INSERT INTO billing_info (InvoiceId) VALUES (%s)" to 将此行sql = "INSERT INTO billing_info (InvoiceId) VALUES (%s)"更改为

sql = "INSERT INTO billing_info (InvoiceId) VALUES {}"

And this one: cursor.execute(sql, row) to cursor.execute(sql.format(row)) 而这一点: cursor.execute(sql, row)cursor.execute(sql.format(row))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM