[英]Import huge data set csv in MYSQL
I'm trying to import a huge data set from a csv file ~400MB with 900000 rows. 我正在尝试从具有900000行的〜400MB的csv文件导入庞大的数据集。 This file has the information of two relational tables.
该文件包含两个关系表的信息。 For example:
例如:
["primary_key","name","lastname","phone,"work_id","work_name"] [ “primary_key”, “姓名”, “姓氏”, “手机”,work_id”, “work_name”]
Every row i have to check if the primary key exists for insert or updated if needed, also i need to verify work, because new works can appear in this dataset. 我必须检查每一行是否存在要插入或更新的主键,如果需要,我还需要验证工作,因为新工作可以出现在此数据集中。
My person table has more colummns that the csv file has, so i can't replace the line with mysqlimport. 我的人员表具有csv文件所具有的更多列,因此我无法将其替换为mysqlimport。
Any ideas on how to work with this? 关于如何使用此功能的任何想法?
Please read the documentation for LOAD DATA INFILE
; 请阅读
LOAD DATA INFILE
的文档 ; it is a good choice for loading data, even very big files. 这是加载数据甚至大型文件的不错选择。 Quoting from Reference manual: Speed of insert statements :
引用参考手册:插入语句的速度 :
When loading a table from a text file, use
LOAD DATA INFILE
.从文本文件加载表时,请使用
LOAD DATA INFILE
。 This is usually 20 times faster than usingINSERT
statements这通常比使用
INSERT
语句快20倍
Assuming that your table as more columns than the .csv
file, then you'd have to write something like this: 假设您的表中的列比
.csv
文件中的列多,那么您将必须编写如下内容:
load data local infile 'path/to/your/file.csv'
into table yourTable
fields terminated by ',' optionally enclosed by '"' lines terminated by '\n'
ignore 1 lines -- if it has column headers
(col1, col2, col3, ...) -- The matching column list goes here
See my own question on the subject and its answer . 看到我对这个问题的自己的问题及其答案 。
Also, if you need faster inserts, you can: 此外,如果您需要更快的插入速度,则可以:
SET foreign_key_checks = 0;
SET foreign_key_checks = 0;
before executing load data
, and/or load data
之前,和/或 alter table yourTable disable keys;
alter table yourTable disable keys;
表的索引alter table yourTable disable keys;
before executing load data
, and rebuilding them afterwards with alter table yourTable enable keys;
load data
之前,然后使用alter table yourTable enable keys;
重建alter table yourTable enable keys;
Untested: If your .csv
file has more columns than your table, I think that you can assign the "exceeding" columns in the file to temp variables: 未经测试:如果您的
.csv
文件中的列多于表中的列,我认为您可以将文件中的“超出”列分配给临时变量:
load data local infile 'path/to/your/file.csv'
into table yourTable
fields terminated by ',' optionally enclosed by '"' lines terminated by '\n'
ignore 1 lines -- if it has column headers
(col1, col2, col3, @dummyVar1, @dummyVar2, col4) -- The '@dummyVarX` variables
-- are simply place-holders for
-- columns in the .csv file that
-- don't match the columns in
-- your table
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.