简体   繁体   English

使用Python将多个CSV文件导入MySQL

[英]Import Multiple CSV files into MySQL with Python

I am trying to use examples that are fun/interesting to me in order to learn MySQL and Python. 我正在尝试使用对我来说有趣/有趣的示例来学习MySQL和Python。 I am new to both, as I am a finance guy by trade. 我对这两者都是陌生的,因为我是贸易金融专家。 I have learned quite a bit along the way, but I am at a point where I am not sure where to go. 在此过程中我学到了很多东西,但是我不确定要去哪里。

The data files I am looking to import are contained in a zip file, hosted: here 我要导入的数据文件包含在一个托管的zip文件中: 此处

I extracted the directory, are there are well over 1000 files. 我提取了目录,这里有1000多个文件。 In the code below, I am simply trying to reference one of the files individually to parse it into MySQL. 在下面的代码中,我只是尝试单独引用其中一个文件以将其解析为MySQL。

I am fully aware that there has to be an easier way to go about this, but I am learning, and I am sure my code reflects it (you can see that I typed in some basic prints to see that my code was correct). 我完全知道必须有一种更简单的方法来进行此操作,但是我正在学习,并且我确信我的代码可以反映出这一点(您可以看到我输入了一些基本的印刷品以查看我的代码是正确的)。

Any help you can provide will be GREATLY appreciated. 您能提供的任何帮助将不胜感激。 Essentially, when it comes to Python, I am looking as it as a hobby and a way to get at the large amount of data on the web that I want to play with. 本质上,当涉及到Python时,我将其视为一种爱好,也是一种在Web上获取要使用的大量数据的方法。 I pasted my code below so you can see where I am coming from. 我在下面粘贴了我的代码,以便您可以查看我的来源。

  • Brock 布罗克

PS Sorry about the code below, can't figure out a better way to post it. 抱歉,下面的代码无法找到更好的发布方式。

The code below is my create table script 下面的代码是我的创建表脚本

> DROP TABLE IF EXISTS `nba`.`event`;
CREATE TABLE  `nba`.`event` (
  `a1` varchar(45) DEFAULT NULL,
  `a2` varchar(45) DEFAULT NULL,
  `a3` varchar(45) DEFAULT NULL,
  `a4` varchar(45) DEFAULT NULL,
  `a5` varchar(45) DEFAULT NULL,
  `h1` varchar(45) DEFAULT NULL,
  `h2` varchar(45) DEFAULT NULL,
  `h3` varchar(45) DEFAULT NULL,
  `h4` varchar(45) DEFAULT NULL,
  `h5` varchar(45) DEFAULT NULL,
  `period` int(11) DEFAULT NULL,
  `time` time DEFAULT NULL,
  `team` varchar(3) DEFAULT NULL,
  `etype` varchar(15) DEFAULT NULL,
  `assist` varchar(45) DEFAULT NULL,
  `away` varchar(45) DEFAULT NULL,
  `block` varchar(45) DEFAULT NULL,
  `entered` varchar(45) DEFAULT NULL,
  `home` varchar(45) DEFAULT NULL,
  `left` varchar(45) DEFAULT NULL,
  `num` int(11) DEFAULT NULL,
  `opponent` varchar(45) DEFAULT NULL,
  `outof` varchar(45) DEFAULT NULL,
  `player` varchar(45) DEFAULT NULL,
  `points` int(11) DEFAULT NULL,
  `possession` varchar(45) DEFAULT NULL,
  `reason` varchar(50) DEFAULT NULL,
  `result` varchar(10) DEFAULT NULL,
  `steal` varchar(45) DEFAULT NULL,
  `type` varchar(30) DEFAULT NULL,
  `x` varchar(2) DEFAULT NULL,
  `y` varchar(2) DEFAULT NULL,
  `gameid` varchar(15) NOT NULL,
  `seqnum` int(11) NOT NULL AUTO_INCREMENT,
  `updated` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`seqnum`,`gameid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Datafiles of play by play from basketballgeek.com';

Here is my python script. 这是我的python脚本。 I put the table above in a schema called NBA. 我将上面的表格放在称为NBA的架构中。 I set my table up so that, I believe, a sequence number is created for every record entered into the db. 我设置了表格,以便为输入到数据库中的每个记录创建一个序列号。 I pass it the gameid as a string, and also assume that the timestamp will be created for every insert statement. 我将gameid作为字符串传递,并且还假定将为每个insert语句创建时间戳。 I know there is an error, but I can't figure out what it is. 我知道有一个错误,但我不知道是什么错误。

 sql = """LOAD DATA INFILE '%s' INTO TABLE event FIELDS TERMINATED BY ',' LINES TERMINATED BY '\\\\n';""" % path print sql try: cursor.execute(sql) db.commit() except: print "ERROR" db.rollback() db.close() 

Why not use MySQLs own CSV import functionality? 为什么不使用MySQL自己的CSV导入功能?

http://dev.mysql.com/doc/refman/5.0/en/load-data.html http://dev.mysql.com/doc/refman/5.0/en/load-data.html

LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

You'll want to output the actual error, like this: 您将要输出实际错误,如下所示:

try:
    cursor.execute(sql)
    db.commit()
except StandardError, e:
    print e
    db.rollback()

For example, when I execute the above, I get following output 例如,当我执行上面的命令时,得到以下输出

(29, "File '/opt/mysql/data/51/test/data.csv' not found (Errcode: 2)")

Note that you want to specify the LOCAL keyword for LOAD DATA. 请注意,您要为LOAD DATA指定LOCAL关键字。 Without it, it will assume that the file is locate on the machine running your MySQL server, which might not be the same as the one running your script? 没有它,它将假设该文件位于运行MySQL服务器的计算机上,这可能与运行脚本的计算机不同吗?

I guess you have an error similar to what I had. 我猜您有一个类似于我的错误。

try 尝试

LOAD DATA  LOCAL INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

instead of 代替

LOAD DATA INFILE 'data.txt' INTO TABLE tbl_name
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n';

This syntax is to make sure the data is being read relative to the client and not the server. 此语法是为了确保相对于客户端而不是服务器读取数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM