简体   繁体   English

无法在Python中读取大CSV

[英]Unable to read a large CSV in Python

I'm using python 2.7 and trying to read the entries of a CSV file. 我正在使用python 2.7并尝试读取CSV文件的条目。 I made a separate version of the original CSV that only has the first 10 rows of data and with the following code, it works the way I would like it to, where I can just edit the indexing of Z in genfromtxt's "usecols" field to read a specific range of columns in my CSV. 我制作了原始CSV的单独版本,该版本仅包含前10行数据,并使用以下代码,它按我想要的方式工作,我可以在genfromtxt的“ usecols”字段中将Z的索引编辑为读取CSV中特定范围的列。

import numpy as np
import array

Z = array.array('i', (i for i in range(0, 40)))

with open('data/training_edit.csv','r') as f:
    data = np.genfromtxt(f, dtype=float, delimiter=',', names=True, usecols=(Z[0:32])) 
print(data)

But when I use this code with my original CSV (250,000 rows x 33 columns) I get this kind of output and I don't know why: 但是,当我将此代码与原始CSV(250,000行x 33列)一起使用时,会得到这种输出,但我不知道为什么:

Traceback (most recent call last):
File "/home/user/PycharmProjects/H-B2/Read.py", line 74, in <module>
data = np.genfromtxt(f, dtype=float, delimiter=',', names=True,usecols=(Z[0:32]))
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1667, in genfromtxt  
raise ValueError(errmsg)
ValueError: Some errors were detected !

.
.
.   
Line #249991 (got 1 columns instead of 32)
Line #249992 (got 1 columns instead of 32)
Line #249993 (got 1 columns instead of 32)
Line #249994 (got 1 columns instead of 32)
Line #249995 (got 1 columns instead of 32)
Line #249996 (got 1 columns instead of 32)
Line #249997 (got 1 columns instead of 32)
Line #249998 (got 1 columns instead of 32)
Line #249999 (got 1 columns instead of 32)
Line #250000 (got 1 columns instead of 32)

Process finished with exit code 1

(I added the dots just to shorten the real output but you hopefully get the point) (我加了点只是为了缩短实际输出,但希望您能明白这一点)

Yes, I believe you need just to add range(0,32) to your usecols as follows: 是的,我相信您只需将range(0,32)添加到usecols中,如下所示:

data = np.genfromtxt(f, dtype=float, delimiter=',', names=True,usecols=range(0,32)) 数据= np.genfromtxt(f,dtype = float,delimiter =',',names = True,usecols = range(0,32))

I just figured that out for myself. 我只是自己弄清楚了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM