Unable to read a large CSV in Python

Question

I'm using python 2.7 and trying to read the entries of a CSV file. I made a separate version of the original CSV that only has the first 10 rows of data and with the following code, it works the way I would like it to, where I can just edit the indexing of Z in genfromtxt's "usecols" field to read a specific range of columns in my CSV.

import numpy as np
import array

Z = array.array('i', (i for i in range(0, 40)))

with open('data/training_edit.csv','r') as f:
    data = np.genfromtxt(f, dtype=float, delimiter=',', names=True, usecols=(Z[0:32])) 
print(data)

But when I use this code with my original CSV (250,000 rows x 33 columns) I get this kind of output and I don't know why:

Traceback (most recent call last):
File "/home/user/PycharmProjects/H-B2/Read.py", line 74, in <module>
data = np.genfromtxt(f, dtype=float, delimiter=',', names=True,usecols=(Z[0:32]))
File "/usr/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 1667, in genfromtxt  
raise ValueError(errmsg)
ValueError: Some errors were detected !

.
.
.   
Line #249991 (got 1 columns instead of 32)
Line #249992 (got 1 columns instead of 32)
Line #249993 (got 1 columns instead of 32)
Line #249994 (got 1 columns instead of 32)
Line #249995 (got 1 columns instead of 32)
Line #249996 (got 1 columns instead of 32)
Line #249997 (got 1 columns instead of 32)
Line #249998 (got 1 columns instead of 32)
Line #249999 (got 1 columns instead of 32)
Line #250000 (got 1 columns instead of 32)

Process finished with exit code 1

(I added the dots just to shorten the real output but you hopefully get the point)

Answer 1

Yes, I believe you need just to add range(0,32) to your usecols as follows:

data = np.genfromtxt(f, dtype=float, delimiter=',', names=True,usecols=range(0,32))

I just figured that out for myself.

Unable to read a large CSV in Python

Question

1 answers

solution1
0 2014-10-16 22:17:33

Unable to read a large CSV in Python

Question

1 answers

solution1 0 2014-10-16 22:17:33

solution1
0 2014-10-16 22:17:33