Column y below should be ['Reg', 'Reg', 'Swp', 'Swp']
In [1]: pd.read_csv('/tmp/test3.csv')
Out[1]:
x,y
^@^@^@,Reg
^@^@^@,Reg
I,Swp
I,Swp
In [2]: ! cat /tmp/test3.csv
x y
0
1 NaN NaN
2 I Swp
3 I Swp
In [3]: f = open('/tmp/test3.csv', 'rb'); print(repr(f.read()))
'x,y\n \x00\x00\x00,Reg\n \x00\x00\x00,Reg\nI,Swp\nI,Swp\n'
Yes, I could reproduce the problem, but don't know how to fix it with pd.read_csv
. Here is a workaround:
In [46]: import numpy as np
In [47]: arr = np.genfromtxt('test3.csv', delimiter = ',',
dtype = None, names = True)
In [48]: df = pd.DataFrame(arr)
In [49]: df
Out[49]:
x y
0 Reg
1 Reg
2 I Swp
3 I Swp
Note that with names = True
the first valid line of the csv is interpreted as column names (and therefore does not affect the dtype of the values on the subsequent lines.) Thus, if the csv file contains numerical data such as
In [22]: with open('/tmp/test.csv','r') as f:
....: print(repr(f.read()))
....:
'x,y,z\n \x00\x00\x00,Reg,1\n \x00\x00\x00,Reg,2\nI,Swp,3\nI,Swp,4\n'
Then genfromtxt will assign a numerical dtype to the third column ( <i4
in this case).
In [19]: arr = np.genfromtxt('/tmp/test.csv', delimiter = ',', dtype = None, names = True)
In [20]: arr
Out[20]:
array([('', 'Reg', 1), ('', 'Reg', 2), ('I', 'Swp', 3), ('I', 'Swp', 4)],
dtype=[('x', '|S3'), ('y', '|S3'), ('z', '<i4')])
However, if the numerical data is intermingled with bytes such as '\\x00'
then genfromtxt will be unable to recognize this column as numerical and will therefore resort to assigning a string dtype. Nevertheless, you can force the dtype of the columns by manually assigning the dtype
parameter. For example,
In [11]: arr = np.genfromtxt('/tmp/test.csv', delimiter = ',', dtype = [('x', '|i4'), ('y', '|S3')], names = True)
sets the first column x
to have dtype |i4
(4-byte integers) and the second column y
to have dtype |S3
(3-byte string). See this doc page for more information on available dtypes.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.