如何使用pd.read_csv跳过/忽略csv文件中的空字节？

Question

I have a .csv file that has hundreds of lines/columns that look like this (small example, see image I couldnt copy/paste the null bytes had to type them manually): 我有一个.csv文件，其中有几百行这样的行/列（小示例，请参见图像我无法复制/粘贴空字节，必须手动键入它们）：

9142,16.04000000,14.65000000
<0x00><0x00><0x00>
9143,16.19000000,14.65000000

there are a small number of lines that contain NULL bytes ("<0x00>") that are giving me trouble when trying to read the csv using pandas pd.read_csv. 少数行包含NULL字节（“ <0x00>”）的行在尝试使用pandas pd.read_csv读取csv时给我带来麻烦。

when I run the command : 当我运行命令时：

pd.read_csv(fname, header=None, na_values='-32768', names=binnams, engine='python')

I get the following error: 我收到以下错误：

pandas.errors.ParserError: ("NULL byte detected. This byte cannot be processed in Python's native csv library at the moment, so please pass in engine='c' instead", 'occurred at index 16')

and when I switch the engine='c' I get: 当我切换引擎='c'时，我得到：

TypeError: ('cannot unpack non-iterable NoneType object', 'occurred at index 16')

Is there a way to ignore these lines completely using pd.read_csv? 有没有办法使用pd.read_csv完全忽略这些行？

I think a workaround might be to open the files and loop through them and delete any lines that contain the <0x00> if it can even be read? 我认为一种解决方法可能是打开文件并循环通过它们，并删除甚至可以读取包含<0x00>的任何行？

Any thoughts/suggestions are definitely appreciated. 任何想法/建议绝对值得赞赏。

EDIT - tried to read the files line by line to see if I could delete these lines but not sure how to actually capture the null byte (using "<0x00>" obv didn't work :D ) 编辑-尝试逐行读取文件，以查看是否可以删除这些行，但不确定如何实际捕获空字节（使用“ <0x00>” obv无效：D）

link to example file here : https://drive.google.com/open?id=1uEjMv0Be9Hu_AqXRzqB3enrWilzCTBvc 链接到示例文件： https ： //drive.google.com/open？id = 1uEjMv0Be9Hu_AqXRzqB3enrWilzCTBvc

Answer 1

尝试将csv文件另存为UTF-16，然后尝试运行代码：

pd.read_csv(fname, header=None, na_values='-32768', names=binnams, engine='python')

如何使用pd.read_csv跳过/忽略csv文件中的空字节？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-08-27 18:33:32

如何使用pd.read_csv跳过/忽略csv文件中的空字节？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-08-27 18:33:32

解决方案1
0 已采纳 2019-08-27 18:33:32