[英]Read zipped txt file as pandas dataframe
I am trying to read a zipped txt file as pandas dataframe.我正在尝试将压缩的 txt 文件读取为 pandas dataframe。 Though the format of file after unzipping is txt, but it contains comma separated values.
虽然解压后的文件格式是txt,但是里面有逗号分隔的值。
Following the answer from here , I used:按照here的答案,我使用了:
path = 'data_folder/data.2020.ZIP'
df = pd.read_csv(path, compression='zip', header=None, sep=',')
print(df.head())
But it is throwing this error:但它抛出了这个错误:
ParserError: Error tokenizing data.
ParserError:错误标记数据。 C error: Expected 37 fields in line 23, saw 80
C 错误:预计第 23 行中的 37 个字段,看到 80
I am using python 3.6 with pandas version 0.24.2.我正在使用 python 3.6 和 pandas 版本 0.24.2。 Would upgrading pandas help?
升级 pandas 有帮助吗?
So this was happening because of irrregular number of columns in various rows, and since I don't want to drop any data, I used the names
argument with maximum number of columns to fix the issue like so:所以发生这种情况是因为各行中的列数不规则,并且由于我不想删除任何数据,所以我使用具有最大列数的
names
参数来解决问题,如下所示:
path = 'data_folder/data.2020.ZIP'
df = pd.read_csv(path, compression='zip', header=None, sep=',', names=range(80))
print(df.head())
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.