将压缩的 txt 文件读取为 pandas dataframe

Question

I am trying to read a zipped txt file as pandas dataframe.我正在尝试将压缩的 txt 文件读取为 pandas dataframe。 Though the format of file after unzipping is txt, but it contains comma separated values.虽然解压后的文件格式是txt，但是里面有逗号分隔的值。

Following the answer from here , I used:按照here的答案，我使用了：

path = 'data_folder/data.2020.ZIP'
df = pd.read_csv(path, compression='zip', header=None, sep=',')
print(df.head())

But it is throwing this error:但它抛出了这个错误：

ParserError: Error tokenizing data. ParserError：错误标记数据。 C error: Expected 37 fields in line 23, saw 80 C 错误：预计第 23 行中的 37 个字段，看到 80

I am using python 3.6 with pandas version 0.24.2.我正在使用 python 3.6 和 pandas 版本 0.24.2。 Would upgrading pandas help?升级 pandas 有帮助吗？

Answer 1

So this was happening because of irrregular number of columns in various rows, and since I don't want to drop any data, I used the names argument with maximum number of columns to fix the issue like so:所以发生这种情况是因为各行中的列数不规则，并且由于我不想删除任何数据，所以我使用具有最大列数的names参数来解决问题，如下所示：

path = 'data_folder/data.2020.ZIP'
df = pd.read_csv(path, compression='zip', header=None, sep=',', names=range(80))
print(df.head())

将压缩的 txt 文件读取为 pandas dataframe

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-03-01 16:04:23

将压缩的 txt 文件读取为 pandas dataframe

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-03-01 16:04:23

解决方案1
0 已采纳 2021-03-01 16:04:23