如何将文本文件（包含“,”作为分隔符）转换为熊猫数据框

Question

我正在尝试读取包含以下内容的文本文件（大约 100 万行）：

第一行： “column_header”、“column_header”、“column_header”、“column_header”

从第二行开始： “价值”、“价值”、“价值”、“价值”

我尝试了以下方法：

''' try 1 '''
with open(file, 'rt') as f:
    contents = f.readlines()

for i in contents:
    print(i) # ->> seeing the text as ," value ", " value ", "
    x = [_.strip().replace('""', '').split(',') for _ in i]
    print(str(x)) # ->> getting bytez

''' try 2 '''
with open(file, 'rt') as f:
    contents = f.read()

    for i in contents:
        print(str(i)) # ->> text but cannot do anything

''' try 3 '''
frame = pd.read_csv(file, sep=',', doublequote=True, skip_blank_lines=True) # ->> utf parsing error

Answer 1

我发现我收到的文本文件没有编码 utc-8。 因此，以上都没有奏效。 我的解决方案：打开并另存为 .txt（utf8 编码）。 比使用以下 python 代码：

file = folder_location + 'report.txt'

''' try 3 '''
frame = pd.read_csv(file, sep=',', doublequote=True, skip_blank_lines=True)
print(frame.head())

如何将文本文件（包含“,”作为分隔符）转换为熊猫数据框

问题描述

1 个解决方案

解决方案1
0 2020-03-13 15:29:46

如何将文本文件（包含“,”作为分隔符）转换为熊猫数据框

问题描述

1 个解决方案

解决方案1 0 2020-03-13 15:29:46

解决方案1
0 2020-03-13 15:29:46