繁体   English   中英

Python:如何使用控制字符定界符导入csv之类的dat文件

[英]Python: How to import a csv like dat file with a control character delimiter

我有一个数据文件,该文件具有DC4控制字符作为分隔符。 这是我现在拥有的代码(是从别人那里复制的,不是我的代码)。

import csv
with open('Test.dat') as csv_file:
    csv_reader = csv.reader(csv_file, quotechar='þ', delimiter='')
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are {", ".join(row)}')
            line_count += 1
        else:
            print(f'\t{row[0]} works in the {row[1]} department, and was born in {row[2]}.')
            line_count += 1
    print(f'Processed {line_count} lines.')

如您所见,该字符由一个框显示,到目前为止,只有notepad ++可以读取它。 我发现了curses.ascii.isctrl(c),它似乎能够通过python读取该字符,然后将其作为插入符号读取? https://docs.python.org/3.2/library/curses.ascii.html

我是编码新手,不确定如何实现此功能,或者不确定它是否对我有用。 以下是我尝试以文本和屏幕截图读取的dat文件的示例。

þIdentifierþþColumn 2þþColumn 3þ
þXX_0012345þþRandom Data 1þþRandom Data 1þ
þXX_0012346þþRandom Data 6þþRandom Data 2þ
þXX_0012347þþRandom Data 1þþRandom Data 3þ
þXX_0012348þþRandom Data 8þþRandom Data 4þ
þXX_0012349þþRandom Data 1þþRandom Data 5þ
þXX_0012345þþRandom Data 9þþRandom Data 1þ

文本文件以查看DC4控制字符

这是在python 3.6.1上使用此代码时的输出。 除了¾字符(这就是DC4字符的读取方式)以外,其他所有内容看起来都不错。

Column names are þIdentifierþ, þColumn 2þ, þColumn 3þ
    þXX_0012345þ works in the þRandom Data 1þ department, and was born in þRandom Data 1þ.
    þXX_0012346þ works in the þRandom Data 6þ department, and was born in þRandom Data 2þ.
    þXX_0012347þ works in the þRandom Data 1þ department, and was born in þRandom Data 3þ.
    þXX_0012348þ works in the þRandom Data 8þ department, and was born in þRandom Data 4þ.
    þXX_0012349þ works in the þRandom Data 1þ department, and was born in þRandom Data 5þ.
    þXX_0012345þ works in the þRandom Data 9þ department, and was born in þRandom Data 1þ.
Processed 7 lines.

任何有关这方面的帮助将不胜感激。 谢谢!

您可以为此使用转义字符。 DC4是Ascii 20(0x14)

csv_reader = csv.reader(csv_file, quotechar='þ', delimiter='\x14')

原来这是我的计算机而不是python的问题。 显然我看不到该字符,它仅显示为白框。 有没有一种方法可以编辑Windows 10以显示该字符

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM