簡體   English   中英

Python:如何使用控制字符定界符導入csv之類的dat文件

[英]Python: How to import a csv like dat file with a control character delimiter

我有一個數據文件,該文件具有DC4控制字符作為分隔符。 這是我現在擁有的代碼(是從別人那里復制的,不是我的代碼)。

import csv
with open('Test.dat') as csv_file:
    csv_reader = csv.reader(csv_file, quotechar='þ', delimiter='')
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are {", ".join(row)}')
            line_count += 1
        else:
            print(f'\t{row[0]} works in the {row[1]} department, and was born in {row[2]}.')
            line_count += 1
    print(f'Processed {line_count} lines.')

如您所見,該字符由一個框顯示,到目前為止,只有notepad ++可以讀取它。 我發現了curses.ascii.isctrl(c),它似乎能夠通過python讀取該字符,然后將其作為插入符號讀取? https://docs.python.org/3.2/library/curses.ascii.html

我是編碼新手,不確定如何實現此功能,或者不確定它是否對我有用。 以下是我嘗試以文本和屏幕截圖讀取的dat文件的示例。

þIdentifierþþColumn 2þþColumn 3þ
þXX_0012345þþRandom Data 1þþRandom Data 1þ
þXX_0012346þþRandom Data 6þþRandom Data 2þ
þXX_0012347þþRandom Data 1þþRandom Data 3þ
þXX_0012348þþRandom Data 8þþRandom Data 4þ
þXX_0012349þþRandom Data 1þþRandom Data 5þ
þXX_0012345þþRandom Data 9þþRandom Data 1þ

文本文件以查看DC4控制字符

這是在python 3.6.1上使用此代碼時的輸出。 除了¾字符(這就是DC4字符的讀取方式)以外,其他所有內容看起來都不錯。

Column names are þIdentifierþ, þColumn 2þ, þColumn 3þ
    þXX_0012345þ works in the þRandom Data 1þ department, and was born in þRandom Data 1þ.
    þXX_0012346þ works in the þRandom Data 6þ department, and was born in þRandom Data 2þ.
    þXX_0012347þ works in the þRandom Data 1þ department, and was born in þRandom Data 3þ.
    þXX_0012348þ works in the þRandom Data 8þ department, and was born in þRandom Data 4þ.
    þXX_0012349þ works in the þRandom Data 1þ department, and was born in þRandom Data 5þ.
    þXX_0012345þ works in the þRandom Data 9þ department, and was born in þRandom Data 1þ.
Processed 7 lines.

任何有關這方面的幫助將不勝感激。 謝謝!

您可以為此使用轉義字符。 DC4是Ascii 20(0x14)

csv_reader = csv.reader(csv_file, quotechar='þ', delimiter='\x14')

原來這是我的計算機而不是python的問題。 顯然我看不到該字符,它僅顯示為白框。 有沒有一種方法可以編輯Windows 10以顯示該字符

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM