[英]Programically convert a fixed width .dat file with a .fmt file into a .csv file using Python or Python/Pandas
我正在嘗試學習Python,但是我被困在這里,不勝感激。
我有2個檔案。
1是沒有固定寬度包含多行數據的列標題的.dat文件1是是包含列標題,列長和數據類型的.fmt文件
.dat示例:
10IFKDHGHS34
12IFKDHGHH35
53IFHDHGDF33
.fmt示例:
ID,2,n
NAME,8,c
CODE,2,n
所需的輸出為.csv:
ID,NAME,CODE
10,IFKDHGHS,34
12,IFKDHGHH,35
53,IFHDHGDF,33
首先,我將分析格式文件。
with open("format_file.fmt") as f:
# csv.reader parses away the commas for us
# and we get rows as nice lists
reader = csv.reader(f)
# this will give us a list of lists that looks like
# [["ID", "2", "n"], ...]
row_definitions = list(reader)
# iterate and just unpack the headers
# this gives us ["ID", "NAME", "CODE"]
header = [name for name, length, dtype in row_definitions]
# [2, 8, 2]
lengths = [int(length) for name, length, dtype in row_definitions]
# define a generator (possibly as a closure) that simply slices
# into the string bit by bit -- this yields, for example, first
# characters 0 - 2, then characters 2 - 10, then 8 - 12
def parse_line(line):
start = 0
for length in lengths:
yield line[start: start+length]
start = start + length
with open(data_file) as f:
# iterating over a file pointer gives us each line separately
# we call list on each use of parse_line to force the generator to a list
parsed_lines = [list(parse_line(line)) for line in data_file]
# prepend the header
table = [header] + parsed_lines
# use the csv module again to easily output to csv
with open("my_output_file.csv", 'w') as f:
writer = csv.writer(f)
writer.writerows(table)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.