txt to csv using Python

Question

I have a sql dump in txt format , it looks like this way -

"Date:","8/21/2015","","Time:","16:18:38","","Name:","NC.S.RHU10.BRD"
"System Name:","NC.S.RHU10.BRD"
"Operator:","SYSTEM"
"Action:","Trend data loss"
"Comment:"," trend definition data loss occurred at 10:21:05 AM on 8/21/2015"
"Revision:","6"
"Location:",""
"Seq Number:","1278738"
" ********************************************************************************"
"Date:","8/21/2015","","Time:","16:17:17","","Name:","SC.L.SIDESHOWBOB.MBC009"
"System Name:","SC.L.SIDESHOWBOB.MBC009"
"Operator:","SYSTEM"
"Action:","FLN device return from failure"
"Comment:","Z8 RETURN from failure in Cabinet 9, Lan 3, Drop 1."
"Revision:","81"
"Location:","SC.L.SIDESHOWBOB.MBC009"
"Seq Number:","1278737"
" ********************************************************************************"
"Date:","8/21/2015","","Time:","16:17:17","","Name:","NC.S.EHU07.EAT"
"System Name:","NC.S.EHU07.EAT"
"Operator:","ITWVSIEMP01\InsightSCH"
"Action:","Trend data collection The target object could not be found on the Field"
"Panel."
"Comment:","Trend COV (0.000)  Failed - The target object could not be found on the"
"Field Panel"
"Revision:","1318"
"Location:","ITWVSIEMP01"
"Seq Number:","1278735"
" ********************************************************************************"
"Date:","8/21/2015","","Time:","16:17:15","","Name:","NC.S.EHU03.TCFM"
"System Name:","NC.S.EHU03.TCFM"
"Operator:","ITWVSIEMP01\InsightSCH"
"Action:","Trend data collection"
"Comment:","COV                Data Loss Detected"
"Revision:","1481"
"Location:","ITWVSIEMP01"
"Seq Number:","1278734"
" ********************************************************************************

I want to convert in column way using Python with following fields :-

"Date","Time","Name","System Name","Operator","Action","Comment","Type","Revision","Location","Seq Number"

Is there a ready function in python that does this ?

Answer 1

import csv

c = csv.writer(open('out.csv', 'w'), delimiter=',')

file = open('myfile.txt')
for col in file:
  data = col.split('\t')
 # find index "Date=0","Time=1","Name=2","System Name=3","Operator=4","Action=5","Comment=6","Type=7","Revision=8","Location=9","Seq Number=10"
  c.writerow(data[0],data[1],data[2],data[3],data[4],data[5],data[6],data[7],data[8],data[9],data[10])
f.close()

Answer 2

import operator
import csv

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    data = {}
    writer = csv.writer(outfile, delimiter=',')
    writer.writerow(["Date","Time","Name","System Name","Operator","Action","Comment","Revision","Location","Seq Number"])
    fields = operator.itemgetter("Date","Time","Name","System Name","Operator","Action","Comment","Revision","Location","Seq Number")
    for line in infile:
        if line.startswith('" *'):
            try:
                writer.writerow(fields(data))
            except AttributeError:
                print('malformed input')
                raise
            data = {}
            continue

        parts = line.split(',')
        if line.startswith('"Date'):
            data['Date'] = parts[1]
            data['Time'] = parts[4]
            data['Name'] = parts[-1]
            continue

        name = parts[0].strip('"').rstrip(":")
        value = parts[1].strip('"')
        data[name] = value

Answer 3

I've just written a little utility here . Maybe this could help you.

I think the last line of your input file is missing a " . Please add it at the end for a uniform delimiter.

Answer 4

The following script should work, it generates your header fields automatically and preserves the order in the CSV file, as such it should still work if the format changes a bit:

import csv

with open("sqldump.txt", "r") as f_input, open("output.csv", "wb") as f_output:
    csv_input = csv.reader(f_input)
    csv_output = csv.writer(f_output)

    headers = []
    for cols in csv_input:
        if len(cols) > 1:
            headers.extend([header.strip(":") for header in cols if header.endswith(':')])
        else:
            break

    csv_output.writerow(headers)
    f_input.seek(0)

    entry = []
    for cols in csv_input:
        if cols[0] == 'Date:':
            entry.extend([cols[1], cols[4], cols[-1]])
        elif len(cols) > 1:
            entry.append(cols[1])
        elif cols[0].startswith(' *'):
            csv_output.writerow(entry)
            entry = []

This would give you an output CSV file looking like:

Date,Time,Name,System Name,Operator,Action,Comment,Revision,Location,Seq Number
8/21/2015,16:18:38,NC.S.RHU10.BRD,NC.S.RHU10.BRD,SYSTEM,Trend data loss, trend definition data loss occurred at 10:21:05 AM on 8/21/2015,6,,1278738
8/21/2015,16:17:17,SC.L.SIDESHOWBOB.MBC009,SC.L.SIDESHOWBOB.MBC009,SYSTEM,FLN device return from failure,"Z8 RETURN from failure in Cabinet 9, Lan 3, Drop 1.",81,SC.L.SIDESHOWBOB.MBC009,1278737
8/21/2015,16:17:17,NC.S.EHU07.EAT,NC.S.EHU07.EAT,ITWVSIEMP01\InsightSCH,Trend data collection The target object could not be found on the Field,Trend COV (0.000)  Failed - The target object could not be found on the,1318,ITWVSIEMP01,1278735
8/21/2015,16:17:15,NC.S.EHU03.TCFM,NC.S.EHU03.TCFM,ITWVSIEMP01\InsightSCH,Trend data collection,COV                Data Loss Detected,1481,ITWVSIEMP01,1278734

Tested using Python 2.7. If you are using Python 3.0, change the code to open("output.csv", "w", newline="")

Note, there is no 'Type' field in your example data?

txt to csv using Python

Question

4 answers

solution1
1 2015-08-22 05:51:20

solution2
0 2015-08-22 06:02:56

solution3
0 2015-08-22 07:06:17

solution4
0 2015-08-22 08:08:40

txt to csv using Python

Question

4 answers

solution1 1 2015-08-22 05:51:20

solution2 0 2015-08-22 06:02:56

solution3 0 2015-08-22 07:06:17

solution4 0 2015-08-22 08:08:40

solution1
1 2015-08-22 05:51:20

solution2
0 2015-08-22 06:02:56

solution3
0 2015-08-22 07:06:17

solution4
0 2015-08-22 08:08:40