简体   繁体   中英

python parsing string to csv format

I have a file containing a line with the following format

aaa=A;bbb=B;ccc=C

I want to convert it to a csv format so the literals on the equation sides will be columns and the semicolon as a row separator. I tried doing something like this

 f = open("aaa.txt", "r")
    with open("ccc.csv", 'w') as csvFile:
        writer = csv.writer(csvFile)
        rows = []
        if f.mode == 'r':
            single = f.readline()
            lns = single.split(";")
            for item in lns:
                rows.append(item.replace("=", ","))
            writer.writerows(rows)
            f.close()
            csvFile.close()

but I am getting each letter as a column so the result looks like :

a,a,a,",",A
b,b,b,",",B
c,c,c,",",C,"

The expected result should look like

aaa,A
bbb,B
ccc,C

The parameter to writer.writerows() must be an iterable of rows , which must in turn be iterables of strings or numbers. Since you pass it a list of strings, characters in the strings are treated as separate fields. You can obtain the proper list of rows by splitting the line first on ';' , then on '=' :

import csv

with open('in.txt') as in_file, open('out.csv', 'w') as out_file:
    writer = csv.writer(out_file)
    line = next(in_file).rstrip('\n')
    rows = [item.split('=') for item in line.split(';')]
    writer.writerows(rows)

Just write the strings into the target file line by line:

import os
f = open("aaa.txt", "r")
with open("ccc.csv", 'w') as csvFile:
    single = f.readline()
    lns = single.split(";")
    for item in lns:
        csvFile.write(item.replace("=", ",") + os.linesep)
f.close()

The output would be:

aaa,A
bbb,B
ccc,C

It helps to interactively execute the commands and print the values, or add debug print in the code (that will be removed or commented when everything works). Here you could have seen that rows is ['aaa,A', 'bbb,B', 'ccc,C'] that is 3 strings when it should be three sequences.

As a string is a (read only) sequence of chars writerows uses each char as a field.

So you do not want to replace the = with a comma ( , ), but want to split on the equal sign:

        ...
        for item in lns:
            rows.append(item.split("=", 1))
        ...

But the csv module requires for proper operation the output file to be opened with newline='' .

So you should have:

with open("ccc.csv", 'w', newline='') as csvFile:
    ...

The following 1 line change worked for me:

rows.append(item.split('='))

instead of the existing code

rows.append(item.replace("=", ",")).

That way, I was able to create a list of lists which can easily be read by the writer so that the row list looks like [['aaa', 'A'], ['bbb', 'B'], ['ccc', 'C']] instead of ['aaa,A', 'bbb,B', 'ccc,C']

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM