简体   繁体   中英

Read CSV with comma as linebreak

I have a file saved as .csv

"400":0.1,"401":0.2,"402":0.3

Ultimately I want to save the data in a proper format in a csv file for further processing. The problem is that there are no line breaks in the file.

pathname = r"C:\pathtofile\file.csv"    

with open(pathname, newline='') as file:
    reader = file.read().replace(',', '\n')
    print(reader)
    with open(r"C:\pathtofile\filenew.csv", 'w') as new_file:
        csv_writer = csv.writer(new_file)
        csv_writer.writerow(reader)

The print reader output looks exactly how I want (or at least it's a format I can further process).

"400":0.1
"401":0.2
"402":0.3

And now I want to save that to a new csv file. However the output looks like

"""",4,0,0,"""",:,0,.,1,"
","""",4,0,1,"""",:,0,.,2,"
","""",4,0,2,"""",:,0,.,3

I'm sure it would be intelligent to convert the format to

400,0.1
401,0.2
402,0.3

at this stage instead of doing later with another script.

The main problem is that my current code

with open(pathname, newline='') as file:
    reader = file.read().replace(',', '\n')
    reader = csv.reader(reader,delimiter=':')
    x = []
    y = []
    print(reader)
    for row in reader:
        x.append( float(row[0]) )
        y.append( float(row[1]) )           

print(x)
print(y)

works fine for the type of csv files I currently have, but doesn't work for these mentioned above:

y.append( float(row[1]) )
IndexError: list index out of range

So I'm trying to find a way to work with them too. I think I'm missing something obvious as I imagine that it can't be too hard to properly define the linebreak character and delimiter of a file.

with open(pathname, newline=',') as file:

yields

ValueError: illegal newline value: ,

The right way with csv module, without replacing and casting to float :

import csv

with open('file.csv', 'r') as f, open('filenew.csv', 'w', newline='') as out:
    reader = csv.reader(f)
    writer = csv.writer(out, quotechar=None)
    for r in reader:
        for i in r:
            writer.writerow(i.split(':'))

The resulting filenew.csv contents (according to your " intelligent " condition):

400,0.1
401,0.2
402,0.3

Nuances :

  • csv.reader and csv.writer objects treat comma , as default delimiter (no need to file.read().replace(',', '\\n') )

  • quotechar=None is specified for csv.writer object to eliminate double quotes around the values being saved

You need to split the values to form a list to represent a row. Presently the code is splitting the string into individual characters to represent the row.

pathname = r"C:\pathtofile\file.csv"    

with open(pathname) as old_file:
    with open(r"C:\pathtofile\filenew.csv", 'w') as new_file:
        csv_writer = csv.writer(new_file, delimiter=',')
        text_rows = old_file.read().split(",")
        for row in text_rows:
            items = row.split(":")
            csv_writer.writerow([int(items[0]), items[1])

If you look at the documentation, for write_row , it says:

Write the row parameter to the writer's file object, formatted according to the current dialect.

But, you are writing an entire string in your code

csv_writer.writerow(reader)

because reader is a string at this point. Now, the format you want to use in your CSV file is not clearly mentioned in the question. But as you said, if you can do some preprocessing to create a list of lists and pass each sublist to writerow() , you should be able to produce the required file format.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM