简体   繁体   中英

split csv column entry on space using python

I am trying to create a new csv file using python. The new csv file will be the same, but have one entry split based on a space delimiter.

My method is to open the files with read and write access respectively, skip over the headers, then write out the specific column headings I want in the csv.

Then iterate over each rows amending the appropriate section and writing the row to the new file using the .writerow function.

One iteration over the row creates ['data1', 'data2', 'data3 data4', 'data5', 'data6', 'data7' etc. ]

So in this case I'm selecting row[2] to select the 'data3 data4' part and trying to split these to create a list that looks like ['data1', 'data2', 'data3', 'data4', 'data5', 'data6', 'data7' etc. ]

I have tried using .split which gives me a list within a list, I've tried .slicing which means I can show either data3 or data4 . I've also tried the .replace which gives me ['data1', 'data2', 'data3,data4', etc.] . I'm quite frustrated and wondering if anyone might give me the a hint as to the probably quite simple solution that i'm missing. Full code is below.

import csv

with open('filepath', mode="rU") as infile:
    with open('filepath', mode="w") as outfile:

        csv_f = csv.reader(infile)
        next(csv_f, None)  # skip the headers

        writer = csv.writer(outfile)
        writer.writerow(['dataheader1', 'dataheader2', 'dataheader3', 'dataheader4', 'dataheader5', 'dataheader6', 'dataheader7' etc. ])

    for row in csv_f:
        row[2] = row[2].replace(' ', ',')
        print row
row[2:3] = row[2].split(' ')

Demonstration:

>>> row = ['a', 'b', 'c d e f', 'g', 'h']
>>> row[2:3] = row[2].split(' ')
>>> row
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']

If you don't know where the cells with spaces are, then you're looking for itertools.chain.from_iterable

import csv

with open('filepath', mode='rU') as infile,
     open('filepath2', mode='wb') as outfile:  # this changed slightly, look!
    csv_f = csv.reader(infile)
    writer = csv.writer(outfile)
    next(csv_f)  # skip headers
    row = next(csv_f)
    # row looks like
    # ['one', 'two', 'three four', 'five', ...]

    rewritten_row = itertools.chain.from_iterable(
        [cell.split() for cell in row])  # or map(str.split, row)
    # rewritten_row looks like
    # ['one', 'two', 'three', 'four', 'five', ...]

    writer.writerow(rewritten_row)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM