简体   繁体   中英

python read .csv file - access list elements in a field

I have a .csv file with "|" as separator in this format (one line sample):

01|a|b|c|(d,e,[(f,g,h)])

I have to convert it into a new .csv file keeping only some element from the last field as:

01|a|b|c|f|g

So far I tried reading it line by line with:

# Set Workbooks for .csv
f = open('output.csv', 'wt')

writer = csv.writer(f, delimiter='|')

writer.writerow( 
                (
                'f1',
                'f2',
                'f3',
                'f4',
                'f5',
                'f6'
                )

with open('input.csv', 'r') as csvfile:

    spamreader = csv.reader(csvfile, delimiter='|')

    for row in spamreader:

        writer.writerow(
                        (
                        row[0],
                        row[1],
                        row[2],
                        row[3],
                        row[4][2][0][1],
                        row[4][2][0][2]
                        )
                        )

f.close()

So try to parse the list elements of the last field of the input.csv file, but it returns a:

row[4][2][0][1], IndexError: string index out of range

So actually it is not possible to access the tuple. Is there a way to do this? I would use pandas to do this, but the file is too big so I need to read it line by line.

With the code I see there, row would be:

['01', 'a', 'b', '(d,e,[(f,g,h)])']

So the last element is a string, not a tuple with a list inside, etc. so you would have to parse this string.

Well, this is going to be a very hacky way to go about it. But if your format stays the same, you could try doing:

row = [[01],["a"],["b"],["c"],["(d,e,[(f,g,h)])"]]
string = str(row) # "01|a|b|c|(d,e,[(f,g,h)])"
a = string.split('|')[:-1]
b = string.split('|')[-1].split('(')[2:][0].split(',')[:2]
result = a + b
print result

result: ['01', 'a', 'b', 'c', 'f', 'g']

Then you could use csv writer to write the list with: writer.writerow(result)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM