简体   繁体   English

python读取.csv文件-字段中的访问列表元素

[英]python read .csv file - access list elements in a field

I have a .csv file with "|" 我有一个带“ |”的.csv文件 as separator in this format (one line sample): 作为此格式的分隔符(一行示例):

01|a|b|c|(d,e,[(f,g,h)])

I have to convert it into a new .csv file keeping only some element from the last field as: 我必须将其转换为新的.csv文件,仅保留最后一个字段中的某些元素,例如:

01|a|b|c|f|g

So far I tried reading it line by line with: 到目前为止,我尝试逐行阅读:

# Set Workbooks for .csv
f = open('output.csv', 'wt')

writer = csv.writer(f, delimiter='|')

writer.writerow( 
                (
                'f1',
                'f2',
                'f3',
                'f4',
                'f5',
                'f6'
                )

with open('input.csv', 'r') as csvfile:

    spamreader = csv.reader(csvfile, delimiter='|')

    for row in spamreader:

        writer.writerow(
                        (
                        row[0],
                        row[1],
                        row[2],
                        row[3],
                        row[4][2][0][1],
                        row[4][2][0][2]
                        )
                        )

f.close()

So try to parse the list elements of the last field of the input.csv file, but it returns a: 因此,请尝试解析input.csv文件最后一个字段的列表元素,但它返回一个:

row[4][2][0][1], IndexError: string index out of range

So actually it is not possible to access the tuple. 因此,实际上不可能访问元组。 Is there a way to do this? 有没有办法做到这一点? I would use pandas to do this, but the file is too big so I need to read it line by line. 我会用pandas来做到这一点,但是文件太大了,所以我需要逐行阅读它。

With the code I see there, row would be: 使用我在那里看到的代码,行将为:

['01', 'a', 'b', '(d,e,[(f,g,h)])'] ['01','a','b','(d,e,[(f,g,h)])']]

So the last element is a string, not a tuple with a list inside, etc. so you would have to parse this string. 因此,最后一个元素是字符串,而不是内部带有列表的元组等,因此您必须解析该字符串。

Well, this is going to be a very hacky way to go about it. 好吧,这将是一个非常棘手的方法。 But if your format stays the same, you could try doing: 但是,如果您的格式保持不变,则可以尝试执行以下操作:

row = [[01],["a"],["b"],["c"],["(d,e,[(f,g,h)])"]]
string = str(row) # "01|a|b|c|(d,e,[(f,g,h)])"
a = string.split('|')[:-1]
b = string.split('|')[-1].split('(')[2:][0].split(',')[:2]
result = a + b
print result

result: ['01', 'a', 'b', 'c', 'f', 'g'] 结果:['01','a','b','c','f','g']

Then you could use csv writer to write the list with: writer.writerow(result) 然后,您可以使用csv writer编写以下列表: writer.writerow(result)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM