简体   繁体   English

如何从数组的字符串表示形式重构数组?

[英]How to reconstruct array from string representation of array?

I've got a huge CSV that was generated by a Python script. 我有一个由Python脚本生成的巨大CSV。 Some cells include arrays of data, while others include single-item arrays. 一些单元格包含数据数组,而其他单元格包含单项数组。 Some examples: 一些例子:

cell01 == ['"July, 2002"', 'CUREE Publication No. CEA-01.', 'Project No. 3126', 'Prepared for Consortium of Universities for Research in Earthquake Engineering.']
cell02 == ['[Memorandum from Ralph J. Johnson on Andy Place].']
cell03 == ["Financial statements for the years ended March 31, 1991 and 1990 and independent auditors' report"]

Ideally, I'd like to parse all this data into structures that look like the following: 理想情况下,我想将所有这些数据解析为如下所示的结构:

cell01_parsed[0] == '"July, 2002"'
cell01_parsed[1] == 'CUREE Publication No. CEA-01.'
cell01_parsed[2] == 'Project No. 3126'
cell01_parsed[3] == 'Prepared for Consortium of Universities for Research in Earthquake Engineering.'

cell02_parsed == '[Memorandum from Ralph J. Johnson on Andy Place].'

cell03_parsed == 'Financial statements for the years ended March 31, 1991 and 1990 and independent auditors\' report'

However, when I use a csv.reader() or csv.DictReader() , these lines are parsed as strings, not arrays. 但是,当我使用csv.reader()csv.DictReader() ,这些csv.reader()被解析为字符串,而不是数组。 What would be an easy way to do this? 有什么简单的方法可以做到这一点? I can't use split(',') since some of the strings have commas in the middle of items. 我不能使用split(',')因为某些字符串在项目中间带有逗号。

You could try to split your strings by regex (find out one that fits your data) like so: 您可以尝试通过正则表达式拆分字符串(找到一个适合您数据的字符串),如下所示:

import re
test_str = '"July, 2002", CUREE Publication No. CEA-01.' 
re.compile(',(?!.+\")').split(test_str)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM