[英]Python: How to read a CSV file in which each line is a string?
I am trying to read a CSV file.我正在尝试读取 CSV 文件。 I need to access the keys and values in each line.
我需要访问每一行中的键和值。
"{id:495981,start:""2020-09-23"",end:""2020-09-23"",something:point({srid:4326, x:10.96791704, y:49.7989944})}"
"{id:49963,start:""2020-09-23"",end:""2020-09-23"",something:point({srid:4326, x:10.96791704, y:49.7989944})}"
As shown above, each line is a string.如上所示,每一行都是一个字符串。 What I want to do is reading the value of id in each line.
我想要做的是读取每一行中 id 的值。 Reading the file with "panda.read_csv" return something like this:
使用“panda.read_csv”读取文件返回如下内容:
{id:495981 end:""2020-09-23"" start:""2020-09-23"" \
0 {id:49963 end:""2020-09-23"" start:""2020-09-23""
...
something:point({srid:4326 x:7.138 y:51.594})}
0 something:point({srid:4326 x:10.96791704 y:49.7989944})}
[31264 rows x 6 columns]
Any suggestions??有什么建议??
You could utilize regex
here to pull each result out of the string as splitting would include the extra characters I'm assuming you would want to exclude.您可以在此处使用
regex
从字符串中提取每个结果,因为拆分将包括我假设您想要排除的额外字符。
import re
data = {}
with open('mycsvfile.csv', 'r') as file:
for line in file:
line_id = re.search('(?<=id:)[0-9]*(?=,)', line).group(0)
line_data = {'start': re.search('(?<=start:"").*(?="",end)', line).group(0),
'end': re.search('(?<=end:"").*(?="",something)', line).group(0),
'something': re.search('(?<=something:).*(?=}")', line).group(0),
}
data[line_id] = line_data
print(data)
This will result in a dict
with all ids
as a key with each key containing another dict
with all the values in the string.这将导致一个
dict
与所有ids
,与含有另一种每个密钥的密钥dict
与所述串中的所有值。
{'495981': {'start': '2020-09-23', 'end': '2020-09-23', 'something': 'point({srid:4326, x:10.96791704, y:49.7989944})'},
'49963': {'start': '2020-09-23', 'end': '2020-09-23', 'something': 'point({srid:4326, x:10.96791704, y:49.7989944})'}}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.