简体   繁体   English

逗号分隔的csv文件中的python dict

[英]python dict in comma separated csv file

Python dict is in a format like this: Python dict的格式如下:

'{"a":1, "b":2, "c":3}'

Notice it use comma to separate different key:value pairs. 注意,它使用逗号分隔不同的key:value对。

The problem is I have a CSV file, which is separate columns by comma too : 问题是我有一个CSV文件,该文件也是用逗号分隔的各列:

'
  "id",   "gender",   "age",    "name"
 "001",     "male",    "14",    "{"first":"Mike", "last":"Green"}"
 "002",   "female",    "15",    "{"first":"Kate", "last":"Spear"}"
'

When I do 当我做
pandas.read_csv('csvfile.csv', sep = ',', names=["id", "gender", "age", "name"])

I got: 我有:

'
  "id",   "gender",   "age",    "name"
 "001",     "male",    "14",    "{"first":"Mike"
 "002",   "female",    "15",    "{"first":"Kate"
'

The reason I guess is csv reader regards the comma follows first name in dict as a separator in csv files. 我猜的原因是csv阅读器将dict中的逗号作为csv文件中的分隔符。 Since I only specified 4 columns named " "id", "gender", "age", "name"", so it ignore last names. 由于我仅指定了4个列,分别为“ id”,“ gender”,“ age”,“ name””,因此它忽略了姓氏。

Any thoughts or possible solution to this? 有什么想法或可能的解决方案吗? Thanks! 谢谢!

You can change the delimiter that read_csv uses. 您可以更改read_csv使用的定界符。 If you can change the csv files to use a semicolon for separating columns, you can then use read_csv(file.csv, sep=';'...) 如果可以更改csv文件以使用分号分隔列,则可以使用read_csv(file.csv, sep=';'...)

Alternatively you can fix the quoting from 或者,您可以修复来自

"001",     "male",    "14",    "{"first":"Mike", "last":"Green"}"

to

"001",     "male",    "14",    "{'first':'Mike', 'last':'Green'}"

Of course both methods mean editing the csv file. 当然,这两种方法都意味着编辑csv文件。

The second looks sounder. 第二个听起来更好。 The regular expression (\\{[^"]*)(")([^}]*\\}) could be used to match quotes inside braces. 正则表达式(\\{[^"]*)(")([^}]*\\})可用于匹配括号内的引号。 (untested) (未试)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM