I have a CSV document which has a column where each cell contains a list of dicts. Any advice on how to extract that data while keeping it as lists of dicts would be appreciated. I've tried the usual json/pandas/csv read-ins and none of them seem to work properly (converts to strings/unicode, which isn't surprising but is still frustrating). Ultimately, I'd like the output to be a dataframe, where the header row is the keys and each following row is the data.
Sample CSV Section:
1 results
2 [{"y": 47, "type": "square"}, {"type": "square", "b": 49}, {"type": "square", "z": 29}, {"a": 69, "type": "square"}, {"type": "square", "x": 81}]
3 [{"type": "circle", "b": 90}, {"y": 12, "type": "circle"}, {"a": 78, "type": "circle"}, {"type": "circle", "c": 74}, {"type": "circle", "x": 14}, {"type": "circle", "z": 19}]
4 [{"type": "square", "b": 85}, {"type": "square", "x": 73}, {"type": "square", "c": 50}]
5 [{"type": "triangle", "c": 71}, {"type": "triangle", "z": 66}, {"type": "triangle", "x": 16}, {"type": "triangle", "b": 38}, {"y": 67, "type": "triangle"}, {"a": 80, "type": "triangle"}]
Sample Output:
type a b c x y z
0 square 69 49 NaN 81 47 29
1 circle 78 90 74 14 12 19
2 square NaN 85 50 73 NaN NaN
3 triangle 80 38 71 16 67 66
Evaluating each line in the file and doing some dictionary work gets you the desired result:
with open(filename) as fobj:
next(fobj) # skip first line with word `results`
data = [eval(line) for line in fobj if line.strip()]
res = []
for entry in data:
d = entry[0].copy()
for x in entry[1:]:
d.update(x)
res.append(d)
df = pd.DataFrame(res)
df.reindex_axis(['type', 'a', 'b', 'c', 'x', 'y', 'z'], axis=1)
df
If you unwanted text on these line. You can remove everything out side the []
:
eval('[' + line.split('[')[-1].split(']')[0] + ']')
Alternatively, you can use a regular expression:
import re
eval(re.findall(r'\[.*?\]', line)[0])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.