CSV中的字典列表

Question

我有一個CSV文檔，其中包含一列，其中每個單元格包含字典列表。 任何有關如何提取數據同時將其保留為字典列表的建議都將受到贊賞。 我已經嘗試了通常的json / pandas / csv讀入，但它們似乎都無法正常工作（轉換為字符串/ unicode，這並不奇怪，但仍然令人沮喪）。 最終，我希望輸出是一個數據框，其中標題行是鍵，隨后的每一行是數據。

樣本CSV部分：

1    results

2    [{"y": 47, "type": "square"}, {"type": "square", "b": 49}, {"type": "square", "z": 29}, {"a": 69, "type": "square"}, {"type": "square", "x": 81}]

3    [{"type": "circle", "b": 90}, {"y": 12, "type": "circle"}, {"a": 78, "type": "circle"}, {"type": "circle", "c": 74}, {"type": "circle", "x": 14}, {"type": "circle", "z": 19}]

4    [{"type": "square", "b": 85}, {"type": "square", "x": 73}, {"type": "square", "c": 50}]

5    [{"type": "triangle", "c": 71}, {"type": "triangle", "z": 66}, {"type": "triangle", "x": 16}, {"type": "triangle", "b": 38}, {"y": 67, "type": "triangle"}, {"a": 80, "type": "triangle"}]

樣本輸出：

  type      a   b   c   x   y   z
0 square    69  49  NaN 81  47  29
1 circle    78  90  74  14  12  19
2 square    NaN 85  50  73  NaN NaN
3 triangle  80  38  71  16  67  66

Answer 1

評估文件中的每一行並進行一些詞典工作即可獲得所需的結果：

with open(filename) as fobj:
    next(fobj)  # skip first line with word `results`
    data = [eval(line) for line in fobj if line.strip()]
res = []
for entry in data:
    d = entry[0].copy()
    for x in entry[1:]:
        d.update(x)
    res.append(d)
df = pd.DataFrame(res)
df.reindex_axis(['type', 'a', 'b', 'c', 'x', 'y', 'z'], axis=1)
df

如果您在這些行上不需要文本。 您可以刪除[] ：

eval('[' + line.split('[')[-1].split(']')[0] + ']')

另外，您可以使用正則表達式：

import re

eval(re.findall(r'\[.*?\]', line)[0])

CSV中的字典列表

問題描述

1 個解決方案

解決方案1
2 已采納 2016-12-29 16:46:46

CSV中的字典列表

問題描述

1 個解決方案

解決方案1 2 已采納 2016-12-29 16:46:46

解決方案1
2 已采納 2016-12-29 16:46:46