Pandas groupby並在列表中獲得dict

Question

我正在嘗試提取分組行數據以使用值將標簽顏色繪制為另一個文件。

我的數據框如下所示。

df = pd.DataFrame({'x': [1, 4, 5], 'y': [3, 2, 5], 'label': [1.0, 1.0, 2.0]})

    x   y   label
0   1   3   1.0
1   4   2   1.0
2   5   5   2.0

我想獲得一組標簽列表

{'1.0': [{'index': 0, 'x': 1, 'y': 3}, {'index': 1, 'x': 4, 'y': 2}],
 '2.0': [{'index': 2, 'x': 5, 'y': 5}]}

這個怎么做？

Answer 1

df = pd.DataFrame({'x': [1, 4, 5], 'y': [3, 2, 5], 'label': [1.0, 1.0, 2.0]})
df['index'] = df.index
df
   label  x  y  index
0    1.0  1  3      0
1    1.0  4  2      1
2    2.0  5  5      2

df['dict']=df[['x','y','index']].to_dict("records")
df
   label  x  y  index                             dict
0    1.0  1  3      0  {u'y': 3, u'x': 1, u'index': 0}
1    1.0  4  2      1  {u'y': 2, u'x': 4, u'index': 1}
2    2.0  5  5      2  {u'y': 5, u'x': 5, u'index': 2}

df = df[['label','dict']]
df['label'] = df['label'].apply(str) #Converting integer column 'label' to string
df = df.groupby('label')['dict'].apply(list) 
desired_dict = df.to_dict()
desired_dict 
    {'1.0': [{'index': 0, 'x': 1, 'y': 3}, {'index': 1, 'x': 4, 'y': 2}],
     '2.0': [{'index': 2, 'x': 5, 'y': 5}]}

Answer 2

您可以將collections.defaultdict與to_dict一起to_dict ：

from collections import defaultdict

# add 'index' series
df = df.reset_index()

# initialise defaultdict
dd = defaultdict(list)

# iterate and append
for d in df.to_dict('records'):
    dd[d['label']].append(d)

結果：

print(dd)

defaultdict(list,
            {1.0: [{'index': 0.0, 'x': 1.0, 'y': 3.0, 'label': 1.0},
                   {'index': 1.0, 'x': 4.0, 'y': 2.0, 'label': 1.0}],
             2.0: [{'index': 2.0, 'x': 5.0, 'y': 5.0, 'label': 2.0}]})

通常，沒有必要轉換回常規dict ，因為defaultdict是dict的子類。

Answer 3

你可以使用itertuples和defulatdict ：

itertuples返回命名元組以迭代數據幀：

for row in df.itertuples():
    print(row)
Pandas(Index=0, x=1, y=3, label=1.0)
Pandas(Index=1, x=4, y=2, label=1.0)
Pandas(Index=2, x=5, y=5, label=2.0)

所以利用這個：

from collections import defaultdict
dictionary = defaultdict(list)
for row in df.itertuples():
    dummy['x'] = row.x
    dummy['y'] = row.y
    dummy['index'] = row.Index
    dictionary[row.label].append(dummy)

dict(dictionary)
> {1.0: [{'x': 1, 'y': 3, 'index': 0}, {'x': 4, 'y': 2, 'index': 1}],
 2.0: [{'x': 5, 'y': 5, 'index': 2}]}

Answer 4

您想要的最快的解決方案幾乎與@cph_sto提供的一樣，

>>> df.reset_index().to_dict('records')
[{'index': 0.0, 'label': 1.0, 'x': 1.0, 'y': 3.0}, {'index': 1.0, 'label': 1.0, 'x': 4.0, 'y': 2.0}, {'index': 2.0, 'label': 2.0, 'x': 5.0, 'y': 5.0}]

也就是說，將索引轉換為常規列，然后應用to_dict的records版本。 感興趣的另一種選擇：

>>> df.to_dict('index')
{0: {'label': 1.0, 'x': 1.0, 'y': 3.0}, 1: {'label': 1.0, 'x': 4.0, 'y': 2.0}, 2: {'label': 2.0, 'x': 5.0, 'y': 5.0}}

有關更多信息，請查看to_dict上的幫助。

Pandas groupby並在列表中獲得dict

問題描述

4 個解決方案

解決方案1
5 2018-12-25 11:28:01

解決方案2
2 2018-12-25 12:50:54

解決方案3
1 已采納 2018-12-25 11:28:41

解決方案4
1 2018-12-25 11:52:45

Pandas groupby並在列表中獲得dict

問題描述

4 個解決方案

解決方案1 5 2018-12-25 11:28:01

解決方案2 2 2018-12-25 12:50:54

解決方案3 1 已采納 2018-12-25 11:28:41

解決方案4 1 2018-12-25 11:52:45

解決方案1
5 2018-12-25 11:28:01

解決方案2
2 2018-12-25 12:50:54

解決方案3
1 已采納 2018-12-25 11:28:41

解決方案4
1 2018-12-25 11:52:45