Dataframe 到字典分组和键/值是其他列

Question

Back my on bs, friends.朋友们，支持我的BS。 I have a dataframe like so:我有一个 dataframe 像这样：

+-------+--------------------------+------+-------+------+
| index |                specialty | code | count | rank |
+-------+--------------------------+------+-------+------+
| 19    | Colon and Rectal Surgery | 1557 | 36    | 5.0  |
+-------+--------------------------+------+-------+------+
| 22    | Surgical Oncology        | 1557 | 22    | 14.0 |
+-------+--------------------------+------+-------+------+
| 147   | Hematology               | 2057 | 383   | 13.0 |
+-------+--------------------------+------+-------+------+
| 753   | Oncology                 | 1578 | 74    | 15.0 |
+-------+--------------------------+------+-------+------+
| 1089  | Dental General Practice  | 1257 | 6     | 2.5  |
+-------+--------------------------+------+-------+------+

There are multiple entries per specialty -- ie I have the count and rank of codes for Specialty X up to Rank 25.每个专业有多个条目——即我有专业 X 的代码计数和排名，最高可达 25 级。

I'm trying to use a lamba function to group by specialty but I can't figure out how to add the columns as the keys/values and create a list of dict rather than just a giant dict.我正在尝试使用lamba function 按专业分组，但我不知道如何将列添加为键/值并创建字典列表而不仅仅是一个巨大的字典。

d = (df2.groupby('specialty').apply(lambda x: dict(zip(x['code'], x['Rank']))).to_dict())

print(d)

{'Acute Care Hospital': {
    1562: 8.0, 
    1554: 11.0, 
    6095: 8.0, 
    119114: 1.0, 
    119117: 5.5, 
    284051: 4.0, 
    562577: 11.0, 
    582646: 8.0, 
    1631305: 2.0, 
    1641114: 5.5, 
    1751592: 3.0, 
    1873207: 11.0
}

How do I get preserve the columns as the keys like so and it be a list per specialty:我如何将列保留为键，并且它是每个专业的列表：

[
    {'specialty': Acute Care Hospital', 
    [
        {'code': 1562, 'rank': 8.0, 
        'code': 1554, 'rank' :11.0, 
        'code': 6095, 'rank': 8.0, 
        'code': 119114, 'rank' 1.0, 
        'code': 119117, 'rank': 5.5, 
        'code': 284051, 'rank': 4.0, 
        'code': 562577, 'rank': 11.0, 
        'code': 582646, 'rank' 8.0, 
        'code': 1631305, 'rank': 2.0, 
        'code': 1641114, 'rank': 5.5, 
        'code': 1751592, 'rank': 3.0, 
        'code': 1873207, 'rank': 11.0}
    ]
    }
]

Answer 1

The outcome that you posted will not work as it contains duplicate keys.您发布的结果将不起作用，因为它包含重复的键。 The below solutions might be what you are after as it makes the code rank pairs callable from the dictionary.以下解决方案可能是您所追求的，因为它使代码等级对可以从字典中调用。

This one creates another dictionary level underneath the main level speciality, where the code and rank pairs are on the same index in the arrays:这在主要级别专业下创建了另一个字典级别，其中代码和排名对位于 arrays 中的同一索引上：

df.groupby('specialty').apply(lambda x: {'code':x['code'].values,'Rank':x['Rank'].values}).to_dict()

Or the next one simply gets the result from the groupby and places that under the key code_rank_pair:或者下一个简单地从 groupby 中获取结果并将其放在 key code_rank_pair 下：

df.groupby('specialty').apply(lambda x: {'code_rank_pair':x.loc[:,['code','Rank']]}).to_dict()

Dataframe 到字典分组和键/值是其他列

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-08-06 08:23:28

Dataframe 到字典分组和键/值是其他列

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-08-06 08:23:28

解决方案1
1 已采纳 2020-08-06 08:23:28