根据另一列和字典值填充熊猫的 dataframe 列

Question

I have a data frame that contain a column called DIAGNOSES.我有一个数据框，其中包含一个名为 DIAGNOSES 的列。 This DIAGNOSES column contain a list of 1 or multiple strings, starting with a Character.此 DIAGNOSES 列包含 1 个或多个字符串的列表，以字符开头。

I want to check the first character of every row in DIAGNOSES and grab its first char to look it up from a dictionary to populate DIAGNOSES_TYPE column with these values.我想检查 DIAGNOSES 中每一行的第一个字符并获取它的第一个字符以从字典中查找它以使用这些值填充 DIAGNOSES_TYPE 列。

Minimal Example:最小的例子：

diagnoses = {'A': 'Arbitrary', 'B': 'Brutal', 'C': 'Cluster', 'D': 'Dropped'}

df = pd.DataFrame({'DIAGNOSES': [['A03'], ['A03', 'B23'], ['A30', 'B54', 'D65', 'C60']]})

              DIAGNOSES
0                 [A03]
1            [A03, B23]
2  [A30, B54, D65, C60]

A little visualization to clarify what I want to get, I want to get the df['DIAGNOSES_TYPES'] populated:一点可视化来澄清我想要得到的东西，我想要填充 df['DIAGNOSES_TYPES'] ：

I approached it this way:我是这样处理的：

def map_diagnose(df)
    for col in len(range(df)):
        for d in df['DIAGNOSIS']:
            for diag in d:
                if diag[0] in diagnoses_dict.keys():
                    df['DIAGNOSES_TYPES'] = diagnoses_dict.get(diag[0])
                df['DIAGNOSES_TYPES'] = ''
    return df

Answer 1

use explode , map and groupby :使用explode ， map和groupby ：

diagnoses = {'A': 'Arbitrary', 'B': 'Brutal', 'C': 'Cluster', 'D': 'Dropped'}
df1 = df.explode('DIAGNOSES')
df1['SD'] = df1['DIAGNOSES'].str.extract('(\D)')
df1['DIAGNOSES_TYPES'] = df1['SD'].map(diagnoses)
df1.groupby(level=0).agg(list)

output: output：

    DIAGNOSES                SD             DIAGNOSES_TYPES
0   [A03]                    [A]            [Arbitrary]
1   [A03, B23]               [A, B]         [Arbitrary, Brutal]
2   [A30, B54, D65, C60]     [A, B, D, C]   [Arbitrary, Brutal, Dropped, Cluster]

Column 'SD' there is the first letter of each dagnoses used for mapping;列'SD'是每个用于映射的dagnoses的第一个字母； you can drop this column if not needed如果不需要，您可以drop此列

Answer 2

You can explode "DIAGNOSES" column, get the first elements of each string using str , map diagnoses dictionary to get types, groupby the index and aggregate to a list:您可以展开“诊断”列，使用str获取每个字符串的第一个元素， explode diagnoses字典以获取类型，按索引groupby并聚合到列表：

df['DIAGNOSES_TYPE'] = df['DIAGNOSES'].explode().str[0].map(diagnoses).groupby(level=0).apply(list)

Output: Output：

              DIAGNOSES                         DIAGNOSES_TYPE
0                 [A03]                            [Arbitrary]
1            [A03, B23]                    [Arbitrary, Brutal]
2  [A30, B54, D65, C60]  [Arbitrary, Brutal, Dropped, Cluster]

根据另一列和字典值填充熊猫的 dataframe 列

问题描述

2 个解决方案

解决方案1
0 2022-01-29 21:35:08

解决方案2
0 2022-01-29 21:53:00

根据另一列和字典值填充熊猫的 dataframe 列

问题描述

2 个解决方案

解决方案1 0 2022-01-29 21:35:08

解决方案2 0 2022-01-29 21:53:00

解决方案1
0 2022-01-29 21:35:08

解决方案2
0 2022-01-29 21:53:00