当在给定行中找到字典值时，使用字典键作为行值的新DataFrame列

Question

I have a Pandas DataFrame with a large number of unique values. 我有一个带有大量唯一值的Pandas DataFrame。 I would like to group these values with a more general column. 我想将这些值与更一般的列进行分组。 By doing so I expect to add hierarchies to my data and thus make analysis easier. 这样，我希望将层次结构添加到我的数据中，从而使分析更加容易。

One thing that worked was to copy the column and replaced the values as follows: 起作用的一件事是复制该列并替换值，如下所示：

data.loc[data['new_col'].str.contains('string0|string1'), 'new_col']\
         = 'substitution'

However, I am trying to find a way to reproduce this easily without adding a condition for each entry. 但是，我试图找到一种轻松地重现此方法而不为每个条目添加条件的方法。

Also tried using without success using the following methods: 还尝试使用以下方法成功使用：

dict.items() dict.items（）
pd.df.replace() Those attempts were futile for me. pd.df.replace（）这些尝试对我来说是徒劳的。

I would like to hear your advice to know how to approach this. 我想听听您的建议，以了解如何解决此问题。

import pandas as pd
# My DataFrame looks similar to this:
>>> df = pd.DataFrame({'A': ['a', 'w', 'c', 'd', 'z']})

# The dictionary were I store the generalization:
>>> subs = {'g1': ['a', 'b', 'c', 'd'],
...         'g2': ['w', 'x', 'y', 'z']}

>>> df
   A  H
0  a  g1
1  w  g2
2  c  g1
3  d  g1
4  z  g2

Answer 1

create a new dict by swapping key with values of list. 通过将键与list值交换来创建新的字典。 Next, map df.A with the swapped dict. 接下来，将df.A与已交换的dict映射。

swap_dict = {x: k for k, v in d.items() for x in v}

Out[1054]:
{'a': 's1',
 'b': 's1',
 'c': 's1',
 'd': 's1',
 'w': 's2',
 'x': 's2',
 'y': 's2',
 'z': 's2'}

df['H'] = df.A.map(swap_dict)

Out[1058]:
   A   H
0  a  s1
1  w  s2
2  c  s1
3  d  s1
4  z  s2

Note : I directly use keys of your dict as values of H instead of g1 , g2 ,.... because I think it is enough to identify each group of values. 注意：我直接将字典的键用作H值，而不是g1 ， g2 ，....，因为我认为足以识别每组值。 If you still want g1 , g2 ,..., it is easy to accomplish. 如果您仍然想要g1 ， g2 ，...，则很容易实现。 Just let me know. 请让我知道。
I also named your dict as d in my code 我在代码中也将您的字典命名为d

当在给定行中找到字典值时，使用字典键作为行值的新DataFrame列

问题描述

1 个解决方案

解决方案1
1 2019-08-17 01:36:12

当在给定行中找到字典值时，使用字典键作为行值的新DataFrame列

问题描述

1 个解决方案

解决方案1 1 2019-08-17 01:36:12

解决方案1
1 2019-08-17 01:36:12