简体   繁体   English

当在给定行中找到字典值时,使用字典键作为行值的新DataFrame列

[英]New DataFrame column using the key of a dictionary as row value when one of it's values is found in a given row

I have a Pandas DataFrame with a large number of unique values. 我有一个带有大量唯一值的Pandas DataFrame。 I would like to group these values with a more general column. 我想将这些值与更一般的列进行分组。 By doing so I expect to add hierarchies to my data and thus make analysis easier. 这样,我希望将层次结构添加到我的数据中,从而使分析更加容易。

One thing that worked was to copy the column and replaced the values as follows: 起作用的一件事是复制该列并替换值,如下所示:

data.loc[data['new_col'].str.contains('string0|string1'), 'new_col']\
         = 'substitution'

However, I am trying to find a way to reproduce this easily without adding a condition for each entry. 但是,我试图找到一种轻松地重现此方法而不为每个条目添加条件的方法。

Also tried using without success using the following methods: 还尝试使用以下方法成功使用:

  • dict.items() dict.items()
  • pd.df.replace() Those attempts were futile for me. pd.df.replace()这些尝试对我来说是徒劳的。

I would like to hear your advice to know how to approach this. 我想听听您的建议,以了解如何解决此问题。

import pandas as pd
# My DataFrame looks similar to this:
>>> df = pd.DataFrame({'A': ['a', 'w', 'c', 'd', 'z']})

# The dictionary were I store the generalization:
>>> subs = {'g1': ['a', 'b', 'c', 'd'],
...         'g2': ['w', 'x', 'y', 'z']}

>>> df
   A  H
0  a  g1
1  w  g2
2  c  g1
3  d  g1
4  z  g2

create a new dict by swapping key with values of list. 通过将键与list值交换来创建新的字典。 Next, map df.A with the swapped dict. 接下来,将df.A与已交换的dict映射。

swap_dict = {x: k for k, v in d.items() for x in v}

Out[1054]:
{'a': 's1',
 'b': 's1',
 'c': 's1',
 'd': 's1',
 'w': 's2',
 'x': 's2',
 'y': 's2',
 'z': 's2'}

df['H'] = df.A.map(swap_dict)

Out[1058]:
   A   H
0  a  s1
1  w  s2
2  c  s1
3  d  s1
4  z  s2

Note : I directly use keys of your dict as values of H instead of g1 , g2 ,.... because I think it is enough to identify each group of values. 注意 :我直接将字典的键用作H值,而不是g1g2 ,....,因为我认为足以识别每组值。 If you still want g1 , g2 ,..., it is easy to accomplish. 如果您仍然想要g1g2 ,...,则很容易实现。 Just let me know. 请让我知道。
I also named your dict as d in my code 我在代码中也将您的字典命名为d

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:当一个键有多个值时,如何将字典写入csv文件,每个键是一个新行,但每个值是一个新列? - Python: How to write dictionary to csv file when one key has multiple values, each key is a new row, but each value is a new column? 数据框如何在给定列中找到字符串时添加特定行 - dataframe how to add a specific row when a string is found in a given column 如果一个数据框的行值在另一数据框的列中,则创建一个新列并获取该索引 - Create a new column if one dataframe's row value is in another data frame's column and get that index 迭代 dataframe 并根据一列的值在具有前一行值的新列中执行操作 - iterrate over dataframe and based on the value of one column do operations in a new column with previous row's value 根据上一行的值在熊猫数据框中创建一个新列 - Create a new column in a pandas dataframe based on values found on a previous row 使用新值逐行更改数据框的值 - Changing the values of a dataframe row by row with a new value 如果列值为“ foo”,则在同一行上将数据框追加新值吗? - If column value is “foo”, append dataframe with new values on the same row? 熊猫:在数据框的最后一行添加一个具有单个值的新列 - Pandas: add a new column with one single value at the last row of a dataframe 使用熊猫按字母顺序按一列和一行中的值对数据框进行排序 - Using pandas to order a dataframe alphabetically by values in one column and one row 使用来自不同行的值在 DataFrame 中创建新列 - Create new column in a DataFrame using values from a different row
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM