如何根据字典键和值过滤熊猫数据框行？

Question

I have a dataframe and a dictionary in Python as shown below and I need to filter the dataframe based on the dictionary.我在 Python 中有一个数据框和一个字典，如下所示，我需要根据字典过滤数据框。 As you see, the keys and values of the dictionary are two columns of the dataframe.如您所见，字典的键和值是数据框的两列。 I want to have a subset of dataframe which contains the keys and values of dictionary plus other columns.我想要一个数据框的子集，其中包含字典的键和值以及其他列。

df : df：

Customer_ID顾客ID	Category类别	Type类型	Delivery送货
40275 40275	Book书	Buy买	True真的
40275 40275	Software软件	Sell卖	False错误的
40275 40275	Video Game电子游戏	Sell卖	False错误的
40275 40275	Cell Phone手机	Sell卖	False错误的
39900 39900	CD/DVD CD/DVD	Sell卖	True真的
39900 39900	Book书	Buy买	True真的
39900 39900	Software软件	Sell卖	True真的
35886 35886	Cell Phone手机	Sell卖	False错误的
35886 35886	Video Game电子游戏	Buy买	False错误的
35886 35886	CD/DVD CD/DVD	Sell卖	False错误的
35886 35886	Software软件	Sell卖	False错误的
40350 40350	Software软件	Sell卖	True真的
28129 28129	Software软件	Buy买	False错误的

And dictionary is:字典是：

d = {
 40275: ['Book','Software'],
 39900: ['Book'],
 35886: ['Software'],
 40350: ['Software'],
 28129: ['Software']
 }

And I need the following dataframe:我需要以下数据框：

Customer_ID顾客ID	Category类别	Type类型	Delivery送货
40275 40275	Book书	Buy买	True真的
40275 40275	Software软件	Sell卖	False错误的
39900 39900	Book书	Buy买	True真的
35886 35886	Software软件	Sell卖	False错误的
40350 40350	Software软件	Sell卖	True真的
28129 28129	Software软件	Buy买	False错误的

Answer 1

We can set_index to the Customer_ID and Category columns then build a list of tuples from the dictionary d and reindex the DataFrame to include only the rows which match the list of tuples, then reset_index to restore the columns:我们可以set_index到Customer_ID和Category列，然后从字典d构建元组列表并reindex DataFrame 以仅包含与元组列表匹配的行，然后reset_index恢复列：

new_df = df.set_index(['Customer_ID', 'Category']).reindex(
    [(k, v) for k, lst in d.items() for v in lst]
).reset_index()

new_df : new_df ：

   Customer_ID  Category  Type  Delivery
0        40275      Book   Buy      True
1        40275  Software  Sell     False
2        39900      Book   Buy      True
3        35886  Software  Sell     False
4        40350  Software  Sell      True
5        28129  Software   Buy     False

*Note this only works if the MultiIndex is unique (like the shown example). *请注意，这只适用于 MultiIndex 是唯一的（如所示示例）。 It will also add rows if the dictionary does not represent a subset of the DataFrame's MultiIndex (which may or may not be the desired behaviour).如果字典不代表 DataFrame 的 MultiIndex 的子集（这可能是也可能不是所需的行为），它也会添加行。

Setup:设置：

import pandas as pd

d = {
    40275: ['Book', 'Software'],
    39900: ['Book'],
    35886: ['Software'],
    40350: ['Software'],
    28129: ['Software']
}

df = pd.DataFrame({
    'Customer_ID': [40275, 40275, 40275, 40275, 39900, 39900, 39900, 35886,
                    35886, 35886, 35886, 40350, 28129],
    'Category': ['Book', 'Software', 'Video Game', 'Cell Phone', 'CD/DVD',
                 'Book', 'Software', 'Cell Phone', 'Video Game', 'CD/DVD',
                 'Software', 'Software', 'Software'],
    'Type': ['Buy', 'Sell', 'Sell', 'Sell', 'Sell', 'Buy', 'Sell', 'Sell',
             'Buy', 'Sell', 'Sell', 'Sell', 'Buy'],
    'Delivery': [True, False, False, False, True, True, True, False, False,
                 False, False, True, False]
})

Answer 2

You can use df.merge with df.append :您可以将df.merge与df.append df.merge使用：

In [444]: df1 = pd.DataFrame.from_dict(d, orient='index', columns=['Cat1', 'Cat2']).reset_index()

In [449]: res = df.merge(df1[['index', 'Cat1']], left_on=['Customer_ID', 'Category'], right_on=['index', 'Cat1']).drop(['index', 'Cat1'], 1)

In [462]: res = res.append(df.merge(df1[['index', 'Cat2']], left_on=['Customer_ID', 'Category'], right_on=['index', 'Cat2']).drop(['index', 'Cat2'], 1)).sort_values('Customer_ID', ascending=False)

In [463]: res
Out[463]: 
   Customer_ID  Category  Type  Delivery
3        40350  Software  Sell      True
0        40275      Book   Buy      True
0        40275  Software  Sell     False
1        39900      Book   Buy      True
2        35886  Software  Sell     False
4        28129  Software   Buy     False

Answer 3

Flatten the dictionary and create a new dataframe, then inner merge df with the new dataframe展平字典并创建一个新的数据帧，然后将df与新的数据帧进行内部合并

df.merge(pd.DataFrame([{'Customer_ID': k, 'Category': i} 
                       for k, v in d.items() for i in v]))

   Customer_ID  Category  Type  Delivery
0        40275      Book   Buy      True
1        40275  Software  Sell     False
2        39900      Book   Buy      True
3        35886  Software  Sell     False
4        40350  Software  Sell      True
5        28129  Software   Buy     False

如何根据字典键和值过滤熊猫数据框行？

问题描述

3 个解决方案

解决方案1
2 2021-11-14 07:38:32

解决方案2
0 2021-11-14 07:32:19

解决方案3
0 2021-11-14 07:35:35

如何根据字典键和值过滤熊猫数据框行？

问题描述

3 个解决方案

解决方案1 2 2021-11-14 07:38:32

解决方案2 0 2021-11-14 07:32:19

解决方案3 0 2021-11-14 07:35:35

解决方案1
2 2021-11-14 07:38:32

解决方案2
0 2021-11-14 07:32:19

解决方案3
0 2021-11-14 07:35:35