熊猫：根据条件过滤 df

Question

let's say I have a dataframe like假设我有一个像

A B 
11 2             # PASS 
22 4             # FAIL
33 5             # FAIL
44 4             # PASS

And two dicts like:和两个字典，如：

B_column_dct = {2: [2,3,5], 4: [33,22,121], 5: [1,2,3]}    # the dict key will have multiple values in a list
A_column_dct = {11: [3], 22: [4], 33: [5], 44: [22]}  # the dict key will always have a single value in a list

Now I want to filter the above dataframe, such that for every value in column A and B it should only be present in the df if: A_column_dct's value is present in B_column_dct's corresponding value.现在我想过滤上面的数据框，这样对于列 A 和 B 中的每个值，它应该只出现在 df 中，如果： A_column_dct 的值存在于 B_column_dct 的相应值中。

The final result df:最终结果df：

A B 
11 2            
44 4

Answer 1

Sorry to say but I cannot completely make sense of your values and the filtered df that you're trying to create, primarily given that a dict cannot hold duplicate keys (ie in the B-column of the org df, the value 4 will not work properly. I tried to get it to work anyway, thinking that they key 4 in the b_dict represents BOTH the column-b values, but then I didn't arrive at the same conclusion as you did in terms of the filtered df. Anyway, below is the code I've used (possibly the longest one-liner I've made so far, I would advice to re-write for readability):很抱歉，但我无法完全理解您的值和您尝试创建的过滤后的 df，主要是因为 dict 不能保存重复键（即在 org df 的 B 列中，值 4 不会正常工作。无论如何我试图让它工作，认为他们在 b_dict 中的键 4 代表了列 b 的值，但后来我没有得出与你在过滤后的 df 方面所做的相同的结论。无论如何，下面是我使用的代码（可能是我迄今为止制作的最长的单行代码，我建议重新编写以提高可读性）：

flat_a = list(set().union(*A_column_dct.values()))
flat_b = list(set().union(*B_column_dct.values()))


filtering = [(any(elem_a in flat_b for elem_a in A_column_dct[i])) and (any(elem_b in flat_a for elem_b in B_column_dct[j])) for i, j in zip(org_df["A"], org_df["B"])]

filtered_df = org_df[filtering]

熊猫：根据条件过滤 df

问题描述

1 个解决方案

解决方案1
0 2020-10-27 12:25:50

熊猫：根据条件过滤 df

问题描述

1 个解决方案

解决方案1 0 2020-10-27 12:25:50

解决方案1
0 2020-10-27 12:25:50