如何根据条件从 DF 创建 DF

Question

我现在的 DF 看起来像这样

Combinations               Count
1   ('IDLY', 'VADA')       3734
6   ('DOSA', 'IDLY')        2020
9   ('CHAPPATHI', 'DOSA')   1297
10  ('IDLY', 'POORI')       1297
11  ('COFFEE', 'TEA')       1179
13  ('DOSA', 'VADA')        1141
15  ('CHAPPATHI', 'IDLY')   1070
16  ('COFFEE', 'SAMOSA')    1061
17  ('COFFEE', 'IDLY')      1016
18  ('POORI', 'VADA')       1008

假设我从上面的数据框中按关键字“DOSA”过滤我得到下面的 OP

    Combinations           Count
6   ('DOSA', 'IDLY')        2020
9   ('CHAPPATHI', 'DOSA')   1297
13  ('DOSA', 'VADA')        1141

但我希望 output 像下面的 df 一样（它忽略了过滤器关键字作为其常见，

    Combinations    Count
6   IDLY            2020
9   CHAPPATHI       1297
13  VADA            1141

这里需要用到pandas的什么概念？ 如何做到这一点？

Answer 1

您也可以尝试创建一个 dataframe 作为参考，然后将关键字与stack匹配的位置屏蔽以删除 NaN：

keyword = 'DOSA'

m = pd.DataFrame(df['Combinations'].tolist(),index=df.index)
c = m.eq(keyword).any(1)
df[m.eq(keyword).any(1)].assign(Combinations=
                         m[c].where(m[c].ne(keyword)).stack().droplevel(1))

   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141

对于字符串类型，您可以通过以下方式转换为元组：

import ast
df['Combinations'] = df['Combinations'].apply(ast.literal_eval)

Answer 2

通常，在 dataframe 中包含list, tuples, sets,等并不理想。 最好在需要时为每个实例设置多个记录。

您可以使用explode将Combinations转换为这种形式并对其进行过滤

keyword = 'DOSA'

s = df.explode('Combinations')

s.loc[s.Combinations.eq('keyword').groupby(level=0).transform('any') & s.Combinations.ne('keyword')]

或者用.loc[lambda ]链接这两个命令：

(df.explode('Combinations')
   .loc[lambda x: x.Combinations.ne(keyword) & 
            x.Combinations.eq(keyword).groupby(level=0).transform('any')]
)

Output：

   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141

Answer 3

我将要做的

x=df.explode('Combinations')
x=x.loc[x.index[x.Combinations=='DOSA']].query('Combinations !="DOSA"')
x
   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141

Answer 4

d = df[df['Combinations'].transform(lambda x: 'DOSA' in x)].copy()
d['Combinations'] = d['Combinations'].apply(lambda x: set(x).difference(['DOSA']).pop())
print(d)

印刷：

   ID Combinations  Count
1   6         IDLY   2020
2   9    CHAPPATHI   1297
5  13         VADA   1141

如何根据条件从 DF 创建 DF

问题描述

4 个解决方案

解决方案1
1 2020-06-01 17:05:03

解决方案2
1 已采纳 2020-06-01 17:09:49

解决方案3
1 2020-06-01 17:34:36

解决方案4
0 2020-06-01 17:16:19

如何根据条件从 DF 创建 DF

问题描述

4 个解决方案

解决方案1 1 2020-06-01 17:05:03

解决方案2 1 已采纳 2020-06-01 17:09:49

解决方案3 1 2020-06-01 17:34:36

解决方案4 0 2020-06-01 17:16:19

解决方案1
1 2020-06-01 17:05:03

解决方案2
1 已采纳 2020-06-01 17:09:49

解决方案3
1 2020-06-01 17:34:36

解决方案4
0 2020-06-01 17:16:19