简体   繁体   English

如何根据条件从 DF 创建 DF

[英]how to create a DF from a DF based on a condition

My current DF looks like this我现在的 DF 看起来像这样

Combinations               Count
1   ('IDLY', 'VADA')       3734
6   ('DOSA', 'IDLY')        2020
9   ('CHAPPATHI', 'DOSA')   1297
10  ('IDLY', 'POORI')       1297
11  ('COFFEE', 'TEA')       1179
13  ('DOSA', 'VADA')        1141
15  ('CHAPPATHI', 'IDLY')   1070
16  ('COFFEE', 'SAMOSA')    1061
17  ('COFFEE', 'IDLY')      1016
18  ('POORI', 'VADA')       1008

Lets say I filter by the keyword 'DOSA' from above data frame I get the below OP假设我从上面的数据框中按关键字“DOSA”过滤我得到下面的 OP

    Combinations           Count
6   ('DOSA', 'IDLY')        2020
9   ('CHAPPATHI', 'DOSA')   1297
13  ('DOSA', 'VADA')        1141

But I would like the output to be like the df below(which has ignored the filter key word as its common,但我希望 output 像下面的 df 一样(它忽略了过滤器关键字作为其常见,

    Combinations    Count
6   IDLY            2020
9   CHAPPATHI       1297
13  VADA            1141

What concept of pandas needs to be used here?这里需要用到pandas的什么概念? How can this be achieved?如何做到这一点?

you can also try creating a dataframe as a reference, then mask where keyword matches with stack for dropping NaN:您也可以尝试创建一个 dataframe 作为参考,然后将关键字与stack匹配的位置屏蔽以删除 NaN:

keyword = 'DOSA'

m = pd.DataFrame(df['Combinations'].tolist(),index=df.index)
c = m.eq(keyword).any(1)
df[m.eq(keyword).any(1)].assign(Combinations=
                         m[c].where(m[c].ne(keyword)).stack().droplevel(1))

   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141

For string type, you can convert into tuple by:对于字符串类型,您可以通过以下方式转换为元组:

import ast
df['Combinations'] = df['Combinations'].apply(ast.literal_eval)

In general, it's not ideal to have list, tuples, sets, etc inside a dataframe.通常,在 dataframe 中包含list, tuples, sets,等并不理想。 It's better to have multiple records for each instance when needed.最好在需要时为每个实例设置多个记录。

You can use explode turn Combinations into this form and filter on that您可以使用explodeCombinations转换为这种形式并对其进行过滤

keyword = 'DOSA'

s = df.explode('Combinations')

s.loc[s.Combinations.eq('keyword').groupby(level=0).transform('any') & s.Combinations.ne('keyword')]

Or chain the two commands with .loc[lambda ] :或者用.loc[lambda ]链接这两个命令:

(df.explode('Combinations')
   .loc[lambda x: x.Combinations.ne(keyword) & 
            x.Combinations.eq(keyword).groupby(level=0).transform('any')]
)

Output: Output:

   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141

What I will do我将要做的

x=df.explode('Combinations')
x=x.loc[x.index[x.Combinations=='DOSA']].query('Combinations !="DOSA"')
x
   Combinations  Count
6          IDLY   2020
9     CHAPPATHI   1297
13         VADA   1141
d = df[df['Combinations'].transform(lambda x: 'DOSA' in x)].copy()
d['Combinations'] = d['Combinations'].apply(lambda x: set(x).difference(['DOSA']).pop())
print(d)

Prints:印刷:

   ID Combinations  Count
1   6         IDLY   2020
2   9    CHAPPATHI   1297
5  13         VADA   1141

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据条件在循环中创建多个df - create multiple df in loop based on condition Pandas:根据条件将数据从df提取到新的df - Pandas: extract data from df to a new df based on condition 根据条件在另一个 df 的 df 子集中填充 NaN - Filling NaNs in a subset of a df from another df based on a condition 根据条件从第一个 df 到另一个 df 的列值 - Column value from first df to another df based on condition 如何根据条件减去 pandas df 中的列 - How to substract columns in pandas df based on condition 如何在一定条件下将值从一个df复制到原始df? - How to copy values from one df to the original df with a certain condition? 如何根据条件从一个原始 df 创建多个 df,然后为它们分配单独的名称 - How do I create several df's out of one original df based on a condition and then assign them individual names 如何根据熊猫中的某些条件将df1中的一行与df2中的其他行进行比较? - How to compare one row from df1 from other rows from df2 based on some condition in pandas? 根据 df1 上的条件创建 pd 系列,并报告来自 df2 或 df3 的值 - Create pd series based on conditions on df1, and reporting values from df2 or df3 Python、Pandas、df 2 部分问题:1. 如何根据特定条件将列添加到列表中 2. 如何从 df 中删除这些列 - Python, Pandas, df 2 part question: 1. how to add a column into a list based of a certain condition 2. how to remove those columns from df
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM