简体   繁体   English

Select 仅来自 Dataframe 的那些行,其中某些带有后缀的列的值不等于零

[英]Select only those rows from a Dataframe where certain columns with suffix have values not equal to zero

I want to select only those rows from a dataframe where certain columns with suffix have values not equal to zero.我想 select 仅来自 dataframe 的那些行,其中某些带有后缀的列的值不等于零。 Also the number of columns is more so I need a generalised solution.此外,列数更多,所以我需要一个通用的解决方案。

eg:例如:

import pandas as pd
data = {
    'ID' : [1,2,3,4,5],
    'M_NEW':[10,12,14,16,18],
    'M_OLD':[10,12,14,16,18],
    'M_DIFF':[0,0,0,0,0],
    'CA_NEW':[10,12,16,16,18],
    'CA_OLD':[10,12,14,16,18],
    'CA_DIFF':[0,0,2,0,0],
    'BC_NEW':[10,12,14,16,18],
    'BC_OLD':[10,12,14,16,17],
    'BC_DIFF':[0,0,0,0,1]
}
df = pd.DataFrame(data)
df

The dataframe would be: dataframe 将是:

   ID  M_NEW  M_OLD  M_DIFF  CA_NEW  CA_OLD  CA_DIFF  BC_NEW  BC_OLD  BC_DIFF
0   1     10     10       0      10      10        0      10      10        0
1   2     12     12       0      12      12        0      12      12        0
2   3     14     14       0      16      14        2      14      14        0
3   4     16     16       0      16      16        0      16      16        0
4   5     18     18       0      18      18        0      18      17        1

The desired output is: (because of 2 in CA_DIFF and 1 in BC_DIFF)所需的 output 是:(因为 CA_DIFF 中为 2,BC_DIFF 中为 1)

   ID  M_NEW  M_OLD  M_DIFF  CA_NEW  CA_OLD  CA_DIFF  BC_NEW  BC_OLD  BC_DIFF
0   3     14     14       0      16      14        2      14      14        0
1   5     18     18       0      18      18        0      18      17        1

This works with using multiple conditions but what if the number of DIFF columns are more?这适用于使用多个条件,但如果 DIFF 列的数量更多怎么办? Like 20?比如20? Can someone provide a general solution?有人可以提供一个通用的解决方案吗? Thanks.谢谢。

You can do this:你可以这样做:


...
# get all columns with X_DIFF
columns = df.columns[df.columns.str.contains('_DIFF')]

# check if any has value greater than 0
df[df[columns].transform(lambda x: x > 0).any(axis=1)]

You could use the function below, combined with pipe to filter rows, based on various conditions:您可以使用下面的 function,结合pipe根据各种条件过滤行:

In [22]: def filter_rows(df, dtype, columns, condition, any_True = True):
    ...:     temp = df.copy()
    ...:     if dtype:
    ...:         temp = df.select_dtypes(dtype)
    ...:     if columns:
    ...:         booleans = temp.loc[:, columns].transform(condition)
    ...:     else:
    ...:         booleans = temp.transform(condition)
    ...:     if any_True:
    ...:         booleans = booleans.any(axis = 1)
    ...:     else:
    ...:         booleans = booleans.all(axis = 1)
    ...: 
    ...:     return df.loc[booleans]

In [24]: df.pipe(filter_rows,
                 dtype=None, 
                 columns=lambda df: df.columns.str.endswith("_DIFF"),
                 condition= lambda df: df.ne(0)
                 )

Out[24]: 
   ID  M_NEW  M_OLD  M_DIFF  CA_NEW  CA_OLD  CA_DIFF  BC_NEW  BC_OLD  BC_DIFF
2   3     14     14       0      16      14        2      14      14        0
4   5     18     18       0      18      18        0      18      17        1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从 DataFrame 中删除某些列只有零值的行 - How to remove rows from a DataFrame where some columns only have zero values 如何仅保存 DataFrame 中的那些行,哪些列具有列表中的值? - How can I save only those rows from a DataFrame which columns have values from the list? 某些列具有最大值之一的 select 行如何 - How select rows where certain columns have one of the largest values 比较两个不同大小的数据帧的各种(但不是全部)列,并从一个数据帧中仅选择条件为真的那些行 - Comparing various (but not all) columns of two different sized dataframes and select only those rows from one dataframe where the conditions are true 如何仅将 dataframe 中的行添加到某些列中的值不匹配的另一行 - How to only add rows from one dataframe to another where values don't match in certain columns Select dataframe 中的列,其中值小于 x 值并在某些行上排除此操作 - Select the columns in a dataframe where values are less than x value and exclude this operation on certain rows Python:如何删除多列具有相等值的行? - Python: How to remove rows where multiple columns have equal values? 选择两个给定日期之间的pandas数据帧,其中两列的值相等 - select pandas dataframe between two given dates where values from two columns are equal 数据框选择某些列值的行 - Dataframe select rows where certain columns are sup to value 我试图在数据帧中划分某些行和列,最后得到原始数据帧但是使用这些新值 - I am trying to divide certain rows and columns in a dataframe and end up with the original dataframe but with those new values
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM