简体   繁体   English

pandas根据另一列中的值创建一个列,该列选择作为条件

[英]pandas create a column based on values in another column which selected as conditions

I have the following df , 我有以下df

id    match_type    amount    negative_amount
1     exact         10        False
1     exact         20        False
1     name          30        False
1     name          40        False
1     amount        15        True
1     amount        15        True 
2     exact         0         False
2     exact         0         False

I want to create a column 0_amount_sum that indicates (boolean) if the amount sum is <= 0 or not for each id of a particular match_type , eg the following is the result df ; 我想创建一个列0_amount_sum ,指示(boolean)如果对于特定match_type每个id, amount和<= 0,例如,以下是结果df ;

id    match_type    amount    0_amount_sum    negative_amount   
1     exact         10        False           False
1     exact         20        False           False
1     name          30        False           False
1     name          40        False           False
1     amount        15        True            True
1     amount        15        True            True
2     exact         0         True            False
2     exact         0         True            False

for id=1 and match_type=exact , the amount sum is 30, so 0_amount_sum is False . 对于id=1match_type=exactamount总和为30,因此0_amount_sumFalse The code is as follows, 代码如下,

df = df.loc[df.match_type=='exact']

df['0_amount_sum_'] = (df.assign(
    amount_n=df.amount * np.where(df.negative_amount, -1, 1)).groupby(
    'id')['amount_n'].transform(lambda x: sum(x) <= 0))

df = df.loc[df.match_type=='name']

df['0_amount_sum_'] = (df.assign(
    amount_n=df.amount * np.where(df.negative_amount, -1, 1)).groupby(
    'id')['amount_n'].transform(lambda x: sum(x) <= 0))

df = df.loc[df.match_type=='amount']

df['0_amount_sum_'] = (df.assign(
    amount_n=df.amount * np.where(df.negative_amount, -1, 1)).groupby(
    'id')['amount_n'].transform(lambda x: sum(x) <= 0))

I am wondering if there is a better way/more efficient to do that, especially when the values of match_type is unknown, so the code can automatically enumerate all the possible values and then do the calculation accordingly. 我想知道是否有更好的方法/更高效的方法,特别是当match_type的值未知时,代码可以自动枚举所有可能的值,然后相应地进行计算。

I believe need groupby by 2 Series (columns) instead filtering: 我相信需要groupby by 2 Series (列)而不是过滤:

df['0_amount_sum_'] = ((df.amount * np.where(df.negative_amount, -1, 1))
                           .groupby([df['id'], df['match_type']])
                           .transform('sum')
                           .le(0))

   id match_type  amount  negative_amount  0_amount_sum_
0   1      exact      10            False          False
1   1      exact      20            False          False
2   1       name      30            False          False
3   1       name      40            False          False
4   1     amount      15             True           True
5   1     amount      15             True           True
6   2      exact       0            False           True
7   2      exact       0            False           True

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas:基于作为对象列表的另一列创建一个新列 - Pandas: Create a new column based on a another column which is a list of objects 根据条件将一列添加到pandas DataFrame中,这是另一个DataFrame中列的总和 - Adding a column to pandas DataFrame which is the sum of parts of a column in another DataFrame, based on conditions 熊猫根据每个组的另一列的多个条件创建一个布尔列 - pandas create a boolean column based on multiple conditions on another column for each group 如何根据另一个 Dataframe 中的值在 Pandas Dataframe 中创建一个新列? - How to create a new column in a Pandas Dataframe based on values in another Dataframe? 根据另一列中的多个条件,在pandas数据框中填充一列 - filling in a column in pandas dataframe based on multiple conditions in another column 熊猫根据条件替换另一列的内容 - pandas replacing the contents of a column based on the conditions another column pandas:根据另一列中的条件更改一列的前一个单元格值 - pandas: change the previous cell value of a column based on conditions in another column pandas如何根据两个以上的列条件创建列? - How to create a column in pandas based on more than two column conditions? 如何在pandas中创建新列,该列是基于条件的另一列的总和? - How do I create a new column in pandas which is the sum of another column based on a condition? 根据另一列 pandas 查找前 5 个值 - Finding top 5 values based on another column pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM