简体   繁体   English

pandas function any() 不返回我想要的结果

[英]the pandas function any() don’t return the result what i want

I have the following DataFrame我有以下DataFrame

df = pd.DataFrame(
    {
        'class': ['0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0'],
        'item':  ['1','1','2','2','2','3','3','3','3','3','4','4','5','5','5','5','5','5','5'],
        'last_PO_code': ['103','103','103','104','103','103','104','105','106','103','103','104','103','103','104','105','105','106','1046'],
        'qty': [3,4,3,3,2,4,4,3,3,3,5,5,2,6,8,2,6,2,6],

    }
)

I apply the following rules for each unique item in the item column to this DataFrame :我将item列中每个唯一项目的以下规则应用于此DataFrame

  1. last_PO_code has '103' only. last_PO_code只有'103'

  2. last_PO_code has ( '103' & '104' ) and ( qty column of '103' > qty column of '104' ) last_PO_code有 ( '103' & '104' ) 和 ( '103' qty列 > '104' qty列)

  3. last_PO_code has ( '103' & '104' & '105' & '106' ) and ( qty column of '105' == qty column of '106' ) and ( qty column of '103' > qty column of '104' ) last_PO_code具有 ( '103' & '104' & '105' & '106' ) 和 ( '105' qty列 == '106' qty列) 和 ( '103' qty列 > '104' ' 的qty'104' )

  4. last_PO_code don't have '103' last_PO_code没有'103'

  5. last_PO_code has ( '103' & '104' ) and ( qty column of '103' == qty column of '104' ) last_PO_code具有( '103' & '104' )和( '103' qty列 == '104' qty列)

  6. last_PO_code has ( '103' & '104' & '105' & '106' ) and ( qty column of '105' == qty column of '106' ) and ( qty column of '103' == qty column of '104' ) last_PO_code有 ( '103' & '104' & '105' & '106' ) 和 ( '105' qty列 == '106' qty列) 和 ( '103' qty列 == '104'qty'104' )

I wrote the following code, but the result is not what I want.我写了以下代码,但结果不是我想要的。


regle1 = lambda x: True if x['last_PO_code'].eq('103').all() else False
regle2 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('103').sum() > x['last_PO_code'].eq('104').sum() \
    else False
regle3 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('105').any() \
    and x['last_PO_code'].eq('106').any() \
    and x['last_PO_code'].eq('103').sum() > x['last_PO_code'].eq('104').sum() \
    and x['last_PO_code'].eq('105').sum() == x['last_PO_code'].eq('106').sum() \
    else False
regle4 = lambda x: False if x['last_PO_code'].eq('103').any() else True

regle5 = lambda x: True if (x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any()) \
    and x['last_PO_code'].eq('103').sum() == x['last_PO_code'].eq('104').sum() \
    else False
regle6 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('105').any() \
    and x['last_PO_code'].eq('106').any() \
    and x['last_PO_code'].eq('103').sum() == x['last_PO_code'].eq('104').sum() \
    and x['last_PO_code'].eq('105').sum() == x['last_PO_code'].eq('106').sum() \
    else False

df2 = df.groupby(['class','item']).apply(lambda x: pd.Series({'regle1' : regle1(x),
                                  'regle2': regle2(x),
                                  'regle3' : regle3(x)
                                  }))

Only regle1 does what I want for all items.只有regle1对所有项目都做了我想要的。 For me the problem comes from the any() function.对我来说,问题来自any() function。 Either I use it badly or I don't understand it well.要么我用得不好,要么我不太了解它。

What I have:我有的:

           regle1   regle2  regle3  regle4  regle5  regle6
class   item                        
0       1   True    False   False   False   False   False
        2   False   True    False   False   False   False
        3   False   True    True    False   False   False
        4   False   False   False   False   True    False
        5   False   True    True    False   False   False

What I want:我想要的是:

           regle1   regle2  regle3  regle4  regle5  regle6
class   item                        
0       1   True    False   False   False   False   False
        2   False   True    False   False   False   False
        3   False   True    True    False   False   False
        4   False   False   False   False   True    False
        5   False   False   False   False   True    True

All the mistakes I noticed were on item 5, but I don't understand why我注意到的所有错误都在第 5 项上,但我不明白为什么

The problem is, that you are summing the number of 'last_PO_code' instead of ' qty '.问题是,您正在对'last_PO_code'而不是“ qty ”的数量求和。 In each lambda, you must have:在每个 lambda 中,您必须具有:

(x['last_PO_code'].eq('103')*x['qty']).sum()

or as mozway suggested, even better:或者正如mozway建议的那样,甚至更好:

x.loc[x['last_PO_code'].eq('103'), 'qty'].sum()

instead of:代替:

x['last_PO_code'].eq('103').sum()

The whole code:整个代码:

egle1 = lambda x: True if x['last_PO_code'].eq('103').all() else False
regle2 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and (x['last_PO_code'].eq('103') * x['qty']).sum()  > (x['last_PO_code'].eq('104') * x['qty']).sum()  \
    else False
regle3 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('105').any() \
    and x['last_PO_code'].eq('106').any() \
    and (x['last_PO_code'].eq('103')*x['qty']).sum() > (x['last_PO_code'].eq('104')*x['qty']).sum() \
    and (x['last_PO_code'].eq('105')*x['qty']).sum() == (x['last_PO_code'].eq('106')*x['qty']).sum() \
    else False
regle4 = lambda x: False if x['last_PO_code'].eq('103').any() else True

regle5 = lambda x: True if (x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any()) \
    and (x['last_PO_code'].eq('103')*x['qty']).sum() == (x['last_PO_code'].eq('104')*x['qty']).sum() \
    else False
regle6 = lambda x: True if x['last_PO_code'].eq('103').any() \
    and x['last_PO_code'].eq('104').any() \
    and x['last_PO_code'].eq('105').any() \
    and x['last_PO_code'].eq('106').any() \
    and (x['last_PO_code'].eq('103')*x['qty']).sum() == (x['last_PO_code'].eq('104')*x['qty']).sum() \
    and (x['last_PO_code'].eq('105')*x['qty']).sum() == (x['last_PO_code'].eq('106')*x['qty']).sum() \
    else False

df2 = df.groupby(['class','item']).apply(lambda x: pd.Series({'regle1' : regle1(x),
                                  'regle2' : regle2(x),
                                  'regle3' : regle3(x),
                                  'regle4' : regle4(x),
                                  'regle5' : regle5(x),
                                  'regle6' : regle6(x),
                                  }))

#               regle1  regle2  regle3  regle4  regle5  regle6
#class  item                        
#0      1       True    False   False   False   False   False
#       2       False   True    False   False   False   False
#       3       False   True    True    False   False   False
#       4       False   False   False   False   True    False
#       5       False   False   False   False   True    True

PS. PS。 At this moment maybe it's time to use normal functions instead of lambdas, to have cleaner code:D.现在也许是时候使用普通函数而不是 lambdas 来获得更简洁的代码了:D。 You also have repeatable chunks of code in your lambda, which could be easily automated.您的 lambda 中还有可重复的代码块,可以轻松实现自动化。

PS2. PS2。 I assumed, that in your example data, you have a typo (there shuld be 106 instead of 1046我假设,在您的示例数据中,您有一个错字(应该是106而不是1046

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何从熊猫数据框中删除行,其中任何列都包含我不想要的符号 - How to drop rows from a pandas dataframe where any column contains a symbol I don't want Function 在我不想要的时候重复 - Function is repeating when I don't want it to 按列号切片pandas数据帧我不想要 - Slicing pandas dataframe by column numbers I don't want 我想打印出行['A','e']下的元素,但是结果不是我想要的,而且我不理解我编写的代码的结果 - I want to print out the elements under the row ['A', 'e'], but the result is not what I wanted and I don't understand the result from the code I wrote 我不知道如何在 function 中编写 A 元组的格式,以便将我想要的任何日期作为元组。 先感谢您 - I don't know how to write the format for A tuple in the function in order to put any date I want as a tuple. Thank you in advance 如何训练python函数返回所需的结果? - How can I train a python function to return the result I want? 添加了第二个值以返回,我不知道要传递的函数参数中缺少什么? - Added second value to return and i don't know what's missing in the function's paramater to pass? 如何 re.sub() 我不想要的字符串? - How to re.sub() what i don't want in a string? 使用范围时我没有得到函数的打印结果 - I don't get a print result of function when using range 无法在熊猫中使用.hist()绘制图表 - can't chart what I want to chart with .hist() in Pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM