[英]the pandas function any() don’t return the result what i want
I have the following DataFrame
我有以下
DataFrame
df = pd.DataFrame(
{
'class': ['0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0','0'],
'item': ['1','1','2','2','2','3','3','3','3','3','4','4','5','5','5','5','5','5','5'],
'last_PO_code': ['103','103','103','104','103','103','104','105','106','103','103','104','103','103','104','105','105','106','1046'],
'qty': [3,4,3,3,2,4,4,3,3,3,5,5,2,6,8,2,6,2,6],
}
)
I apply the following rules for each unique item in the item
column to this DataFrame
:我将
item
列中每个唯一项目的以下规则应用于此DataFrame
:
last_PO_code
has '103'
only. last_PO_code
只有'103'
。
last_PO_code
has ( '103'
& '104'
) and ( qty
column of '103'
> qty
column of '104'
) last_PO_code
有 ( '103'
& '104'
) 和 ( '103'
qty
列 > '104'
qty
列)
last_PO_code
has ( '103'
& '104'
& '105'
& '106'
) and ( qty
column of '105'
== qty
column of '106'
) and ( qty
column of '103'
> qty
column of '104'
) last_PO_code
具有 ( '103'
& '104'
& '105'
& '106'
) 和 ( '105'
qty
列 == '106'
qty
列) 和 ( '103'
qty
列 > '104'
' 的qty
列'104'
)
last_PO_code
don't have '103'
last_PO_code
没有'103'
last_PO_code
has ( '103'
& '104'
) and ( qty
column of '103'
== qty
column of '104'
) last_PO_code
具有( '103'
& '104'
)和( '103'
qty
列 == '104'
qty
列)
last_PO_code
has ( '103'
& '104'
& '105'
& '106'
) and ( qty
column of '105'
== qty
column of '106'
) and ( qty
column of '103'
== qty
column of '104'
) last_PO_code
有 ( '103'
& '104'
& '105'
& '106'
) 和 ( '105'
qty
列 == '106'
qty
列) 和 ( '103'
qty
列 == '104'
的qty
列'104'
)
I wrote the following code, but the result is not what I want.我写了以下代码,但结果不是我想要的。
regle1 = lambda x: True if x['last_PO_code'].eq('103').all() else False
regle2 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and x['last_PO_code'].eq('103').sum() > x['last_PO_code'].eq('104').sum() \
else False
regle3 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and x['last_PO_code'].eq('105').any() \
and x['last_PO_code'].eq('106').any() \
and x['last_PO_code'].eq('103').sum() > x['last_PO_code'].eq('104').sum() \
and x['last_PO_code'].eq('105').sum() == x['last_PO_code'].eq('106').sum() \
else False
regle4 = lambda x: False if x['last_PO_code'].eq('103').any() else True
regle5 = lambda x: True if (x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any()) \
and x['last_PO_code'].eq('103').sum() == x['last_PO_code'].eq('104').sum() \
else False
regle6 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and x['last_PO_code'].eq('105').any() \
and x['last_PO_code'].eq('106').any() \
and x['last_PO_code'].eq('103').sum() == x['last_PO_code'].eq('104').sum() \
and x['last_PO_code'].eq('105').sum() == x['last_PO_code'].eq('106').sum() \
else False
df2 = df.groupby(['class','item']).apply(lambda x: pd.Series({'regle1' : regle1(x),
'regle2': regle2(x),
'regle3' : regle3(x)
}))
Only regle1
does what I want for all items.只有
regle1
对所有项目都做了我想要的。 For me the problem comes from the any()
function.对我来说,问题来自
any()
function。 Either I use it badly or I don't understand it well.要么我用得不好,要么我不太了解它。
What I have:我有的:
regle1 regle2 regle3 regle4 regle5 regle6
class item
0 1 True False False False False False
2 False True False False False False
3 False True True False False False
4 False False False False True False
5 False True True False False False
What I want:我想要的是:
regle1 regle2 regle3 regle4 regle5 regle6
class item
0 1 True False False False False False
2 False True False False False False
3 False True True False False False
4 False False False False True False
5 False False False False True True
All the mistakes I noticed were on item 5, but I don't understand why我注意到的所有错误都在第 5 项上,但我不明白为什么
The problem is, that you are summing the number of 'last_PO_code'
instead of ' qty
'.问题是,您正在对
'last_PO_code'
而不是“ qty
”的数量求和。 In each lambda, you must have:在每个 lambda 中,您必须具有:
(x['last_PO_code'].eq('103')*x['qty']).sum()
or as mozway
suggested, even better:或者正如
mozway
建议的那样,甚至更好:
x.loc[x['last_PO_code'].eq('103'), 'qty'].sum()
instead of:代替:
x['last_PO_code'].eq('103').sum()
The whole code:整个代码:
egle1 = lambda x: True if x['last_PO_code'].eq('103').all() else False
regle2 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and (x['last_PO_code'].eq('103') * x['qty']).sum() > (x['last_PO_code'].eq('104') * x['qty']).sum() \
else False
regle3 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and x['last_PO_code'].eq('105').any() \
and x['last_PO_code'].eq('106').any() \
and (x['last_PO_code'].eq('103')*x['qty']).sum() > (x['last_PO_code'].eq('104')*x['qty']).sum() \
and (x['last_PO_code'].eq('105')*x['qty']).sum() == (x['last_PO_code'].eq('106')*x['qty']).sum() \
else False
regle4 = lambda x: False if x['last_PO_code'].eq('103').any() else True
regle5 = lambda x: True if (x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any()) \
and (x['last_PO_code'].eq('103')*x['qty']).sum() == (x['last_PO_code'].eq('104')*x['qty']).sum() \
else False
regle6 = lambda x: True if x['last_PO_code'].eq('103').any() \
and x['last_PO_code'].eq('104').any() \
and x['last_PO_code'].eq('105').any() \
and x['last_PO_code'].eq('106').any() \
and (x['last_PO_code'].eq('103')*x['qty']).sum() == (x['last_PO_code'].eq('104')*x['qty']).sum() \
and (x['last_PO_code'].eq('105')*x['qty']).sum() == (x['last_PO_code'].eq('106')*x['qty']).sum() \
else False
df2 = df.groupby(['class','item']).apply(lambda x: pd.Series({'regle1' : regle1(x),
'regle2' : regle2(x),
'regle3' : regle3(x),
'regle4' : regle4(x),
'regle5' : regle5(x),
'regle6' : regle6(x),
}))
# regle1 regle2 regle3 regle4 regle5 regle6
#class item
#0 1 True False False False False False
# 2 False True False False False False
# 3 False True True False False False
# 4 False False False False True False
# 5 False False False False True True
PS. PS。 At this moment maybe it's time to use normal functions instead of lambdas, to have cleaner code:D.
现在也许是时候使用普通函数而不是 lambdas 来获得更简洁的代码了:D。 You also have repeatable chunks of code in your lambda, which could be easily automated.
您的 lambda 中还有可重复的代码块,可以轻松实现自动化。
PS2. PS2。 I assumed, that in your example data, you have a typo (there shuld be
106
instead of 1046
我假设,在您的示例数据中,您有一个错字(应该是
106
而不是1046
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.