[英]Pandas / Pyspark for loop column substract
实现这一目标的众多方法之一是使用 pandas.apply。
看看这是否有帮助:
import numpy as np
import pandas as pd
data={
"name":["tom","jim"],
"cummulative":[17,15],
"voucher1":[10,0],
"voucher2":[5,5],
"voucher3":[2,10],
"credit":[20,10]
}
df=pd.DataFrame(data)
def change_order(row):
new_dict=row.to_dict()
credit=row.credit
cummulative=row.cummulative
for i in range(3,0,-1):
current=row[f"voucher{i}"]
if credit>=current:
credit-=current
cummulative-=current
new_dict["credit"]=credit
new_dict["cummulative"]=cummulative
new_dict[f"voucher{i}"]=0
series=pd.Series(new_dict)
return series
df=df.apply(change_order,axis=1)
print(df)
输出:
name cummulative voucher1 voucher2 voucher3 credit
0 tom 0 0 0 0 3
1 jim 5 0 5 0 0
这实际上是一个非常好的练习。
这是可以在 PySpark 中创建的。 不过感觉应该有更好的方法...
输入:
from pyspark.sql import functions as F
df = spark.createDataFrame(
[('tom', 17, 10, 5, 2, 20),
('jim', 15, 0, 5, 10, 10)],
['name', 'cumulative_vouchers_used', 'voucher1', 'voucher2', 'voucher3', 'credit'])
脚本:
c, v1, v2, v3 = F.col('credit'), F.col('voucher1'), F.col('voucher2'), F.col('voucher3')
subt_v3 = F.when(c >= v3, c - v3).otherwise(c)
new_v3 = F.when(c >= v3, 0).otherwise(v3)
subt_v2 = F.when((subt_v3 >= v2) & (subt_v3 != c), subt_v3 - v2).otherwise(subt_v3)
new_v2 = F.when(subt_v3 >= v2, 0).otherwise(v2)
subt_v1 = F.when((subt_v2 >= v1) & (subt_v2 != subt_v3), subt_v2 - v1).otherwise(subt_v2)
new_v1 = F.when(subt_v2 >= v1, 0).otherwise(v1)
new_cum = new_v1 + new_v2 + new_v3
df = df.select(
'name',
new_cum.alias('cumulative_vouchers_used'),
new_v1.alias('voucher1'),
new_v2.alias('voucher2'),
new_v3.alias('voucher3'),
subt_v1.alias('credit')
)
df.show()
# +----+------------------------+--------+--------+--------+------+
# |name|cumulative_vouchers_used|voucher1|voucher2|voucher3|credit|
# +----+------------------------+--------+--------+--------+------+
# | tom| 0| 0| 0| 0| 3|
# | jim| 5| 0| 5| 0| 0|
# +----+------------------------+--------+--------+--------+------+
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.