[英]How to calculate data using if/elif/else and minimum value in columns?
例如,我有 DataFrame:
a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
{'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
{'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
{'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)
print(df)
我正在尝试计算df['color_4']
列的值,并且我正在尝试用一行代码来完成。 但也许这是不可能的。
计算逻辑如下,对df['name'] == 'A' and 'B'
有效,对df['name'] == 'C'
无效,需补充
df['col_4'] = [i if i != 0 else x - (x * 0.75) for i, x, y in zip(df['col_3'], df['col_2'], df['col_1'])]
需要继续计算,如果值在df['col_3'] and df['col_2'] == 0
,则y - (y * 0.75)
-> df['col_1']
在df['name'] == 'D'
的情况下,首先需要比较列df['col_1'] and df['col_2']
中的值并选择最小值
我需要下一个结果:
你也可以使用 Pandas 操作,但我让它成为你想要的风格:
df['col_4'] = [t[0] if t[0]>0 else
(min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2])
for t in zip(df['col_3'], df['col_2'], df['col_1'])]
完整的检查脚本
import pandas as pd
a = [{'name': 'A', 'col_1': 5, 'col_2': 3, 'col_3': 1.5},
{'name': 'B', 'col_1': 4, 'col_2': 2.5, 'col_3': None},
{'name': 'C', 'col_1': 8, 'col_2': None, 'col_3': None},
{'name': 'D', 'col_1': 7, 'col_2': 9, 'col_3': None}]
df = pd.DataFrame(a)
df['col_1'] = df['col_1'].fillna(0)
df['col_2'] = df['col_2'].fillna(0)
df['col_3'] = df['col_3'].fillna(0)
df['col_4'] = [t[0] if t[0]>0 else
(min(t[1], t[2]) - (min(t[1], t[2]) * 0.75) if bool(t[1]) else t[2]-0.75*t[2])
for t in zip(df['col_3'], df['col_2'], df['col_1'])]
print(df)
结果
name col_1 col_2 col_3 col_4
0 A 5 3.0 1.5 1.500
1 B 4 2.5 0.0 0.625
2 C 8 0.0 0.0 2.000
3 D 7 9.0 0.0 1.750
您可以将逻辑简化为:
df['col_4'] = np.where(df['col_3'].eq(0),
df[['col_1', 'col_2']].min(axis=1).mul(0.25),
df['col_3'])
Output:
name col_1 col_2 col_3 col_4
0 A 5 3.0 1.5 1.500
1 B 4 2.5 0.0 0.625
2 C 8 0.0 0.0 0.000
3 D 7 9.0 0.0 1.750
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.