[英]Pandas iterate over values of single column in data frame
I am a beginner to python and pandas.我是 python 和 pandas 的初学者。
I have a 5000-row data frame that looks something like this:我有一个 5000 行的数据框,看起来像这样:
INDEX COL1 COL2 COL3
0 10.0 12.0 15.0
1 14.0 16.0 153.8
2 18.0 20.0 16.3
3 22.0 24.0 101.7
I wish to iterate over the values in COL3
and carry out calculations, such that:我希望遍历COL3
中的值并进行计算,例如:
For each row in the data frame, if the value in COL3 is <= 100.0, multiply that value by 10 and assign to variable "New_Value";对于数据框中的每一行,如果 COL3 中的值 <= 100.0,则将该值乘以 10 并分配给变量“New_Value”; Else, multiply the value by 5 and assign to variable "New_Value"否则,将值乘以 5 并分配给变量“New_Value”
I understand that if
statement cannot be directly applied to the data frame series, as it will lead to ambiguous value error.我了解if
语句不能直接应用于数据框系列,因为它会导致模棱两可的值错误。 However, I am stuck trying to find the right tool for this task, and would appreciate some guidance.但是,我一直在努力为这项任务找到合适的工具,并希望得到一些指导。
Cheers干杯
Using np.where
:使用np.where
:
df['New_Value'] = np.where(df['COL3']<=100,df['COL3']*10,df['COL3']*5)
One liner一个班轮
df.COL1.apply(lambda x: x*10 if x<=100 else 5*x)
for this example, you can use apply, which will apply a function on each row of your data.对于此示例,您可以使用 apply,它将在数据的每一行上应用 function。
lambda
is a quick
function that you can define. lambda
是您可以定义的quick
function。 It will have a bit of a difference compared to normal functions.与正常功能相比,它会有所不同。
The condition is => x*10 if x<=100
so for each x under or equal to 100, multiply it by 10. ELSE
multiply it by 5. x*10 if x<=100
因此对于每个小于或等于 100 的 x,将其乘以 10。 ELSE
,将其乘以 5。
Try this:尝试这个:
df['New_Value']=df.COL3.apply(lambda x: 10*x if x<=100 else 5*x)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.