简体   繁体   English

根据python中的条件更新多列值

[英]Update muliple column values based on condition in python

I have a dataframe like this,我有一个这样的数据框,

ID    00:00  01:00  02:00  ...   23:00   avg_value
22      4.7     5.3     6   ...    8         5.5
37       0      9.2    4.5  ...    11.2      9.2
4469     2      9.8    11   ...    2         6.4

Can I use np.where to apply conditions on multiple columns at once?我可以使用np.where一次在多列上应用条件吗? I want to update the values from 00:00 to 23:00 to 0 and 1 .我想将值从 00:00 更新到 23:00 到01 If the value at the time of day is greater than avg_value then I change it to 1 , else to 0 .如果一天中的值大于avg_value那么我将其更改为1 ,否则更改为0

I know how to apply this method to one single column.我知道如何将此方法应用于单列。

np.where(df['00:00']>df['avg_value'],1,0)

Can I change it to multiple columns?我可以将其更改为多列吗?

Output will be like,输出会像,

ID    00:00  01:00  02:00  ...   23:00   avg_value
22      0     1       1    ...      1       5.5
37      0     0       0    ...      1       9.2
4469    0     1       1    ...      0       6.4

Select all columns without last by DataFrame.iloc , compare by DataFrame.gt and casting to integer s and last add avg_value column by DataFrame.join :选择所有列,而不上次DataFrame.iloc ,比较受DataFrame.gt和铸造到integer秒和最后加avg_valueDataFrame.join

df = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int).join(df['avg_value'])
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22        0      0      1      1        5.5
37        0      0      0      1        9.2
4469      0      1      1      0        6.4

Or use DataFrame.pop for extract column:或者使用DataFrame.pop提取列:

s = df.pop('avg_value')
df = df.gt(s, axis=0).astype(int).join(s)
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22        0      0      1      1        5.5
37        0      0      0      1        9.2
4469      0      1      1      0        6.4

Because if assign to same columns integers are converted to floats (it is bug):因为如果分配给相同的列整数将转换为浮点数(这是错误):

df.iloc[:, :-1] = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int)
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22      0.0    0.0    1.0    1.0        5.5
37      0.0    0.0    0.0    1.0        9.2
4469    0.0    1.0    1.0    0.0        6.4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM