根据python中的条件更新多列值

Question

I have a dataframe like this,我有一个这样的数据框，

ID    00:00  01:00  02:00  ...   23:00   avg_value
22      4.7     5.3     6   ...    8         5.5
37       0      9.2    4.5  ...    11.2      9.2
4469     2      9.8    11   ...    2         6.4

Can I use np.where to apply conditions on multiple columns at once?我可以使用np.where一次在多列上应用条件吗？ I want to update the values from 00:00 to 23:00 to 0 and 1 .我想将值从 00:00 更新到 23:00 到0和1 。 If the value at the time of day is greater than avg_value then I change it to 1 , else to 0 .如果一天中的值大于avg_value那么我将其更改为1 ，否则更改为0 。

I know how to apply this method to one single column.我知道如何将此方法应用于单列。

np.where(df['00:00']>df['avg_value'],1,0)

Can I change it to multiple columns?我可以将其更改为多列吗？

Output will be like,输出会像，

ID    00:00  01:00  02:00  ...   23:00   avg_value
22      0     1       1    ...      1       5.5
37      0     0       0    ...      1       9.2
4469    0     1       1    ...      0       6.4

Answer 1

Select all columns without last by DataFrame.iloc , compare by DataFrame.gt and casting to integer s and last add avg_value column by DataFrame.join :选择所有列，而不上次DataFrame.iloc ，比较受DataFrame.gt和铸造到integer秒和最后加avg_value柱DataFrame.join ：

df = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int).join(df['avg_value'])
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22        0      0      1      1        5.5
37        0      0      0      1        9.2
4469      0      1      1      0        6.4

Or use DataFrame.pop for extract column:或者使用DataFrame.pop提取列：

s = df.pop('avg_value')
df = df.gt(s, axis=0).astype(int).join(s)
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22        0      0      1      1        5.5
37        0      0      0      1        9.2
4469      0      1      1      0        6.4

Because if assign to same columns integers are converted to floats (it is bug):因为如果分配给相同的列整数将转换为浮点数（这是错误）：

df.iloc[:, :-1] = df.iloc[:, :-1].gt(df['avg_value'], axis=0).astype(int)
print (df)
      00:00  01:00  02:00  23:00  avg_value
ID                                         
22      0.0    0.0    1.0    1.0        5.5
37      0.0    0.0    0.0    1.0        9.2
4469    0.0    1.0    1.0    0.0        6.4

根据python中的条件更新多列值

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-02-17 08:28:39

根据python中的条件更新多列值

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-02-17 08:28:39

解决方案1
2 已采纳 2020-02-17 08:28:39