[英]How do I change multiple values in pandas df column to np.nan, based on condition in other column?
I don't have much experience in coding and this is my first question, so please be patient with me. 我没有太多的编码经验,这是我的第一个问题,所以请耐心等待我。 I need to find a way to change multiple values of a pandas df column to np.nan, based on a condition in another column. 我需要找到一种方法,根据另一列中的条件,将pandas df列的多个值更改为np.nan。 Therefore I have created copies of the required columns "Vorgabe" and "Temp". 因此,我创建了所需列“Vorgabe”和“Temp”的副本。
Whenever the value in "Grad" isn't 0 i want to change the values in a definded area in "Vorgabe" and "Temp" to np.nan. 每当“Grad”中的值不为0时,我想将“Vorgabe”和“Temp”中的definded区域中的值更改为np.nan。
print(df)
OptOpTemp OpTemp BSP Grad Vorgabe Temp
0 22.0 20.0 5 0.0 22.0 20.0
1 22.0 20.5 7 0.0 22.0 20.5
2 22.0 21.0 8 1.0 22.0 21.0
3 22.0 21.0 6 0.0 22.0 21.0
4 22.0 23.5 7 0.0 22.0 20.0
5 23.0 21.5 1 0.0 23.0 21.5
6 24.0 22.5 3 1.0 24.0 22.5
7 24.0 23.0 4 0.0 24.0 23.0
8 24.0 25.5 9 0.0 24.0 25.5
So I want to achieve something like this: 所以我想实现这样的目标:
OptOpTemp OpTemp BSP Grad Vorgabe Temp
0 22.0 20.0 5 0.0 22.0 20.0
1 22.0 20.5 7 0.0 nan nan <-one row above
2 22.0 21.0 8 1.0 nan nan
3 22.0 21.0 6 0.0 nan nan <-one row among
4 22.0 23.5 7 0.0 22.0 20.0
5 23.0 21.5 1 0.0 nan nan
6 24.0 22.5 3 1.0 nan nan
7 24.0 23.0 4 0.0 nan nan
8 24.0 25.5 9 0.0 24.0 25.5
Does somebody have a solution to my problem? 有人能解决我的问题吗?
EDIT: I may have been unclear. 编辑:我可能不清楚。 The goal is to change every value in "Vorgabe" and "Temp" in an defined area to nan. 目标是将定义区域中“Vorgabe”和“Temp”中的每个值更改为nan。 In my example the area would be one row above, the row with 1.0 in it, and one row among. 在我的例子中,区域将在上面一行,行中有1.0,其中一行。 So not only the row, where 1.0 is located, but also rows above and under. 因此,不仅是1.0所在的行,还有上面和下面的行。
df.loc[df.Grad != 0.0, ['Vorgabe', 'Temp']] = np.nan
print(df)
Output 产量
OptOpTemp OpTemp BSP Grad Vorgabe Temp
0 22.0 20.0 5 0.0 22.0 20.0
1 22.0 20.5 7 0.0 22.0 20.5
2 22.0 21.0 8 1.0 NaN NaN
3 22.0 21.0 6 0.0 22.0 21.0
4 22.0 23.5 7 0.0 22.0 20.0
5 23.0 21.5 1 0.0 23.0 21.5
6 24.0 22.5 3 1.0 NaN NaN
7 24.0 23.0 4 0.0 24.0 23.0
8 24.0 25.5 9 0.0 24.0 25.5
You could use numpy.where . 你可以使用numpy.where 。
import numpy as np
df['Vorbage']=np.where(df['Grad']!=0, df['OptOpTemp'], np.nan)
df['Temp']=np.where(df['Grad']!=0, df['OpTemp'], np.nan)
Chain 3 conditions with |
链条3条件|
for bitwise OR
, for rows above and under 1
use mask with shift
: 对于bitwise OR
,对于高于和低于1
行,使用带有shift
掩码:
mask1 = df['Grad'] == 1
mask2 = df['Grad'].shift() == 1
mask3 = df['Grad'].shift(-1) == 1
mask1 = df['Grad'] != 0
mask2 = df['Grad'].shift() != 0
mask3 = df['Grad'].shift(-1) != 0
mask = mask1 | mask2 | mask3
df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
OptOpTemp OpTemp BSP Grad Vorgabe Temp
0 22.0 20.0 5 0.0 22.0 20.0
1 22.0 20.5 7 0.0 NaN NaN
2 22.0 21.0 8 1.0 NaN NaN
3 22.0 21.0 6 0.0 NaN NaN
4 22.0 23.5 7 0.0 22.0 20.0
5 23.0 21.5 1 0.0 NaN NaN
6 24.0 22.5 3 1.0 NaN NaN
7 24.0 23.0 4 0.0 NaN NaN
8 24.0 25.5 9 0.0 24.0 25.5
General solution for multiple rows: 多行的一般解决方案:
N = 1
#create range for test value betwen -N to N
r = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
#create boolean mask by comparing with shift and join together by reduce
mask = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r])
df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
EDIT: 编辑:
You can join both masks together: 您可以将两个面具连接在一起:
N = 1
r1 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask1 = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r1])
N = 2
r2 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask2 = np.logical_or.reduce([df['Grad'].shift(x) == 1.5 for x in r2])
#if not working ==1.5 because precision of floats
#mask2 = np.logical_or.reduce([np.isclose(df['Grad'].shift(x), 1.5) for x in r2])
mask = mask1 | mask2
df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
OptOpTemp OpTemp BSP Grad Vorgabe Temp
0 22.0 20.0 5 0.0 22.0 20.0
1 22.0 20.5 7 0.0 NaN NaN
2 22.0 21.0 8 1.0 NaN NaN
3 22.0 21.0 6 0.0 NaN NaN
4 22.0 23.5 7 0.0 NaN NaN
5 23.0 21.5 1 0.0 NaN NaN
6 24.0 22.5 3 1.5 NaN NaN <- changed value to 1.5
7 24.0 23.0 4 0.0 NaN NaN
8 24.0 25.5 9 0.0 NaN NaN
You can use df.apply(f,axis=1)
, and define f
to be what you want to do on each row. 您可以使用df.apply(f,axis=1)
,并将f
定义为您想要在每一行上执行的操作。 You description seems to be saying you want 你的描述似乎在说你想要的
def f(row):
if row['Grad']!=0:
row.loc[['Vorgabe','Temp']]=np.nan
return row
However, your example seems to be saying you want something else. 但是,你的例子似乎在说你想要别的东西。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.