简体   繁体   English

如何根据其他列中的条件将pandas df列中的多个值更改为np.nan?

[英]How do I change multiple values in pandas df column to np.nan, based on condition in other column?

I don't have much experience in coding and this is my first question, so please be patient with me. 我没有太多的编码经验,这是我的第一个问题,所以请耐心等待我。 I need to find a way to change multiple values of a pandas df column to np.nan, based on a condition in another column. 我需要找到一种方法,根据另一列中的条件,将pandas df列的多个值更改为np.nan。 Therefore I have created copies of the required columns "Vorgabe" and "Temp". 因此,我创建了所需列“Vorgabe”和“Temp”的副本。

Whenever the value in "Grad" isn't 0 i want to change the values in a definded area in "Vorgabe" and "Temp" to np.nan. 每当“Grad”中的值不为0时,我想将“Vorgabe”和“Temp”中的definded区域中的值更改为np.nan。

print(df)  

    OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0        22.0    20.0    5   0.0     22.0  20.0
1        22.0    20.5    7   0.0     22.0  20.5
2        22.0    21.0    8   1.0     22.0  21.0
3        22.0    21.0    6   0.0     22.0  21.0
4        22.0    23.5    7   0.0     22.0  20.0
5        23.0    21.5    1   0.0     23.0  21.5
6        24.0    22.5    3   1.0     24.0  22.5
7        24.0    23.0    4   0.0     24.0  23.0
8        24.0    25.5    9   0.0     24.0  25.5

So I want to achieve something like this: 所以我想实现这样的目标:

    OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0        22.0    20.0    5   0.0     22.0  20.0
1        22.0    20.5    7   0.0     nan   nan      <-one row above
2        22.0    21.0    8   1.0     nan   nan
3        22.0    21.0    6   0.0     nan   nan      <-one row among
4        22.0    23.5    7   0.0     22.0  20.0
5        23.0    21.5    1   0.0     nan   nan
6        24.0    22.5    3   1.0     nan   nan
7        24.0    23.0    4   0.0     nan   nan
8        24.0    25.5    9   0.0     24.0  25.5

Does somebody have a solution to my problem? 有人能解决我的问题吗?

EDIT: I may have been unclear. 编辑:我可能不清楚。 The goal is to change every value in "Vorgabe" and "Temp" in an defined area to nan. 目标是将定义区域中“Vorgabe”和“Temp”中的每个值更改为nan。 In my example the area would be one row above, the row with 1.0 in it, and one row among. 在我的例子中,区域将在上面一行,行中有1.0,其中一行。 So not only the row, where 1.0 is located, but also rows above and under. 因此,不仅是1.0所在的行,还有上面和下面的行。

Use loc : 使用loc

df.loc[df.Grad != 0.0, ['Vorgabe', 'Temp']] = np.nan
print(df)

Output 产量

   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0     22.0  20.5
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0     22.0  21.0
4       22.0    23.5    7   0.0     22.0  20.0
5       23.0    21.5    1   0.0     23.0  21.5
6       24.0    22.5    3   1.0      NaN   NaN
7       24.0    23.0    4   0.0     24.0  23.0
8       24.0    25.5    9   0.0     24.0  25.5

You could use numpy.where . 你可以使用numpy.where

import numpy as np

df['Vorbage']=np.where(df['Grad']!=0, df['OptOpTemp'], np.nan)
df['Temp']=np.where(df['Grad']!=0, df['OpTemp'], np.nan)

Chain 3 conditions with | 链条3条件| for bitwise OR , for rows above and under 1 use mask with shift : 对于bitwise OR ,对于高于和低于1行,使用带有shift掩码:

mask1 = df['Grad'] == 1
mask2 = df['Grad'].shift() == 1
mask3 = df['Grad'].shift(-1) == 1

mask1 = df['Grad'] != 0
mask2 = df['Grad'].shift() != 0
mask3 = df['Grad'].shift(-1) != 0

mask = mask1 | mask2 | mask3

df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0      NaN   NaN
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0      NaN   NaN
4       22.0    23.5    7   0.0     22.0  20.0
5       23.0    21.5    1   0.0      NaN   NaN
6       24.0    22.5    3   1.0      NaN   NaN
7       24.0    23.0    4   0.0      NaN   NaN
8       24.0    25.5    9   0.0     24.0  25.5

General solution for multiple rows: 多行的一般解决方案:

N = 1
#create range for test value betwen -N to N
r = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
#create boolean mask by comparing with shift and join together by reduce 
mask = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r])

df.loc[mask, ['Vorgabe', 'Temp']] = np.nan

EDIT: 编辑:

You can join both masks together: 您可以将两个面具连接在一起:

N = 1
r1 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask1 = np.logical_or.reduce([df['Grad'].shift(x) == 1 for x in r1])

N = 2
r2 = np.concatenate([np.arange(0, N+1), np.arange(-1, -N-1, -1)])
mask2 = np.logical_or.reduce([df['Grad'].shift(x) == 1.5 for x in r2])
#if not working ==1.5 because precision of floats
#mask2 = np.logical_or.reduce([np.isclose(df['Grad'].shift(x), 1.5) for x in r2])

mask = mask1 | mask2
df.loc[mask, ['Vorgabe', 'Temp']] = np.nan
print (df)
   OptOpTemp  OpTemp  BSP  Grad  Vorgabe  Temp
0       22.0    20.0    5   0.0     22.0  20.0
1       22.0    20.5    7   0.0      NaN   NaN
2       22.0    21.0    8   1.0      NaN   NaN
3       22.0    21.0    6   0.0      NaN   NaN
4       22.0    23.5    7   0.0      NaN   NaN
5       23.0    21.5    1   0.0      NaN   NaN
6       24.0    22.5    3   1.5      NaN   NaN <- changed value to 1.5
7       24.0    23.0    4   0.0      NaN   NaN
8       24.0    25.5    9   0.0      NaN   NaN

You can use df.apply(f,axis=1) , and define f to be what you want to do on each row. 您可以使用df.apply(f,axis=1) ,并将f定义为您想要在每一行上执行的操作。 You description seems to be saying you want 你的描述似乎在说你想要的

 def f(row):
     if row['Grad']!=0:
         row.loc[['Vorgabe','Temp']]=np.nan
     return row

However, your example seems to be saying you want something else. 但是,你的例子似乎在说你想要别的东西。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据其他列中的值将列中的值更改为 np.nan - Change an amount of values in column to np.nan based on value in other column Pandas:如何根据多列的条件将值替换为 np.nan - Pandas: How to replace values to np.nan based on Condition for multiple columns 尝试在Pandas数据框中添加列时,为什么会得到np.NaN值? - Why am I getting np.NaN values when trying to add a column to a Pandas dataframe? 如何使用有时包含 np.nan 的其他列的字符串填充 df 列,遍历 elifs 以返回适当的组合? - How to populate a df column with strings from other columns sometimes containing np.nan, iterating through elifs to return appropriate combinations? 用基于其他列的值填充 np.nan - Fill np.nan with values based on other columns 如何根据另一列中的值填充组中的 np.nan 列? - How to fill a np.nan column in a group based on a value in another column? pandas,对字符串应用字符串操作应该是字符串类型,但是缺少值(np.nan) - pandas, apply string operation to column should be string type, but has missing values (np.nan) 无法使用系列设置 pandas 列值,而是将所有内容设置为 np.nan - cannot set pandas column values using series, sets everything to np.nan instead 熊猫:使用iloc根据条件更改df列值 - Pandas: Change df column values based on condition with iloc 在进行 pandas 比较时,如何从 np.nan&gt;np.nan 返回 np.nan? - When making a pandas comparison how to, return np.nan from np.nan>np.nan?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM