[英]np.where to check on True for 1st column then on another
i have below dataframe with np.where
condition as:我有以下 dataframe 与
np.where
条件为:
df=pd.DataFrame(data = {'First':[1,2,4,6,2,7,8,9],'Second':[4,6,7,3,1,3,9,3]})
df['First_check']=np.where(df['First']==2,'T','F')
df
First Second First_check
0 1 4 F
1 2 6 T
2 4 7 F
3 6 3 F
4 2 1 T
5 7 3 F
6 8 9 F
7 9 3 F
Now i want to check for df['Second']==3
but only after df['First_check']=='T'
also i want only first occurance of condition.现在我想检查
df['Second']==3
但只有在df['First_check']=='T'
之后我也只想要条件的第一次出现。
Below is My desired output:下面是我想要的 output:
First Second First_check Second_check
0 1 4 F F
1 2 6 T F
2 4 7 F F
3 6 3 F T
4 2 1 T F
5 7 3 F T
6 8 9 F F
7 9 3 F F
Edit: i want df['Second']==3
to become True But 1st df['First_check']=='T'
should become True then it may or may not be on same row.编辑:我希望
df['Second']==3
变为 True 但是第一个df['First_check']=='T'
应该变为 True 那么它可能在也可能不在同一行上。 say for row 2
df['First_check']=T
then it should check next rows 2,3,4... for df['Second']==3
it matched at row 4th
说对于
row 2
df['First_check']=T
那么它应该检查下一行 2,3,4... 对于df['Second']==3
它在row 4th
匹配
I think this is what you are looking for.我想这就是你要找的。
Criteria:标准:
'First_check'
.'First_check'
中寻找价值。 If the value is 'T', reset the flag to check for 3
in Second
Second
内检查3
3
in Second
.Second
中检查3
。 Turn the value of the first 3
to 'T'
.3
的值变为'T'
。 All subsequent 3
should be 'F'
until you get a new T
in 'First_check'
3
都应该是'F'
直到你在'First_check'
中得到一个新的T
'T'
, return to Step 1 and continue.'T'
时,返回第 1 步并继续。 To do this, you need to look back both 'First_check'
and 'Second'
为此,您需要回顾
'First_check'
和'Second'
Here's code to solve for it.这是解决它的代码。
import pandas as pd
import numpy as np
df=pd.DataFrame(data = {'First':[1,2,4,6,2,7,8,9],'Second':[4,6,7,3,1,3,9,3]})
df['First_check']=np.where(df['First']==2,'T','F')
print (df)
df['tempF'] = df.groupby((df['First_check'].eq('T')).cumsum()).cumcount()+1
df['tempS'] = df.groupby((df['Second'].eq(3)).cumsum()).cumcount()+1
df['Second_check'] = np.where((df['tempS'] == 1) & (df['tempF'] == df['tempS'].shift(1)),'T','F')
df.drop(['tempF','tempS'],axis=1,inplace=True)
print (df)
The output is as per your required output: output 是根据您所需的 output:
First Second First_check Second_check
0 1 4 F F
1 2 6 T F
2 4 7 F F
3 6 3 F T
4 2 1 T F
5 7 3 F T
6 8 9 F F
7 9 3 F F
Alternatively you can create a new Series where 2
s and 3
s are filled in while other places have nan
s, and then you can do a forward fill on this new Series, which will give a Series with 2
s and 3
s only (with possibly nan
s at the beginning).或者,您可以创建一个新系列,其中
2
s 和3
s 被填充,而其他地方有nan
s,然后您可以对这个新系列进行前向填充,这将给出一个仅包含2
s 和3
s 的系列(使用可能是开头的nan
s)。 Finally you can check if 3
s in the Second
has a preceding value of 2
in the preceding
column:最后,您可以检查
Second
中的3
s preceding
列中是否具有先前的值2
:
# firstly merge two and three into a single Series and do a forward fill
preceding = df.First.shift(-1).where(
df.First.shift(-1).eq(2),
df.Second.where(df.Second.eq(3))
).ffill()
preceding
#0 2.0
#1 2.0
#2 2.0
#3 2.0
#4 2.0
#5 3.0
#6 3.0
#7 3.0
#Name: First, dtype: float64
# after the forward fill if a 3 is preceded by a 2, then it should be True
df.Second.eq(3) & preceding.shift().eq(2)
#0 False
#1 False
#2 False
#3 True
#4 False
#5 True
#6 False
#7 False
#Name: First, dtype: bool
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.