I have a data frame like this,
col1 col2 col3
1 A T
2 A F
3 N N
4 N N
5 B T
6 N N
7 B F
8 N N
9 A T
10 N N
11 N N
12 A T
13 N N
14 N N
15 A T
16 N N
17 A F
Now I want to create a new data frame from above in such a way that, of there is consecutive N in col2 and col3 between every T and F in col3, fill with above non N value with T. Ignore those where T or F coming after T and F and.
So the desired data frame look like,
col1 col2 col3
1 A T
2 A F
3 N N
4 N N
5 B T
6 B T
7 B F
8 N N
9 A T
10 N N
11 N N
12 A T
13 N N
14 N N
15 A T
16 A T
17 A F
I could do this using a for loop and store the indices by comparing the next and previous value. But it will take the longer time to execute. I am looking for some pythonic way/ pandas shortcuts to do it efficiently.
This is my approach:
# mask T and F
TFs = df['col3'].mask(df['col3'].eq('N'))
after_T = TFs.ffill()
before_F = TFs.bfill()
# between
bt_TF = after_T.eq('T') & before_F.eq('F')
# mask and ffill:
df['col2'] = df['col2'].mask(bt_TF).ffill()
df['col3'] = df['col3'].mask(bt_TF).ffill()
Output:
col1 col2 col3
0 1 A T
1 2 A F
2 3 N N
3 4 N N
4 5 B T
5 6 B T
6 7 B F
7 8 N N
8 9 A T
9 10 N N
10 11 N N
11 12 A T
12 13 N N
13 14 N N
14 15 A T
15 16 A T
16 17 A F
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.