[英]fill in between rows of two column combinations in a pandas data frame
I have a data frame like this,我有一个这样的数据框,
col1 col2 col3
1 A T
2 A F
3 N N
4 N N
5 B T
6 N N
7 B F
8 N N
9 A T
10 N N
11 N N
12 A T
13 N N
14 N N
15 A T
16 N N
17 A F
Now I want to create a new data frame from above in such a way that, of there is consecutive N in col2 and col3 between every T and F in col3, fill with above non N value with T. Ignore those where T or F coming after T and F and.现在我想从上面创建一个新的数据框,在 col3 中的每个 T 和 F 之间有连续的 N 在 col2 和 col3 中,用 T 填充上面的非 N 值。忽略 T 或 F 来的那些在 T 和 F 和之后。
So the desired data frame look like,所以所需的数据框看起来像,
col1 col2 col3
1 A T
2 A F
3 N N
4 N N
5 B T
6 B T
7 B F
8 N N
9 A T
10 N N
11 N N
12 A T
13 N N
14 N N
15 A T
16 A T
17 A F
I could do this using a for loop and store the indices by comparing the next and previous value.我可以使用 for 循环来执行此操作,并通过比较下一个值和上一个值来存储索引。 But it will take the longer time to execute.
但是执行起来需要更长的时间。 I am looking for some pythonic way/ pandas shortcuts to do it efficiently.
我正在寻找一些 pythonic 方式/pandas 快捷方式来有效地做到这一点。
This is my approach:这是我的方法:
# mask T and F
TFs = df['col3'].mask(df['col3'].eq('N'))
after_T = TFs.ffill()
before_F = TFs.bfill()
# between
bt_TF = after_T.eq('T') & before_F.eq('F')
# mask and ffill:
df['col2'] = df['col2'].mask(bt_TF).ffill()
df['col3'] = df['col3'].mask(bt_TF).ffill()
Output: Output:
col1 col2 col3
0 1 A T
1 2 A F
2 3 N N
3 4 N N
4 5 B T
5 6 B T
6 7 B F
7 8 N N
8 9 A T
9 10 N N
10 11 N N
11 12 A T
12 13 N N
13 14 N N
14 15 A T
15 16 A T
16 17 A F
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.