[英]Most efficient way to enlarge the active area of a binary series pandas?
I have a pandas dataframe df
:我有一个 pandas dataframe
df
:
Car![]() |
Open![]() |
Time![]() |
---|---|---|
Audi A5![]() |
0 ![]() |
0 ![]() |
Audi A5![]() |
0 ![]() |
1 ![]() |
Audi A5![]() |
0 ![]() |
2 ![]() |
Audi A5![]() |
1 ![]() |
3 ![]() |
Audi A5![]() |
1 ![]() |
4 ![]() |
Audi A5![]() |
0 ![]() |
5 ![]() |
Audi A5![]() |
0 ![]() |
6 ![]() |
Audi A5![]() |
0 ![]() |
7 ![]() |
Audi A5![]() |
1 ![]() |
8 ![]() |
Audi A5![]() |
1 ![]() |
9 ![]() |
Mercedes Class A![]() |
1 ![]() |
0 ![]() |
Mercedes Class A![]() |
1 ![]() |
1 ![]() |
Mercedes Class A![]() |
1 ![]() |
2 ![]() |
Mercedes Class A![]() |
0 ![]() |
3 ![]() |
Mercedes Class A![]() |
0 ![]() |
4 ![]() |
Mercedes Class A![]() |
1 ![]() |
5 ![]() |
Mercedes Class A![]() |
1 ![]() |
6 ![]() |
Mercedes Class A![]() |
0 ![]() |
7 ![]() |
Mercedes Class A![]() |
0 ![]() |
8 ![]() |
Mercedes Class A![]() |
1 ![]() |
9 ![]() |
I want to enlarge the active part of the binary series Open
by n
units, but after grouping the dataframe by Car
.我想将二进制系列
Open
的活动部分放大n
单位,但是在将 dataframe 分组后Car
。
An active part is a group of consecutive 1 that is either surrounded by 0, or having only 0 as previous value, or having only 0 as next values.活动部分是一组被 0 包围的连续 1,或者只有 0 作为前一个值,或者只有 0 作为下一个值。 The case when the series has only 1 as value is ignored.
该系列只有 1 作为值的情况被忽略。
If n = 1
, I want to get the following dataframe:如果
n = 1
,我想得到以下 dataframe:
Car![]() |
Open![]() |
Time![]() |
---|---|---|
Audi A5![]() |
0 ![]() |
0 ![]() |
Audi A5![]() |
0 ![]() |
1 ![]() |
Audi A5![]() |
1 ![]() |
2 ![]() |
Audi A5![]() |
1 ![]() |
3 ![]() |
Audi A5![]() |
1 ![]() |
4 ![]() |
Audi A5![]() |
0 ![]() |
5 ![]() |
Audi A5![]() |
0 ![]() |
6 ![]() |
Audi A5![]() |
1 ![]() |
7 ![]() |
Audi A5![]() |
1 ![]() |
8 ![]() |
Audi A5![]() |
1 ![]() |
9 ![]() |
Mercedes Class A![]() |
1 ![]() |
0 ![]() |
Mercedes Class A![]() |
1 ![]() |
1 ![]() |
Mercedes Class A![]() |
1 ![]() |
2 ![]() |
Mercedes Class A![]() |
0 ![]() |
3 ![]() |
Mercedes Class A![]() |
1 ![]() |
4 ![]() |
Mercedes Class A![]() |
1 ![]() |
5 ![]() |
Mercedes Class A![]() |
1 ![]() |
6 ![]() |
Mercedes Class A![]() |
0 ![]() |
7 ![]() |
Mercedes Class A![]() |
1 ![]() |
8 ![]() |
Mercedes Class A![]() |
1 ![]() |
9 ![]() |
I can get the index of all active parts using the following code:我可以使用以下代码获取所有活动部件的索引:
df = pd.DataFrame(
{
"Car": ["Audi A5"]*10 + ["Mercedes Class A"]*10,
"Time" : list(range(10)) + list(range(10)),
"Open" : [0,0,0,1,1,0,0,0,1,1,1,1,1,0,0,1,1,0,0,1]
}
)
def enlarge(dataframe : pd.DataFrame, sensor : str, n : int = 1) -> pd.DataFrame:
get_group_indexes = (
lambda x: x.index[0]
if x.index[-1] - x.index[0] >= 1
else None
)
groups = (
dataframe[sensor]
.eq(0)
.cumsum()[dataframe[sensor].ne(0)]
.to_frame()
.groupby(sensor)
.apply(get_group_indexes)
.dropna()
)
if groups.empty:
return dataframe
for index in groups:
dataframe.loc[index-n:index, sensor] = 1
return dataframe
It works when I don't have to group by Car
but I want to group by this column before perfoming this transformation.当我不必按
Car
分组但我想在执行此转换之前按此列分组时,它可以工作。 Does someone hqs an idea how to achieve this efficiently using pandas tricks?有人知道如何使用 pandas 技巧有效地实现这一目标吗? Thanks.
谢谢。
IIUC, you can bfill
per group with a limit after masking the non-1 values: bfill
,您可以在屏蔽非 1 值后对每个组进行限制:
n=1
df['Open2'] = (df['Open']
.where(df['Open'].eq(1))
.groupby(df['Car']).bfill(limit=n)
.fillna(df['Open'], downcast='infer')
)
output (as new column "Open2" for clarity): output(为清楚起见,作为新列“Open2”):
Car Time Open Open2
0 Audi A5 0 0 0
1 Audi A5 1 0 0
2 Audi A5 2 0 1
3 Audi A5 3 1 1
4 Audi A5 4 1 1
5 Audi A5 5 0 0
6 Audi A5 6 0 0
7 Audi A5 7 0 1
8 Audi A5 8 1 1
9 Audi A5 9 1 1
10 Mercedes Class A 0 1 1
11 Mercedes Class A 1 1 1
12 Mercedes Class A 2 1 1
13 Mercedes Class A 3 0 0
14 Mercedes Class A 4 0 1
15 Mercedes Class A 5 1 1
16 Mercedes Class A 6 1 1
17 Mercedes Class A 7 0 0
18 Mercedes Class A 8 0 1
19 Mercedes Class A 9 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.