
[英]pandas Dataframe Replace NaN values with with previous value based on a key column
[英]Replace value with NaN based on previous and subsequent values in the time series
我正在使用 python pandas 和一个具有多个时间序列的巨大数据帧,类似于以下由三个时间序列组成的数据帧:
df = pd.DataFrame({
'Year': [2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 2013, 2013],
'Week': [48, 49, 50, 51, 52, 1, 2, 3, 4, 5, 48, 49, 50, 51, 52, 1, 2, 3, 4, 5, 48, 49, 50, 51, 52, 1, 2, 3, 4, 5],
'Location': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
'Amount': [None, None, None, None, None, 46, None, None, None, 55, None, None, None, None, None,29, 24, 65, 34, 34, 34, 23, 87, 56, 89, 23, 45, 63, 87, 89]})
Year Week Location Amount
0 2012 48 1 NaN
1 2012 49 1 NaN
2 2012 50 1 NaN
3 2012 51 1 NaN
4 2012 52 1 NaN
5 2013 1 1 46.0
6 2013 2 1 NaN
7 2013 3 1 NaN
8 2013 4 1 NaN
9 2013 5 1 55.0
10 2012 48 2 NaN
11 2012 49 2 NaN
12 2012 50 2 NaN
13 2012 51 2 NaN
14 2012 52 2 NaN
15 2013 1 2 29.0
16 2013 2 2 24.0
17 2013 3 2 65.0
18 2013 4 2 34.0
19 2013 5 2 34.0
20 2012 48 3 34.0
21 2012 49 3 23.0
22 2012 50 3 87.0
23 2012 51 3 56.0
24 2012 52 3 89.0
25 2013 1 3 23.0
26 2013 2 3 45.0
27 2013 3 3 63.0
28 2013 4 3 87.0
29 2013 5 3 89.0
对于每个时间序列,如果前三周和后三周是 NaNs ,我想将 2013 年第 1 周的金额更改为 NaN 。
结果应如下所示(2013 年第 1 周,位置 1 中的数量现在为 NaN):
Year Week Location Amount
0 2012 48 1 NaN
1 2012 49 1 NaN
2 2012 50 1 NaN
3 2012 51 1 NaN
4 2012 52 1 NaN
5 2013 1 1 NaN
6 2013 2 1 NaN
7 2013 3 1 NaN
8 2013 4 1 NaN
9 2013 5 1 55.0
10 2012 48 2 NaN
11 2012 49 2 NaN
12 2012 50 2 NaN
13 2012 51 2 NaN
14 2012 52 2 NaN
15 2013 1 2 29.0
16 2013 2 2 24.0
17 2013 3 2 65.0
18 2013 4 2 34.0
19 2013 5 2 34.0
20 2012 48 3 34.0
21 2012 49 3 23.0
22 2012 50 3 87.0
23 2012 51 3 56.0
24 2012 52 3 89.0
25 2013 1 3 23.0
26 2013 2 3 45.0
27 2013 3 3 63.0
28 2013 4 3 87.0
29 2013 5 3 89.0
我试过的不起作用:
df.loc[((df['Year'] == 2012) & (df['Week'] == 50) & (df['Amount'] == None)) &
((df['Year'] == 2012) & (df['Week'] == 51) & (df['Amount'] == None)) &
((df['Year'] == 2012) & (df['Week'] == 52) & (df['Amount'] == None)) &
((df['Year'] == 2013) & (df['Week'] == 1) & (df['Amount'] >= 0)) &
((df['Year'] == 2013) & (df['Week'] == 2) & (df['Amount'] == None)) &
((df['Year'] == 2013) & (df['Week'] == 3) & (df['Amount'] == None)) &
((df['Year'] == 2013) & (df['Week'] == 4) & (df['Amount'] == None)), 'Amount'] = None
任何想法如何解决这个问题?
将rolling.sum
与Series.groupby
和Series.notna
一起使用来创建蒙版并将其与Series.mask
一起Series.mask
:
m = (df['Amount'].notna()
.groupby(df['Location'])
.rolling(7,center = True).sum().le(1)
.reset_index(level = 'Location',drop='Location'))
df['Amount'] = df['Amount'].mask(m & df['Year'].eq(2013) & df['Week'].eq(1))
print(df)
Year Week Location Amount
0 2012 48 1 NaN
1 2012 49 1 NaN
2 2012 50 1 NaN
3 2012 51 1 NaN
4 2012 52 1 NaN
5 2013 1 1 NaN
6 2013 2 1 NaN
7 2013 3 1 NaN
8 2013 4 1 NaN
9 2013 5 1 55.0
10 2012 48 2 NaN
11 2012 49 2 NaN
12 2012 50 2 NaN
13 2012 51 2 NaN
14 2012 52 2 NaN
15 2013 1 2 NaN
16 2013 2 2 24.0
17 2013 3 2 65.0
18 2013 4 2 34.0
19 2013 5 2 34.0
20 2012 48 3 34.0
21 2012 49 3 23.0
22 2012 50 3 87.0
23 2012 51 3 56.0
24 2012 52 3 89.0
25 2013 1 3 NaN
26 2013 2 3 45.0
27 2013 3 3 63.0
28 2013 4 3 87.0
29 2013 5 3 89.0
对于新数据框:
df.assign(Amount = df['Amount'].mask(m & df['Year'].eq(2013) & df['Week'].eq(1)))
你可以这样做:
s = pd.Series(df['Amount'].isna()
.groupby(df['Location'])
.rolling(7,center=True)
.sum().values,
index=df.index)
df.loc[(s.ge(6)& df['Year'].eq(2013)
& df['Week'].eq(1) & df['Amount'].notna()), 'Amount'] = np.nan
输出:
Year Week Location Amount
0 2012 48 1 NaN
1 2012 49 1 NaN
2 2012 50 1 NaN
3 2012 51 1 NaN
4 2012 52 1 NaN
5 2013 1 1 NaN
6 2013 2 1 NaN
7 2013 3 1 NaN
8 2013 4 1 NaN
9 2013 5 1 55.0
10 2012 48 2 NaN
11 2012 49 2 NaN
12 2012 50 2 NaN
13 2012 51 2 NaN
14 2012 52 2 NaN
15 2013 1 2 29.0
16 2013 2 2 24.0
17 2013 3 2 65.0
18 2013 4 2 34.0
19 2013 5 2 34.0
20 2012 48 3 34.0
21 2012 49 3 23.0
22 2012 50 3 87.0
23 2012 51 3 56.0
24 2012 52 3 89.0
25 2013 1 3 23.0
26 2013 2 3 45.0
27 2013 3 3 63.0
28 2013 4 3 87.0
29 2013 5 3 89.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.