[英]Pandas find and interpolate missing value
This question is pretty much a follow up from Pandas pivot or reshape dataframe with NaN这个问题几乎是Pandas pivot的跟进或使用 NaN 重塑数据框
When decoding videos some frames go missing and that data needs to be interpolated解码视频时,某些帧会丢失,需要对这些数据进行插值
Current df当前DF
frame pvol vvol area label
0 NaN 109.8 120 v
2 NaN 160.4 140 v
0 23.1 NaN 110 p
1 24.3 NaN 110 p
2 25.6 NaN 112 p
Expected df预期 df
frame pvol vvol p_area v_area
0 23.1 109.8 110 110
1 24.3 135.1 110 111 # Interpolated for label v
2 25.6 160.4 112 120
I know I can do df.interpolate()
once the current_df
is reshaped for only p frames.我知道我可以做
df.interpolate()
一旦current_df
被重塑为 p 帧。 The reshaping is the issue.重塑是问题。
Note: label p >= label v
meaning label p
will always have all the frames but v
can have missed frames注意:
label p >= label v
意味着标签p
将始终包含所有帧,但v
可能有丢失的帧
You can reshape, dropna as in the previous question, except that now you need to specify that you want to drop only empty columns, then interpolate:您可以像上一个问题一样重塑 dropna,但现在您需要指定仅删除空列,然后进行插值:
out = (df.pivot(index='frame', columns='label')
.dropna(axis=1, how='all') # only drop empty columns
.interpolate() # interpolate
)
out.columns = [f'{y}_{x}' for x,y in out.columns]
Output:输出:
p_pvol v_vvol p_area v_area
frame
0 23.1 109.8 110.0 120.0
1 24.3 135.1 110.0 130.0
2 25.6 160.4 112.0 140.0
Change the dropna
remove the issue更改
dropna
删除问题
s = df.set_index(['frame','label']).unstack().dropna(thresh=1,axis=1)
s.columns = s.columns.map('_'.join)
s = s.interpolate()
Out[279]:
pvol_p vvol_v area_p area_v
frame
0 23.1 109.8 110.0 120.0
1 24.3 135.1 110.0 130.0
2 25.6 160.4 112.0 140.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.