[英]pandas: fill missing data in data frame columns
I have the following pandas
data frame: 我有以下
pandas
数据框:
import numpy as np
import pandas as pd
timestamps = [1, 14, 30]
data = dict(quantities=[1, 4, 9], e_quantities=[1, 2, 3])
df = pd.DataFrame(data=data, columns=data.keys(), index=timestamps)
which looks like this: 看起来像这样:
quantities e_quantities
1 1 1
14 4 2
30 9 3
However, the timestamps
should run from 1 to 52: 但是,
timestamps
应从1到52:
index = pd.RangeIndex(1, 53)
The following line provides the timestamps
that are missing: 以下行提供了缺少的
timestamps
:
series_fill = pd.Series(np.nan, index=index.difference(df.index)).sort_index()
How can I get the quantities
and e_quantities
columns to have NaN values at these missing timestamps? 如何在这些丢失的时间戳上使
quantities
和e_quantities
列具有NaN值?
I've tried: 我试过了:
df = pd.concat([df, series_fill]).sort_index()
but it adds another column ( 0
) and swaps the order of the original data frame: 但会添加另一列(
0
)并交换原始数据帧的顺序:
0 e_quantities quantities
1 NaN 1.0 1.0
2 NaN NaN NaN
3 NaN NaN NaN
Thanks for any help here. 感谢您的帮助。
我认为您正在寻找reindex
df=df.reindex(index)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.