简体   繁体   English

大熊猫:在数据框列中填充缺失的数据

[英]pandas: fill missing data in data frame columns

I have the following pandas data frame: 我有以下pandas数据框:

import numpy as np
import pandas as pd
timestamps = [1, 14, 30]
data = dict(quantities=[1, 4, 9], e_quantities=[1, 2, 3])
df = pd.DataFrame(data=data, columns=data.keys(), index=timestamps)

which looks like this: 看起来像这样:

    quantities  e_quantities
1            1             1
14           4             2
30           9             3

However, the timestamps should run from 1 to 52: 但是, timestamps应从1到52:

index = pd.RangeIndex(1, 53)

The following line provides the timestamps that are missing: 以下行提供了缺少的timestamps

series_fill = pd.Series(np.nan, index=index.difference(df.index)).sort_index()

How can I get the quantities and e_quantities columns to have NaN values at these missing timestamps? 如何在这些丢失的时间戳上使quantitiese_quantities列具有NaN值?

I've tried: 我试过了:

df = pd.concat([df, series_fill]).sort_index()

but it adds another column ( 0 ) and swaps the order of the original data frame: 但会添加另一列( 0 )并交换原始数据帧的顺序:

     0  e_quantities  quantities
1  NaN           1.0         1.0
2  NaN           NaN         NaN
3  NaN           NaN         NaN

Thanks for any help here. 感谢您的帮助。

我认为您正在寻找reindex

df=df.reindex(index)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM