[英]Add "missing" rows to multi-index groupby pandas dataframe
I have a DataFrame that looks like this:我有一个 DataFrame,看起来像这样:
numberSold
date | location | time
3/10 FL 12:00 4
1:00 1
4:00 5
3/11 FL 1:00 2
2:00 3
3:00 0
3/12 FL 2:00 6
5:00 6
It's multi-index (date, location, time).它是多索引(日期、位置、时间)。 I want the output to look as follows:我希望 output 如下所示:
numberSold
date | location | time
3/10 FL 12:00 4
1:00 1
4:00 5
3/11 FL 12:00 4
1:00 2
2:00 3
3:00 0
4:00 5
3/12 FL 12:00 4
1:00 2
2:00 6
3:00 0
5:00 6
Here is the first DataFrame in dictionary format:这是字典格式的第一个 DataFrame:
{'numberSold': {('3/10', 'FL', '12:00'): 4,
('3/10', 'FL', '1:00'): 1,
('3/10', 'FL', '4:00'): 5,
('3/11', 'FL', '1:00'): 2,
('3/11', 'FL', '2:00'): 3,
('3/11', 'FL', '3:00'): 0,
('3/12', 'FL', '2:00'): 6,
('3/12', 'FL', '5:00'): 6}}
Basically, I want the table to build off of the previous entries.基本上,我希望该表基于以前的条目构建。 If the entry exists in the current entry, then use the current entry (like how 3/11 1:00 uses "2" and not "1"), but if it doesn't exist, then just add on what the previous row had (like how 3/11 has the 4:00 value from 3/10).如果该条目存在于当前条目中,则使用当前条目(如 3/11 1:00 使用“2”而不是“1”),但如果不存在,则只需添加上一行有(比如 3/11 如何从 3/10 获得 4:00 的值)。
I'm not sure how to use Pandas to do something like this, I feel like it's pretty simple, but my attempts have all failed.我不确定如何使用 Pandas 来做这样的事情,我觉得它很简单,但我的尝试都失败了。
You could pivot
+ ffill
to get the missing data;您可以pivot
+ ffill
来获取丢失的数据; then stack
to get the DataFrame back in previous shape:然后stack
以将 DataFrame 恢复为之前的形状:
df.index.names = ['date', 'location', 'time']
out = df.reset_index().pivot(['date', 'location'], 'time', 'numberSold').ffill().stack().to_frame(name='numberSold')
Output: Output:
numberSold
date location time
3/10 FL 12:00 4.0
1:00 1.0
4:00 5.0
3/11 FL 12:00 4.0
1:00 2.0
2:00 3.0
3:00 0.0
4:00 5.0
3/12 FL 12:00 4.0
1:00 2.0
2:00 6.0
3:00 0.0
4:00 5.0
5:00 6.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.