在pandas.DataFrame中向缺少的数据插入零

Question

I have a following kind of pandas.DataFrame: 我有以下类型的pandas.DataFrame：

sales_with_missing = pd.DataFrame({'month':[1,2,3,6,7,8,9,10,11,12],'code':[111]*10,  'sales':[np.random.randint(1500) for _ in np.arange(10)]})

You can see records for April and May are missing, and I'd like to insert sales as zero for those missing records: 您可以看到4月和5月的记录丢失，我想将丢失记录的销售额列为零：

sales = insert_zero_for_missing(sales_with_missing)
print(sales)

How can I implement the insert_zero_for_missing method? 如何实现insert_zero_for_missing方法？

Answer 1

Set the month as the index, 将month设置为索引，
reindex to add rows for the missing months, reindex为缺少的月份添加行，
call fillna to fill the missing values with zero, and then 调用fillna以零填充缺失值，然后
reset the index (to make month a column again): 重置索引（使month再次成为列）：

import numpy as np
import pandas as pd

month = list(range(1,4)) + list(range(6,13))
sales = np.array(month)*100
df = pd.DataFrame(dict(month=month, sales=sales))
print(df.set_index('month').reindex(range(1,13)).fillna(0).reset_index())

yields 产量

    month  sales
0       1    100
1       2    200
2       3    300
3       4      0
4       5      0
5       6    600
6       7    700
7       8    800
8       9    900
9      10   1000
10     11   1100
11     12   1200

Answer 2

# create a series of all months
all_months = pd.Series(data = range(1 , 13))
# get all missing months from your data frame in this example it will be 4 & 5
missing_months = all_months[~all_months.isin(sales_with_missing.month)]
# create a new data frame of missing months , it will be used in the next step to be concatenated to the original data frame
missing_df = pd.DataFrame({'month' : missing_months.values , 'code' : 111 , 'sales' : 0})

Out[36]:
code    month   sales
111        4    0
111        5    0
# then concatenate both data frames
pd.concat([sales_with_missing , missing_df]).sort_index(by = 'month')

Out[39]:
code    month   sales
111        1    1028
111        2    1163
111        3    961
111        4    0
111        5    0
111        6    687
111        7    31
111        8    607
111        9    1236
111        10   0863
111        11   11233
111        12   2780

在pandas.DataFrame中向缺少的数据插入零

问题描述

2 个解决方案

解决方案1
5 已采纳 2015-09-13 14:18:28

解决方案2
4 2015-09-13 15:17:03

在pandas.DataFrame中向缺少的数据插入零

问题描述

2 个解决方案

解决方案1 5 已采纳 2015-09-13 14:18:28

解决方案2 4 2015-09-13 15:17:03

解决方案1
5 已采纳 2015-09-13 14:18:28

解决方案2
4 2015-09-13 15:17:03