[英]Pandas: create rows for each unique value of a column, even with missing data
Note : I had difficulty wording the title of my question, so if you can think of something better to help other people with a similar question, please let me know and I will change it. 注意 :我很难用措辞来表达我的问题的标题,因此,如果您能想到更好的方法来帮助其他有类似问题的人,请告诉我,我将对其进行更改。
Stored as a Pandas DataFrame 存储为Pandas DataFrame
print(df)
week | site | vol
1 | a | 10
2 | a | 11
3 | a | 2
1 | b | 55
2 | b | 1
1 | c | 69
2 | c | 66
3 | c | 23
Notice that site b has no data for week 3 请注意,网站b没有第3周的数据
week | site | vol
1 | a | 10
2 | a | 11
3 | a | 2
1 | b | 55
2 | b | 1
3 | b | 0
1 | c | 69
2 | c | 66
3 | c | 23
Essentially, I want to create rows for all of the unique combinations of week
and site
. 本质上,我想为week
和site
所有唯一组合创建行。 If the original data doesn't have a vol
for a week-site
combo, then it gets a 0
. 如果原始数据在week-site
组合中没有vol
,那么它将获得0
。
Using stack
with unstack
使用stack
与unstack
df.set_index(['week','site']).unstack('week',fill_value=0).stack().reset_index()
Out[424]:
site week vol
0 a 1 10
1 a 2 11
2 a 3 2
3 b 1 55
4 b 2 1
5 b 3 0
6 c 1 69
7 c 2 66
8 c 3 23
You can use crosstab
and stack
: 您可以使用crosstab
和stack
:
pd.crosstab(df.site,df.week,df.vol, aggfunc='first').fillna(0).stack().reset_index(name='vol')
Output: 输出:
site week vol
0 a 1 10.0
1 a 2 11.0
2 a 3 2.0
3 b 1 55.0
4 b 2 1.0
5 b 3 0.0
6 c 1 69.0
7 c 2 66.0
8 c 3 23.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.