[英]How to generate a column from the unique groupby combinations in a cumulative way?
My data looks like this:我的数据如下所示:
df_dict = {
'Year' : [2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022],
'Week of Year' : [1, 1, 2, 2, 10, 10, 11, 11]
}
df = pd.DataFrame(df_dict)
How can I generate a new column, say Week Order
that shows the unique Year, Week of Year
combinations in a cumulative way.我如何生成一个新列,比如以累积方式显示独特的
Year, Week of Year
组合的Week Order
。 The resulting data set will be like this:结果数据集将是这样的:
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
You can use pandas.factorize
:您可以使用
pandas.factorize
:
df['Week Order'] = df.agg(tuple, axis=1).factorize()[0]+1
print(df)
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
here is one way to do it这是一种方法
df['week order']=1
df['week order']=df['week order'].mask(df.duplicated()).cumsum().ffill().astype(int)
df
OR要么
df['week order'] = (df.duplicated()).cumsum().shift(-1).ffill().astype(int)
Year Week of Year week order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
Another option, sort_values
+ duplicated
+ cumsum
, ie every non duplicated Year + Week increases the order by one:另一种选择,
sort_values
+ duplicated
+ cumsum
,即每个非重复的 Year + Week 将订单增加一个:
cols = ['Year', 'Week of Year']
df['Week Order'] = (~df.sort_values(cols).duplicated(cols)).cumsum()
df
Year Week of Year Week Order
0 2021 1 1
1 2021 1 1
2 2021 2 2
3 2021 2 2
4 2022 10 3
5 2022 10 3
6 2022 11 4
7 2022 11 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.