[英]Pandas - Create new column and assign values based on filter
Say I have a dataframe假设我有一个 dataframe
id category
1 A
2 A
3 B
4 C
5 A
And I want to create a new column with incremental values where category == 'A'
.我想创建一个包含增量值的新列,其中category == 'A'
。 So it should be something like.所以它应该是这样的。
id category value
1 A 1
2 A 2
3 B NaN
4 C NaN
5 A 3
Currently I am able to do this with目前我能够做到这一点
df['value'] = pd.nan
df.loc[df.category == "A", ['value']] = range(1, len(df[df.category == "A"]) + 1)
Is there a better/pythonic way to do this (ie I don't have to initialize the value column with nan? And currently, this method assigns me a float type instead of integer which is what I want.有没有更好的/pythonic 方法来执行此操作(即我不必使用 nan 初始化值列?目前,此方法为我分配了一个 float 类型,而不是我想要的 integer。
Column value
is not necessary inicialize if default values is NaN
, if create without []
, also for count values of mask is used sum
:如果默认值为NaN
,则列value
不需要初始化,如果创建时不使用[]
,则掩码的计数值也使用sum
:
m = df.category == "A"
df.loc[m, 'value'] = range(1, m.sum() + 1)
df['value'] = df['value'].astype('Int64')
print (df)
0 1 A 1
1 2 A 2
2 3 B <NA>
3 4 C <NA>
4 5 A 3
If need also set to nullable integers:如果还需要设置为可为空的整数:
m = df.category == "A"
df['value'] = m.cumsum().where(m).astype('Int64')
print (df)
id category value
0 1 A 1
1 2 A 2
2 3 B <NA>
3 4 C <NA>
4 5 A 3
Another way could be另一种方式可能是
df['value'] = df['category'].eq('A').cumsum()
df['value'][df['category']!='A'] = pd.NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.