Pandas - 创建新列并根据过滤器分配值

Question

Say I have a dataframe假设我有一个 dataframe

id  category
1   A        
2   A
3   B
4   C
5   A

And I want to create a new column with incremental values where category == 'A' .我想创建一个包含增量值的新列，其中category == 'A' 。 So it should be something like.所以它应该是这样的。

id  category  value
1   A         1
2   A         2
3   B         NaN
4   C         NaN
5   A         3

Currently I am able to do this with目前我能够做到这一点

df['value'] = pd.nan
df.loc[df.category == "A", ['value']] = range(1, len(df[df.category == "A"]) + 1)

Is there a better/pythonic way to do this (ie I don't have to initialize the value column with nan? And currently, this method assigns me a float type instead of integer which is what I want.有没有更好的/pythonic 方法来执行此操作（即我不必使用 nan 初始化值列？目前，此方法为我分配了一个 float 类型，而不是我想要的 integer。

Answer 1

Column value is not necessary inicialize if default values is NaN , if create without [] , also for count values of mask is used sum :如果默认值为NaN ，则列value不需要初始化，如果创建时不使用[] ，则掩码的计数值也使用sum ：

m = df.category == "A"
df.loc[m, 'value'] = range(1, m.sum() + 1)
df['value'] = df['value'].astype('Int64')

print (df)

0   1        A      1
1   2        A      2
2   3        B   <NA>
3   4        C   <NA>
4   5        A      3

If need also set to nullable integers:如果还需要设置为可为空的整数：

m = df.category == "A"
df['value'] = m.cumsum().where(m).astype('Int64')
print (df)
   id category  value
0   1        A      1
1   2        A      2
2   3        B   <NA>
3   4        C   <NA>
4   5        A      3

Answer 2

Another way could be另一种方式可能是

df['value'] = df['category'].eq('A').cumsum()
df['value'][df['category']!='A'] = pd.NA

Pandas - 创建新列并根据过滤器分配值

问题描述

2 个解决方案

解决方案1
0 2020-08-19 08:56:52

解决方案2
0 2020-08-19 09:09:06

Pandas - 创建新列并根据过滤器分配值

问题描述

2 个解决方案

解决方案1 0 2020-08-19 08:56:52

解决方案2 0 2020-08-19 09:09:06

解决方案1
0 2020-08-19 08:56:52

解决方案2
0 2020-08-19 09:09:06