根据值在另一列中出现的次数按常量增加值

Question

I have:我有：

df=pd.DataFrame({'col1':['x','x','x','x','x','y','y','y','y','y','y','y'],
                 'value':[0,0,0,0,0,0,0,0,0,0,0,0]})

I would like:我想要：

the value column to increase by a constant value depending on the number of times it appears in col1 . value列增加一个常数值，具体取决于它在col1中出现的次数。 for each occurrence of x , it increases by 100, and for each occurrence of y it increases by 150对于x的每次出现，它增加 100，对于y的每次出现，它增加 150

Answer 1

We'll start by getting the cumulative count for each item in col1 :我们将从获取col1中每个项目的累计计数开始：

df['value'] = df.groupby('col1').cumcount()

Next, we need to apply the multiplication by item:接下来，我们需要按项目应用乘法：

multiples = {
    'x': 100,
    'y': 150
}
for col, value in multiples.items():
    index = df['col1'] == col
    df.loc[index,'value'] *= value

Giving the final result:给出最终结果：

    col1    value
0   x   0
1   x   100
2   x   200
3   x   300
4   x   400
5   y   0
6   y   150
7   y   300
8   y   450
9   y   600
10  y   750
11  y   900

Answer 2

EDIT : SNygard beat me to it, but I try to present a solution that makes use of pandas' broadcasting architecture and bypasses the inneficiencies of iteration.编辑： SNygard打败了我，但我尝试提出一种解决方案，该解决方案利用熊猫的广播架构并绕过迭代的低效性。

It is said that when you iterate over a dataframe's rows, you lose pandas' efficiency by using it to a purpose it was not intended for.据说当你遍历数据帧的行时，你会因为将它用于非预期目的而失去 pandas 的效率。

Here's how I would do it:我会这样做：

import pandas as pd

col1_to_value_hash = {
    'x': 100,
    'y': 150
}

df = pd.DataFrame({
    'col1': ['x', 'x', 'x', 'x', 'x', 'y', 'y', 'y', 'y', 'y', 'y', 'y'],
    'value': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
})

cumcount = df.groupby('col1').cumcount()


df['value'] = cumcount * df['col1'].apply(lambda x: col1_to_value_hash[x])

根据值在另一列中出现的次数按常量增加值

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-12-07 16:02:58

解决方案2
0 2022-12-07 16:13:27

根据值在另一列中出现的次数按常量增加值

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-12-07 16:02:58

解决方案2 0 2022-12-07 16:13:27

解决方案1
1 已采纳 2022-12-07 16:02:58

解决方案2
0 2022-12-07 16:13:27