简体   繁体   English

如何从 pandas dataframe 中的当前行中减去前一行以创建一个新列,以每个名称重新启动进程?

[英]How to subtract previous row from current row in a pandas dataframe to create a new column restarting the process with each name?

I have a dataframe with 3 columns in which the first column is a categorical variable with a person name, the second column is the date and the third column are the cumulative ocurrences of a problem.我有一个 dataframe 有 3 列,其中第一列是带有人名的分类变量,第二列是日期,第三列是问题的累积发生率。 I would like to generate a new column with the ocurrences by day per person.我想生成一个新列,其中包含每人每天的出现次数。

**Name     Date          Cumulative**

John     01-01-2020    0
John     02-01-2020    5
John     03-01-2020    10
John     04-01-2020    12
Peter    01-01-2020    0
Peter    02-01-2020    3
Peter    03-01-2020    7
Peter    04-01-2020    10
James    01-01-2020    0
James    02-01-2020    10
James    03-01-2020    14
James    04-01-2020    18
Kirk     01-01-2020    0
Kirk     02-01-2020    12
Kirk     03-01-2020    12
Kirk     04-01-2020    15
Rob      01-01-2020    0
Rob      02-01-2020    11
Rob      03-01-2020    18
Rob      04-01-2020    23

If I use df['By Day'] = df.Cumulative.diff() the result is good but in the first ocurrence of each person it will give me the negative number instead of 0 (because it subtracts the previous number to the 0).如果我使用 df['By Day'] = df.Cumulative.diff() 结果很好,但在每个人的第一次出现时,它会给我负数而不是 0(因为它将前一个数字减去 0 )。 It would give me as follows:它会给我如下:

Name     Date          Cumulative  By Day

John     01-01-2020    0           0
John     01-02-2020    0           0
John     03-01-2020    5           5
John     04-01-2020    10          5
John     05-01-2020    12          2
Peter    01-01-2020    0           -12
Peter    02-01-2020    0           0
Peter    03-01-2020    3           3
Peter    04-01-2020    7           4
Peter    04-01-2020    10          3
James    01-01-2020    0           -10
James    02-01-2020    0           0
James    03-01-2020    10          10
James    04-01-2020    14          4
James    04-01-2020    18          4 
Kirk     01-01-2020    0           -18
Kirk     02-01-2020    0           0
Kirk     03-01-2020    12          12
Kirk     04-01-2020    15          3
Kirk     04-01-2020    19          4
Rob      01-01-2020    5           -14
Rob      02-01-2020    11          6
Rob      03-01-2020    18          7
Rob      04-01-2020    23          5
Rob      04-01-2020    27          4

I would like to do the difference by each name so that it starts from 0 every time the person is not the same.我想按每个名字做差异,以便每次人不一样时它都从 0 开始。 I've thought about using an iteration by name but it will do it 5 times for each entry.我曾考虑过按名称使用迭代,但它会为每个条目执行 5 次。 For example I would want, for Rob, 0 6 7 5 4 instead of starting with -14 (the previous 19 from Kirk -5 from Rob's first entry)例如,对于 Rob,我想要 0 6 7 5 4 而不是以 -14 开头(来自 Kirk 的前 19 -5 来自 Rob 的第一个条目)

You should first use groupby function on the Name column to apply the diff function separately over every person.您应该首先在Name列上使用groupby function 以分别对每个人应用diff function。 Then you can use fillna(0) to replace NaN values (which will exist in the first row of every person) with 0:然后您可以使用fillna(0)NaN值(将存在于每个人的第一行中)替换为 0:

df["By Day"] = df.groupby("Name").Comulative.diff().fillna(0)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从Pandas列中的当前行值中减去前一行的值 - Subtract previous row value from the current row value in a Pandas column 如何根据 pandas DataFrame 中的条件从当前行值中减去前一行值? - how to subtract previous row value from current row value based on condition in pandas DataFrame? 如何在熊猫数据框中从当前行中减去前一行并将其应用于每一行; 不使用循环? - How do I subtract the previous row from the current row in a pandas dataframe and apply it to every row; without using a loop? 如何从数字中减去熊猫DataFrame的每一行? - How to subtract each row of a pandas DataFrame from a number? 如何有效地从熊猫数据框中减去每一行? - How to efficiently subtract each row from pandas dataframe? 如何根据 Pandas dataframe 中上一行的行值创建新列? - How to create a new column based on row value in previous row in Pandas dataframe? 根据上一行的值在熊猫数据框中创建一个新列 - Create a new column in a pandas dataframe based on values found on a previous row 使用上一行用值创建新的Pandas DataFrame列 - Create New Pandas DataFrame Column with Values using Previous Row 如何组合多个 if 条件以创建新列并从前一行添加或减去 - How can I combine multiple if conditions to create a new column and add or subtract from the previous row 如何创建一个新的 DataFrame ,其中每一列代表一个实例在前一个 DataFrame 的行中的出现 - How to create a new DataFrame where each column represents occurrence of an instance in a row of a previous DataFrame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM