使用最小和最大日期计算值差

Question

I am trying to calculate value growth/decline using the minimum date and maximum date.我正在尝试使用最小日期和最大日期来计算价值增长/下降。 My data currently looks like this:我的数据目前如下所示：

 Code Date Value 0 A 2020-12-31 80122.0 1 A 2019-12-31 45472.0 2 A 2018-12-31 31917.0 3 A 2017-12-31 23432.0 4 B 2020-12-31 0.0

For Code AI need to keep the max (2020-12-31) and min (2017-12-31) dates as well as the values so I can calculate the difference later on.对于 Code AI，需要保留最大 (2020-12-31) 和最小 (2017-12-31) 日期以及值，以便稍后计算差异。 I have multiple codes and need to be able to apply the same logic to each one.我有多个代码，需要能够对每个代码应用相同的逻辑。 Any suggestions on the best way to approach this?关于解决此问题的最佳方法的任何建议？

Thanks谢谢

Answer 1

In your case, you want to sort the date, then groupby and extract first, last:在您的情况下，您想对日期进行排序，然后是 groupby 并首先提取，最后：

 df.sort_values(['Code','Date']).groupby('Code')['Value'].agg(['first','last'])

Output: Output：

        first     last
Code                  
A     23432.0  80122.0
B         0.0      0.0

Answer 2

I would first sort_values then you can drop_duplicates on 'Code'.我会先sort_values然后你可以drop_duplicates在“代码”上。 Using different logic for keep this allows you to get the first and last row (based on Date) within each 'Code', which you can then subtract to get the day difference and Value difference for each code.使用不同的逻辑来keep这一点，您可以获取每个“代码”中的第一行和最后一行（基于日期），然后您可以将其减去以获得每个代码的日差和值差。

df = df.sort_values(['Code', 'Date'])

(df.drop_duplicates('Code', keep='last').set_index('Code')
 - df.drop_duplicates('Code', keep='first').set_index('Code'))

#          Date    Value
#Code                   
#A    1096 days  56690.0
#B       0 days      0.0

Alternatively if you don't just need the difference and actually need the rows, then I would concat those together instead of subtracting.或者，如果您不仅需要差异并且实际上需要行，那么我concat它们连接在一起而不是减去。 The main reason to avoid the .first aggregation is because it does not guarantee data come from the same rows (without specifying dropna ) in the case of null values.避免.first聚合的主要原因是因为它不保证在 null 值的情况下数据来自相同的行（没有指定dropna ）。

pd.concat([df.drop_duplicates('Code', keep='last').set_index('Code'),
           df.drop_duplicates('Code', keep='first').set_index('Code')],
          keys=['Last', 'First'], axis=1)

#           Last               First         
#           Date    Value       Date    Value
#Code                                        
#A    2020-12-31  80122.0 2017-12-31  23432.0
#B    2020-12-31      0.0 2020-12-31      0.0

Answer 3

since you自从你

need to keep the max (2020-12-31) and min (2017-12-31) dates as well as the values...需要保留最大（2020-12-31）和最小（2017-12-31）日期以及值...

, you can try: ，你可以试试：

df = pd.DataFrame({'Code':['A','A','A','A','B'], 
                   'Date': ['2020-12-31', '2019-12-31', '2018-12-31', '2017-12-31', '2020-12-31'],
                   'Value': [80122.0, 45472.0, 31917.0, 23432.0, 0.0] 
                  }, )

df.loc[:, 'Date'] = pd.to_datetime(df.loc[:, 'Date'])

is the df mentioned:是提到的df：

    Code    Date    Value
0   A   2020-12-31  80122.0
1   A   2019-12-31  45472.0
2   A   2018-12-31  31917.0
3   A   2017-12-31  23432.0
4   B   2020-12-31  0.0

so another way can be:所以另一种方法可以是：

dictionary = {}

for code in df.loc[:, 'Code'].unique():
    dictionary[code] = {'Date min': df.loc[df.loc[:, 'Code']==code,'Date'].min(),
                        'Value min': df.loc[(df.loc[:, 'Code']==code)& (df.loc[:,'Date'] == df.loc[df.loc[:, 'Code']==code,'Date'].min()), 'Value'].values[0],
                        'Date max': df.loc[df.loc[:, 'Code']==code,'Date'].max(),
                        'Value max':df.loc[(df.loc[:, 'Code']==code)&(df.loc[:,'Date'] == df.loc[df.loc[:, 'Code']==code,'Date'].max()), 'Value'].values[0]
                                     
                       }
resume = pd.DataFrame(dictionary)
resume = resume.transpose()
resume

that outputs:输出：

    Date min    Value min   Date max    Value max
A   2017-12-31  23432.0   2020-12-31    80122.0
B   2020-12-31  0.0       2020-12-31    0.0

使用最小和最大日期计算值差

问题描述

3 个解决方案

解决方案1
6 已采纳 2021-12-16 16:09:25

解决方案2
1 2021-12-16 16:10:07

解决方案3
0 2021-12-16 16:38:18

使用最小和最大日期计算值差

问题描述

3 个解决方案

解决方案1 6 已采纳 2021-12-16 16:09:25

解决方案2 1 2021-12-16 16:10:07

解决方案3 0 2021-12-16 16:38:18

解决方案1
6 已采纳 2021-12-16 16:09:25

解决方案2
1 2021-12-16 16:10:07

解决方案3
0 2021-12-16 16:38:18