计算日期列表之间的平均天数

Question

I'm trying to calculate the posting frequency of a user on Instagram. 我正在尝试计算用户在Instagram上的发布频率。 Therefore I've built a list of their most recent post dates. 因此，我建立了他们最近发布日期的列表。 Like this: 像这样：

['01-23-2019', '01-19-2019', '01-12-2019', '12-30-2018', '12-28-2018', '12-20-2018', '11-21-2018', '11-09-2018', '10-26-2018', '10-12-2018', '09-30-2018', '09-16-2018', '09-06-2018', '08-31-2018', '08-15-2018', '08-12-2018', '08-09-2018', '07-30-2018', '07-27-2018', '07-24-2018', '07-20-2018', '07-17-2018', '07-14-2018', '07-08-2018', '07-06-2018', '06-30-2018', '06-26-2018', '06-13-2018', '06-08-2018', '06-06-2018', '05-28-2018', '05-21-2018', '05-19-2018', '05-11-2018', '05-08-2018', '05-03-2018', '05-01-2018', '04-12-2018', '04-05-2018', '03-31-2018', '03-27-2018', '03-10-2018', '03-06-2018', '02-25-2018', '02-21-2018', '02-18-2018', '02-16-2018', '02-11-2018', '02-06-2018', '02-03-2018']

Ideally, I want to get the average amount of days between the post dates. 理想情况下，我想获取发布日期之间的平均天数。 So I end up with a frequency number: eg 'the user posts every n days'. 因此，我最后得到一个频率编号：例如“用户每n天发布一次”。

I'm taking the timestamp from the JSON code and converting it to something readable like this: 我正在从JSON代码获取时间戳并将其转换为类似这样的可读性：

import datetime
#prepare timestamp to calculate frequency
taken_on = post_details['taken_at_timestamp']
readable_post_date = datetime.datetime.fromtimestamp(taken_on).strftime('%m-%d-%Y')
post_dates.append(readable_post_date)

How should I best approach this to get a decimal result? 我应该如何最好地获得十进制结果？

Answer 1

Just sum up the difference between dates and divide by the total. 只需将日期之间的差异求和，然后除以总数即可。 To keep it simple 为了简单起见

from datetime import datetime, timedelta

dates = ['01-23-2019', '01-19-2019', '01-12-2019', '12-30-2018', '12-28-2018', '12-20-2018', '11-21-2018', '11-09-2018', '10-26-2018', '10-12-2018', '09-30-2018', '09-16-2018', '09-06-2018', '08-31-2018', '08-15-2018', '08-12-2018', '08-09-2018', '07-30-2018', '07-27-2018', '07-24-2018', '07-20-2018', '07-17-2018', '07-14-2018', '07-08-2018', '07-06-2018', '06-30-2018', '06-26-2018', '06-13-2018', '06-08-2018', '06-06-2018', '05-28-2018', '05-21-2018', '05-19-2018', '05-11-2018', '05-08-2018', '05-03-2018', '05-01-2018', '04-12-2018', '04-05-2018', '03-31-2018', '03-27-2018', '03-10-2018', '03-06-2018', '02-25-2018', '02-21-2018', '02-18-2018', '02-16-2018', '02-11-2018', '02-06-2018', '02-03-2018']
sorted_dates = sorted(datetime.strptime(d, '%m-%d-%Y') for d in dates)

time_difference = timedelta(0)
counter = 0
for i in range(1, len(sorted_dates), 1):
    counter += 1
    time_difference += sorted_dates[i] - sorted_dates[i-1]


frequency = time_difference / counter
print(frequency.days) # 7 days

Answer 2

Using statistics.mean with the datetime module: 将statistics.mean与datetime模块一起使用：

from datetime import datetime
from statistics import mean

def to_dt(x):
    return datetime.strptime(x, '%m-%d-%Y')

res = mean((to_dt(x) - to_dt(y)).days for x, y in zip(L, L[1:]))  # 7.22

Answer 3

If you are looking for the average as a float value, then you could do something like the following (since your list is already sorted). 如果您正在寻找平均值作为浮点值，则可以执行以下操作（因为列表已经排序）。

from datetime import datetime

dates = ['01-23-2019', '01-19-2019', '01-12-2019', '12-30-2018', '12-28-2018', '12-20-2018', '11-21-2018', '11-09-2018', '10-26-2018', '10-12-2018', '09-30-2018', '09-16-2018', '09-06-2018', '08-31-2018', '08-15-2018', '08-12-2018', '08-09-2018', '07-30-2018', '07-27-2018', '07-24-2018', '07-20-2018', '07-17-2018', '07-14-2018', '07-08-2018', '07-06-2018', '06-30-2018', '06-26-2018', '06-13-2018', '06-08-2018', '06-06-2018', '05-28-2018', '05-21-2018', '05-19-2018', '05-11-2018', '05-08-2018', '05-03-2018', '05-01-2018', '04-12-2018', '04-05-2018', '03-31-2018', '03-27-2018', '03-10-2018', '03-06-2018', '02-25-2018', '02-21-2018', '02-18-2018', '02-16-2018', '02-11-2018', '02-06-2018', '02-03-2018']

def daysdiff(a, b):
    return (datetime.strptime(a, '%m-%d-%Y') - datetime.strptime(b, '%m-%d-%Y')).days

average = sum(daysdiff(a, b) for a, b in zip(dates, dates[1:])) / (len(dates) - 1)
print(average)
# OUTPUT
# 7.224489795918367

Iterates over consecutive pairs in your list by zipping the list with a slice of itself, gets the date difference in days and divides by the number of consective pairs to produce the average. 通过使用列表本身的一部分压缩列表，可以对列表中的连续对进行迭代，获得以天为单位的日期差，然后除以连续对的数量即可得出平均值。

Answer 4

Four things here : The use timedelta (datetime), zip , list comprehension , and mean (statistics) 这里有四件事 ：使用timedelta （日期时间）， zip ，列表推导和均值（统计信息）

from datetime import datetime
from statistics import mean

deltas = [(datetime.strptime(d1, '%m-%d-%Y') - datetime.strptime(d2, '%m-%d-%Y')).days 
          for d1, d2 in zip(my_list[:-1], my_list[1:])]
avg_days = mean(deltas)

计算日期列表之间的平均天数

问题描述

4 个解决方案

解决方案1
2 2019-01-25 15:00:41

解决方案2
2 2019-01-25 15:25:41

解决方案3
1 已采纳 2019-01-25 15:15:06

解决方案4
1 2019-01-25 15:33:19

计算日期列表之间的平均天数

问题描述

4 个解决方案

解决方案1 2 2019-01-25 15:00:41

解决方案2 2 2019-01-25 15:25:41

解决方案3 1 已采纳 2019-01-25 15:15:06

解决方案4 1 2019-01-25 15:33:19

解决方案1
2 2019-01-25 15:00:41

解决方案2
2 2019-01-25 15:25:41

解决方案3
1 已采纳 2019-01-25 15:15:06

解决方案4
1 2019-01-25 15:33:19