i have a data set that looks like this:
|.....userId.................|..cahtroomID....|..msg_index_in_chat..|..time_difference_between_msg..| |1234567891222222|sdfbsjkfdsdklf...|..............1.................|......0 hours 0 minutes....................| |9876543112252141|sdfbsjkfdsdklf...|...............2................|......0 hours 4 minutes....................| |2374623982398939|quweioqewiieio|...............1................|......0 hours 0 minutes....................| |1234567891222222|quweioqewiieio|...............2................|......0 hours 4 minutes....................| |2374623982398939|quweioqewiieio|....................3...........|......1 hours 0 minutes....................|
I need to calculate the average time between messages in every room and assign the value I've gotten to every row. To do so, I wrote this:
df['avg_time'] = 0
for room in set(df.roomId):
table = df[['msg_index_in_chat', 'time_difference_between_msg']][df.roomId == room]
if len(table) > 1:
avg_time = []
times = table.time_difference_between_msg.tolist()
avg_time = sum(times[1:], timedelta(0))/len(times[1:])
elif len(table) ==1:
avg_time = timedelta(hours = 0)
df.loc[df.roomId == room,('avg_time')] = avg_time
the problem is that this code runs for a lot of time. can you suggest a faster way for doing this calculation?
Thank you!
Use GroupBy.transform
with custom lambda function:
f = lambda times: sum(times.iloc[1:], pd.Timedelta(0))/len(times.iloc[1:]) if len(times) > 1 else pd.Timedelta(0)
df['avg_time'] = df.groupby('cahtroomID')['time_difference_between_msg'].transform(f)
print (df)
userId cahtroomID msg_index_in_chat \
0 1234567891222222 sdfbsjkfdsdklf 1
1 9876543112252141 sdfbsjkfdsdklf 2
2 2374623982398939 quweioqewiieio 1
3 1234567891222222 quweioqewiieio 2
4 2374623982398939 quweioqewiieio 3
time_difference_between_msg avg_time
0 00:00:00 00:04:00
1 00:04:00 00:04:00
2 00:00:00 00:32:00
3 00:04:00 00:32:00
4 01:00:00 00:32:00
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.