簡體   English   中英

如何加快這個計算?

[英]how to speed up this calculation?

我有一個看起來像這樣的數據集:

|...userId.....|..cahtroomID..|..msg_index_in_chat..|..time_difference_between_msg..| |1234567891222222|sdfbsjkfdsdklf...|................1................................|......0 小時 0分鍾………………| |9876543112252141|sdfbsjkfdsdklf...|................2................|......0 小時 4分鍾………………| |2374623982398939|quweioqewiieio|................1................|......0 小時 0 分鍾.. ...................| |1234567891222222|quweioqewiieio|................2................|......0 小時 4 分鍾.. ...................| |2374623982398939|quweioqewiieio|..................3..........|......1 小時 0 分鍾.. ...................|

我需要計算每個房間中消息之間的平均時間,並將我得到的值分配給每一行。 為此,我寫了這個:

 df['avg_time'] = 0
    for room in set(df.roomId):
        table = df[['msg_index_in_chat', 'time_difference_between_msg']][df.roomId == room]
        if len(table) > 1:
            avg_time = []
            times = table.time_difference_between_msg.tolist()
            avg_time = sum(times[1:], timedelta(0))/len(times[1:])
        elif len(table) ==1:
            avg_time = timedelta(hours = 0)
        df.loc[df.roomId == room,('avg_time')] = avg_time

問題是這段代碼運行了長時間。 你能建議一個更快的方法來做這個計算嗎?

謝謝!

GroupBy.transform與自定義 lambda 函數一起使用:

f = lambda times: sum(times.iloc[1:], pd.Timedelta(0))/len(times.iloc[1:]) if len(times) > 1 else pd.Timedelta(0)
df['avg_time'] = df.groupby('cahtroomID')['time_difference_between_msg'].transform(f)
print (df)
             userId      cahtroomID  msg_index_in_chat  \
0  1234567891222222  sdfbsjkfdsdklf                  1   
1  9876543112252141  sdfbsjkfdsdklf                  2   
2  2374623982398939  quweioqewiieio                  1   
3  1234567891222222  quweioqewiieio                  2   
4  2374623982398939  quweioqewiieio                  3   

  time_difference_between_msg avg_time  
0                    00:00:00 00:04:00  
1                    00:04:00 00:04:00  
2                    00:00:00 00:32:00  
3                    00:04:00 00:32:00  
4                    01:00:00 00:32:00  

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM