繁体   English   中英

如何按演讲者姓名汇总时间戳列表

[英]how to sum up a list of timestamps by the speaker name

我正在做一个项目,我已经从列表中提取数据,现在有 3 个列表:
list 1 - 演讲者姓名列表

['<M1>', '<M1>', '<M1>', '<M1>', '<M1>', '<M2>', '<M2>', '<M2>', '<M1>', '<M1>', '<M2>', '<M1>', '<M2>', '<M2>', '<M2>', '<M2>', '<M2>']

列表 2 - 通话时间戳开始的列表

['[00:00:00.000]', '[00:00:08.010]', '[00:00:16.890]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:21.120]', '[00:01:46.130]', '[00:01:47.180]', '[00:01:49.390]', '[00:01:50.670]', '[00:02:02.320]', '[00:02:16.010]', '[00:02:21.110]', '[00:02:27.610]']

列表 3 - 通话时间戳结束的列表

['[00:00:08.010]', '[00:00:16.290]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:20.250]', '[00:01:33.850]', '[00:01:47.150]', '[00:01:49.370]', '[00:01:50.140]', '[00:02:01.350]', '[00:02:16.010]', '[00:02:20.150]', '[00:02:27.610]', '[00:02:39.040]'] 

我需要做的是每当一个发言者多次讲话时(例如列表的前 5 个元素),我需要将第一个结束段 [00:00:08.010] 更改为 [00:00:48.100] 并摆脱之间的所有条目(将只有一个发言者的 5 个条目变为 1 个条目)并对列表中的所有发言者再次执行此操作。 如果说话者只说了一次,那么它需要保持不变。 有人可以帮助我并找到在 python 中执行此操作的方法吗? 谢谢 !

speakerOrder    = ['<M1>', '<M1>', '<M1>', '<M1>', '<M1>', '<M2>', '<M2>', '<M2>', '<M1>', '<M1>', '<M2>', '<M1>', '<M2>', '<M2>', '<M2>', '<M2>', '<M2>']
speakerBegin    = ['[00:00:00.000]', '[00:00:08.010]', '[00:00:16.890]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:21.120]', '[00:01:46.130]', '[00:01:47.180]', '[00:01:49.390]', '[00:01:50.670]', '[00:02:02.320]', '[00:02:16.010]', '[00:02:21.110]', '[00:02:27.610]']
speakerEnd      = ['[00:00:08.010]', '[00:00:16.290]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:20.250]', '[00:01:33.850]', '[00:01:47.150]', '[00:01:49.370]', '[00:01:50.140]', '[00:02:01.350]', '[00:02:16.010]', '[00:02:20.150]', '[00:02:27.610]', '[00:02:39.040]']


newSpeakerOrder = []
newSpeakerBegin = []
newSpeakerEnd   = []

currentSpeaker = None
for speakerIndex in range(len(speakerOrder)):
    speaker = speakerOrder[speakerIndex]
    if(currentSpeaker!=speaker):
        #If someone was already speaking add the time it ended
        if(currentSpeaker!=None):
            newSpeakerEnd.append(speakerEnd[speakerIndex-1])
        #Add the new Speaker
        newSpeakerOrder.append(speaker)
        currentSpeaker = speaker
        #Add the time it began
        newSpeakerBegin.append(speakerBegin[speakerIndex])

#Add the final time the last person stopped speaking
newSpeakerEnd.append(speakerEnd[-1])

print(newSpeakerOrder)
print(newSpeakerBegin)
print(newSpeakerEnd)

这是我提出的解决方案,虽然不完美,但应该可以解决您的问题。 只需事先确保原始 arrays 具有相同的长度。

您可以在 itertools 中使用 groupby function ,试试这个

from itertools import groupby

l1 = ['<M1>', '<M1>', '<M1>', '<M1>', '<M1>', '<M2>', '<M2>', '<M2>', '<M1>', '<M1>', '<M2>', '<M1>', '<M2>', '<M2>', '<M2>', '<M2>', '<M2>']
l2= ['[00:00:00.000]', '[00:00:08.010]', '[00:00:16.890]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:21.120]', '[00:01:46.130]', '[00:01:47.180]', '[00:01:49.390]', '[00:01:50.670]', '[00:02:02.320]', '[00:02:16.010]', '[00:02:21.110]', '[00:02:27.610]']
l3 = ['[00:00:08.010]', '[00:00:16.290]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:20.250]', '[00:01:33.850]', '[00:01:47.150]', '[00:01:49.370]', '[00:01:50.140]', '[00:02:01.350]', '[00:02:16.010]', '[00:02:20.150]', '[00:02:27.610]', '[00:02:39.040]'] 
start_index = 0
for (m,g) in groupby(l1):
    end_index = start_index + len(list(g)) -1
    start_time = l2[start_index]
    end_time = l3[end_index]
    start_index=end_index+1
    print(start_time)
    print(end_time)
    print("============")

output

[00:00:00.000]
[00:00:48.100]
============
[00:00:48.100]
[00:01:20.250]
============
[00:01:21.120]
[00:01:47.150]
============
[00:01:47.180]
[00:01:49.370]
============
[00:01:49.390]
[00:01:50.140]
============
[00:01:50.670]
[00:02:39.040]
============

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM