[英]Comparing a 3-tuple to a list of 3-tuples using only the first two parts of the tuple
I have a list of 3-tuples in a Python program that I'm building while looking through a file (so one at a time), with the following setup: 在浏览文件时,我正在构建的Python程序中有一个三元组的列表(一次一个),具有以下设置:
(feature,combination,durationOfTheCombination),
such that if a unique combination of feature and combination is found, it will be added to the list. 这样,如果找到功能和组合的唯一组合,它将被添加到列表中。 The list itself holds a similar setup, but the durationOfTheCombination is the sum of all duration that share the unique combination of (feature,combination).
列表本身具有类似的设置,但是durationOfTheCombination是共享(feature,combination)唯一组合的所有持续时间的总和。 Therefore, when deciding if it should be added to the list, I need to only compare the first two parts of the tuple, and if a match is found, the duration is added to the corresponding list item.
因此,在决定是否应将其添加到列表时,我只需要比较元组的前两个部分,并且如果找到匹配项,则将持续时间添加到相应的列表项中。
Here's an example for clarity. 这是一个清晰的例子。 If the input is
(ABC,123,10);(ABC,123,10);(DEF,123,5);(ABC,123,30);(EFG,456,30)
The output will be (ABC,123,50);(DEF,123,5);(EFG,456,30)
. 如果输入为
(ABC,123,10);(ABC,123,10);(DEF,123,5);(ABC,123,30);(EFG,456,30)
输出将为(ABC,123,50);(DEF,123,5);(EFG,456,30)
。
Is there any way to do this comparison? 有什么办法可以做这种比较吗?
You can do this with Counter
, 您可以使用
Counter
来执行此操作
In [42]: from collections import Counter
In [43]: lst = [('ABC',123,10),('ABC',123,10),('DEF',123,5)]
In [44]: [(i[0],i[1],i[2]*j) for i,j in Counter(lst).items()]
Out[44]: [('DEF', 123, 5), ('ABC', 123, 20)]
As per the OP suggestion if it's have different values, use groupby
根据OP建议,如果其值不同,请使用
groupby
In [26]: lst = [('ABC',123,10),('ABC',123,10),('ABC',123,25),('DEF',123,5)]
In [27]: [tuple(list(n)+[sum([i[2] for i in g])]) for n,g in groupby(sorted(lst,key = lambda x:x[:2]), key = lambda x:x[:2])]
Out[27]: [('ABC', 123, 45), ('DEF', 123, 5)]
If you don't want to use Counter, you can use a dict instead. 如果您不想使用Counter,则可以改用dict。
setOf3Tuples = dict()
def add3TupleToSet(a):
key = a[0:2]
if key in setOf3Tuples:
setOf3Tuples[a[0:2]] += a[2]
else:
setOf3Tuples[a[0:2]] = a[2]
def getRaw3Tuple():
for k in setOf3Tuples:
yield k + (setOf3Tuples[k],)
if __name__ == "__main__":
add3TupleToSet(("ABC",123,10))
add3TupleToSet(("ABC",123,10))
add3TupleToSet(("DEF",123,5))
print([i for i in getRaw3Tuple()])
It seems a dict is more suited than a list here, with the first 2 fields as key. 似乎dict比此处的列表更合适,前两个字段为键。 And to avoid checking each time if the key is already here you can use a defaultdict.
并且为了避免每次都检查密钥是否已经在此处,可以使用defaultdict。
from collections import defaultdict
d = defaultdict(int)
for t in your_list:
d[t[:2]] += t[-1]
Assuming your input is collected in a list as below, you can use pandas groupby to accomplish this quickly: 假设您的输入收集在以下列表中,则可以使用pandas groupby快速完成此操作:
import pandas as pd
input = [('ABC',123,10),('ABC',123,10),('DEF',123,5),('ABC',123,30),('EFG',456,30)]
output = [tuple(x) for x in pd.DataFrame(input).groupby([0,1])[2].sum().reset_index().values]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.