简体   繁体   English

如何在列表中查找重复项及其索引?

[英]How to find duplicates and its indexes in a list?

I have a list 我有一个清单

l=['a','b','c','c','a','d']

The output should return all the duplicate elements and their indices in the list 输出应返回列表中所有重复的元素及其索引

Output: 输出:

out = {a:['0','4'],c:['2','3']}

I have tried 我努力了

def nextDuplicates(c):
    dupl_c = dict()
    sorted_ind_c = sorted(range(len(c)), key=lambda x: c[x])
    for i in xrange(len(c) - 1):
        if c[sorted_ind_c[i]] == c[sorted_ind_c[i+1]]:
            dupl_c[ sorted_ind_c[i] ] = sorted_ind_c[i+1]
    return dupl_c

A dict comprehension coupled with a list comprehension would work (even for more than 2 occurences) : dict理解与list理解结合使用(即使对于超过两次的事件也是如此):

l = ["a", "b", "c", "c", "a", "d"]
out = {el: [i for i, x in enumerate(l) if x == el] for el in l if l.count(el) > 1}

I saw in your expected output that indexes are strings. 我在您的预期输出中看到索引是字符串。 I don't understand why, but if you really want them as strings, replace i for i, x with str(i) for i, x . 我不明白为什么,但是如果您真的希望将它们作为字符串,请将i for i, x替换i for i, xstr(i) for i, x

More on list comprehensions 更多列表理解

Try this: 尝试这个:

l=['a','b','c','c','a','d']
o = {}
for i in range(len(l)):
    if (l[i] in o):
        o[l[i]].append(i)
    else:
        o[l[i]] = [i]
print({key:val for key, val in o.items() if len(val) > 1})

Use collections.defaultdict + a set iteration for a faster lookup for counts greater than 1: 使用collections.defaultdict + set迭代可以更快地查找大于1的计数:

from collections import defaultdict

l = ['a','b','c','c','a','d']

result = defaultdict(list)

for x in set(l):
    if l.count(x) > 1:
        result[x].extend([i for i, y in enumerate(l) if y == x])

print(result)
# defaultdict(<class 'list'>, {'a': [0, 4], 'c': [2, 3]})

You can use this dict comprehension 您可以使用此dict理解

l = ["a", "b", "c", "c", "a", "d"]
out = {ele: [str(i) for i, x in enumerate(l) if x == ele] for ele in set(l) if l.count(ele) > 1}

# Output : {'c': ['2', '3'], 'a': ['0', '4']}

Rather than iterating over the list itself using the set will give a performance improvement especially if there are many duplicates. 与其使用集合对列表本身进行迭代,反而可以改善性能,尤其是当有很多重复项时。

In your expected output you wanted a list of str as the value. 在预期的输出中,您需要一个str列表作为值。 If you need int, you can use i instead of str(i) 如果需要int,则可以使用i代替str(i)

l=['a','b','c','c','a','d']

result = {}

for element in l:
    if element not in result:
         indexes = [i for i, x in enumerate(l) if x == element]

         if len(indexes) > 1:
              result[element] = indexes

print(result)

Iterate through list and check if element already exista in dictionary. 遍历列表并检查字典中是否已存在元素。 If it doesn't then get all the indexes for that element and append the element in dictionary. 如果没有,则获取该元素的所有索引并将该元素附加到字典中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM