简体   繁体   中英

How to iterate through a string and find the duplicated values

My question is that I can't seem to find the right method for iterating through subjects2 and pick out the duplicated strings. Below is my method:

nosubjects = []
subjects2 = ["hi","hi","bi","ki","si","bi","li"]
for i in subjects2:
  if subjects2.count(i)==2:
    nosubjects.extend(i)
    print(nosubjects)

But when I print it out it appears like this:

['hi','hi']
['h', 'i', 'h', 'i','b', 'i']
['hi', 'i', 'h', 'i', 'b', 'i', 'b', 'i']

Please help thanks!

Use collections.Counter to get count of each element and take only those whose count exceeds 1:

from collections import Counter

subjects2 = ['hi', 'hi', 'bi', 'ki', 'si', 'bi', 'li']
nosubjects = [x for x, i in Counter(subjects2).items() if i > 1]

print(nosubjects)
# ['hi', 'bi']

Problems in your code:

  • You are trying to check the count of each element in the list, due to which duplicated elements will be checked multiple times.
  • You are printing nosubjects inside the if condition which will cause it to be printed multiple times

Use sets . First to get unique set of elements in the list, then you can check if the count of each element in the set exceeds 1 in the original list.

nosubjects = []
subjects2 = ['hi','hi','bi','ki','si','bi','li']

for i in set(subjects2):
  if subjects2.count(i)>=2:
    nosubjects.append(i)

print(nosubjects)

Using list comprehension:

subjects2 = ['hi','hi','bi','ki','si','bi','li']

nosubjects = [i for i in set(subjects2) if subjects2.count(i) >=2]    
print(nosubjects)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM