简体   繁体   中英

How to identify an odd item in a list of items using python

My goal is to identify the odd element in the list below.

list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']

The odd item is tasksb2 as the other four items are under taska .

They all have equal length, hence discriminating using the len function will not work. Any ideas? thanks.

If you simply want to find the item that does not start with 'taska', then you could use the following list comprehension :

>>> list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']
>>> print [l for l in list_1 if not l.startswith('taska')]
['taskb2']

Another option is to use filter + lambda :

>>> filter(lambda l: not l.startswith('taska'), list_1)
['taskb2']

Seems to be an easy problem solved by alphabetical sort.

print sorted(list_1)[-1]

Don't wanna sort? Try an O(n) time-complexity solution with O(1) space complexity:

print max(list_1)

If you know what the basic structure of the items will be, then it's easy.

If you don't know the structure of your items a priori, one approach is to score the items according to their similarity against each other. Using info from this question for the standard library module difflib ,

import difflib
import itertools

list_1=['taska1', 'taska2', 'taska3', 'taskb2', 'taska7']

# Initialize a dict, keyed on the items, with 0.0 score to start
score = dict.fromkeys(list_1, 0.0)

# Arrange the items in pairs with each other
for w1, w2 in itertools.combinations(list_1, 2):
    # Performs the matching function - see difflib docs
    seq=difflib.SequenceMatcher(a=w1, b=w2)
    # increment the "match" score for each
    score[w1]+=seq.ratio()
    score[w2]+=seq.ratio()

# Print the results

>>> score
{'taska1': 3.166666666666667,
 'taska2': 3.3333333333333335,
 'taska3': 3.166666666666667,
 'taska7': 3.1666666666666665,
 'taskb2': 2.833333333333333}

It turns out that taskb2 has the lowest score!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM