简体   繁体   中英

Match similar item in list

I have 2 lists of hostnames

foo=['some-router-1', 'some-switch-1', 'some-switch-2']

bar=['some-router-1-lo','some-switch-1','some-switch-2-mgmt','some-switch-3-mgmt']

I would expect output to be like...

out=['some-switch-3-mgmt']

I want to find entries in bar that are not in foo . However some names in bar have "-mgmt" or some other string appended that don't occur in foo . The length and number of dashes per list item vary greatly, so I'm not sure how successful using a regex would be. I'm new to programming, so please provide some explanation if possible.

You could do this with a list comprehension and all :

>>> out = [i for i in bar if all(j not in i for j in foo)]    
>>> out
['some-switch-3-mgmt']

Meaning, you select every element i in bar if, for every element j in foo , j is not contained in i .

You may achieve it by using filter as:

>>> filter(lambda x: x if not any(x.startswith(f) for f in foo) else None, bar)
['some-switch-3-mgmt']

I am using startswith to check whether any element of bar starts with any element of foo

You can use startswith() to see if a string starts with another string. So something like:

out = [bar_string for bar_string in bar if not bar_string.startswith(tuple(foo))]

There is some problems with the solutions provided by @Jim and @bbkglb when the elements are repeated in bar . Those solutions should be converted to sets . I tested the solutions and their response times:

foo=['some-router-1', 'some-switch-1', 'some-switch-2']*1000
bar=['some-router-1-lo','some-switch-1','some-switch-2-mgmt','some-switch-3-mgmt']*10000

Using filter - lambda :

%timeit set(filter(lambda x: x if not any(x.startswith(f) for f in foo) else None, bar))
1 loop, best of 3: 7.65 s per loop

Using all :

%timeit set([i for i in bar if all(j not in i for j in foo)])
1 loop, best of 3: 7.97 s per loop

Using any :

%timeit set(b for b in bar if not any(b.startswith(f) for f in foo))
1 loop, best of 3: 7.97 s per loop

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM