简体   繁体   中英

Find matching elements based on given string

Given I have this list for simplicity.

lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']    

and the string 'dbaeec' which will always be even length mostly 6 or 8.

We will split it into 2 length chunks, then take the first 'db' find it, but with the condition that, the element next to it needs to be 'ae' then after that 'ec'

Based on the list above we see ' db ' at index 0 and 2.

The first one doesn't match ' ae ' for their next element and should be ignored, but the latter does, even for the third ' ec ' and so the output should be 2 .

This is what I tried so far,

for i, n in enumerate(lst): 
  if n == 'db': 
      if lst[i+1] == 'ae':
          if lst[i+2] == 'ec':
              print(i)
              break

but surely there must be better/pythonic way?

Get 3 (len(string)//2) elements each time convert it to a string and compare with the teststring .

lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']    
teststring= "dbaeec"
for i in range( len(lst)-len(string)//2 ):
    if "".join( lst[i:i+len(string)//2] ) == teststring:
        print(i, i+len(string)//2)
        break

Output:

2 5

Here is a regex -based solution. I made it into a reusable function.

import re
test_lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']
test_pat = 'dbaeec'

def find_match_index(needle, haystack):
    m = re.search(needle, ''.join(haystack))
    try:
        return (m.span()[0])/2
    except AttributeError:
        return None

def test(pat, lst):
    match = find_match_index(pat, lst)
    if match is None:
        print("No match was found (function returned None)")
    else:
        print(f"Found match at list index {match}")

print("Test with test data")
test(test_pat, test_lst)
print("Test with non-matchable pattern")
test('x', test_lst)

#output
Test with test data
Found match at list index 2.0
Test with non-matchable pattern
No match was found (function returned None)

This makes use of the fact Python is a dynamic typed language, returning None for no match. Caller must test for this return. This is because you could have a valid match at index zero, thus zero cannot be a flag for not found.

I am not a fan of manipulating types this way, coming from a C background. There is no law against this in python, but it comes with risks in future code maintenance. If I had more time and this was a bigger project, I would make a "Result" class to keep to one type per variable.

You can do something like this:

This will check if the sequence of dbaeec exists in the list at any consecutive positions.

lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']    
s = 'dbaeec'

if s in ''.join(lst):
    print ('yes')
else:
    print ('no')

If you also want to find the index position in the list, you can do:

lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']    
s = 'dbaeec'

i = ''.join(lst).find(s) #do a find after converting the list into a string (using join function) and then searching for dbaeec

if i >= 0:
    print ('Yes, {} is in the list starting index position {}'.format(s,int(i/2)))
else:
    print ('{} is not in the list'.format(s))

The output will be:

Yes, dbaeec is in the list starting index position 2

Note that the above code will only search for the first occurrence in the list. If you want to find all occurrences, the code has to be modified a bit.

try this:

y = "dbaeec"
lst = ['db', 'ca', 'db', 'ae', 'ec', 'sa']

x = ''.join(lst)    
for i in range(0, len(x), 2):
    if x[i:i+len(y)] == y:
        print(i/2)

output:

2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM