strings_to_search = ['abc', 'def', 'fgh hello']
complete_list = ['abc abc dsss abc', 'defgj', 'abc fgh hello xabd', 'fgh helloijj']
for col_key in strings_to_search:
print(list(map(lambda x: re.findall(col_key, x), complete_list)))
We get below output by running the above program, I am able to match abc 4 times as it is matching 3 times in 0th index and 1 time in 2nd index of the complete_list.
'def' is matching against 'defgj', but I want to match only if there is a string like 'def abc' or 'def'. (either separated by white-spaces or matching start and end of the string)
similarly 'fgh hello' is matching against 'abc fgh hello xabd' and 'fgh helloijj'. I wanted this to match only against 'abc fgh hello xabd' as it is separated with white-space. Can anyone please suggest how I can achieve this in python?
[['abc', 'abc', 'abc'], [], ['abc'], []]
[[], ['def'], [], []]
[[], [], ['fgh hello'], ['fgh hello']]
Use word breaks (\\b) in your regular expression.
import re
strings_to_search = ['abc', 'def', 'fgh hello']
complete_list = ['abc abc dsss abc', 'defgj', 'abc fgh hello xabd', 'fgh helloijj']
for col_key in strings_to_search:
word = r'\b{}\b'.format(col_key)
print(list(map(lambda x: re.findall(word, x), complete_list)))
Output:
[['abc', 'abc', 'abc'], [], ['abc'], []]
[[], [], [], []]
[[], [], ['fgh hello'], []]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.