I have a list of strings and I want to extract the next word after a specific keyword in each string.
When I am using the lambda function to iterate over the list, I am getting the whole strings instead of just the next word after the keyword:
import re
s = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
user = filter(lambda i:re.search('(?<=The job )(\w+)',i),s)
print(*user)
output: The job ABS is scheduled by Bob. The job BFG is scheduled by Alice.
but, when I am trying the same code for a single string, it is giving the correct output:
import re
s = "The job ABS is scheduled by Bob."
user = re.search('(?<=The job )(\w+)',s)
print(user.group())
output: ABS
How can I get output like (ABS, BFG) from the first code snippet?
You can use
import re
s = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
rx = re.compile(r'(?<=The job )\w+')
user = tuple(map(lambda x: x.group() or "", map(rx.search, s)))
print(user)
See the Python demo .
Alternatively, if there can be any amount of whitespace, use
rx = re.compile(r'The\s+job\s+(\w+)')
user = tuple(map(lambda x: x.group(1) or "", map(rx.search, s)))
Output:
('ABS', 'BFG')
Here, the map(rx.search, s)
returns an iterator to the match data objects or None
s, and the outer map(lambda x: x.group(...) or "", ...)
gets the value of the group (either the whole match with .group()
or Group 1 value with .group(1)
), or returns an empty string if there was no match.
You can simplify this:
import re
arr = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
user = [re.findall('(?<=The job )\w+', s) for s in arr]
print (user)
print (tuple(user))
Output:
[['ABS'], ['BFG']]
(['ABS'], ['BFG'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.