![](/img/trans.png)
[英]Using regex or BeautifulSoup to find a word or number after a keyword
[英]Find next word after the matching keyword in a list of strings using regex in python
我有一個字符串列表,我想在每個字符串中的特定關鍵字之后提取下一個單詞。
當我使用 lambda function 遍歷列表時,我得到的是整個字符串,而不僅僅是關鍵字后的下一個詞:
import re
s = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
user = filter(lambda i:re.search('(?<=The job )(\w+)',i),s)
print(*user)
output: The job ABS is scheduled by Bob. The job BFG is scheduled by Alice.
但是,當我為單個字符串嘗試相同的代碼時,它給出了正確的 output:
import re
s = "The job ABS is scheduled by Bob."
user = re.search('(?<=The job )(\w+)',s)
print(user.group())
output: ABS
我怎樣才能從第一個代碼片段中得到 output like (ABS, BFG) ?
您可以使用
import re
s = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
rx = re.compile(r'(?<=The job )\w+')
user = tuple(map(lambda x: x.group() or "", map(rx.search, s)))
print(user)
請參閱Python 演示。
或者,如果可以有任意數量的空格,請使用
rx = re.compile(r'The\s+job\s+(\w+)')
user = tuple(map(lambda x: x.group(1) or "", map(rx.search, s)))
Output:
('ABS', 'BFG')
這里, map(rx.search, s)
返回一個迭代器到匹配數據對象或None
s,外層map(lambda x: x.group(...) or "", ...)
獲取值組的匹配項(與.group()
的整個匹配項或與.group(1)
的第 1 組值),如果沒有匹配項,則返回空字符串。
你可以簡化這個:
import re
arr = ["The job ABS is scheduled by Bob.", "The job BFG is scheduled by Alice."]
user = [re.findall('(?<=The job )\w+', s) for s in arr]
print (user)
print (tuple(user))
Output:
[['ABS'], ['BFG']]
(['ABS'], ['BFG'])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.