[英]Pythonic way to search a list using keywords
I am attempting to search for text between two keywords. 我正在尝试搜索两个关键字之间的文本。 My solution so far is using split()
to change string to list. 到目前为止,我的解决方案是使用split()
将字符串更改为list。 It works but I was wondering if there is more efficient/elegant way to achieve this. 它有效,但是我想知道是否有更有效/更优雅的方法来实现这一目标。 Below is my code: 下面是我的代码:
words = "Your meeting with Dr Green at 8pm"
list_words = words.split()
before = "with"
after = "at"
title = list_words[list_words.index(before) + 1]
name = list_words[list_words.index(after) - 1]
if title != name:
var = title + " " + name
print(var)
else:
print(title)
Results: 结果:
>>> Dr Green
Id prefer a solution that is configurable as the text I'm searching for can be dynamic so Dr Green could be replaced by a name with 4 words or 1 word. 我更喜欢一种可配置的解决方案,因为我要搜索的文本可以是动态的,因此可以用4个单词或1个单词的名称代替Green博士。
Sounds like a job for regular expressions. 听起来像是正则表达式的工作。 This uses the pattern (?:with)(.*?)(?:at)
to look for 'with', and 'at', and lazily match anything in-between. 这使用模式(?:with)(.*?)(?:at)
查找'with'和'at',并懒惰地匹配它们之间的任何内容。
import re
words = 'Your meeting with Dr Green at 8pm'
start = 'with'
end = 'at'
pattern = r'(?:{})(.*?)(?:{})'.format(start, end)
match = re.search(pattern, words).group(1).strip()
print(match)
Outputs; 产出;
Dr Green
Note that the Regex does actually match the spaces on either side of Dr Green
, I've included a simple match.strip()
to remove trailing whitespace. 请注意,正则表达式实际上与Dr Green
两边的空格匹配,我提供了一个简单的match.strip()
来删除尾随空格。
Using RE 使用RE
import re
words = "Your meeting with Dr Green at 8pm"
before = "Dr"
after = "at"
result = re.search('%s(.*)%s' % (before, after), words).group(1)
print before + result
Output : 输出:
Dr Green
How about slicing the list at start and end, then just splitting it? 如何在列表的开始和结尾处切片,然后将其拆分?
words = "Your meeting with Dr Jebediah Caruseum Green at 8pm"
start = "with"
end = "at"
list_of_stuff = words[words.index(start):words.index(end)].replace(start, '', 1).split()
list_of_stuff
['Dr', 'Jebediah', 'Caruseum', 'Green']
You can do anything you like with the list. 您可以使用列表执行任何操作。 For example I would parse for title like this: 例如,我将解析标题,如下所示:
list_of_titles = ['Dr', 'Sr', 'GrandMaster', 'Pleb']
try:
title = [i for i in list_of_stuff if i in list_of_titles][0]
except IndexError:
#title not found, skipping
title = ''
name = ' '.join([x for x in list_of_stuff if x != title])
print(title, name)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.