在匹配 python 正则表达式之后查找下一个/上一个字符串

Question

我需要查找文本中提到的人的姓名，我需要使用关键字列表过滤所有姓名，例如：

key_words = ["magistrate","officer","attorney","applicant","defendant","plaintfill"...]

For example, in the text:

INPUT: "The magistrate DANIEL SMITH blalblablal, who was in a meeting with the officer MARCO ANTONIO 
and WILL SMITH, defendant of the judgment filed by the plaintiff MARIA FREEMAN "

OUTPUT:
(magistrate, DANIEL SMITH)
(officer, MARCO ANTONIO)
(defendant, WILL SMITH)
(plaintfill, MARIA FREEMAN)

所以我有两个问题：首先，在键之前提到名称，其次如何构建正则表达式以同时使用所有关键字和过滤器。

我尝试过一些事情：

line = re.split("magistrate",text)[1]
name = []
for key in line.split():
    if key.isupper(): name.append(key)
    else:
        break
" ".join(name)
OUTPUT: 'DANIEL SMITH'

谢谢！

Answer 1

是否必须使用正则表达式？ 如果不是，这就是我的答案，因为我们仍然可以在没有正则表达式的情况下做到这一点

您可以使用split()方法使用空格分隔符拆分行。 此方法返回一个列表，将其分配给一个变量并遍历该列表。 尝试这个

key_words = ["magistrate","officer","attorney","applicant","defendant","plaintfill"]

line = "The magistrate DANIEL SMITH blalblablal, who was in a meeting with the officer MARCO ANTONIO and WILL SMITH, defendant of the judgment filed by the plaintiff MARIA FREEMAN"
line_words = line.split(" ")

for word in line_words:
    if word in key_words:
        Index = line_words.index(word)
        print(word, line_words[Index+1], line_words[Index+2])

Answer 2

我建议将re.findall与两个捕获组一起使用，方法如下：

import re
key_words = ["magistrate","officer","attorney","applicant","defendant","plaintiff"]
line = "The magistrate DANIEL SMITH blalblablal, who was in a meeting with the officer MARCO ANTONIO and WILL SMITH, defendant of the judgment filed by the plaintiff MARIA FREEMAN "
found = re.findall('('+'|'.join(key_words)+')'+r'\s+([ A-Z]+[A-Z])',line)
print(found)

Output：

[('magistrate', 'DANIEL SMITH'), ('officer', 'MARCO ANTONIO'), ('plaintiff', 'MARIA FREEMAN')]

说明：在re.findall的模式中使用多个捕获组（由(和)表示）导致tuple列表（在这种情况下为 2 元组）。 第一个组是通过使用|加入简单地创建的。 它在模式中像 OR 一样工作，然后我们有一个或多个空格（ \s+ ），它在任何组之外，因此不会出现在结果中，最后我们有第二组，它由一个或多个空格或 ASCII 大写字母组成（ [ AZ]+ ) 后跟单个 ASCII 大写字母 ( [AZ] )，因此它不会捕获尾随空格。

在匹配 python 正则表达式之后查找下一个/上一个字符串

问题描述

2 个解决方案

解决方案1
0 2020-08-13 13:33:25

解决方案2
0 2020-08-13 13:40:48

在匹配 python 正则表达式之后查找下一个/上一个字符串

问题描述

2 个解决方案

解决方案1 0 2020-08-13 13:33:25

解决方案2 0 2020-08-13 13:40:48

解决方案1
0 2020-08-13 13:33:25

解决方案2
0 2020-08-13 13:40:48