从 python 的正则表达式（re.findall）中的给定文本中提取医生姓名

Question

I have我有

text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

I want to get the following output using re.findall :我想使用re.findall ：

['Dr. alex dams', 'Dr. kane Andeas']

I am using the following code but just getting ['Dr.'] in output.我正在使用以下代码，但只是在 output 中获得['Dr.'] 。

re.findall("Dr.[a-z\s]+",text)

Answer 1

If the doctors will always follows the same format, you can search for then with \w+ for a word and \s for space.如果医生总是遵循相同的格式，您可以使用\w+搜索单词，使用\s搜索空格。

(Dr\.\s\w+\s\w+)

Code代码

text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

re.findall(r'(Dr\.\s\w+\s\w+)', text)

#['Dr. alex dams', 'Dr. kane Andeas']

Answer 2

While PacketLoss answer works it will not catch hyphen divided names (like Pearl-Hopson or similar).虽然 PacketLoss 答案有效，但它不会捕获连字符分隔的名称（如 Pearl-Hopson 或类似名称）。

I would go for:我会 go 为：

text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

re.findall(r'(Dr\.\s\S+\s\S+\b)', text)

从 python 的正则表达式（re.findall）中的给定文本中提取医生姓名

问题描述

2 个解决方案

解决方案1
2 2021-01-18 10:26:27

解决方案2
1 已采纳 2021-01-18 10:31:50

从 python 的正则表达式（re.findall）中的给定文本中提取医生姓名

问题描述

2 个解决方案

解决方案1 2 2021-01-18 10:26:27

解决方案2 1 已采纳 2021-01-18 10:31:50

解决方案1
2 2021-01-18 10:26:27

解决方案2
1 已采纳 2021-01-18 10:31:50