简体   繁体   English

从 python 的正则表达式(re.findall)中的给定文本中提取医生姓名

[英]Extract the doctor name from given text in regex(re.findall) in python

I have我有

text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

I want to get the following output using re.findall :我想使用re.findall

['Dr. alex dams', 'Dr. kane Andeas']

I am using the following code but just getting ['Dr.'] in output.我正在使用以下代码,但只是在 output 中获得['Dr.']

re.findall("Dr.[a-z\s]+",text)

If the doctors will always follows the same format, you can search for then with \w+ for a word and \s for space.如果医生总是遵循相同的格式,您可以使用\w+搜索单词,使用\s搜索空格。

(Dr\.\s\w+\s\w+)

Code代码


text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

re.findall(r'(Dr\.\s\w+\s\w+)', text)

#['Dr. alex dams', 'Dr. kane Andeas']

While PacketLoss answer works it will not catch hyphen divided names (like Pearl-Hopson or similar).虽然 PacketLoss 答案有效,但它不会捕获连字符分隔的名称(如 Pearl-Hopson 或类似名称)。

I would go for:我会 go 为:

text = 'he is Dr. alex dams. He puts up in Washington town since 1990. He has been a very good friend of Dr. kane Andeas and his family'

re.findall(r'(Dr\.\s\S+\s\S+\b)', text)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM