如何在python中使用正则表达式提取字符串旁边的单词

Question

9.DATUM DER ERTEILUNG DER ZULASSUNG/VERLÄNGERUNG DER ZULASSUNG
10.STAND DER INFORMATION
Juni 2019
Rezeptpflicht/Apothekenpflicht
Rezept- und apothekenpflichtig, wiederholte Abgabe verboten.

This is my text and I am trying to extract dates which are always after STAND DER INFORMATION .这是我的文本，我试图提取始终在STAND DER INFORMATION之后的日期。 Juni 2019 in this example text above. Juni 2019在上面的示例文本中。

I have tried string split method but that doesn't work for me as I just need the dates.我尝试过字符串拆分方法，但这对我不起作用，因为我只需要日期。

Answer 1

If your text has STAND DER INFORMATION prior to date as illustrated you can use the following.如果您的文本在日期之前具有STAND DER INFORMATION ，如图所示，您可以使用以下内容。

Code代码

import re
re.findall(r'(?<=STAND DER INFORMATION\s)\D{3,4}\s\d{4}', s, re.MULTILINE)

Explanation解释

# s is text string
# <=STAND DER INFORMATION\n - look behind for STAND DER INFORMATION followed by \n
# \D is non-digit (so 3 or 4 non-digits)
# \d digits (so four digit date)
# re.MULTILINE - multiline flag to allow matches across multiple lines

Test测试

s = """9.DATUM DER ERTEILUNG DER ZULASSUNG/VERLÄNGERUNG DER ZULASSUNG
10.STAND DER INFORMATION
Juni 2019
Rezeptpflicht/Apothekenpflicht
Rezept- und apothekenpflichtig, wiederholte Abgabe verboten."""
dates = re.findall(r'(?<=STAND DER INFORMATION\n)\D{3,4}\s\d{4}', s, re.MULTILINE)
print(dates)

Output输出

['Juni 2019']

如何在python中使用正则表达式提取字符串旁边的单词

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-03-09 22:31:53

如何在python中使用正则表达式提取字符串旁边的单词

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-03-09 22:31:53

解决方案1
1 已采纳 2020-03-09 22:31:53