在 python 中找到正确的正则表达式以匹配模式并提取子字符串

Question

我有一个看起来像这样的文本：

36] Smarandache F. (Editor), Proceedings of the First International Conference on Neutrosophics, Univ. of New Mexico, Gallup Campus, NM, USA, 1-3 Dec. 2001, Xiquan, Phoenix, 2002

我想提取：

Proceedings of the First International Conference on Neutrosophics

我尝试如下使用正则表达式模式：

conference = re.search(",(.*)conference(.*),", str(r.lower()))

我只得到这个 output: Proceedings of the First International

我的文字将是随机的，但它会包含像会议这样的词

我的问题是如何开发可以在文本中找到单词会议并从单词会议之前的第一个逗号到单词会议之后的第一个逗号提取 substring 的模式。

, xxxxxxxxxxxxxxxxxx 会议 xxxxxxxxxxxxxxxxxx,

任何帮助都会很棒

Answer 1

您可以使用否定字符 class 匹配除逗号以外的任何字符，并在匹配会议之间使用单个捕获组。

您可以匹配以大写C开头的 Conference 以获得结果，或者使用re.IGNORECASE使模式不区分大小写

如果您使用r.lower()将字符串转换为小写，则 output 将改为：

第一届中智学国际会议论文集

,\s*([^,]*\bConference\b[^,]*),

正则表达式演示

示例代码：

import re
r = "36] Smarandache F. (Editor), Proceedings of the First International Conference on Neutrosophics, Univ. of New Mexico, Gallup Campus, NM, USA, 1-3 Dec. 2001, Xiquan, Phoenix, 2002"

conference = re.search(r",\s*([^,]*\bConference\b[^,]*),", r)
if conference:
    print(conference.group(1))

Output

Proceedings of the First International Conference on Neutrosophics

在 python 中找到正确的正则表达式以匹配模式并提取子字符串

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-08-07 10:29:54

在 python 中找到正确的正则表达式以匹配模式并提取子字符串

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-08-07 10:29:54

解决方案1
2 已采纳 2020-08-07 10:29:54