[英]write a regular expression to extract a word that follows a pattern
我有一列包含如下数据。 从中我需要提取批准后的 1 个词,批准者是,批准来自等。关键字“批准”之后的第一个词/名称。 批准应该不区分大小写。
前任 -
第 1 行 - 事件 12345,问题是某某,解决方案是某某。票已被thors批准
第 2 行 - 事件 12900,问题是某某,解决方案是某某。 批准人是:万达倡导者工作朱莉
第 3 行 - 事件 125790,问题是某某,解决方案是某某。 票得到了蜘蛛侠的批准,关闭
第 4 行 - 事件 125790,问题是某某,解决方案是某某。 票是由铁人批准的,等等等等
我试图做 \bApprov*\b([\w][A-Za-z]{4-7}) - 但它不工作
这是一个与您的解决方案非常相似的解决方案,我希望它对您有用。 至少对于这种特殊情况,它会返回您需要的输出:
import regex as re
string = """row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors
row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie
row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing
row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah"""
for row in string.split("\n"):
if row.startswith("row"):
m = re.search(r"(?i)(?<=approv[A-Z\s\-\:]+)[A-Z]{5,}", row)
print(m.group(0))
输出:
thors
Wanda
spiderman
ironman
你想用python实现这个吗? 如果是这样,下面的代码可能会有所帮助。
代码:
rows = ['incident 12345, issue is so and so, solution is so and so.Ticket was approved by Thors'
, 'incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie'
, 'incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing'
, 'incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah']
for row in rows:
clean_row = row.translate({ord(x): None for x in ',.;:[]()-'})
split_row = clean_row.lower().split('approv')[-1].split()[2]
print(split_row)
输出:
thors
wanda
spiderman
ironman
使用这个回调函数可以解决这个问题。
txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim()).match(/(?<=(approv)[az]+\s)[az\s\-\:]+/gi).join().split(' ')[1]
解释:
txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim())
,因为在 2nd 有一些输入问题输入字符串: row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie
row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie
。 在 'approver is' 和 ':' 之间添加额外的空格txt.match(/(?<=(approv)[az]+\s)[az\s\-\:]+/gi).join().split(' ')[1]
。 然后在“approv*”关键字之后找到剩余的单词并打印第二个单词。代码:
var ar = [`row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors`,
`row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie`,
`row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing`,
`row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah`]
ar.forEach(txt => {
console.log(txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim()).match(/(?<=(approv)[a-z]+\s)[a-z\s\-\:]+/gi).join().split(' ')[1]);
})
输出:
thors
Wanda
spiderman
ironman
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.