繁体   English   中英

编写一个正则表达式来提取一个遵循模式的单词

[英]write a regular expression to extract a word that follows a pattern

我有一列包含如下数据。 从中我需要提取批准后的 1 个词,批准者是,批准来自等。关键字“批准”之后的第一个词/名称。 批准应该不区分大小写。

前任 -

第 1 行 - 事件 12345,问题是某某,解决方案是某某。票已被thors批准

第 2 行 - 事件 12900,问题是某某,解决方案是某某。 批准人是:万达倡导者工作朱莉

第 3 行 - 事件 125790,问题是某某,解决方案是某某。 票得到了蜘蛛侠的批准,关闭

第 4 行 - 事件 125790,问题是某某,解决方案是某某。 票是由铁人批准的,等等等等

我试图做 \bApprov*\b([\w][A-Za-z]{4-7}) - 但它不工作

这是一个与您的解决方案非常相似的解决方案,我希望它对您有用。 至少对于这种特殊情况,它会返回您需要的输出:

import regex as re

string = """row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors

row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie

row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing

row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah"""

for row in string.split("\n"):
    if row.startswith("row"):
        m = re.search(r"(?i)(?<=approv[A-Z\s\-\:]+)[A-Z]{5,}", row)
        print(m.group(0))

输出:

thors
Wanda
spiderman
ironman

你想用python实现这个吗? 如果是这样,下面的代码可能会有所帮助。

代码:

    rows = ['incident 12345, issue is so and so, solution is so and so.Ticket was approved by Thors'
            , 'incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie'
            , 'incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing'
            , 'incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah']

    for row in rows:
        clean_row = row.translate({ord(x): None for x in ',.;:[]()-'})
        split_row = clean_row.lower().split('approv')[-1].split()[2]
        print(split_row)

输出:

    thors
    wanda
    spiderman
    ironman
    

使用这个回调函数可以解决这个问题。

txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim()).match(/(?<=(approv)[az]+\s)[az\s\-\:]+/gi).join().split(' ')[1]

解释:

  1. 我正在使用txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim()) ,因为在 2nd 有一些输入问题输入字符串: row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie 在 'approver is' 和 ':' 之间添加额外的空格
  2. 正则表达式: txt.match(/(?<=(approv)[az]+\s)[az\s\-\:]+/gi).join().split(' ')[1] 然后在“approv*”关键字之后找到剩余的单词并打印第二个单词。

代码:

var ar = [`row 1- incident 12345, issue is so and so, solution is so and so.Ticket was approved by thors`,
`row 2-incident 12900, issue is so and so, solution is so and so. approver is : Wanda Advocate worked julie`,
`row 3-incident 125790, issue is so and so, solution is so and so. Ticket was got approval from- spiderman, closing`,
`row 4-incident 125790, issue is so and so, solution is so and so. Ticket was approved by- ironman, blah blah`]

ar.forEach(txt => {
  console.log(txt.replace(/(?<=[a-z0-9]+)\s+[\:\-]/gi, x => x.trim()).match(/(?<=(approv)[a-z]+\s)[a-z\s\-\:]+/gi).join().split(' ')[1]);
})

输出:

thors
Wanda
spiderman
ironman

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM