简体   繁体   English

需要帮助来构建正则表达式

[英]Need help to construct regex

I am trying to find a regex for the following lines (see attempts at bottom of post) 我正在尝试为以下各行找到一个正则表达式(请参阅文章底部的尝试)

CLog_DMT_HPCC2_IWHT91731695_242_AFT1_2019-05-02T07.51.43

The Regex is working fine for this line, Result for above line is: 正则表达式在此行上工作正常,上述行的结果是:

  • programName=CLog programName = CLog
  • otherRegex=DMT_HPCC2 otherRegex = DMT_HPCC2
  • SerialNO=IWHT91731695 (Note: Serial no will always start from "I") SerialNO = IWHT91731695(注意:序列号始终从“ I”开始)
  • Version=242 版本= 242
  • operation=AFT1 操作= AFT1

which is desired result 这是期望的结果

But the Regex is not working for this line 但是正则表达式不适用于此行

CLOB_ABCD_6KW_SYSTEM_609-784_IWHT91831863_197_ACB_01_2019-05-02T07.03.27

I want result in following way for above line 我想要以下行的结果

  • programName=CLOB 程式名称= CLOB
  • otherRegex=ABCD_6KW_SYSTEM_609-784 otherRegex = ABCD_6KW_SYSTEM_609-784
  • SerialNO=IWHT91831863 序列号= IWHT91831863
  • Version= 197 版本= 197
  • operation= ACB_01 操作= ACB_01

but what i am getting is shown below: 但我得到的是如下所示:

  • programName=CLOB 程式名称= CLOB
  • otherRegex=ABCD_6KW_SYSTEM_609-784 otherRegex = ABCD_6KW_SYSTEM_609-784
  • SerialNO=IWHT91831863 _197 序列号= IWHT91831863 _197
  • Version= ACB 版本= ACB
  • operation= 01 操作= 01

I have tried following regex for above lines: 我已经尝试过以下行的正则表达式:

(?<programName>[a-zA-Z0-9]+)_(?<other>.+)_(?<boardSN>I.+)_(?<entityNameProgramVersion>.+)_(?<operation>.+)_

In your pattern you use .+ which is greedy and will match until the end of the string. 在您的模式中,您使用.+ ,这是贪婪的,它将匹配到字符串的结尾。 Then it will backtrack to fulfill the rest of the pattern. 然后它将回溯以完成其余模式。 In this case it will try backtrack to fit all the following underscores. 在这种情况下,它将尝试回溯以适应以下所有下划线。

Instead you could use a negated character class [^ matching not an underscore or a newline to limit the matching to the current string in case when there are multiple following. 相反,您可以使用否定的字符类[^而不是下划线或换行符来将匹配项限制为当前字符串,以防后面有多个。

For the other part you might make the quantifier non greedy (?<other>.+?) so it gives up matches until it can match _I other您可能会使量词不贪心(?<other>.+?)因此它放弃匹配,直到可以匹配_I

^(?<programName>[a-zA-Z0-9]+)_(?<other>.+?)_(?<boardSN>I[^_\n]+)_(?<entityNameProgramVersion>[^_\n]+)_(?<operation>[^\n_]+(?:_[^\n]+)?)_

Explanation 说明

  • ^ Start of string ^字符串开头
  • (?<programName>[a-zA-Z0-9]+)_ Repeat 1+ times what is listed in the character class (?<programName>[a-zA-Z0-9]+)_重复1次以上字符类中列出的内容
  • (?<other>.+?)_ Match any char 1+ times except a newline non greedy (?<other>.+?)_匹配任何char 1次以上,除了非贪婪换行符
  • (?<boardSN>I[^_\\n]+)_ Negated character class, match not an _ or newline (?<boardSN>I[^_\\n]+)_否定的字符类,不匹配_或换行符
  • (?<entityNameProgramVersion>[^_\\n]+)_ Negated character class, match not an _ or newline (?<entityNameProgramVersion>[^_\\n]+)_否定的字符类,不匹配_或换行符
  • (?<operation>[^\\n_]+(?:_[^\\n]+)?)_ Negated character class, match not an _ or newline with an optional group which will match one underscore and match not an underscore. (?<operation>[^\\n_]+(?:_[^\\n]+)?)_否定的字符类,不将_或换行符与可选组匹配,该可选组将匹配一个下划线并且不匹配下划线。 After that, match a single underscore outside of the group. 之后,在组外匹配一个下划线。

Regex demo 正则表达式演示

If optional group at the end can be only digits, you could use this part without the last underscore: 如果末尾的可选组只能是数字,则可以使用此部分而没有最后一个下划线:

(?<operation>[^\n_]+(?:_\d+)?)

Regex demo 正则表达式演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM