简体   繁体   English

如何在C#中使用Regex匹配给定的模式?

[英]How can I match the given pattern using Regex in C#?

I have the following input: 我有以下输入:

-key1:"val1" -key2: "val2" -key3:(val3) -key4: "(val4)" -key5: val5 -key6: "val-6" -key-7: val7 -key-eight: "val 8"

With only the following assumption about the pattern: 关于模式只有以下假设:

  • Keys always start with a - followed by a value delimited by : 键始终以a开头-后跟由以下分隔的值:

How can I match and extract each key and it's corresponding value ? 如何匹配和提取每个及其相应的

I have so far come up with the following regex : 到目前为止我已经提出了以下正则表达式

-(?<key>\\S*):\\s?(?<val>\\S*)

But it's currently not matching the complete value for the last argument as it contains a space but I cannot figure out how to match it. 但它目前不匹配最后一个参数的完整值,因为它包含一个空格,但我无法弄清楚如何匹配它。

The expected output should be: 预期产量应为:

  • key1 "val1" key1“val1”
  • key2 "val2" key2“val2”
  • key3 (val3) key3(val3)
  • key4 "(val4)" key4“(val4)”
  • key5 val5 key5 val5
  • key6 "val-6" key6“val-6”
  • key-7 val7 key-7 val7
  • key-eight val 8 key-8 val 8

Any help is much appreciated. 任何帮助深表感谢。

Guessing that you want to only allow whitespace characters that are not at the beginning or end, change your regex to: 猜测您只想允许不在开头或结尾的空格字符,请将正则表达式更改为:

-(?<key>\S*):\s?(?<val>\S+(\s*[^-\s])*)

This assumes that the character - preceeded by a whitespace unquestioningly means a new key is beginning, it cannot be a part of any value. 这假定字符-前面有空格毫无疑问意味着新键开始,它不能是任何值的一部分。

For this example: 对于这个例子:

-key: value -key2: value with whitespace -key3: value-with-hyphens -key4: v

The matches are: -key: value , -key2: value with whitespace , -key3: value-with-hyphens , -key4: v . 匹配为: -key2: value with whitespace -key: value-key2: value with whitespace-key3: value-with-hyphens-key4: v

It also works perfectly well on your provided example. 它也适用于您提供的示例。

I presume you're wanting to keep the brackets and quotation marks as that's what you're doing in the example you gave? 我认为你想要保留括号和引号,就像你在你给出的例子中所做的一样? If so then the following should work: 如果是这样,则以下内容应该有效:

-(?<key>\S+):+\s?(?<val>\S+\s?\d+\)?\"?)

This does presume that all val's end with a number though. 这确实假设所有的val都以数字结尾。

EDIT: Given that the val doesn't always end with a number, but I'm guessing it always starts with val, this is what I have: 编辑:鉴于val并不总是以数字结尾,但我猜它总是以val开头,这就是我所拥有的:

-(?<key>\S+):+\s?(?<val>\"?\(?(val)+\s?\S+)

Seems to be working properly... 似乎工作正常......

A low tech (non regex) solution, just for an alternative. 低技术(非正则表达式)解决方案,仅供替代方案使用。 Trim guff, ToDictionary if you need 如果你需要,修剪guff, ToDictionary

var results = input.Split(new[] { " -" }, StringSplitOptions.RemoveEmptyEntries)
                   .Select(x => x.Trim('-').Split(':'));

Full Demo Here 完整的演示在这里

Output 产量

key1 -> "val1"
key2 ->  "val2"
key3 -> (val3)
key4 ->  "(val4)"
key5 ->  val5
key6 ->  "val-6"
key-7 ->  val7
key8 ->  "val 8"

Try this regex using Replace function: 使用替换功能尝试此正则表达式

(?:^|(?!\S)\s*)-|\s*:\s*

and replace with "\\n". 并替换为“\\ n”。 You should get key values in separate lines. 您应该在单独的行中获取键值。

This should do the trick 这应该可以解决问题

-(?<key>\S*):\s*(?<value>(?(?=")((")(?:(?=(\\?))\2.)*?\1))(\S*))

a sample link can be found here . 可在此处找到示例链接。 Basically it does and if/else/then to detect if the value contain " as (?(?=")(true regex)(false regex) , the false regex is yours \\S* while the true regex will try to match start/end quote (")(?:(?=(\\\\?))\\2.)*?\\1) . 基本上它确实和if / else /然后检测值是否包含" as (?(?=")(true regex)(false regex) ,假正则表达式是你的\\S*而真正的正则表达式将尝试匹配start / end quote (")(?:(?=(\\\\?))\\2.)*?\\1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM