简体   繁体   English

从字符串中提取数据的最佳方法

[英]Best way to extract data from string

I have a string: 我有一个字符串:

__cfduid=d2eec71493b48565be764ad44a52a7b191399561601015; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.planetminecraft.com; HttpOnly

I want to use regex and get something like this: 我想使用正则表达式并获取如下内容:

[0] = __cfduid=d2eec71493b48565be764ad44a52a7b191399561601015
[1] = expires=Mon, 23-Dec-2019 23:50:00 GMT
[2] = path=/
[3] = domain=.planetminecraft.com
[4] = HttpOnly

I tried this regex: 我试过这个正则表达式:

[\A|;](.*?)[\Z|;]

I don't understand why \\A . 我不明白为什么\\A works but [\\A] not, how can I create ( \\A or ; )? 可以,但是[\\A]无效,如何创建( \\A; )?

In final form of this regex I want to get from string this: 在此正则表达式的最终形式中,我想从字符串中获取以下内容:

[0] = {
    [0] = __cfduid
    [1] = d2eec71493b48565be764ad44a52a7b191399561601015
}
[1] = {
    [0] = expires
    [1] = Mon, 23-Dec-2019 23:50:00 GMT
}
[2] = {
    [0] = path
    [1] = /
}
[3] = {
    [0] = domain
    [1] = .planetminecraft.com
}
[4] = {
    [0] = HttpOnly
}

Square brackets create a character class ; 方括号创建字符类 ; you need parentheses for grouping, preferably non- capturing groups . 您需要括号进行分组,最好是非捕获组 And you need to use a positive lookahead assertion instead of the second group since each semicolon can only match once: 由于每个分号只能匹配一次,因此您需要使用正向超前断言而不是第二组。

(?:\A|;)(.*?)(?=\Z|;)

That still doesn't get you your parameter/value pairs, so you might want to be more specific: 那仍然不能让您获得参数/值对,因此您可能需要更具体:

(?:\A|;\s*)([^=]*)(?:=([^;]*))?(?=\Z|;)

( [^=]* matches any number of characters except = .) [^=]*匹配除=以外的任意数量的字符。)

See it live on regex101.com . 在regex101.com上实时查看

You can try matching on this regex: 您可以尝试在此正则表达式上进行匹配:

\s*([^=;]+)(?:=([^=;]+))?

Description: 描述:

\s*         # Match any spaces
([^=;]+)    # Match any non = or ; characters
(?:
  =         # Match an = sign
  ([^=;]+)  # Match any non = or ; characters.
)?          # Make this group optional

regex101 demo regex101演示

In code: 在代码中:

string text = "__cfduid=d2eec71493b48565be764ad44a52a7b191399561601015; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.planetminecraft.com; HttpOnly";

var regex = new Regex(@"\s*([^=;]+)(?:=([^=;]+))?");
var matches = regex.Matches(text);
foreach (Match match in matches)
{
    Console.WriteLine(match.Groups[1].Value + "\n" + match.Groups[2].Value + "\n");
}

ideone demo ideone演示


\\A works but [\\A] does not because when you put \\A in a character class, it loses its meaning like most regex metacharacters. \\A有效,但是[\\A]无效,因为当您将\\A放在字符类中时,它就失去了像大多数正则表达式元字符一样的含义。 For instance, + and * also lose their meaning. 例如, +*也失去其含义。 In [\\A] , the regex is actually trying to match \\A and since it doesn't have a particular meaning in a character class, it means a literal A . [\\A] ,正则表达式实际上是尝试与\\A匹配,并且由于它在字符类中没有特殊含义,因此它表示文字A

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM