简体   繁体   中英

Best way to extract data from string

I have a string:

__cfduid=d2eec71493b48565be764ad44a52a7b191399561601015; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.planetminecraft.com; HttpOnly

I want to use regex and get something like this:

[0] = __cfduid=d2eec71493b48565be764ad44a52a7b191399561601015
[1] = expires=Mon, 23-Dec-2019 23:50:00 GMT
[2] = path=/
[3] = domain=.planetminecraft.com
[4] = HttpOnly

I tried this regex:

[\A|;](.*?)[\Z|;]

I don't understand why \\A . works but [\\A] not, how can I create ( \\A or ; )?

In final form of this regex I want to get from string this:

[0] = {
    [0] = __cfduid
    [1] = d2eec71493b48565be764ad44a52a7b191399561601015
}
[1] = {
    [0] = expires
    [1] = Mon, 23-Dec-2019 23:50:00 GMT
}
[2] = {
    [0] = path
    [1] = /
}
[3] = {
    [0] = domain
    [1] = .planetminecraft.com
}
[4] = {
    [0] = HttpOnly
}

Square brackets create a character class ; you need parentheses for grouping, preferably non- capturing groups . And you need to use a positive lookahead assertion instead of the second group since each semicolon can only match once:

(?:\A|;)(.*?)(?=\Z|;)

That still doesn't get you your parameter/value pairs, so you might want to be more specific:

(?:\A|;\s*)([^=]*)(?:=([^;]*))?(?=\Z|;)

( [^=]* matches any number of characters except = .)

See it live on regex101.com .

You can try matching on this regex:

\s*([^=;]+)(?:=([^=;]+))?

Description:

\s*         # Match any spaces
([^=;]+)    # Match any non = or ; characters
(?:
  =         # Match an = sign
  ([^=;]+)  # Match any non = or ; characters.
)?          # Make this group optional

regex101 demo

In code:

string text = "__cfduid=d2eec71493b48565be764ad44a52a7b191399561601015; expires=Mon, 23-Dec-2019 23:50:00 GMT; path=/; domain=.planetminecraft.com; HttpOnly";

var regex = new Regex(@"\s*([^=;]+)(?:=([^=;]+))?");
var matches = regex.Matches(text);
foreach (Match match in matches)
{
    Console.WriteLine(match.Groups[1].Value + "\n" + match.Groups[2].Value + "\n");
}

ideone demo


\\A works but [\\A] does not because when you put \\A in a character class, it loses its meaning like most regex metacharacters. For instance, + and * also lose their meaning. In [\\A] , the regex is actually trying to match \\A and since it doesn't have a particular meaning in a character class, it means a literal A .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM