简体   繁体   中英

Regex for extracting key-value pair from HTTP Query String

I am using a data analysis package that exposes a Regex function for string parsing. I am trying to parse a response from a website that is in the format...

key1=val1&key2=val2&key3=val3 ...

[There is the possibility that the keys and values may be percent encoded, but the current return values are not, the current return values are tokens and other info that are alphanumeric].

I understand this data to be www-form-urlencoded, or alternatively it might be known as query string format.

The object is to extract the value for a given key, if the order of the keys cannot be relied upon. For example, I might know that one of the keys I should receive is "token", so what regex pattern can I use to extract the value for the key "token"? I have searched for this but cannot find anything that does what I need, but if there is a duplicate question, apologies in advance.

In Alteryx, you may use Tokenize with a regex containing a capturing group around the part you need to extract:

The Tokenize Method allows you to specify a regular expression to match on and that part of the string is parsed into separate columns (or rows). When using the Tokenize method, you want to match to the whole token, and if you have a marked group, only that part is returned .

I bolded the part of the method description that proves that if there is a capturing group, only this part will be returned rather than the whole match.

Thus, you may use

(?:^|[?&])token=([^&]*)

where instead of token you may use any of the keys the value for which you want to extract.

See the regex demo .

Details

  • (?:^|[?&]) - the start of a string, ? or & (if the string is just a plain key-value pair string, you may omit ? and use (?:^|&) or (?<![^&]) )
  • token - the key
  • = - an equal sign
  • ([^&]*) - Group 1 (this will get extracted): 0 or more chars other than & (if you do not want to extract empty values, replace * with + quantifier).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM