简体   繁体   中英

Regular Expression Pattern Extract Log - U031503@nttdata [11/Mar/2013:09:42:44 +0900] "GET /infovia/ga/ga004rp0002.action HTTP/1.1" 302 301 "https://tb-infovia.groupwide.net/infovia/ga/ga013rp0004.action?messageId=errors.Authentication.001" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET CLR 1.1.4322)"

The above is the access log line. There are two action ids. I want to extract the first action id before HTTP by using regex pattern. Now I use this pattern ([^/\\"]*).action . It matched both action id in line anywhere. I was testing this problem two days ago. Could you please help me?

This will match the first id:

action \S+" (\d+)

Get group 1 from the match

Try this:


or use this


and get group 1.


*? Matches the previous element zero or more times, but as few times as possible. (?<=subexpression) Zero-width positive lookbehind assertion.

If I understand your question correctly, your problem is that there are two "action IDs" in the string, and you want to capture both. However, with your current regex, which matches both, depending on how you are evaluating this regex, you may only be getting the first match. So, in order to extract both with one match, you'll need to repeat the regex and then consume everything between the parts you want to capture:


This is your regex ([^/\\"]*).action , repeated twice, with .* in the middle, which matches anything an unlimited number of times. Then both actions are available in capturing groups one and two.

If you're sure it will always be followed by HTTP , you can use a lookahead:



Edit live on Debuggex

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM