I am trying a regular expression through the Python re module to match both of these patterns:
"GET /images/launch-logo.gif HTTP/1.0"
"GET / HTTP/1.0 "
I tried the following expression:
"(\S+) (\S.*?)\s*(\S*)"
This does as expected by returning the following:
1. GET
2. /images/launch-logo.gif
3. HTTP/1.0
However, for the second one it returns:
1. GET
2. / HTTP/1.0
3. ''
Instead, I would like that to return the following:
1. GET
2. /
3. HTTP/1.0
There is also a trailing space that needs to be removed. Could some one help me with the right regular expression?
You don't need to use a reluctant quantifier ( *?
) here. Use:
(\S+)\s+(\S+)\s+(\S+)\s*
The problem with your original regex is the combination of .*?
and \\s*
, since the reluctant expression can keep matching while \\s*
doesn't have to match anything.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.