简体   繁体   中英

Python Regular Expression for IP Address and URL

I am trying to extract only the IP address and URL portion of a log containing data in the format of

153.12.123.123 - - [13/Nov/2014:15:06:43 -0700] "GET /icons/AHPS/0.06.png HTTP/1.1" 123 1234 "http://198.123.123.123/index.html" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/1234567 Firefox/33.0"

153.12.123.123 - - [13/Nov/2014:15:06:43 -0700] "GET /icons/AHPS/0.06.png HTTP/1.1" 123 1234 "http://abc.weatherabc.org/?Center=38.123456789" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/1234556 Firefox/33.0"

I currently am using this expression on the command line:

[^\"]*\"[^\"]*\"[^\"]*\"([^\"]*)\"

and it produces these as results:

http://198.123.123.123/index.html

http://abc.weatherabc.org/?Center=38.123456789

However I want a regular expression that produces only these portion:

http://198.123.123.123/

http://abc.weatherabc.org/

or

http://198.123.123.123

http://abc.weatherabc.org

Please help. Thanks in advance!

"(http://[^/]+)

Search for the keyword http which is common and end at the first /

  • " Looks for "
  • http:// This will match http://
  • [^/]+ This will match all characters except /
  • Brackets are used to extract required data which is why " is outside brackets. This is called as groups.

If you need / at the end just add it to the group

"(http://[^/]+/)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM