简体   繁体   中英

Regex for matching portions of URL

I need help creating a regex to match specific portions of a URL using Java. Specifically the protocol, the hostname, and the port.

For example, if I have a URL http://hq.dev.test.domain:8080/ip/CreateRegex and I would like to pull out the following:

[Protocol]=http
[Hostname]=hq.dev.test.domain
[Port]=8080

Here is what I currently have which works only for the Protocol. I'll take any updates on this regex as well.

var getProtocol= ^((http)?:\\/\\/).*\\w*(CreateRegex)$

Use the java.net.URL class for parsing

URL url = new URL("https://google.com:443/search");
System.out.println(url.getProtocol()); // https
System.out.println(url.getHost()); // google.com
System.out.println(url.getPort()); // 443

Is there any particular reason to use regex for your case? If not you should use URL to parse the string.

URL url = new URL("http://hq.dev.test.domain:8080/ip/CreateRegex");
System.out.println("[Protocol]=" + url.getProtocol());
System.out.println("[Hostname]=" + url.getHost());
System.out.println("[Port]=" + url.getPort());

For the sake of completeness though here how you'd go about it with regex :

Pattern regex = Pattern.compile("^(\\w+)://([^:/]*)(?::(\\d+))?.*");
Matcher regexMatcher = regex.matcher("http://hq.dev.test.domain:8080/ip/CreateRegex");
if (regexMatcher.find()) {
    System.out.println("[Protocol]=" + regexMatcher.group(1));
    System.out.println("[Hostname]=" + regexMatcher.group(2));
    System.out.println("[Port]=" + regexMatcher.group(3));
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM