简体   繁体   中英

splitting URL regex after 4th slash

I'm trying to split a URL into chunks. What I want is everything up until the 5th / .

I've tried looking around but I'm new to regex and I'm getting overwhelmed a bit.

url example is:

http://daniel.mirimar.net.nz/Sites/reginald/DDD/CD

So what I'd like from here is: http://daniel.mirimar.net.nz/Sites/reginald/

How can I do this?

Short and concise is always nice

(?:.+?/){4}
  • (?: -- open non-capturing group
  • .+?/ -- lazily match anything till /
  • ) -- close non-capturing group
  • {4} -- repeat four times

Use a regex like this:

^.*?\/\/[^\/]*\/[^\/]*\/[^\/]*

or

^.*?\/(\/[^\/]*){3}

And for checking without CRLF and URL with fewer parts:

^.*?\/(\/[^\/\n\r]*){1,3}

You can be more specific by this:

^https?:\/(\/[^\/\n\r]*){1,3}

Sometimes regex can be a little overwhelming, especially if you're not familiar with it. It even can make code more difficult to read ( Disadvantages of using Regular Expressions ). Now, don't get me wrong, I like to use regex when the task is simple enough for it. IMO, you're better off solving this without regex . You can design a method to find the index location of the 5th "/" and then just return the substring.

Something like:

public static void main(String[] args) {
    String url = "http://daniel.mirimar.net.nz/Sites/reginald/DDD/CD";
    System.out.println(substringNthOccurrence(url, '/', 5));
}

public static String substringNthOccurrence(String string, char c, int n) {
    if (n <= 0) {
        return "";
    }

    int index = 0;
    while (n-- > 0 && index != -1) {
        index = string.indexOf(c, index + 1);   
    }
    return index > -1 ? string.substring(0, index + 1) : "";
}

Results:

http://daniel.mirimar.net.nz/Sites/reginald/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM