简体   繁体   中英

conditional replaceAll java

I have html code with img src tags pointing to urls. Some have mysite.com/myimage.png as src others have mysite.com/1234/12/12/myimage.png. I want to replace these urls with a cache file path. Im looking for something like this.

String website = "mysite.com"    
String text = webContent.replaceAll(website+ "\\d{4}\\/\\d{2}\\/\\d{2}", String.valueOf(cacheDir));

This code however does not work when the url does not have the extra date stamp at the end. Does anyone know how i might achieve this? Thanks!

Try this one

mysite\.com/(\d{4}/\d{2}/\d{2}/)?

here ? means zero or more occurance


Note: use escape character \\. for dot match because .(dot) is already used in regex

Sample code :

String[] webContents = new String[] { "mysite.com/myimage.png",
        "mysite.com/1234/12/12/myimage.png" };

for (String webContent : webContents) {
    String text = webContent.replaceAll("mysite\\.com/(\\d{4}/\\d{2}/\\d{2}/)?",
            String.valueOf("mysite.com/abc/"));
    System.out.println(text);
}

output:

mysite.com/abc/myimage.png
mysite.com/abc/myimage.png

在此处输入图片说明

You are missing a forward slash between the website.com and the first 4 digits.

String text = webContent.replaceAll(Pattern.quote(website) + "/\\d{4}\\/\\d{2}\\/\\d{2}", String.valueOf(cacheDir));

I'd also recommend using a literal for your website.com value (the Pattern.quote part).

Finally you are also missing the last forward slash after the last two digits so it won't be replaced, but that may be on purpose...

Try:

String text = webContent.replaceAll("(?<="+website+")(.*)(?=\\/)", 
                                    String.valueOf(cacheDir));

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM