简体   繁体   中英

How to do a replace only on parts of string matching my regex pattern?

Lets' say I have this html code in my String variable;

String htmlCode = "<span class='test'>test</span>"+
+"<a href=\"http://foo.com?id=<span class='test'>test</span>\">link</a>";

The htmlCode variable would contain more links similar to that, plus it would also contain more spans similar to that.

I want to replace everything in between tags <span and </span> including those spans, but only if they are in <a href tag. Meaning that I don't want to replace the first span tag, but I want to replace the second one.

I know that regex can do that, but so far I was able to do this:

htmlCode = htmlCode.replaceAll("<span.*?</span>", "");

But how do I define that I want to replace it only if it occurs in the <a> tag? Plus is there a way to replace it including those span tags?

If I understand your question correctly you want to remove span tags from href value of your a tag.

In that case you can try with something like

String htmlCode = "<span class='test'>test</span>"
        + "<a href=\"http://foo.com?id=<span class='test'>test</span>\">link</a>"
        + "<a href=\"http://foo.com?id=test2\">link</a>";
Document doc = Jsoup.parse(htmlCode);
System.out.println(doc);

for (Element el : doc.select("a[href*=<span]")){//select a with href which contains `<span`
    el.attr("href", Jsoup.parse(el.attr("href")).text());//sets new value for `href` attribute which will be 
    //parsed "http://foo.com?id=<span class='test'>test</span>" and text it represents
}

System.out.println("-----");
System.out.println(doc);

Output (before/after):

<html>
 <head></head>
 <body>
  <span class="test">test</span>
  <a href="http://foo.com?id=<span class='test'>test</span>">link</a>
  <a href="http://foo.com?id=test2">link</a>
 </body>
</html>
-----
<html>
 <head></head>
 <body>
  <span class="test">test</span>
  <a href="http://foo.com?id=test">link</a>
  <a href="http://foo.com?id=test2">link</a>
 </body>
</html>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM