简体   繁体   中英

Get part of string that is not html in Java

In my Java application I have String that have to be edited. The problem is that these Strings can contain HTML tags/elements, which should not be edited (no id to retrieve element).

Scenario (add -):

String a = "<span> <table> </table>  </span> <div></div> <div> text 2</div>";
should become: <span> <table> </table>  </span> <div></div> <div> -text 2</div>  

String b = "text";
should become: -text

String c = "<p> t </p>";
should become: <p> -t </p>  

My question is: How can I retrieve the text in a string that can contain html tags (cannot add id or class)

You can use an XML parsing library.

String newText = null;
for ( Node node : document.nodes() ) {
  if ( node.text() != null ) newText = "-" + node.text();
}

note that this is pseudo.

newText will now be -text or whatever the node text is.

EDIT: Your question is a bit ambiguous in terms of "the text can contain html elements."
If it doesn't contain html tags, then you cannot use an XML parser, which brings up the question.. if it doesn't contain tags, then why can't you just do...

String newString = "-" + a;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM