[英]Replace text in HTML within multiple tags using Jsoup in Java
I am reading a HTML file line by line using java. 我正在使用Java逐行读取HTML文件。 Consider i am having a HTML line 考虑我有一个HTML行
<p> Hi everyone. This is a <em>dead end.</em> Do not go!</p>
I want to change the text in the line to 我想将行中的文本更改为
<p> Hi everyone. This is not a <em>dead end.</em>You may go!</p>
The Inputs will be given as 输入将给出为
This is a dead end. Do not go!
更改自: This is a dead end. Do not go!
This is a dead end. Do not go!
This is not a dead end. You may go!
更改为: This is not a dead end. You may go!
This is not a dead end. You may go!
How can i do this without disturbing the HTML tags using Jsoup in Java or any other methods in java. 我该如何使用Java中的Jsoup或Java中的任何其他方法来在不干扰HTML标记的情况下执行此操作。 Please help 请帮忙
As an alternative to MCL 's solution, here a fully Jsoup based one: 作为MCL解决方案的替代方法,这里有一个完全基于Jsoup的解决方案:
First, here's how Jsoup see's your html: 首先,这是Jsoup看到您的html的方式:
org.jsoup.nodes.TextNode: Hi everyone. This is a
org.jsoup.nodes.Element: <em>dead end.</em>
org.jsoup.nodes.TextNode: Do not go!
All three nodes are children of the <p>...</p>
element. 这三个节点都是<p>...</p>
元素的子级。
And here's the (very verbose) code: 这是(非常冗长的)代码:
final String html = "<p> Hi everyone. This is a <em>dead end.</em> Do not go!</p>";
Document doc = Jsoup.parseBodyFragment(html); // Parse html into a document
Element pTag = doc.select("p").first(); // Select the p-element (there's just one)
// Text before 'em'-tag
TextNode preEM = (TextNode) pTag.childNode(0);
preEM.text(preEM.text().replace("This is a", "This is not a"));
// Text after 'em'-tag
TextNode postEM = (TextNode) pTag.childNode(2);
postEM.text("You may go!");
System.out.println(pTag); // Print result
Output: 输出:
<p> Hi everyone. This is not a <em>dead end.</em>You may go!</p>
This will keep all html formatting and / or will work in full documents. 这将保留所有html格式和/或将在完整文档中工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.