简体   繁体   English

在Java中使用Jsoup在多个标签中替换HTML中的文本

[英]Replace text in HTML within multiple tags using Jsoup in Java

I am reading a HTML file line by line using java. 我正在使用Java逐行读取HTML文件。 Consider i am having a HTML line 考虑我有一个HTML行

<p> Hi everyone. This is a <em>dead end.</em> Do not go!</p>

I want to change the text in the line to 我想将行中的文本更改为

<p> Hi everyone. This is not a <em>dead end.</em>You may go!</p>

The Inputs will be given as 输入将给出为

  • Change From: This is a dead end. Do not go! 更改自: This is a dead end. Do not go! This is a dead end. Do not go!
  • Change To: This is not a dead end. You may go! 更改为: This is not a dead end. You may go! This is not a dead end. You may go!

How can i do this without disturbing the HTML tags using Jsoup in Java or any other methods in java. 我该如何使用Java中的Jsoup或Java中的任何其他方法来在不干扰HTML标记的情况下执行此操作。 Please help 请帮忙

As an alternative to MCL 's solution, here a fully Jsoup based one: 作为MCL解决方案的替代方法,这里有一个完全基于Jsoup的解决方案:

First, here's how Jsoup see's your html: 首先,这是Jsoup看到您的html的方式:

org.jsoup.nodes.TextNode:    Hi everyone. This is a 
org.jsoup.nodes.Element:    <em>dead end.</em>
org.jsoup.nodes.TextNode:    Do not go!

All three nodes are children of the <p>...</p> element. 这三个节点都是<p>...</p>元素的子级。

And here's the (very verbose) code: 这是(非常冗长的)代码:

final String html = "<p> Hi everyone. This is a <em>dead end.</em> Do not go!</p>";

Document doc = Jsoup.parseBodyFragment(html); // Parse html into a document
Element pTag = doc.select("p").first(); // Select the p-element (there's just one)


// Text before 'em'-tag
TextNode preEM = (TextNode) pTag.childNode(0);
preEM.text(preEM.text().replace("This is a", "This is not a"));

// Text after 'em'-tag
TextNode postEM = (TextNode) pTag.childNode(2);
postEM.text("You may go!");


System.out.println(pTag); // Print result

Output: 输出:

<p> Hi everyone. This is not a <em>dead end.</em>You may go!</p>

This will keep all html formatting and / or will work in full documents. 这将保留所有html格式和/或将在完整文档中工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM