简体   繁体   English

标准库是 Java 加载/读取和编辑/修改和保存 html 文件而不重新格式化的最佳选择吗?

[英]Is the standard library my best option for Java to load/read and edit/modify and save a html file with no reformatting?

I want to load/read and edit/modify and save a html file located on my hard drive.我想加载/读取和编辑/修改并保存位于我的硬盘驱动器上的 html 文件。 I tried JSOUP , but it kept reformatting the html file.我试过JSOUP ,但它一直在重新格式化 html 文件。 I want to avoid reformating.我想避免重组。

I'm wanting to inject some JavaScript after the <script> and before var deviceReady = false;我想在<script>之后和var deviceReady = false; in the html file.在 html 文件中。

Do I need to parse the file?我需要解析文件吗?

Should I use default Java?我应该使用默认的 Java 吗? (BufferedReader, FileReader, Scanner) (BufferedReader, FileReader, Scanner)

<!DOCTYPE html>
<html lang="en">
<head>
<meta name='viewport' content='initial-scale = 1, minimum-scale = 1, maximum-scale = 1'/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta http-equiv="x-ua-compatible" content="IE=10">
<title>LX-XXX-KU</title>
<style type="text/css">#initialLoading{background:url(assets/htmlimages/loader.gif) no-repeat center 
center;background-color:#ffffff;position:absolute;margin:auto;top:0;left:0;right:0;bottom:0;z- 
index:10010;}</style>

"

<script>

var deviceReady = false;
var initCalled = false ;
var initialized = false;

function onBodyLoad()
{
 if(typeof window.device === 'undefined')
{
    document.addEventListener("deviceready", onDeviceReady, false);
}
else
 {
    onDeviceReady();
 }
}

Javasacript I want to add after the <script> and before var deviceReady = false;我想在<script>之后和var deviceReady = false;之前添加 Javasacript

`//adds numbers to TOC
window.addEventListener( 'moduleReadyEvent', function ( e )
{
var myText = document.getElementsByClassName('tocText');

for ( var i = 0; i < myText.length; i++ )
{
var getText = myText[ i ].childNodes;
var str = ( i + 1 ) + ' ' + getText[ 0 ].innerHTML;
getText[ 0 ].innerHTML = str;
}
});`

This can be accomplished like so:这可以像这样完成:

File f = ...;
String contents = new String(Files.readAllBytes(f));
int idx = contents.indexOf(insertBeforeStr);
contents = contents.substring(0, idx) + contentToBeAdded + contents.substring(idx + 1);

// write contents back to the disk.

If you turn off jsoup's pretty printing option, and use the XML parser instead of the validating HTML parser, the document and all of its text verbatim, including whitespace, is passed through pretty much unmolested, other than syntax fixes for attributes, missing end tags, and the like.如果您关闭 jsoup 的漂亮打印选项,并使用XML 解析器而不是验证 HTML 解析器,则文档及其所有文本逐字逐句传递,包括空格,除了属性的语法修复,缺少结束标记,等等。

See for example your input on Try jsoup with pretty-printing off, and using the XML parser, is effectively the same as your original.例如,请参阅您在Try jsoup上的输入,并使用漂亮的打印关闭,并使用 XML 解析器,实际上与您的原始输入相同。

The code would be something like:代码将类似于:

Document doc = Jsoup.parse("<script>\nSomething(); ", "", Parser.xmlParser());
doc.outputSettings().prettyPrint(false);

Element scriptEl = doc.selectFirst("script");
DataNode scriptData = scriptEl.dataNodes().get(0);
scriptData.setWholeData(scriptData.getWholeData() + "\nanotherFunction();");

System.out.println(doc.html());

Gives us (note that there's no HTML structure automatically created, due to using the XML parser):给我们(请注意,由于使用了 XML 解析器,因此没有自动创建 HTML 结构):

<script>
Something(); 
anotherFunction()</script>

ControlAltDel's answer definitely works and means you can do it with just the Java base library. ControlAltDel 的答案绝对有效,这意味着您只需使用 Java 基础库即可。 The benefit of using jsoup is (IMHO - as the author of jsoup) in this case is that you're not trying to string-match HTML, and won't get caught by eg a <script> in a comment, or in this case a missing close </script> tag, etc. But of course YMMV.在这种情况下,使用 jsoup 的好处是(恕我直言 - 作为 jsoup 的作者),您不会尝试对 HTML 进行字符串匹配,并且不会被例如评论中的<script>或在此情况下缺少关闭</script>标记等。但当然是 YMMV。

Incidentally, once jsoup 1.14.1 is released (soon!) with the change #1419 (which for script elements, proxies text settings into data without escaping), the code will simplify to:顺便说一句,一旦 jsoup 1.14.1 发布(很快!)并带有更改#1419 (对于脚本元素,将文本设置代理到数据中而不转义),代码将简化为:

Element scriptEl = doc.selectFirst("script");
scriptEl.appendText("\nanotherFunction()");

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Java中修改HTML文件并保存更改 - How to modify a HTML-file in Java and save the changes Java读取html文件并将其内容保存到excel文件 - Java Read a html file and save its content to a excel file 使用Java编辑或修改现有文件 - Edit or modify the existing file using java 从Java读取和编辑文件 - Read and edit the file from Java 将链接的Java库中的HTML文件加载到Android Webview中 - Load HTML file from a linked java library into Android webview 使用Java需要帮助来读取文件,插入trim命令,编辑字符串,然后将文件另存为输出 - Need help using java to read a file, insert trim command, edit the strings, and then save the file as an output 您能否在不导入或使用任何 Java 标准库包或组件的情况下读取 .txt 文件? - Can you read a .txt file without importing or using any Java standard library packages or components? 无法使用 Java ImageIO 标准库读写 TIFF 图像文件 - Can't read and write a TIFF image file using Java ImageIO standard library Java库的标准配置文件位置? - Standard Config File Location for a Java Library? Java如何从Web服务器加载Yaml文件,对其进行编辑并将其保存到Web服务器? - Java how to load a Yaml file from web server, edit it, and save it to the web server?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM