[英]how to write UTF8 data to xml file using RandomAccessFile?
When trying to write some UTF8 data to a file, I end up with some garbage in the file.当尝试将一些 UTF8 数据写入文件时,我最终在文件中产生了一些垃圾。 The code is as follows
代码如下
public static boolean saveToFile(StringBuffer buffer,
String fileName,
ArrayList exceptionList,
String className)
{
log.debug("In saveToFile for file [" + fileName + "]");
RandomAccessFile raf = null;
File file = new File(fileName);
File backupFile = new File(fileName+"_bck");
try
{
if (file.exists())
{
if (backupFile.exists())
{
backupFile.delete();
}
file.renameTo(backupFile);
}
raf = new RandomAccessFile(file, "rw");
raf.writeBytes(buffer.toString());
raf.close();
The output of buffer.toString() is buffer.toString() 的输出是
<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>αβγδεζη
The data in the file however is但是文件中的数据是
<?xml version="1.0" encoding="UTF-8"?>
<ivr>
<version>1.1</version>
<templateName>▒▒▒▒▒▒▒</templateName>
How can I make sure that data i nthe file itself is UTF8如何确保文件本身中的数据是 UTF8
I'm not surpised you get garbage:我不惊讶你得到垃圾:
raf.writeBytes(buffer.toString())
The documentation for RandomAccessFile.writeBytes(String)
says (emphasis added): RandomAccessFile.writeBytes(String)
的文档说(强调):
Writes the string to the file as a sequence of bytes.
将字符串作为字节序列写入文件。 Each character in the string is written out, in sequence, by discarding its high eight bits .
通过丢弃其高八位,按顺序写出字符串中的每个字符。
In a few circumstances, that operation will result in a correctly encoded file.在少数情况下,该操作将生成正确编码的文件。 But in most it won't.
但在大多数情况下不会。 That
writeBytes()
method is a foolish design by the Java developers.这个
writeBytes()
方法是 Java 开发人员的一个愚蠢的设计。 You need to correctly encode your text as bytes in UTF-8, and then write those bytes.您需要将文本正确编码为 UTF-8 字节,然后写入这些字节。
Do you really need to operate on the file as a random access file.您是否真的需要将文件作为随机访问文件进行操作。 If not, just manipulate it with a
Writer
wrapping an OutputStream
.如果没有,只需使用包装
OutputStream
的Writer
操作它。
You could use Charset.encode(CharBuffer)
to produce a ByteBuffer
holding the encoded bytes, then write those bytes to the file:您可以使用
Charset.encode(CharBuffer)
生成一个ByteBuffer
保存编码的字节,然后将这些字节写入文件:
raf.write(StandardCharsets.UTF_8.encode(buffer).array());
The Javadoc for RandomAccessFile states that for writeBytes()
RandomAccessFile的Javadoc声明对于
writeBytes()
Writes the string to the file as a sequence of bytes.
将字符串作为字节序列写入文件。 Each character in the string is written out, in sequence, by discarding its high eight bits .
通过丢弃其高八位,按顺序写出字符串中的每个字符。 The write starts at the current position of the file pointer.
写入从文件指针的当前位置开始。
Assuming that discarding parts of your String isn't what you want, you should be using writeUtf() :假设丢弃部分 String不是您想要的,您应该使用writeUtf() :
Writes a string to the file using modified UTF-8 encoding in a machine-independent manner.
以独立于机器的方式使用修改后的 UTF-8 编码将字符串写入文件。
String txt = buffer.toString();
raf.write(txt.getBytes(StandardCharsets.UTF_8));
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.