简体   繁体   English

XML编码UTF-8不适用于土耳其语字符

[英]XML encoding UTF-8 not working for turkish characters

I have a method to create and record to xml file. 我有一种创建并记录到xml文件的方法。 It produces corrupted result. 它产生损坏的结果。 My turkish characters writing as hexadecimal expressions. 我的土耳其语字符以十六进制表示形式书写。 While i'm using UTF-8, i couldn't solve the problem. 当我使用UTF-8时,我无法解决问题。 By the way i checked both with Sublime and Notepad++ editors. 顺便说一下,我同时检查了Sublime和Notepad ++编辑器。

public boolean add(BatFile batFile) throws Exception {
        File inputFile = new File(fileLocation);
        DocumentBuilderFactory docFactory = DocumentBuilderFactory
                .newInstance();
        DocumentBuilder docBuilder = docFactory.newDocumentBuilder();
        Document doc = docBuilder.parse(inputFile);

        Element rootElement = doc.getDocumentElement();

        Element batFileElement = doc.createElement("BatFile");
        rootElement.appendChild(batFileElement);

        Element batJobName = doc.createElement("Name");
        batJobName.appendChild(doc.createTextNode(batFile.getName()));
        batFileElement.appendChild(batJobName);

        Element batFileBriefDesc = doc.createElement("BriefDesc");
        batFileBriefDesc
                .appendChild(doc.createTextNode(batFile.getBriefDesc()));
        batFileElement.appendChild(batFileBriefDesc);

        Element batFileDesc = doc.createElement("Desc");
        batFileDesc.appendChild(doc.createTextNode(batFile.getDesc()));
        batFileElement.appendChild(batFileDesc);

        Element batFileName = doc.createElement("FileName");
        batFileName.appendChild(doc.createTextNode(batFile.getFileName()));
        batFileElement.appendChild(batFileName);

        Element batCommandArgs = doc.createElement("CommandArgs");

        for (int k = 0; k < batFile.getCommandArgs().size(); k++) {
            Element commandArg = doc.createElement("CommandArg");
            // commandArg.setAttribute("ID", String.valueOf(k));
            commandArg.appendChild(doc.createTextNode(batFile.getCommandArgs()
                    .get(k)));
            batCommandArgs.appendChild(commandArg);

        }
        batFileElement.appendChild(batCommandArgs);

        Element batCreationTime = doc.createElement("CreationTime");
        batCreationTime.appendChild(doc.createTextNode(batFile
                .getCreationTime()));
        batFileElement.appendChild(batCreationTime);

        Element batSchedulerPattern = doc.createElement("SchedulerPattern");
        batSchedulerPattern.appendChild(doc.createTextNode(batFile
                .getExecutionPattern()));
        batFileElement.appendChild(batSchedulerPattern);

        Element batTaskID = doc.createElement("TaskID");
        if (batFile.getTaskID() != null) {
            batTaskID.appendChild(doc.createTextNode(batFile.getTaskID()));
        }
        batFileElement.appendChild(batTaskID);

        TransformerFactory tFactory = TransformerFactory.newInstance();
        Transformer transformer = tFactory.newTransformer();
        DOMSource domSource = new DOMSource(doc);
        StreamResult result = new StreamResult(new FileWriter(inputFile));
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.transform(domSource, result);
        return true;

    }

When i test it with those codes below: 当我使用以下代码进行测试时:

    @Test
    public void testAddingTask() throws Exception {
        IBAO testBao = XMLBAO.getInstance();
        BatFile testBatFile = new BatFile();
        testBatFile.setName("ŞŞŞŞŞ");
        testBatFile.setBriefDesc("ÇÇÇÇÇÇ");
        testBatFile.setDesc("ĞĞĞĞĞĞ");
        testBatFile.setFileName("FileName");
        testBatFile.setCreationTime("Merhaba");
        testBatFile.setExecutionPattern("ööçöçöçüü");
        testBatFile.addCommandArgs("ZZZZZZZZ");
        testBatFile.setTaskID("ÜÜÜÜÜÜÜÜ");
        testBao.add(testBatFile);
    }

It produces me this result: 它产生我这个结果:

<BatFiles>  
<BatFile>
<Name>???/Name>
<BriefDesc>???</BriefDesc>
<Desc>???</Desc>
<FileName>FileName</FileName>
<CommandArgs>
<CommandArg>ZZZZZZZZ</CommandArg>
</CommandArgs>
<CreationTime>Merhaba</CreationTime>
<SchedulerPattern>??????</SchedulerPattern>
<TaskID>????</TaskID>
</BatFile>
</BatFiles>

You're writing to a character stream and not letting the API control which encoding the data is written as. 您正在写入字符流,而不是让API控制写入数据的编码方式。 FileWriter uses the default platform encoding which might not be UTF-8: FileWriter使用默认平台编码,该编码可能不是UTF-8:

The constructors of this class assume that the default character encoding and the default byte-buffer size are acceptable. 此类的构造函数假定默认字符编码和默认字节缓冲区大小是可接受的。

Use a FileOutputStream with the StreamResult (in a try-with-resources block.) FileOutputStreamStreamResult一起使用 (在try-with-resources块中。)


You might also be having issues due to Java source file encodings . 由于Java源文件编码,您可能还会遇到问题。 Consider using Unicode escapes instead of literals. 考虑使用Unicode转义而不是文字。 That is, "\Ş" instead of "Ş" . 也就是说,用"\Ş"代替"Ş"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM