简体   繁体   English

Java无法在我的XML中获得带有法国口音的Nodelist

[英]Java cannot get Nodelist with french accents inside my XML

I have an XML like this one having french character: 我有一个像这样的具有法语字符的XML:

<?xml version="1.0" encoding="ISO-8859-1"?>
<liste>
<produit code="311" prix="43.00" quantité= "28" />
<produit code="123" prix="39.00" quantité= "10"  />
<produit code="456" prix="36.00" quantité= "241"  />
</liste>

My java code : 我的Java代码:

import org.w3c.dom.*;
import javax.xml.parsers.*;
import java.io.*;
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;

public class test2 {
public static void main(String[] args) throws Exception {
System.setOut(new PrintStream(System.out, true, "Cp850")); 
   DocumentBuilderFactory factory = 
    DocumentBuilderFactory.newInstance();
   DocumentBuilder parser = factory.newDocumentBuilder();
   Document doc = parser.parse(args[0]);

   Element racine = doc.getDocumentElement(); 
   NodeList nl = racine.getElementsByTagName("produit");
}
}

When I try to compile my java code with javac I have an error telling me : Attribute name "Quantit╟" associated with an element type "produit" must be followed by the '=' character. 当我尝试使用javac编译Java代码时,出现一个错误告诉我:与元素类型“产品”关联的属性名称“Quantit╟”必须后跟'='字符。

How I can read my french word in my nodelist ? 我如何在节点列表中阅读法语单词? thanks 谢谢

Your document does not seem to use the character set it declares in its header. 您的文档似乎没有使用其在标头中声明的字符集。 I can reproduce your problem if the XML document is encoded as UTF-8. 如果XML文档编码为UTF-8,我可以重现您的问题。 The problem goes away if it is encoded as ISO-8859-1. 如果将其编码为ISO-8859-1,问题将消失。 Please try yourself: 请尝试一下:

public static void main(String[] args) throws Exception {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    //OutputStreamWriter osw = new OutputStreamWriter(baos, "UTF-8"); // causes described error
    OutputStreamWriter osw = new OutputStreamWriter(baos, "ISO-8859-1");
    PrintWriter pw = new PrintWriter(osw, true);
    pw.println("<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>"); 
    pw.println("<liste>"); 
    pw.println("<produit code='311' prix='43.00' quantité= '28' />"); 
    pw.println("<produit code='123' prix='39.00' quantité= '10'  />"); 
    pw.println("<produit code='456' prix='36.00' quantité= '241'  />"); 
    pw.println("</liste>");
    pw.close();

    System.setOut(new PrintStream(System.out, true, "Cp850"));
    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
    DocumentBuilder parser = factory.newDocumentBuilder();
    Document doc = parser.parse(new ByteArrayInputStream(baos.toByteArray()));

    Element racine = doc.getDocumentElement();
    NodeList nl = racine.getElementsByTagName("produit");
}

You should make the declared and actual encoding of the XML document match. 您应该使XML文档的声明和实际编码匹配。

Someone found my problem. 有人发现我的问题。 It's because my XML is saved with NotePad that Use UTF-8 for encoding. 这是因为我的XML与使用UTF-8进行编码的记事本一起保存。 I use NotePad++ and save it as ISO-8859-1 and my code works fine now. 我使用NotePad ++并将其另存为ISO-8859-1,现在我的代码可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM