简体   繁体   English

如何从 Java 中的 XML 读取特殊字符的映射?

[英]How can I read a mapping of special characters from XML in Java?

I'm not sure if this is possible, but I am writing a program that converts data from a database into XML.我不确定这是否可能,但我正在编写一个程序,将数据库中的数据转换为 XML。 The issue is that some of the values in the database have special characters.问题是数据库中的某些值具有特殊字符。 We have the typical XML special characters hardcoded in a map, but we would like to have a configurable XML mapping file that we will read at run time.我们有典型的 XML 特殊字符硬编码在 map 中,但我们希望有一个可配置的 XML 映射文件,我们将在运行时读取。

    <mapping source="ÿ" target="&#255;"/>
    <mapping source="þ" target="&#254;"/>
    <mapping source="ý" target="&#253;"/>
    <mapping source="ü" target="&#252;"/>
    <mapping source="û" target="&#251;"/>
    <mapping source="ú" target="&#250;"/>

We are using xstream to read the XML.我们正在使用 xstream 读取 XML。

public class CharMapping {

    private static final String CHAR_MAPPING_FILE = "char_mapping.xml";
    private static final String XML_ROOT_ELEMENT = "mappings";

    private static String readXmlFile(String filename) {
        StringBuffer xmlContent = new StringBuffer();
        try (BufferedReader br = new BufferedReader(
                new InputStreamReader(new FileInputStream(filename),"ISO-8859-1"))) {

            String currentLine;

            while ((currentLine = br.readLine()) != null) {
                xmlContent.append(currentLine);
            }

        } catch (IOException e) {
            e.printStackTrace();
        }
        return xmlContent.toString();
    }

    @SuppressWarnings("unchecked")
    public static Map<String, String> getCharMapping() {
        XStream xstream = new XStream();
        xstream.alias(XML_ROOT_ELEMENT, java.util.Map.class);
        xstream.registerConverter(new XMLConfigConverter("source", "target", null, null));

        String xml = readXmlFile(CHAR_MAPPING_FILE);
        Map<String, String> relationsMapping = (Map<String, String>) xstream.fromXML(xml);
        return relationsMapping;
    }
}
public class XMLConfigConverter implements Converter {

    private String keyAttribute;
    private String valueAttribute;
    private String filterAttribute;
    private String filterValue;

    public XMLConfigConverter(String keyAttribute, String valueAttribute, String filterAttribute,
            String filterValue) {
        this.keyAttribute = keyAttribute;
        this.valueAttribute = valueAttribute;
        this.filterAttribute = filterAttribute;
        this.filterValue = filterValue;
    }

    @SuppressWarnings("rawtypes")
    public boolean canConvert(Class clazz) {
        return AbstractMap.class.isAssignableFrom(clazz);
    }

    @Override
    public void marshal(Object arg0, HierarchicalStreamWriter writer, MarshallingContext context) {
    }

    @Override
    public Object unmarshal(HierarchicalStreamReader reader, UnmarshallingContext context) {
        Map<String, String> map = new HashMap<String, String>();

        while (reader.hasMoreChildren()) {
            reader.moveDown();
            if (reader.getNodeName().equals("mapping")) {
                if (filterAttribute != null && filterValue != null) {
                    if (reader.getAttribute(filterAttribute).equals(filterValue)) {
                        putValueInMap(reader, map);
                    }
                } else {
                    putValueInMap(reader, map);
                }
            }
            reader.moveUp();
        }
        for(String charKey : map.keySet()) {
            System.out.println("mapping: " + charKey + " - " + map.get(charKey));       }
        return map;
    }

    private void putValueInMap(HierarchicalStreamReader reader, Map<String, String> map) {
        String key = reader.getAttribute(keyAttribute);
        String value = reader.getAttribute(valueAttribute);
        System.out.println("Key: " + key + " - Value: " + value);
        map.put(key, value);
    }

}

The output is: output 是:

Key: ?¿ - Value: ÿ
Key: ?? - Value: ?
Key: ?½ - Value: ?
Key: ?¼ - Value: ü
Key: ?» - Value: û
Key: ?º - Value: ú

I know this seems a bit weird to pull mappings for an XML FROM an XML.我知道从 XML 中提取 XML 的映射似乎有点奇怪。 Idf this isn't possible is there any advice on a better solution? Idf 这是不可能的,有没有关于更好解决方案的建议? Would a CSV mapping be better? CSV 映射会更好吗?

Thanks!谢谢!

Your bug is probably in the line您的错误可能在行中

 new InputStreamReader(new FileInputStream(filename),"ISO-8859-1")))

where you are decoding the file using Java, rather than letting the XML parser do the decoding.您在哪里使用 Java 解码文件,而不是让 XML 解析器进行解码。 From the evidence of your output, it appears the file is not encoded in iso-8859-1, but in utf-8, and if you had left the XML parser to do the decoding it would probably have got it right.从您的 output 的证据来看,该文件似乎不是在 iso-8859-1 中编码,而是在 utf-8 中,如果您离开了 XML,它可能会正确解码。

I don't actually know XStream, but the Javadoc says there is a version of the fromXML() method that accepts a File as input.我实际上并不知道 XStream,但 Javadoc 说有一个接受File作为输入的 fromXML() 方法版本。 I suggest you use that version of the method, which is likely to get the decoding right, and get rid of your readXmlFile() method, which appears to be getting it wrong.我建议您使用该版本的方法,它可能会正确解码,并摆脱您的readXmlFile()方法,这似乎是错误的。

It is possible, of course, that you will still have problems: perhaps the file is encoded in UTF-8 but declares its encoding as ISO-8859-1.当然,您可能仍然会遇到问题:可能文件以 UTF-8 编码,但将其编码声明为 ISO-8859-1。 But I think there's a good chance this change will fix it.但我认为这种变化很有可能会解决它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM