繁体   English   中英

将XML解析为Java对象

[英]Parsing XML into Java object

我正在尝试确定解析从Web服务调用到Java对象的XML响应的最佳方法。 使用JAXB似乎是最简单的方法,但是我得到的每个例子都要求你有一个模板Java类,它将是XML转换成的Java类型。 我的xml如下

  <?xml version="1.0" encoding="utf-8" ?>
  <entry_list version="1.0">
      <entry id="main[1]"> <hw highlight="yes" hindex="1">main</hw> <sound><wav>main0001.wav</wav></sound> <pr>ˈmeɪn</pr> <fl>adjective</fl> <lb>always used before a noun</lb> <def><dt>:most important :<sx>chief</sx> <sx>principal</sx> <vi>the <it>main</it> idea/point</vi> <vi>the <it>main</it> goal/purpose</vi> <vi>Speed is the <it>main</it> advantage of this approach.</vi> <vi>The company's <it>main</it> office is located in New York.</vi> <vi>the novel's <it>main</it> character</vi> <vi>driving down the <it>main</it> road/highway</vi> <vi>the <it>main</it> gate/entrance</vi> <vi>This dish can be served as a <phrase>main course</phrase> or appetizer.</vi> <vi>And now for the <phrase>main event</phrase> of the evening!</vi></dt></def> <uro><ure>main*ly</ure> <fl>adverb</fl> <utxt><vi>The reviews have been <it>mainly</it> [=<it>mostly</it>] positive.</vi> <vi>a plant found <it>mainly</it> [=<it>chiefly</it>] in coastal regions</vi> <vi>I don't like the plan, <it>mainly</it> because I think it's too expensive.</vi> <vi>The problems have been <it>mainly</it> minor ones. [=most of the problems have been minor ones]</vi> <vi>They depend <it>mainly</it> on/upon fish for food.</vi></utxt></uro></entry>
      <entry id="main[2]"> <hw hindex="2">main</hw> <altpr>ˈmeɪn</altpr> <fl>noun</fl> <in><il>plural</il> <if>mains</if></in> <def><sn>1</sn> <sgram>count</sgram> <dt>:the largest pipe in a system of connected pipes <vi>a gas <it>main</it></vi> <vi>a water <it>main</it></vi></dt> <sn>2</sn> <bnote>the mains</bnote> <ssl>Brit</ssl> <sn>a</sn> <dt>:the system of pipes or wires for electricity, gas, or water <vi>My radio runs either off batteries or off <it>the mains</it>.</vi> <un>often used as <it>mains</it> before another noun <vi>We haven't had any <it>mains</it> water/electricity since the storm.</vi></un></dt> <sn>b</sn> <dt>:the place where electricity, gas, or water enters a building or room <vi>Turn off the water at <it>the mains</it>.</vi></dt></def> <dro><dre>in the main</dre> <def><dt>:in general <un>used to say that a statement is true in most cases or at most times <vi>The workers are <it>in the main</it> very capable. [=most of the workers are very capable]</vi> <vi>The weather has <it>in the main</it> been quite good. [=has been quite good most of the time]</vi></un></dt></def></dro></entry>
      <entry id="main clause"> <hw>main clause</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ clauses</if></in> <def><gram>count</gram> <sl>grammar</sl> <dt>:a clause that could be used by itself as a simple sentence but that is part of a larger sentence <ca>called also <cat>independent clause</cat></ca> <dx>compare <dxt>coordinate clause</dxt> <dxt>subordinate clause</dxt></dx></dt></def></entry>
      <entry id="main drag"> <hw>main drag</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ drags</if></in> <def><gram>count</gram> <sl>US</sl> <sl>informal</sl> <dt>:the main street in a town or city <vi>A carload of teenagers were cruising down the <it>main drag</it>.</vi></dt></def></entry>
      <entry id="main line"> <hw>main line</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ lines</if></in> <def><gram>count</gram> <dt>:an important highway or railroad line</dt></def></entry>
      <entry id="main man"> <hw>main man</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ men</if></in> <def><gram>count</gram> <sl>US</sl> <sl>informal</sl> <sn>1</sn> <dt>:someone's best male friend <vi>He's still her <it>main man</it>.</vi></dt> <sn>2</sn> <dt>:the most important or admired man in a group <vi>The team has many good players, but he is clearly the <it>main man</it>.</vi></dt></def></entry>
      <entry id="main squeeze"> <hw>main squeeze</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ squeezes</if></in> <def><gram>count</gram> <sl>chiefly US slang</sl> <dt>:someone's main girlfriend, boyfriend, or lover <vi>She's my <it>main squeeze</it>.</vi></dt></def></entry>
      <entry id="main street"> <hw>main street</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ streets</if></in> <def><sn>1</sn> <sgram>count</sgram> <dt>:the most important street of a U.S. town where there are many stores, banks, etc. <un>often used as a name <vi>The restaurant is at 257 <it>Main Street</it>.</vi></un></dt> <sn>2</sn> <bnote>Main Street</bnote> <sgram>noncount</sgram> <ssl>US</ssl> <dt><un>used to refer to middle-class people in the U.S. who have traditional beliefs and values <vi>What does <it>Main Street</it> think of this policy?</vi></un></dt></def></entry>
      <entry id="water main"> <hw>water main</hw> <fl>noun</fl> <in><il>plural</il> <if>⁓ mains</if></in> <def><gram>count</gram> <dt>:a large underground pipe that carries water <vi>The <it>water main</it> burst/broke and flooded the street.</vi></dt></def></entry>
  </entry_list>

我的问题是,我是否必须定义将被转换为的Java对象? 如果是这种情况,我担心的是,如果在XML响应中添加或删除数据,就会发生这种情况。 我也尝试将XML加载到DOM中并以这种方式行走,但我又想知道如果添加或删除元素会发生什么?
我只想要某些子节点,如果它们的父节点是某个值,那么任何关于最简单方法的指针都是受欢迎的。

经常被称为POJO,是的,拥有一个是一个好主意(甚至可能是必要的)。 它定义了数据应如何表示为对象。 如果缺少数据,Java对象的字段将为null。 因此,您应该将Java对象定义为所有可能属性的最大覆盖。

可能有一些库会将其他属性放入Hashmap中(至少我知道Jackson可以为JSON做到这一点,不确定XML)

唯一的选择是自己手动解析它,以保证捕获所有元素,例如深度优先遍历节点

您可以使用SAX解析器。 除了快速和低内存之外,这种方法的优点在于你可以忽略你不想要或不需要的一切 - 那么你不关心这些部分是否会改变。 您只需在通过时捕获所需的标签。

例如,如果您只对“主要子句”标记感兴趣,那么您的处理程序将类似于:

public class MyHandler extends org.xml.sax.helpers.DefaultHandler {

    public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {
        if ("entry".equalsIgnoreCase(localName) &&
                "main clause".equalsIgnoreCase(attributes.getValue("id"))) {
            // Set a member variable flag
            // So we know how to process nested tags
        }
    }

    public void endElement(String uri, String localName, String qName) throws SAXException {
        if ("entry".equalsIgnoreCase(localName)) {
            // Unset the flag
        }
    }
}

使用XML的最简单方法是将其序列化为对象。
你可以用JAXB来做,这是一个教程: mykong
只需定义对象的外观。
这是一个例子:

@XmlRootElement(name = "entry_list")
public class EntryList {

    @XmlElement(name = "entry")
    private List<Entry> entities;

    public List<Entry> getEntities() {
        return entities;
    }
    public void setLastName(List<Entry> entities) {
        this.entities = entities;
    }
}

public class Entry {

    @XmlAttribute
    private String id;

    @XmlElement
    private Sound sound

    etc
    ...

    public String getId() {
        return id;
    }
    public void setId(String id) {
        this.id = id;
    }

    public Sound getSound() {
        return sound;
    }
    public void setSound(Sound sound) {
        this.sound = sound;
    }
}

获得子元素的每个元素都必须是一个类,如果它重复多次,如entryvi,它应该是一个列表。

根据我的经验,当你必须处理非常复杂的XML文档时,它可能更容易:

  1. 将其转换为更简单的形式
  2. 将它编成你可以使用的对象

也就是说你有一个非常复杂的XML:

<XML>
   <SomeElement>
       <MoreElements>
           <EvenMoreElements>text</EvenMoreElements>
       </MoreElements>
   </SomeElement>
</XML>

步骤1:使用XSLT简化它

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:template match="/">
        <SimpleForm><xsl:value-of select="XML/SomeElement/MoreElements/EvenMoreElements/text()"/></SimpleForm>
    </xsl:template>
</xsl:stylesheet>

步骤2.将您自己的SimpleForm XML编组为java对象

这样你就可以放松外部模式和内部逻辑之间的耦合。

我不认为JAXB是最好的解决方案......最好的解决方案是基于XPath,它允许您简化编码而不牺牲代码可维护性......正如您在下面的代码中看到的,您的导航只是一个XPATH表达式,整个程序大约是使用XPath和VTD-XML的10多行代码,BTW上面发布的xml示例格式不正确...

import com.ximpleware.*;
public class extractExample {

    public static void main(String[] args) throws VTDException {
        // TODO Auto-generated method stub
        VTDGen vg = new VTDGen();
        if(!vg.parseFile("d:\\xml\\sample.xml", false)){
            return;
        }
        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectXPath("/entry_list/entry/hw[following-sibling::fl='value']/text()");
        int i=0;
        while((i=ap.evalXPath())!=-1){
            System.out.println(" hw value are "+vn.toNormalizedString(i));
        }
    }

}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM