简体   繁体   中英

Trouble parsing self closing XML tags using SAX parser

I am having trouble parsing self closing XML tags using SAX. I am trying to extract the link tag from the Google Base API.I am having reasonable success in parsing regular tags.

Here is a snippet of the xml

<entry>
  <id>http://www.google.com/base/feeds/snippets/15802191394735287303</id>
  <published>2010-04-05T11:00:00.000Z</published>
  <updated>2010-04-24T19:00:07.000Z</updated>
  <category scheme='http://base.google.com/categories/itemtypes' term='Products'/>
  <title type='text'>En-el1 Li-ion Battery+charger For Nikon Digital Camera</title>
  <link rel='alternate' type='text/html' href='http://rover.ebay.com/rover/1/711-67261-24966-0/2?ipn=psmain&amp;icep_vectorid=263602&amp;kwid=1&amp;mtid=691&amp;crlp=1_263602&amp;icep_item_id=170468125748&amp;itemid=170468125748'/>
.
.

and so on

I can parse the updates and published tags, but not the link and category tag.

Here are my startElement and endElement overrides

public void startElement(String uri, String localName, String qName,
     Attributes attributes) throws SAXException {
     if (qName.equals("title") && xmlTags.peek().equals("entry")) {

     insideEntryTitle = true;

   } 
   xmlTags.push(qName);

 }

public void endElement(String uri, String localName, String qName)
     throws SAXException {
   // If a "title" element is closed, we start a new line, to prepare
   // printing the new title.

   xmlTags.pop();
   if (insideEntryTitle) {
     insideEntryTitle = false;
  System.out.println();
   }
 }

declaration for xmltags..

private Stack<String> xmlTags = new Stack<String>(); 

Any help guys?

this is my first post here.. I hope I have followed posting rules! thanks a ton guys..

Correction: endElement gets called. characters does not.

public void characters(char[] ch, int start, int length) throws SAXException 
{
    if (insideEntryTitle)
    {
        String url= new String(ch, start, length);
        System.out.println("url="+title);
        i++;
    }
}

What characters does is deliver the content between the XML element tags (in chunks, one chunk per method call). So if you have an XML element like

<Foo someattrib=“” />

then the characters doesn't get called, because there's no content there for the parser to tell you about.

If you are relying on your characters method having to get called here even if the tag is empty, .

The characters method adds element text to a buffer, but startElement and endElement need to be in charge of clearing and reading from the buffer because endElement is the place where you know you've received all the element text. It should be ok to have characters not get called if there is nothing to read.

Because you may not have all the content yet in any one characters call there must not be any business logic in that method. If there is then your code won't work at some point.

For how to implement characters see this example . If what you want to do is read attribute values see this example .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM