简体   繁体   中英

Java read XML with SAX Parsing

so I started to work with xml and the SAX parser and now I'm trying to figure out how its works, I am familiar with JSON but this doesn't seem to work like JSON.So here is the code I'm work with

package com.myalbion.gamedataextractor.handlers;

import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Map;

import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;

import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;

import com.myalbion.gamedataextractor.Main;
import com.myalbion.gamedataextractor.datatables.Language;
import com.myalbion.gamedataextractor.datatables.Localized;
import com.myalbion.gamedataextractor.datatables.XMLFile;

public class LocalizationXMLFileHandler extends DefaultHandler {

    private String temp;
    Localized localized;
    List<Localized> localizedList;
    Map<Language, String> tempMap;

    /*
     * When the parser encounters plain text (not XML elements),
     * it calls(this method, which accumulates them in a string buffer
     */
    public void characters(char[] buffer, int start, int length) {
           temp = new String(buffer, start, length);
    }


    /*
     * Every time the parser encounters the beginning of a new element,
     * it calls this method, which resets the string buffer
     */ 
    public void startElement(String uri, String localName,
                  String qName, Attributes attributes) throws SAXException {
           temp = "";
           if (qName.equalsIgnoreCase("tu")) {
               localized = new Localized();
               localized.setUniqueName(attributes.getValue("tuid"));

           } else if(qName.equalsIgnoreCase("tuv")) {
               tempMap.put(Language.getLanguageFromCode(attributes.getValue("xml:lang")), )
           }
    }

    /*
     * When the parser encounters the end of an element, it calls this method
     */
    public void endElement(String uri, String localName, String qName)
                  throws SAXException {

           if (qName.equalsIgnoreCase("tu")) {
                  // add it to the list
                  accList.add(acct);

           } else if (qName.equalsIgnoreCase("Name")) {
                  acct.setName(temp);
           } else if (qName.equalsIgnoreCase("Id")) {
                  acct.setId(Integer.parseInt(temp));
           } else if (qName.equalsIgnoreCase("Amt")) {
                  acct.setAmt(Integer.parseInt(temp));
           }

    } 

}

and I am trying to extract the data from this xml File into the tempMap which holds the Language enum and localized Name.

<?xml version="1.0"?>
<tmx version="1.4">
  <body>
    <tu tuid="@ACCESS_RIGHTS_ACCESS_MODE">
      <tuv xml:lang="EN-US">
        <seg>Access Mode</seg>
      </tuv>
      <tuv xml:lang="DE-DE">
        <seg>Zugriffsmodus</seg>
      </tuv>
      <tuv xml:lang="FR-FR">
        <seg>Mode d'accès</seg>
      </tuv>
      <tuv xml:lang="RU-RU">
        <seg>Доступ</seg>
      </tuv>
      <tuv xml:lang="PL-PL">
        <seg>Tryb dostępu</seg>
      </tuv>
      <tuv xml:lang="ES-ES">
        <seg>Modo de acceso</seg>
      </tuv>
      <tuv xml:lang="PT-BR">
        <seg>Modo de acesso</seg>
      </tuv>
      <tuv xml:lang="ZH-CN">
        <seg>权限模式</seg>
      </tuv>
      <tuv xml:lang="KO-KR">
        <seg>접근 모드</seg>
      </tuv>
    </tu>
  </body>
</tmx>

Now at line 49 of the java code I am getting the language code from the tuv attribute but I'm missing the localized Name which is below the tuv called seg of can receive the parents attribute and get the seg value in the same line?

You're overwriting your text buffer every time you hit a new text node, including a whitespace-only text node like the ones between </seg> and </tuv> . You need to save the contents of the text buffer when processing the seg end tag, and pick it up when processing the tuv end tag.

Also you should be aware that the content of a single text node can be supplied in a sequence of calls to text(): the parser can break it up any way it likes (many parsers do this on entity boundaries). You need to accumulate the content by appending to a buffer.

Also note that XML is case-sensitive; you shouldn't really ignore case when testing element names.

And when asking questions on SO, it helps to get the terminology right: referring to elements as attributes is going to confuse people.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM