简体   繁体   English

Xml没有使用sax解析String作为输入

[英]Xml not parsing String as input with sax

I have a string input from which I need to extract simple information, here is the sample xml (from mkyong): 我有一个字符串输入,我需要从中提取简单的信息,这里是示例xml(来自mkyong):

<?xml version="1.0"?>
<company>
    <staff>
        <firstname>yong</firstname>
        <lastname>mook kim</lastname>
        <nickname>mkyong</nickname>
        <salary>100000</salary>
    </staff>
    <staff>
        <firstname>low</firstname>
        <lastname>yin fong</lastname>
        <nickname>fong fong</nickname>
        <salary>200000</salary>
    </staff>
</company>

How I parse it within my code (I have a field String name in my class) : 我如何在我的代码中解析它(我的类中有一个字段String name ):

public String getNameFromXml(String xml) {
        try {

            SAXParserFactory factory = SAXParserFactory.newInstance();
            SAXParser saxParser = factory.newSAXParser();
            DefaultHandler handler = new DefaultHandler() {

                boolean firstName = false;

                public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException {

                    if (qName.equalsIgnoreCase("firstname")) {
                        firstName = true;
                    }
                }

                public void characters(char ch[], int start, int length) throws SAXException {

                    if (firstName) {
                        name = new String(ch, start, length);
                        System.out.println("First name is : " + name);
                        firstName = false;
                    }

                }

            };

            saxParser.parse(xml.toString(), handler);

        } catch (Exception e) {
            e.printStackTrace();
        }

        return name;
    }

I'm getting a java.io.FileNotFoundException and I see that it's trying to find a file myprojectpath + the entireStringXML 我收到了java.io.FileNotFoundException ,我发现它正在尝试查找文件myprojectpath + the entireStringXML

What am I doing wrong? 我究竟做错了什么?

Addon : 添加在 :

Here is my main method : 这是我的主要方法:

public static void main(String[] args) {
        Text tst = new Text("<?xml version=\"1.0\"?><company>   <staff>     <firstname>yong</firstname>     <lastname>mook kim</lastname>       <nickname>mkyong</nickname>     <salary>100000</salary> </staff>    <staff>     <firstname>low</firstname>      <lastname>yin fong</lastname>       <nickname>fong fong</nickname>      <salary>200000</salary> </staff></company>");
        NameFilter cc = new NameFilter();
        String result = cc.getNameFromXml(tst);
        System.out.println(result);
    }

You should replace the line saxParser.parse(xml.toString(), handler); 你应该替换saxParser.parse(xml.toString(), handler); with the following one: 以下是:

saxParser.parse(new InputSource(new StringReader(xml)), handler);

I'm going to highlight another issue, which you're likely to hit once you read your file correctly. 我将重点介绍另一个问题,一旦您正确读取文件,您可能会遇到这个问题。

The method 方法

public void characters(char ch[], int start, int length) 

won't always give you the complete text element . 不会总是给你完整的文字元素 It's at liberty to give you the text element (content) 'n' characters at a time. 您可以自由地一次为您提供文本元素(内容)'n'字符。 From the doc : 来自doc

SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks SAX解析器可以在单个块中返回所有连续的字符数据,或者它们可以将其拆分为多个块

So you should build up your text element string from each call to this method (eg using a StringBuilder ) and only interpret/store that text once the corresponding endElement() method is called. 因此,您应该在每次调用此方法时构建文本元素字符串(例如,使用StringBuilder ),并且只有在调用相应的endElement()方法时才解释/存储该文本。

This may not impact you now. 这可能不会影响你。 But it'll arise at some time in the future - likely when you least expect it. 但它会在未来的某个时间出现 - 可能是你最不期望的时候。 I've encountered it when moving from small to large XML documents, where buffering has been able to hold the whole small document, but not the larger one. 我在从小型XML文档移动到大型XML文档时遇到过这种情况,其中缓冲已经能够容纳整个小文档,而不是较大的文档。

An example (in pseudo-code): 一个例子(伪代码):

   public void startElement() {
      builder.clear();
   }
   public void characters(char ch[], int start, int length) {
      builder.append(new String(ch, start, length));
   }
   public void endElement() {
      // no do something with the collated text
      builder.toString();
   }

Mybe this help. Mybe这个帮助。 it's uses javax.xml.parsers.DocumentBuilder, which is easier than SAX 它使用的是javax.xml.parsers.DocumentBuilder,它比SAX更容易

public Document getDomElement(String xml){
        Document doc = null;
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        try {

            DocumentBuilder db = dbf.newDocumentBuilder();

            InputSource is = new InputSource();
                is.setCharacterStream(new StringReader(xml));
                doc = db.parse(is); 

            } catch (ParserConfigurationException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            } catch (SAXException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            } catch (IOException e) {
                Log.e("Error: ", e.getMessage());
                return null;
            }
                // return DOM
            return doc;
    }

you can loop through the document by using NodeList and check each Node by it's name 您可以使用NodeList遍历文档,并按名称检查每个节点

Seems you took this example from here . 好像你从这里拿了这个例子。 You need to pass a file with absolute path an not a string to method SAXParser.parse() ; 您需要将带有绝对路径而不是字符串的文件传递给方法SAXParser.parse() ; Look the example closely. 仔细看一下这个例子。 The method parse() defined as follows 方法parse() 定义如下

public void parse(File f,
                  DefaultHandler dh)
           throws SAXException,
                  IOException

If you want to parse a string anyways . 如果你想要解析一个字符串 There is another method which takes Inputstream . 还有另一种采用Inputstream方法。

public void parse(InputStream is,
                  DefaultHandler dh)
           throws SAXException,
                  IOException

Then you need to convert your string to an InputStream . 然后,您需要将您的字符串转换为InputStream Here is how to do it . 这是怎么做的

You call parse with a String as the first parameter. 您使用String作为第一个参数调用parse。 According to the docu that string is interpreted as the URI to your file. 根据文档 ,字符串被解释为文件的URI

If you want to parse your String directly, you have to transform it to an InputStream in the first place for usage with the parse(InputSource is, DefaultHandler dh) method ( docu ): 如果要直接解析String ,则必须首先将其转换为InputStream ,以便与parse(InputSource is, DefaultHandler dh)一起使用parse(InputSource is, DefaultHandler dh)方法( docu ):

// transform from string to inputstream
ByteArrayInputStream in = new ByteArrayInputStream(xml.toString().getBytes());
InputSource is = new InputSource();
is.setByteStream(in);

// start parsing
saxParser.parse(xml.toString(), handler);

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM