[英]How does this Java Program Run?
I read about DOMParser
and SAXParser
in Java. 我在Java中读到了
DOMParser
和SAXParser
。 I have no doubts in DOMParser and people prefer SAXParser than DOMParser, because of the memory it takes. 我对DOMParser毫不怀疑,人们比DOMParser更喜欢SAXParser,因为它需要内存。 However I understand the concept of SAXParser, i could not able to under this code:
但是,我理解SAXParser的概念,我无法在此代码下:
import javax.xml.parsers.SAXParser;
import javax.xml.parsers.SAXParserFactory;
import org.xml.sax.Attributes;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ReadXMLFileSAX {
public static void main(String args[]) {
try {
SAXParserFactory factory = SAXParserFactory.newInstance();
SAXParser saxParser = factory.newSAXParser();
DefaultHandler handler = new DefaultHandler() {
boolean bfname = false;
boolean blname = false;
boolean bnname = false;
boolean bsalary = false;
public void startElement(String uri, String localName,
String qName, Attributes attributes)
throws SAXException {
System.out.println("Start Element :" + qName);
if (qName.equalsIgnoreCase("FIRSTNAME")) {
bfname = true;
}
if (qName.equalsIgnoreCase("LASTNAME")) {
blname = true;
}
if (qName.equalsIgnoreCase("NICKNAME")) {
bnname = true;
}
if (qName.equalsIgnoreCase("SALARY")) {
bsalary = true;
}
}
public void endElement(String uri, String localName,
String qName)
throws SAXException {
System.out.println("End Element :" + qName);
}
public void characters(char ch[], int start, int length)
throws SAXException {
if (bfname) {
System.out.println("First Name : "
+ new String(ch, start, length));
bfname = false;
}
if (blname) {
System.out.println("Last Name : "
+ new String(ch, start, length));
blname = false;
}
if (bnname) {
System.out.println("Nick Name : "
+ new String(ch, start, length));
bnname = false;
}
if (bsalary) {
System.out.println("Salary : "
+ new String(ch, start, length));
bsalary = false;
}
}
};
saxParser.parse("/home/anto/Groovy/Java/file.xml", handler);
} catch (Exception e) {
e.printStackTrace();
}
}
}
And the .xml file is : 而.xml文件是:
<?xml version="1.0"?>
<company>
<staff>
<firstname>yong</firstname>
<lastname>mook kim</lastname>
<nickname>mkyong</nickname>
<salary>100000</salary>
</staff>
<staff>
<firstname>low</firstname>
<lastname>yin fong</lastname>
<nickname>fong fong</nickname>
<salary>200000</salary>
</staff>
</company>
And when i run the program i get the output like this: 当我运行程序时,我得到这样的输出:
Start Element :company
Start Element :staff
Start Element :firstname
First Name : yong
End Element :firstname
Start Element :lastname
Last Name : mook kim
End Element :lastname
Start Element :nickname
Nick Name : mkyong
End Element :nickname
Start Element :salary
Salary : 100000
End Element :salary
End Element :staff
Start Element :staff
Start Element :firstname
First Name : low
End Element :firstname
Start Element :lastname
Last Name : yin fong
End Element :lastname
Start Element :nickname
Nick Name : fong fong
End Element :nickname
Start Element :salary
Salary : 200000
End Element :salary
End Element :staff
End Element :company
The output looks very fine, but i'm confused with the output! 输出看起来非常好,但我对输出感到困惑! How the order of the output is been printed?
如何打印输出的顺序? Which handles this?
哪个处理这个? Since this is the first time I have read SAX And DOM, i could not able to figure it, kindly help me.
由于这是我第一次阅读SAX和DOM,我无法理解它,请帮助我。
SAX is event-based. SAX是基于事件的。 So, each time it sees a start tag, attribute, characters within a tag, end tag, ... it calls the appropriate function of the handler.
因此,每次它看到开始标记,属性,标记内的字符,结束标记,......它都会调用处理程序的相应功能。
So the flow here is: 所以这里的流程是:
company
tag, call startElement
for it company
标签,为其调用startElement
staff
tag, call startElement
for it staff
标签,为其调用startElement
firstname
tag, call startElement
for it (which sets a boolean) firstname
标签,为它调用startElement
(设置一个布尔值) characters
function for them (which sees which boolean is set and prints the appropriate message and clears the flag) characters
函数(查看设置了哪个布尔值并打印相应的消息并清除标志) firstname
tag, call the endElement
function firstname
标记,调用endElement
函数 ... ...
By calling saxParser.parse("/home/anto/Groovy/Java/file.xml", handler);
通过调用
saxParser.parse("/home/anto/Groovy/Java/file.xml", handler);
, The SAX Parser uses your DefaultHandler
(which is your handler
that you passed as parameter) that you implemented to do XML parsing. ,SAX Parser使用您为执行XML解析而实现的
DefaultHandler
(您作为参数传递的handler
)。
SAX is event-based, these event is encountered when the parser traverses in your XML document. SAX是基于事件的,当解析器遍历XML文档时会遇到这些事件。 When SAX parser encounters a start of an element, example
<firstname>
, it calls the startElement
method. 当SAX解析器遇到元素的开头,例如
<firstname>
,它会调用startElement
方法。 It then, traverse to the body of the start element, and sees yong
. 然后,它遍历起始元素的主体,并看到
yong
。 Since it's not enclosed in a <>
tag, it's considered a text node, therefore it calls the characters
method. 由于它未包含在
<>
标记中,因此它被视为文本节点,因此它调用characters
方法。 If there was another XML element, it would call the startElement
again for the new XML element. 如果有另一个XML元素,它将再次为新的XML元素调用
startElement
。
Finally, the SAX Parser traverses till it sees the end element </firstname>
and calls the endElement
method. 最后,SAX Parser遍历直到它看到end元素
</firstname>
并调用endElement
方法。
All these 3 methods startElement
, characters
and endElement
are implemented by the developer (in your case, YOU). 所有这三个方法
startElement
, characters
和endElement
都是由开发人员(在你的情况下,你)实现的。
Don't forget, SAX traverses through your XML document only. 不要忘记,SAX仅遍历您的XML文档。 It doesn't keep record of which node is a parent or child of which node.
它不会记录哪个节点是哪个节点的父节点或子节点。
Hope this helps! 希望这可以帮助!
The power of SAX parser is its events. SAX解析器的强大之处在于它的事件。 All you need to do is to override/implement the proper methods and the onus is on the parsing library to call the events in the order.
您需要做的就是覆盖/实现正确的方法,解析库上的onus用于调用顺序中的事件。
The order looks fine to me. 订单看起来很好。 What's the issue?
有什么问题?
If you're talking about the start and end elements, that just shows the XML tag nesting. 如果您正在讨论开始和结束元素,那只会显示XML标记嵌套。 You see that "company" comes before "staff", and "staff" before "firstname".
你看到“公司”出现在“员工”之前,“员工”出现在“名字”之前。
Finally that you have the data itself, inside the individual tags. 最后,您在各个标签内部拥有数据。 That's why the last three lines are:
这就是为什么最后三行是:
End Element :salary
End Element :staff
End Element :company
Because it's leaving the salary, salary is the last element of staff, and that's the final staff of the company. 因为它离开了薪水,薪水是员工的最后一个要素,而这是公司的最终员工。
As parser reads input XML it calls startElement
on every opening tag, and it calls endElement
on every closing tag. 当解析器读取输入XML时,它会在每个开始标记上调用
startElement
,并在每个结束标记上调用endElement
。 If parser meets contents of tag, like yong
, it calls characters
. 如果解析器符合标签的内容,如
yong
,则调用characters
。
Code you posted tracks which tag is currently parsed by using state variables bfname
, bsalary
, etc. Once characters
is called, your code knows which entity it's called for -- first name, last name or salary, so it can decipher raw characters string properly. 您发布的代码跟踪哪个标记当前使用状态变量
bfname
, bsalary
等进行解析。一旦调用了characters
,您的代码就会知道它所要求的实体 - 名字,姓氏或工资,因此它可以正确解密原始字符串。
So, while writing your SAX parser, in fact you writing callbacks for tracking state of your parser inside XML -- which part of XML it's currently reads. 因此,在编写SAX解析器时,实际上您编写了用于在XML中跟踪解析器状态的回调 - 它当前读取的XML部分。
On the contrary, while using DOM parser, you get whole XML document converted to tree, so you can navigate from it's root to nodes, or backwards -- from nodes to root, in any manner you like. 相反,在使用DOM解析器时,您可以将整个XML文档转换为树,因此您可以以任何方式从其根目录导航到节点,或者从节点导航到根目录。
A SAX parser just iterates through a document, one character at a time. SAX解析器只迭代一个文档,一次一个字符。 The
parse()
method of the Parser takes a Handler
object. 解析器的
parse()
方法采用Handler
对象。 Various methods of this object get called by the parser when the parser encounters certain characters in the document (an "event"). 当解析器遇到文档中的某些字符(“事件”)时,解析器会调用此对象的各种方法。 So every time the parser encounters a start tag, it calls the
startElement
method of the Handler, when it encounters an end tag it calls the endElement
method and so on. 因此,每次解析器遇到一个开始标记时,它都会调用Handler的
startElement
方法,当它遇到一个结束标记时,它会调用endElement
方法,依此类推。 These methods in the DefaultHandler are empty. DefaultHandler中的这些方法为空。 It is up to you to sub-class this class and provide your own implementation of these methods (in your code example above the Defaulthandler has been anonymously subclassed).
由您来对这个类进行子类化并提供您自己的这些方法的实现(在上面的代码示例中,Defaulthandler已被匿名子类化)。
Unlike a DOM Parser a SAX Parser does not construct elements - it just fires the various handler methods as it encounters start and end tags and content characters. 与DOM Parser不同,SAX Parser不构造元素 - 它只是在遇到开始和结束标记以及内容字符时触发各种处理程序方法。 It is up to you to, within these methods, provide the logic the maps an end tag to a start tag and so on, which is what the condition statements are doing in the startElement and endElement methods.
在这些方法中,由您提供逻辑,将结束标记映射到开始标记等等,这是条件语句在startElement和endElement方法中执行的操作。 And the class variables
blname
etc are just keeping track of what element the parser is currently in - so that you know what the characters relate to that are passed into the characters()
method. 类变量
blname
等只是跟踪解析器当前所处的元素 - 这样您就可以知道与characters()
方法相关的characters()
。
Near the end, you'll notice that the saxParser.parse()
method is given handler
as a parameter. 接近结束时,您会注意到
saxParser.parse()
方法被赋予handler
作为参数。 The handler is an instance of DefaultHandler
that was defined earlier in the code. 处理程序是先前在代码中定义的
DefaultHandler
的实例。 The SAXParser calls the appropriate method on the handler as it parses the XML document. SAXParser在解析XML文档时在处理程序上调用适当的方法。 Here is some Javadoc on DefaultHandler and SAXParser (see the documentation on the
parse
methods). 这是DefaultHandler和SAXParser上的一些Javadoc(请参阅有关
parse
方法的文档)。 As the XML document is parsed and each method in the handler is called in turn, the handler method prints out the values that were processed. 解析XML文档并依次调用处理程序中的每个方法时,处理程序方法会打印出已处理的值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.