用Java解析XML文件

Question

在下面的示例代碼中，我對列表有疑問。 我的教授將Document對象添加到ArrayList中。 看來這只是將一個Document對象添加到列表中，而不是將每個Node添加到列表中。 但是，然后查看while循環，似乎他在索引0處獲取了該項目，解析了該信息，然后刪除了該項目，以便他可以查看下一個信息。 因此，似乎在ArrayList中發生的事情比僅一個Document對象更多。 ArrayList / while循環部分中發生了什么嗎？ 我對這段代碼的工作方式感到困惑。 提前致謝！

import java.io.*; 
import java.util.*; 
import javax.xml.parsers.*; 
import org.w3c.dom.*; 
import org.xml.sax.*; 


public class RSSReader {
    public static void main(String[] args) {
        File f = new File("testrss.xml");
        if (f.isFile()) {
            System.out.println("is File");
            RSSReader xml = new RSSReader(f);
        }
    }

    public RSSReader(File xmlFile) {
        try {
            DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
            DocumentBuilder builder = factory.newDocumentBuilder();
            Document doc = builder.parse(xmlFile);

            List<Node> nodeList = new ArrayList<Node>();
            nodeList.add(doc);

            while(nodeList.size() > 0) 
            { 
            Node node = nodeList.get(0); 

            if (node instanceof Element) { 
                System.out.println("Element Node: " + ((Element)node).getTagName()); 
                NamedNodeMap attrMap = node.getAttributes(); 
                for(int i = 0; i < attrMap.getLength(); i++) 
                { 
                    Attr attribute = (Attr) attrMap.item(i); 
                    System.out.print("\tAttribute Key: " + attribute.getName() 
                        + " Value: " + attribute.getValue()); 
                } 
                if(node.hasAttributes()) 
                    System.out.println(); 
            } 
            else if(node instanceof Text) 
                System.out.println("Text Node: " + node.getNodeValue()); 
            else 
                System.out.println("Other Type: " + node.getNodeValue()); 

            if(node.hasChildNodes()) 
            { 
                NodeList nl = node.getChildNodes(); 
                for(int i = 0; i < nl.getLength(); i++) 
                { 
                    nodeList.add(nl.item(i)); 
                } 
            } 
            nodeList.remove(0); 
            } 
        }

        catch (IOException e) {
            e.printStackTrace();
        }
        catch (SAXException e) {
            e.printStackTrace();
        }
        catch (IllegalArgumentException e) {
            e.printStackTrace();
        }
        catch (ParserConfigurationException e) {
            e.printStackTrace();
        }
    }
}

Answer 1

我認為您的教授在此演示的方法稱為廣度優先算法。 循環中的關鍵代碼塊是

if(node.hasChildNodes()) 
{ 
    NodeList nl = node.getChildNodes(); 
    for(int i = 0; i < nl.getLength(); i++) 
    { 
        nodeList.add(nl.item(i)); 
    } 
}

在處理列表中的元素之后，如果該元素具有要處理的子元素，則此代碼將被破解。 如果是這樣，它們將被添加到要處理的列表中。

我使用此算法時，首先處理根元素，然后再處理其子項，然后再處理其子項，再處理其下的子項，依此類推，直到樹上只有葉子為止。

（附帶說明：對於XML文檔，尤其是RSS feed，這似乎是錯誤的方法。我想您想使用“深度優先”算法來使輸出更易於理解。在這種情況下，您可以使用堆棧而不是列表。）

Answer 2

每個節點的每個子節點都通過以下代碼添加到List<Node> ：

if(node.hasChildNodes()) 
{ 
    NodeList nl = node.getChildNodes(); 
    for(int i = 0; i < nl.getLength(); i++) 
    { 
        nodeList.add(nl.item(i)); 
    } 
}

基本上，這意味着將訪問文檔中的每個節點。

用Java解析XML文件

問題描述

2 個解決方案

解決方案1
2 已采納 2010-08-30 16:00:35

解決方案2
1 2010-08-30 15:52:21

用Java解析XML文件

問題描述

2 個解決方案

解決方案1 2 已采納 2010-08-30 16:00:35

解決方案2 1 2010-08-30 15:52:21

解決方案1
2 已采納 2010-08-30 16:00:35

解決方案2
1 2010-08-30 15:52:21