简体   繁体   English

如何使用Java来实现遍历树HTML的算法?

[英]How I can implement an algorithm that loops through a tree HTML with Java?

I have to walk a tree that reaches me from a NodeList, I need an algorithm to traverse all nodes in order, most likely be in depth but not how to implement it. 我必须走一棵从NodeList到达我的树,我需要一种算法来按顺序遍历所有节点,最有可能遍历所有节点,但无需了解如何实现。 I think I need some recursion. 我想我需要递归。 Can anybody help? 有人可以帮忙吗?

The part of the code is: NodeList nodeLista = documento.getElementsByTagName("html"); 代码的一部分是:NodeList nodeLista = documento.getElementsByTagName(“ html”);

for (int s = 0; s < nodeLista.getLength(); s++) {
    Node Raiz = nodeLista.item(s);

.... ....

    for (int h = 0; h < nodeLista.getLength(); h++) {

    //Level of depth 1.
    Node Primer_Hijo = nodeLista.item(h); // In the first iteration for the HEAD will enter in the second iteration enter the BODY.

    //Level of depth 2.
    Element SegundoElemento = (Element) Primer_Hijo;
    NodeList ListadeNodos2 = SegundoElemento.getChildNodes();

..... .....

Recursive descent is exactly what you are looking for. 递归下降正是您要寻找的。

http://en.wikipedia.org/wiki/Recursive_descent_parser http://en.wikipedia.org/wiki/Recursive_descent_parser

For parsing html I have used Jerry in the past. 对于解析html,我过去使用过Jerry

It bills itself as jquery for java and allows you to use css style selectors. 它将自己标为Java的jquery,并允许您使用CSS样式选择器。 I think there are now several libraries that implement css style selectors now. 我认为现在有几个库现在可以实现CSS样式选择器。

It leads to more easily readable code though it might not fit your use case. 尽管它可能不适合您的用例,但它会导致代码更易于阅读。

This is the pseudo code 这是伪代码

    traverse_tree(node)   {
    childNodes = node.getChildNodes();
    if(chidNodes is empty){
      print valueOf(node);
      return;
    }
    for each childNode in childNodes{
     traverse_tree(childNode);
    }
}

Start traversal by calling traverse_tree(rootNode) //root is the tree root node. 通过调用traverse_tree(rootNode)开始遍历// root是树的根节点。

Something like this: 像这样:

public static void main(String[] args) {
    //get the nodeList
    //...
    for (int h = 0; h < nodeLista.getLength(); h++) {
        Node Primer_Hijo = nodeLista.item(h); 
        navegate(Primer_Hijo);
    }

    //or (better) the root node
    navegate(rootNode);
}

void navegate(Node node){
    //do something with node
    node.getAttributes();
    //...

    for(int i=0; i<node.getChildNodes().getLength(); i++)
        navegate(node.getChildNodes().item(i));
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM