简体   繁体   中英

Python - lxml - how to 'move' around the tree when building the tree

Basic question - how do you 'move' around in a tree when you are building a tree.

I can populate the first level:

import lxml.etree as ET

def main():
    root = ET.Element('baseURL')
    root.attrib["URL"]='www.com'
    root.attrib["title"]='Level Title'
    myList = [["www.1.com","site 1 Title"],["www.2.com","site 2 Title"],["www.3.com","site 3 Title"]]   
    for i in xrange(len(myList)):
        ET.SubElement(root, "link_"+str(i), URL=myList[i][0], title=myList[i][1])

This gives me something like:

baseURL:
        link_0
        link_1
        link_2

from there, I want to add a subtree from each of the new nodes so it looks something like:

baseURL:
        link_0:
               link_A
               link_B
               link_C
        link_1
        link_2

I can't see how to 'point' the subElement call to the next node down - I tried:

myList2 = [["www.A.com","site A Title"],["www.B.com","site B Title"],["www.C.com","site C Title"]]
for i in xrange(len(myList2)):
        ET.SubElement('link_0', "link_"+str(i), URL=myList2[i][0], title=myList2[i][1])

But that throws the error:

TypeError: Argument '_parent' has incorrect type (expected lxml.etree._Element, got str)

as I am giving the subElement call a string, not an element reference. I also tried it as a variable, (ie link_0' rather than "link_0"`) and that gives a global missing variable, so my reference is obviously incorrect.

How do I 'point' my lxml builder to a child as a parent, and write a new child?

ET.SubElement(parent_node,type) creates a new XML element node as a child of parent_node . It also returns this new node.

So you could do this:

import lxml.etree as ET

def main():
  root = ET.Element('baseURL')
  myList = [1,2,3]
  children = []
  for x in myList:
    children.append( ET.SubElement(root, "link_"+str(x)) )

  for y in myList:
     ET.SubElement( children[0], "child_"+str(y) )

But keeping track of the children is probably excessive since lxml already provides you with many ways to get to them.

Here's a way using lxmls built in children lists:

 node = root[0]
 for y in myList:
   ET.SubElement( node, "child_"+str(y) )

Here's a way using XPath (possibly better if your XML is getting ugly)

 node = root.xpath("/baseURL/link_0")[0]
 for y in myList:
   ET.SubElement( node, "child_"+str(y) )

Found the answer. I should be using the python array referencing, root[n] not trying to get to it via list_0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM