[英]Getting parsing error when attempting to remove empty xml tags with lxml
[英]remove xml tags with lxml
我想從文件“ new.xml”中刪除xml標記,並根據打印語句放置數據。
我嘗試過:
from lxml import etree
tree = etree.parse("C:\\Users\\name\\Desktop\\new.xml")
root = tree.getroot()
for text in root.iter():
print text.text
XML代碼是:
<connection>
<rhel>
<runscript>y</runscript>
<username>useranme</username>
<password>passw</password>
<store>None</store>
<port>2</port>
<host>192.168.73.56</host>
<logdirectory>logs</logdirectory>
</rhel>
</connection>
我得到以下輸出:
yes
username
passw
None
2
192.168.73.56
logs
但我想將其打印為:
is it a new connection: yes
username: username
password: passw
value: none
connections: 2
host: 192.168.73.56
log dir : logs
您需要根據XML文件的結構進行分析。 為此,您可以循環遍歷子項,並查看每個子項的標簽名稱和文本。
from lxml import etree
tree = etree.parse("test.xml")
root = tree.getroot()
connections = []
for node in root.findall('rhel'): # for all the 'rhel' nodes, children of the root 'connections' node
connections.append({info.tag: info.text for info in node}) # Construct a dictionary with the (tag, text) as (key, value) pair.
print connections
for conn in connections:
print '='*20
print """is it a new connection: {runscript}
username: {username}
password: {password}
value: {store}
connections: {port}
host: {host}
log dir : {logdirectory}""".format(**conn)
您可以嘗試這樣做: repr(root)
。 您將得到正在打印的內容。 但是不建議這樣做,原因有很多:
希望能幫助到你。
更新 :
您可以使用connections.append(dict((info.tag, info.text) for info in node))
代替Python <2.7的另一行。 我猜在此之前不支持該表示法。
或者,最終,您可以這樣做:
c = {}
for info in node:
c[info.tag] = info.text
connections.append(c)
另外,如果在Python 2.6上,我猜該格式可能也無法正常工作。 將其替換為舊的字符串格式:
print """is it a new connection: %(runscript)s
username: %(username)s
password: %(password)s
value: %(store)s
connections: %(port)s
host: %(host)s
log dir : %(logdirectory)s""" % conn
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.