简体   繁体   中英

How to modify XML as text in lxml

I have an XML file generated by an IDE; however, it unfortunately outputs code with newlines as BRs and seems to randomly decide where to place newlines. Example:

if test = true
    foo;
    bar;
endif

becomes the following XTML within an XML file:

<body>
    <p>if test = true<br />    foo;<br />    bar;<br />endif
    </p>
</body>

I am trying to make a pre-processor for these files in python using lxml to make it easier to version control them. However, I cannot figure out to modify the XML as text so that I can place each BR on it's own line like the following:

<body>
<p>if test = true
    <br />    foo;
    <br />    bar;
    <br />endif
</p>
</body>

How does one edit the xml as text, or failing that, is there another way to get the results like above?

One option would be to add a new-line character to the p tag's text and br tag tails. Example:

from lxml import html

data = """
<html>
<body>
<p>if test = true<br />    foo;<br />    bar;<br />endif
</p>
</body>
</html>
"""

tree = html.fromstring(data)

p = tree.find('.//p')
p.text += '\n'

for element in tree.xpath('.//br'):
    element.tail += '\n'

print html.tostring(tree)

Prints:

<html>
<body>
<p>if test = true
<br>    foo;
<br>    bar;
<br>endif

</p>
</body>
</html>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM