I want to extact lines from an xml that are between from my xml. Here is an example:
<userData code="viPartListRailML" value="1">
<partRailML s="0.0000000000000000e+00" id="0"/>
<partRailML s="2.0000000000000000e+01" id="1"/>
<partRailML s="9.4137883373059267e+01" id="2"/>
</userData>
Here is my code, that I was trying:
import re
shakes = open("N:\SAJAT_MAPPAK\IGYULAVICS\/adhoc\pythonXMLread\probaxml\github_minta.xml", "r")
for x in shakes:
if "userData" in x:
print x
continue
if "/userData" in x:
break
The problem is that it still gives back only the lines that contain <userData
or </userData>
How to modify it to get the lines between these two "words"
Assuming that there is one <userData>
block in your file, your can extract lines within block by:
shakes = open("./file.xml", "r")
inblock = False
for x in shakes:
if "/userData" in x:
inblock = False
if inblock:
print(x)
if "userData" in x:
inblock = True
But read your file with a xml parser is more robust, like:
import xml.etree.ElementTree as ET
tree = ET.parse('file.xml')
for data in tree.getroot().iter('userData'):
for child in data:
print(ET.tostring(child))
# or something else, eg:
# print(child.tag)
BTW, use Python3 whenever possible, Python2 is retired.
Easy way is to add a variable, which tells if you are between those words:
shakes = open("N:\SAJAT_MAPPAK\IGYULAVICS\/adhoc\pythonXMLread\probaxml\github_minta.xml", "r")
t=False
for x in shakes:
if t:
print(x) # also /userdata -line is printed
if "/userData" in x:
t=False
elif "userData" in x: # this matches /userData as well--> elif
t=True
You can use itertools.dropwhile
to reach the <userData
part and then use itertools.takewhile
to read until </userData
:
import itertools as it
result = it.takewhile(
lambda x: '</userData' not in x,
it.dropwhile(
lambda x: '<userData' not in x,
text.splitlines()
)
)
print('\n'.join(result))
If you want to skip the <userData
element you can add itertools.islice
:
result = it.takewhile(
lambda x: '</userData' not in x,
it.islice(it.dropwhile(
lambda x: '<userData' not in x,
text.splitlines()
), 1, None)
)
print('\n'.join(result))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.