Lets say I have some XML data on an online product with multiple prices:
<Response>
<TotalOffers>6</TotalOffers>
<LowPrices>
<LowPrice condition="new">
<CurrencyCode>USD</CurrencyCode>
<Amount>15.50</Amount>
</LowPrice>
<LowPrice condition="used">
<CurrencyCode>USD</CurrencyCode>
<Amount>22.86</Amount>
</LowPrice>
</LowPrices>
</Response>
My ultimate goal is to pass it through a function that parses the XML into the form of a simplified dict that looks something like this:
response = {
'total_offers': 6,
'low_prices': [
{'condition': "new", 'currency': "USD", 'amount': 15.50},
{'condition': "used", 'currency': "USD", 'amount': 22.86},
]
}
Using the lxml library this is pretty simple to do. I just have to specify the xpath for finding each value and then handle exceptions where the expected data is missing, for example to get the TotalOffers value (6) I would do something like this:
# convert xml to etree object
tree_obj = etree.fromstring(xml_text)
# use xpath to find values that I want in this tree object
matched_els = tree_obj.xpath('//TotalOffers')
# xpath matches are returned as a list
# since there could be more than one match grab only the first one
first_match_el = matched_els[0]
# extract the text and print to console
print first_match_el.text
# >>> '6'
Now my thinking is I could write a function like get_text(tree_obj, xpath_to_value)
but then what if I also want this function to convert the value into its appropriate type (eg: string, float, or int) should I have a param that specifies the type like so get_text(tree_obj, xpath_to_value, type='float')
?
Because if I do that my next step in creating the dict would be something like this:
low_prices = []
low_prices_els = tree_obj.xpath('//LowPrices')
for el in low_prices_els:
low_prices.append(
{
'condition': get_text(el, './@condition', type='str'),
'currency': get_text(el, './CurrencyCode', type='str'),
'amount': get_text(el, './Amount', type='float')
}
)
response = {
'total_offers': get_text(tree_obj, '//TotalOffers', type='int'),
'low_prices': low_prices
}
Is this the best way to accomplish what I am trying to do? I feel like I'm creating problems for myself in the future.
I think the tool you need is xml to json tool, it converts the xml document to json format, you can test it in :
http://codebeautify.org/xmltojson
out:
{"Response":{"TotalOffers":"6","LowPrices":{"LowPrice":[{"CurrencyCode":"USD","Amount":"15.50","_condition":"new"},{"CurrencyCode":"USD","Amount":"22.86","_condition":"used"}]}}}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.