简体   繁体   中英

Convert XML to nested JSON object in specific format

This is a follow up to the question I previously asked about deriving a totally flat structure out of an XML node: Converting an xml doc into a specific dot-expanded json structure .

Suppose I have the same XML to start with:

<Item ID="288917">
  <Main>
    <Platform>iTunes</Platform>
    <PlatformID>353736518</PlatformID>
  </Main>
  <Genres>
    <Genre FacebookID="6003161475030">Comedy</Genre>
    <Genre FacebookID="6003172932634">TV-Show</Genre>
  </Genres>
  <Products>
    <Product Country="CA">
      <URL>https://itunes.apple.com/ca/tv-season/id353187108?i=353736518</URL>
      <Offers>
        <Offer Type="HDBUY">
          <Price>3.49</Price>
          <Currency>CAD</Currency>
        </Offer>
        <Offer Type="SDBUY">
          <Price>2.49</Price>
          <Currency>CAD</Currency>
        </Offer>
      </Offers>
    </Product>
    <Product Country="FR">
      <URL>https://itunes.apple.com/fr/tv-season/id353187108?i=353736518</URL>
      <Rating>Tout public</Rating>
      <Offers>
        <Offer Type="HDBUY">
          <Price>2.49</Price>
          <Currency>EUR</Currency>
        </Offer>
        <Offer Type="SDBUY">
          <Price>1.99</Price>
          <Currency>EUR</Currency>
        </Offer>
      </Offers>
    </Product>
  </Products>
</Item>

Now I would like to convert it into a nested json object in a specific format (slightly different than the xmltodict library. Here is the structure I'd like to derive:

{
    "Item[@ID]": 288917,
    "Item.Main.Platform": "iTunes",
    "Item.Main.PlatformID": "353736518",
    "Item.Genres": [
        {
            "[@FacebookID]": "6003161475030",
            "Value": "Comedy"
        },
        {
            "[@FacebookID]": "6003161475030",
            "Value": "TV-Show"
        }
    ],
    "Item.Products": [
        {
            "[@Country]": "CA",
            "URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
            "Offers.Offer": [
                {
                    "[@Type]": "HDBUY",
                    "Price": "3.49",
                    "Currency": "CAD"
                }
                {
                    "[@Type]": "SDBUY",
                    "Price": "2.49",
                    "Currency": "CAD"
                }
            ]
        },
        {
            "[@Country]": "FR",
            "URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
            "Offers.Offer": [
                {
                    "[@Type]": "HDBUY",
                    "Price": "3.49",
                    "Currency": "EUR"
                }
                {
                    "[@Type]": "SDBUY",
                    "Price": "1.99",
                    "Currency": "EUR"
                }
            ]
        }
    ]
}

The main difference being instead of collapsing everything into a list of flat values, to allow lists of dicts. How could this be done?

While doing the above might be a nice challenge, xmltodic already does a great job with this and can do the job with slight altering.

And here are the changes to make in xmltodict :

  1. Change var cdata_key from #text to Value .
  2. Change var attr_prefix from @ to [@ .
  3. Add new var attr_suffix=']' to init method.
  4. Change attr_key to key = self.attr_prefix+self._build_name(key)+self.attr_suffix .

That should give the exact result you're looking for with a tested module:

>>> from lxml import etree
>>> import xmltodict
>>> import json
>>> from utils import xmltodict
>>> node= etree.fromstring(s)
>>> d=xmltodict.parse(etree.tostring(node))
>>> print(json.dumps(d, indent=4))
{
    "Item": {
        "[@ID]": "288917",
        "Main": {
            "Platform": "iTunes",
            "PlatformID": "353736518"
        },
        "Genres": {
            "Genre": [
                {
                    "[@FacebookID]": "6003161475030",
                    "Value": "Comedy"
                },
                {
                    "[@FacebookID]": "6003172932634",
                    "Value": "TV-Show"
                }
            ]
        },
        "Products": {
            "Product": [
                {
                    "[@Country]": "CA",
                    "URL": "https://itunes.apple.com/ca/tv-season/id353187108?i=353736518",
                    "Offers": {
                        "Offer": [
                            {
                                "[@Type]": "HDBUY",
                                "Price": "3.49",
                                "Currency": "CAD"
                            },
                            {
                                "[@Type]": "SDBUY",
                                "Price": "2.49",
                                "Currency": "CAD"
                            }
                        ]
                    }
                },
                {
                    "[@Country]": "FR",
                    "URL": "https://itunes.apple.com/fr/tv-season/id353187108?i=353736518",
                    "Rating": "Tout public",
                    "Offers": {
                        "Offer": [
                            {
                                "[@Type]": "HDBUY",
                                "Price": "2.49",
                                "Currency": "EUR"
                            },
                            {
                                "[@Type]": "SDBUY",
                                "Price": "1.99",
                                "Currency": "EUR"
                            }
                        ]
                    }
                }
            ]
        }
    }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM