简体   繁体   English

Python将XML解析为JSON

[英]Python Parse XML to JSON

I am currently working with some interesting XML string responses. 我目前正在处理一些有趣的 XML字符串响应。 Essentially, the XML I'm receiving is nested, but it reads like a CSV file. 从本质上讲,我收到的XML是嵌套的,但它读起来就像一个CSV文件。 Example: 例:

xml = <?xml version="1.0" encoding="ISO-8859-1"?>
<ThisDocument protocol="OCI" xmlns="C" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<sessionId xmlns="">29348u29!!4nthisSucks!==</sessionId>
  <command echo="" xsi:type="GroupGetListInServiceProviderResponse" xmlns="">
    <groupTable>
      <colHeading>Group Id</colHeading>
      <colHeading>Group Name</colHeading>
      <colHeading>User Limit</colHeading>
      <row>
        <col>LRB7905</col>
        <col>Test1</col>
        <col>25</col>
      </row>
      <row>
        <col>LRB9294</col>
        <col>Test2</col>
        <col>100</col>
      </row>
      <row>
        <col>LRB8270</col>
        <col>Test3</col>
        <col>10</col>
      </row>
      <row>
        <col>LRB8212</col>
        <col>Test4</col>
        <col>25</col>
      </row>
      <row>
        <col>LRB8175</col>
        <col>Test5</col>
        <col>25</col>
      </row>
    </groupTable>
  </command>
</ThisDocument>

In the responses I receive from the server in question, the 'colHeading' is the 'key' for and the 'col' for each 'row' corresponds to the value. 在我从相关服务器收到的回复中,'colHeading'是'key',每个'row'的'col'对应于该值。 It seems like an easy structure to map, but I cannot think of a 'PYTHONIC' way to perform this task. 这似乎是一个简单的地图结构,但我想不出一个“PYTHONIC”方式来执行这项任务。 The desired outcome is: 期望的结果是:

{
  "groupTable": [
    {
        "Group ID": "LRB7905",
        "Group Name": "Test1",
        "User Limit": "25"
    },
    {
        "Group ID": "LRB9294",
        "Group Name": "Test2",
        "User Limit": "100"
    },
    {
        "Group ID": "LRB8270",
        "Group Name": "Test3",
        "User Limit": "10"
    },
    {
        "Group ID": "LRB8212",
        "Group Name": "Test4",
        "User Limit": "25"
    },
    {
        "Group ID": "LRB8175",
        "Group Name": "Test5",
        "User Limit": "25"
    }
  ]
}

The information I really need is contained in the 'col' fields of the XML, and the number of colHeadings corresponds to the number of values in each 'row'. 我真正需要的信息包含在XML的'col'字段中,colHeadings的数量对应于每个'row'中的值的数量。 So far, I've been able to manipulate the values into CSV files, but ultimately, I need to create JSON objects (dicts) with the key, value pairs. 到目前为止,我已经能够将值操作为CSV文件,但最终,我需要使用键值对创建JSON对象(dicts)。 I've used different libraries/modules etc... but the best approach I've come up with, is to break the colHeadings and Values into two lists, and then combine them. 我使用了不同的库/模块等......但我提出的最好的方法是将colHeadings和Values分成两个列表,然后将它们组合起来。

Code so far: 代码到目前为止:

xmlroot = ET.fromstring(xml)

headings =[]
values = []

def breakoutLists(xmlroot):
    for columnHeading in root.iter('colHeading'):
        headings.append(columnHeading.text)
    for column in root.iter('col'):
        values.append(column.text)
    return headings, values

breakoutLists(xmlroot)

zipped = dict(itertools.izip(values, itertools.cycle(headings)))
print zipped

This produces a dictionary, but the in the order of values: keys instead of keys: values . 这会产生一个字典,但是按values: keys的顺序values: keys而不是keys: values

I'd appreciate any suggestions on the best way to approach this task. 我很感激有关处理这项任务的最佳方法的任何建议。 Thanks in advance!!! 提前致谢!!!

EDIT Thanks to the help of Eric, I was able to accomplish my goal! 编辑感谢Eric的帮助,我完成了我的目标!

groupResp = {'groupResponse': []}    
def breakoutLists(root):
    headings = [h.text for h in root.iter('colHeading')]

    return (
        {
            h: col.text
            for h, col in zip(headings, row.iter('col'))
        }
        for row in root.iter('row')
    )

data = list(breakoutLists(root))

for item in data:
    groupResp['groupResponse'].append(item)

print json.dumps(groupResp)

I can probably clean this up a bit to append the dictionary during the initial function, but I'm good for now! 我可以在初始功能期间将这一点清理干净以附加字典,但我现在很好!

Your code flattens the data, which is unhelful - you need to iterate over the row objects 您的代码会使数据变得扁平化,这是不合适的 - 您需要遍历行对象

def breakoutLists(xmlroot):
    headings = [h.text for h in root.iter('colHeading')]

    return (
        {
            h: col.text
            for h, col in zip(headings, row.iter('column'))
        }
        for row in root.iter('row')
    )

data = list(breakoutLists(html))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM