简体   繁体   English

将嵌套的 XML 解析为 DataFrame

[英]Parse nested XML into DataFrame

I get back a response from an API in XML form.我从 XML 形式的 API 收到回复。 It looks like:看起来像:

<result_set>
    <status> ...
    </status>
    <copyright> ...
    </copyright>
    <results>
       <congress> ...
       </congress>
       <chamber> ...
       </chamber>
       <num_results> ...
       </num_results>
       <offset> ...
       </offset>
       <members>
         <member>
           <id> ... </id>
           <name> ... </name>
           ...
           ...
           ...
         </member>
         <member> ... </member>
         ...
         ... 
         ...
       </members>
    </results>
</result_set>

I need the data under each tag into a pandas dataframe of the form:我需要将每个标签下的数据转换为 pandas dataframe 的形式:

id name ... ... ...
1  JOHN ... ... ...
2  DOE  ... ... ...

I have tried ElementTree, but I have failed in all attempts so far.我已经尝试过 ElementTree,但到目前为止我的所有尝试都失败了。

Below以下

import pandas as pd
import xml.etree.ElementTree as ET

xml = '''<r> 
         <members>
         <member>
           <id>1</id>
           <name>jack</name>
         </member>
         <member>
           <id>5</id>
           <name>dan</name>
         </member>

       </members>
       </r>'''

root = ET.fromstring(xml)
data = []
members = root.findall('.//member')
for member in members:
    data.append({c.tag:c.text for c in list(member)})
df = pd.DataFrame(data)
print(df.head())

output output

  id  name
0  1  jack
1  5   dan

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM