XML元素树Python遍历子级并将每个子级另存为CSV列

Question

I have been searching all over SO for a solution to my current issue but haven't been able to find anything that adequately solves it. 我一直在整个SO中寻找解决我当前问题的方法，但是还没有找到任何可以解决问题的方法。 I am attempting to iterate through the child of a root node in an XML document and pull the values of each of the subchilds within the iteration (eg iterate through in the XML below and pull each instance of COMPANY and ROLE). 我试图遍历XML文档中根节点的子节点，并提取迭代中每个子节点的值（例如，遍历下面的XML并提取COMPANY和ROLE的每个实例）。 This is the last piece of a huge project and I am completely stuck, any and all help would be immensely appreciated. 这是一个庞大项目的最后一部分，我完全陷于困境，我们将不胜感激任何帮助。

<Personnel Personnel ID = "123">
  <First_Name> First </First_Name>
  <Last_Name> Last </Last_Name>
  <User_ID> 123 </User_ID> 
  <Date> 2017-01-01 </Date>
  <INFO>
      <INFO_1>
        <PHONE> 555-555-5555 </PHONE>
      <INFO_2>
        <EMAIL> name@email.com </EMAIL>
  </INFO>     
  <LINKS>
      <LINK COMPANY = "Company 1" ROLE = "Role 1" />
      <LINK COMPANY = "Company 2" ROLE = "Role 2" />
      <LINK COMPANY = "Company 3" ROLE = "Role 3" />
       ....
      <LINK Company = "Company n" ROLE = "Role n" />
  </LINKS>
  <TAGS>
      <TAG Term="Tag 1" />
      <TAG Term="Tag 2" />
      <TAG Term="Tag 3" />
      ...................
      <TAG Term="Tag n" />
  </Tags>
  <Personnel_Field_1> Field 1 </Personnel_Field_1>
  <Personnel_Field_2> Field 2 </Personnel_Field_2>

Example Code: 示例代码：

 for contact in root.findall('Personnel'):
    Personnel_ID = contact.get('Personnel_ID')  
    contact_info.append(Personnel_ID)   

    First_Name = contact.find('First_Name').text
    contact_info.append(First_Name)

    Last_Name = contact.find('Last_Name').text
    contact_info.append(Last_Name)

    User = contact.find('User_ID').text
    contact_info.append(User)

    Date = contact.find('Date').text
    contact_info.append(Date)

    Email = contact.find( './/EMAIL' ).text
    contact_info.append(Email)

    Phone = contact.find( './/PHONE' ).text
    contact_info.append(Phone)

    Personnel_1 = contact.find('Personnel_Field_1').text.encode('utf-8')
    contact_info.append(Personnel_1)

    Personnel_2 = contact.find('Personnel_Field_2').text.encode('utf-8')
    contact_info.append(Personnel_2)

So far I have been successful in pulling the following and saving them into CSV columns: Personnel ID, First Name, Last Name, User ID, Date, Email, Phone, Personnel 1, Personnel 2 到目前为止，我已经成功提取了以下内容并将其保存到CSV列中：人员ID，名字，姓氏，用户ID，日期，电子邮件，电话，人员1，人员2

What I am stuck on is the ability to iterate through to parse COMPANY and ROLE, as well as to parse each term. 我所坚持的是具有迭代能力以解析COMPANY和ROLE以及解析每个术语的能力。 I need to save each company, role, and tag value as their own columns as well. 我还需要将每个公司，角色和标签值另存为自己的列。 If anyone can help by simply showing me how to iterate through these elements, I will be able to save them down into CSV columns. 如果有人可以通过简单地向我展示如何遍历这些元素来提供帮助，那么我将能够将它们保存到CSV列中。

Thanks in advance for any and all advice this is the last piece of a huge project I am working on and I feel like I've exhausted all potential solutions that I have found. 在此先感谢您提供的所有建议，这是我正在进行的一个大型项目的最后一部分，我觉得我已经用尽了所有可能找到的解决方案。

Answer 1

Simply add nested for loops to parse the LINK and TAG children. 只需添加嵌套的for循环即可解析LINK和TAG子级。

for contact in root.findall('Personnel'):
    ...
    for link in contact.findall('.//LINK'):
        contact_info.append(link.get('COMPANY'))
        contact_info.append(link.get('ROLE'))

    for tag in contact.findall('.//TAG'):
        contact_info.append(tag.get('Term'))

XML元素树Python遍历子级并将每个子级另存为CSV列

问题描述

1 个解决方案

解决方案1
1 已采纳 2017-11-29 20:15:02

XML元素树Python遍历子级并将每个子级另存为CSV列

问题描述

1 个解决方案

解决方案1 1 已采纳 2017-11-29 20:15:02

解决方案1
1 已采纳 2017-11-29 20:15:02