简体   繁体   English

XML元素树Python遍历子级并将每个子级另存为CSV列

[英]XML Element Tree Python Iterate through child and save each subchild as CSV column

I have been searching all over SO for a solution to my current issue but haven't been able to find anything that adequately solves it. 我一直在整个SO中寻找解决我当前问题的方法,但是还没有找到任何可以解决问题的方法。 I am attempting to iterate through the child of a root node in an XML document and pull the values of each of the subchilds within the iteration (eg iterate through in the XML below and pull each instance of COMPANY and ROLE). 我试图遍历XML文档中根节点的子节点,并提取迭代中每个子节点的值(例如,遍历下面的XML并提取COMPANY和ROLE的每个实例)。 This is the last piece of a huge project and I am completely stuck, any and all help would be immensely appreciated. 这是一个庞大项目的最后一部分,我完全陷于困境,我们将不胜感激任何帮助。

<Personnel Personnel ID = "123">
  <First_Name> First </First_Name>
  <Last_Name> Last </Last_Name>
  <User_ID> 123 </User_ID> 
  <Date> 2017-01-01 </Date>
  <INFO>
      <INFO_1>
        <PHONE> 555-555-5555 </PHONE>
      <INFO_2>
        <EMAIL> name@email.com </EMAIL>
  </INFO>     
  <LINKS>
      <LINK COMPANY = "Company 1" ROLE = "Role 1" />
      <LINK COMPANY = "Company 2" ROLE = "Role 2" />
      <LINK COMPANY = "Company 3" ROLE = "Role 3" />
       ....
      <LINK Company = "Company n" ROLE = "Role n" />
  </LINKS>
  <TAGS>
      <TAG Term="Tag 1" />
      <TAG Term="Tag 2" />
      <TAG Term="Tag 3" />
      ...................
      <TAG Term="Tag n" />
  </Tags>
  <Personnel_Field_1> Field 1 </Personnel_Field_1>
  <Personnel_Field_2> Field 2 </Personnel_Field_2>

Example Code: 示例代码:

 for contact in root.findall('Personnel'):
    Personnel_ID = contact.get('Personnel_ID')  
    contact_info.append(Personnel_ID)   

    First_Name = contact.find('First_Name').text
    contact_info.append(First_Name)

    Last_Name = contact.find('Last_Name').text
    contact_info.append(Last_Name)

    User = contact.find('User_ID').text
    contact_info.append(User)

    Date = contact.find('Date').text
    contact_info.append(Date)

    Email = contact.find( './/EMAIL' ).text
    contact_info.append(Email)

    Phone = contact.find( './/PHONE' ).text
    contact_info.append(Phone)

    Personnel_1 = contact.find('Personnel_Field_1').text.encode('utf-8')
    contact_info.append(Personnel_1)

    Personnel_2 = contact.find('Personnel_Field_2').text.encode('utf-8')
    contact_info.append(Personnel_2)

So far I have been successful in pulling the following and saving them into CSV columns: Personnel ID, First Name, Last Name, User ID, Date, Email, Phone, Personnel 1, Personnel 2 到目前为止,我已经成功提取了以下内容并将其保存到CSV列中:人员ID,名字,姓氏,用户ID,日期,电子邮件,电话,人员1,人员2

What I am stuck on is the ability to iterate through to parse COMPANY and ROLE, as well as to parse each term. 我所坚持的是具有迭代能力以解析COMPANY和ROLE以及解析每个术语的能力。 I need to save each company, role, and tag value as their own columns as well. 我还需要将每个公司,角色和标签值另存为自己的列。 If anyone can help by simply showing me how to iterate through these elements, I will be able to save them down into CSV columns. 如果有人可以通过简单地向我展示如何遍历这些元素来提供帮助,那么我将能够将它们保存到CSV列中。

Thanks in advance for any and all advice this is the last piece of a huge project I am working on and I feel like I've exhausted all potential solutions that I have found. 在此先感谢您提供的所有建议,这是我正在进行的一个大型项目的最后一部分,我觉得我已经用尽了所有可能找到的解决方案。

Simply add nested for loops to parse the LINK and TAG children. 只需添加嵌套的for循环即可解析LINKTAG子级。

for contact in root.findall('Personnel'):
    ...
    for link in contact.findall('.//LINK'):
        contact_info.append(link.get('COMPANY'))
        contact_info.append(link.get('ROLE'))

    for tag in contact.findall('.//TAG'):
        contact_info.append(tag.get('Term'))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 遍历XML中的特定子节点并使用Python保存到CSV - Iterate over particular child nodes in XML and save to CSV using Python 在Python中遍历子标记的XML子标记 - Iterate through XML child of a child tags in Python 如何遍历Python中的CSV层次树? - How to iterate through CSV hierarchy tree in Python? Python-遍历CSV行并以XML替换 - Python - Iterate through CSV rows and replace in XML 如何检查每个<book>元素在 xml 文件中有一个特定的子元素</book> - How to check if each <book> element has a specific subchild in xml file 使用BeautifulSoup迭代html树中的元素,并生成一个保持每个元素相对位置的输出?在Python中 - Iterate through elements in html tree using BeautifulSoup, and produce an output that maintains the relative position of each element? in Python 迭代CSV列Python中行中的每个整数 - Iterate each integer in row in CSV column Python 使用Python查询XML子代 - Query XML subchild with Python Python-使用Minidom进行xml解析-如何迭代每个 <parent> 并得到一个清单 <child> 为了那个原因 <parent> ? - Python - xml parsing with Minidom - How do I iterate through each <parent> and get a list of <child> for that <parent>? python lxml - 循环/遍历excel行并将每行保存为一个xml - python lxml - loop/iterate through excel rows and save each row as one xml
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM