简体   繁体   中英

Parse comma separated list from xml in Python

I've spent hours searching for solutions to this problem but have come up empty handed. I am trying to parse an xml document in Python to return elements as comma separated lists.

Here is an example of the xml document:

<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://adcenter.microsoft.com/advertiser/reporting/v5/XMLSchema" ReportName="My DestinationUrl Performance Report" ReportTime="4/7/2014" TimeZone="Various" ReportAggregation="Daily" LastCompletedAvailableDay="4/8/2014 5:00:00 PM (GMT)" LastCompletedAvailableHour="4/8/2014 5:00:00 PM (GMT)" PotentialIncompleteData="false">
    <Column name="GregorianDate" />
    <Column name="AccountName" />
    <Column name="CampaignName" />
    <Column name="CampaignId" />
    <Column name="AdGroupName" />
    <Column name="AdGroupId" />
    <Column name="DestinationUrl" />
    <Column name="Impressions" />
    <Column name="Clicks" />
    <Column name="Spend" />
    <Column name="Conversions" />
      <GregorianDate value="4/7/2014" />
      <AccountName value="BingAccount" />
      <CampaignName value="Campaign#1" />
      <CampaignId value="12345678" />
      <AdGroupName value="Adgroup1" />
      <AdGroupId value="901234567" />
      <DestinationUrl value="www.example.com" />
      <Impressions value="8" />
      <Clicks value="0" />
      <Spend value="0.00" />
      <Conversions value="0" />
      <GregorianDate value="4/7/2014" />
      <AccountName value="BingAccount" />
      <CampaignName value="Campaign#2" />
      <CampaignId value="83984398493" />
      <AdGroupName value="Adgroup#2" />
      <AdGroupId value="3439843983" />
      <DestinationUrl value="www.example.co.uk" />
      <Impressions value="20" />
      <Clicks value="2" />
      <Spend value="0.10" />
      <Conversions value="0" />
  <Copyright>©2014 Microsoft Corporation. All rights reserved. </Copyright>

I want to return each rows value in a comma separated list, so desired results would be: ('4/7/2014','BingAccount','Campaign#1','12345678','Adgroup1','901234567','www.example.com','8','0','0.00','0') ('4/7/2014','BingAccount','Campaign#2','83984398493','Adgroup2','3439843983','www.example.co.uk','20','2','0.10','0')

This is what I have so far but haven't been able to advance further:

from xml.dom import minidom

xmldoc = minidom.parse('file.xml')

rows = xmldoc.firstChild.childNodes[3].childNodes

for i in rows:
    print tuple(i.childNodes)

Try xml.etree .

In [4]: print a
<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://adcenter.microsoft.com/advertiser/reporting/v5/XMLSchema" ReportName="My DestinationUrl Performance Report" ReportTime="4/7/2014" TimeZone="Various" ReportAggregation="Daily" LastCompletedAvailableDay="4/8/2014 5:00:00 PM (GMT)" LastCompletedAvailableHour="4/8/2014 5:00:00 PM (GMT)" PotentialIncompleteData="false">
    <Column name="GregorianDate" />
    <Column name="AccountName" />
    <Column name="CampaignName" />
    <Column name="CampaignId" />
    <Column name="AdGroupName" />
    <Column name="AdGroupId" />
    <Column name="DestinationUrl" />
    <Column name="Impressions" />
    <Column name="Clicks" />
    <Column name="Spend" />
    <Column name="Conversions" />
      <GregorianDate value="4/7/2014" />
      <AccountName value="BingAccount" />
      <CampaignName value="Campaign#1" />
      <CampaignId value="12345678" />
      <AdGroupName value="Adgroup1" />
      <AdGroupId value="901234567" />
      <DestinationUrl value="www.example.com" />
      <Impressions value="8" />
      <Clicks value="0" />
      <Spend value="0.00" />
      <Conversions value="0" />
      <GregorianDate value="4/7/2014" />
      <AccountName value="BingAccount" />
      <CampaignName value="Campaign#2" />
      <CampaignId value="83984398493" />
      <AdGroupName value="Adgroup#2" />
      <AdGroupId value="3439843983" />
      <DestinationUrl value="www.example.co.uk" />
      <Impressions value="20" />
      <Clicks value="2" />
      <Spend value="0.10" />
      <Conversions value="0" />
  <Copyright>�.©2014 Microsoft Corporation. All rights reserved. </Copyright>

In [5]: import xml.etree.ElementTree as ET

In [6]: root = ET.fromstring(a)

In [7]: [tuple([y.attrib['value'] for y in x]) for x in root[1]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM