[英]Parse comma separated list from xml in Python
I've spent hours searching for solutions to this problem but have come up empty handed. 我已经花了数小时来寻找解决此问题的方法,但是空手而归。 I am trying to parse an xml document in Python to return elements as comma separated lists.
我正在尝试在Python中解析xml文档,以将元素返回为逗号分隔的列表。
Here is an example of the xml document: 这是xml文档的示例:
<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://adcenter.microsoft.com/advertiser/reporting/v5/XMLSchema" ReportName="My DestinationUrl Performance Report" ReportTime="4/7/2014" TimeZone="Various" ReportAggregation="Daily" LastCompletedAvailableDay="4/8/2014 5:00:00 PM (GMT)" LastCompletedAvailableHour="4/8/2014 5:00:00 PM (GMT)" PotentialIncompleteData="false">
<DestinationUrlPerformanceReportColumns>
<Column name="GregorianDate" />
<Column name="AccountName" />
<Column name="CampaignName" />
<Column name="CampaignId" />
<Column name="AdGroupName" />
<Column name="AdGroupId" />
<Column name="DestinationUrl" />
<Column name="Impressions" />
<Column name="Clicks" />
<Column name="Spend" />
<Column name="Conversions" />
</DestinationUrlPerformanceReportColumns>
<Table>
<Row>
<GregorianDate value="4/7/2014" />
<AccountName value="BingAccount" />
<CampaignName value="Campaign#1" />
<CampaignId value="12345678" />
<AdGroupName value="Adgroup1" />
<AdGroupId value="901234567" />
<DestinationUrl value="www.example.com" />
<Impressions value="8" />
<Clicks value="0" />
<Spend value="0.00" />
<Conversions value="0" />
</Row>
<Row>
<GregorianDate value="4/7/2014" />
<AccountName value="BingAccount" />
<CampaignName value="Campaign#2" />
<CampaignId value="83984398493" />
<AdGroupName value="Adgroup#2" />
<AdGroupId value="3439843983" />
<DestinationUrl value="www.example.co.uk" />
<Impressions value="20" />
<Clicks value="2" />
<Spend value="0.10" />
<Conversions value="0" />
</Row>
</Table>
<Copyright>©2014 Microsoft Corporation. All rights reserved. </Copyright>
</Report>
I want to return each rows value in a comma separated list, so desired results would be: ('4/7/2014','BingAccount','Campaign#1','12345678','Adgroup1','901234567','www.example.com','8','0','0.00','0') ('4/7/2014','BingAccount','Campaign#2','83984398493','Adgroup2','3439843983','www.example.co.uk','20','2','0.10','0') 我想返回以逗号分隔的列表中的每一行值,因此所需的结果将是:('4/7/2014','BingAccount','Campaign#1','12345678','Adgroup1','901234567' 'www.example.com','8','0','0.00','0')('4/7/2014','BingAccount','Campaign#2','83984398493','Adgroup2' ,'3439843983','www.example.co.uk','20','2','0.10','0')
This is what I have so far but haven't been able to advance further: 这是我到目前为止的内容,但无法进一步提高:
from xml.dom import minidom
xmldoc = minidom.parse('file.xml')
rows = xmldoc.firstChild.childNodes[3].childNodes
for i in rows:
print tuple(i.childNodes)
Try xml.etree
. 尝试
xml.etree
。
In [4]: print a
<?xml version="1.0" encoding="utf-8"?>
<Report xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://adcenter.microsoft.com/advertiser/reporting/v5/XMLSchema" ReportName="My DestinationUrl Performance Report" ReportTime="4/7/2014" TimeZone="Various" ReportAggregation="Daily" LastCompletedAvailableDay="4/8/2014 5:00:00 PM (GMT)" LastCompletedAvailableHour="4/8/2014 5:00:00 PM (GMT)" PotentialIncompleteData="false">
<DestinationUrlPerformanceReportColumns>
<Column name="GregorianDate" />
<Column name="AccountName" />
<Column name="CampaignName" />
<Column name="CampaignId" />
<Column name="AdGroupName" />
<Column name="AdGroupId" />
<Column name="DestinationUrl" />
<Column name="Impressions" />
<Column name="Clicks" />
<Column name="Spend" />
<Column name="Conversions" />
</DestinationUrlPerformanceReportColumns>
<Table>
<Row>
<GregorianDate value="4/7/2014" />
<AccountName value="BingAccount" />
<CampaignName value="Campaign#1" />
<CampaignId value="12345678" />
<AdGroupName value="Adgroup1" />
<AdGroupId value="901234567" />
<DestinationUrl value="www.example.com" />
<Impressions value="8" />
<Clicks value="0" />
<Spend value="0.00" />
<Conversions value="0" />
</Row>
<Row>
<GregorianDate value="4/7/2014" />
<AccountName value="BingAccount" />
<CampaignName value="Campaign#2" />
<CampaignId value="83984398493" />
<AdGroupName value="Adgroup#2" />
<AdGroupId value="3439843983" />
<DestinationUrl value="www.example.co.uk" />
<Impressions value="20" />
<Clicks value="2" />
<Spend value="0.10" />
<Conversions value="0" />
</Row>
</Table>
<Copyright>�.©2014 Microsoft Corporation. All rights reserved. </Copyright>
</Report>
In [5]: import xml.etree.ElementTree as ET
In [6]: root = ET.fromstring(a)
In [7]: [tuple([y.attrib['value'] for y in x]) for x in root[1]]
Out[7]:
[('4/7/2014',
'BingAccount',
'Campaign#1',
'12345678',
'Adgroup1',
'901234567',
'www.example.com',
'8',
'0',
'0.00',
'0'),
('4/7/2014',
'BingAccount',
'Campaign#2',
'83984398493',
'Adgroup#2',
'3439843983',
'www.example.co.uk',
'20',
'2',
'0.10',
'0')]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.