使用LXML在Python中解析XML

Question

tree = etree.parse("pinnacle_feed.xml")

fdtime = tree.xpath('//rsp/fd/fdTime/text()')
global lasttime 
lasttime = fdtime[0]


for leagues in tree.getiterator('league'):
    leagueid = tree.xpath('//id/text()')

    for elt in leagues.getiterator('event'):
        startDateTime = elt.xpath('//startDateTime/text()')
        eventId = elt.xpath('//id/text()')
        homeTeam = elt.xpath('./homeTeam/name/text()')
        awayTeam = elt.xpath('./awayTeam/name/text()')
        homeTeamOdds = elt.xpath('./periods/period/moneyLine/homePrice/text()')
        awayTeamOdds = elt.xpath('./periods/period/moneyLine/awayPrice/text()')
        drawOdds = elt.xpath('./periods/period/moneyLine/drawPrice/text()')
        print full_iterator

That is the code I am currently using. 那就是我当前正在使用的代码。 The issue is, I need to find out the 'current' leagueid as it is needed when I parse through the events in that league. 问题是，当我解析该联赛中的事件时，我需要找出“当前”联赛。

leagueid = tree.xpath('//id/text()')

returns a list of all the leagueids and not just the 'current one' 返回所有联赛的列表，而不仅仅是“当前联赛”的列表

I hope I explained myself correctly and someone could give me a hand/advice. 我希望我能正确地解释自己，有人可以帮我。

XML doc: http://pastebin.com/BDaJ7Ayx XML文档： http ： //pastebin.com/BDaJ7Ayx

Answer 1

I think this is what you need to get the id from current node referenced by leagues variable : 我认为这是需要从leagues变量引用的当前节点获取ID的方法：

leagueid = leagues.xpath('./id/text()')

Above Xpath looks for child node <id> from current <league> node. 在Xpath上方，从当前<league>节点中查找子节点<id> 。

使用LXML在Python中解析XML

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-02-24 01:43:38

使用LXML在Python中解析XML

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-02-24 01:43:38

解决方案1
0 已采纳 2015-02-24 01:43:38