简体   繁体   English

使用LXML在Python中解析XML

[英]Parsing XML in Python using LXML

tree = etree.parse("pinnacle_feed.xml")

fdtime = tree.xpath('//rsp/fd/fdTime/text()')
global lasttime 
lasttime = fdtime[0]


for leagues in tree.getiterator('league'):
    leagueid = tree.xpath('//id/text()')

    for elt in leagues.getiterator('event'):
        startDateTime = elt.xpath('//startDateTime/text()')
        eventId = elt.xpath('//id/text()')
        homeTeam = elt.xpath('./homeTeam/name/text()')
        awayTeam = elt.xpath('./awayTeam/name/text()')
        homeTeamOdds = elt.xpath('./periods/period/moneyLine/homePrice/text()')
        awayTeamOdds = elt.xpath('./periods/period/moneyLine/awayPrice/text()')
        drawOdds = elt.xpath('./periods/period/moneyLine/drawPrice/text()')
        print full_iterator

That is the code I am currently using. 那就是我当前正在使用的代码。 The issue is, I need to find out the 'current' leagueid as it is needed when I parse through the events in that league. 问题是,当我解析该联赛中的事件时,我需要找出“当前”联赛。

leagueid = tree.xpath('//id/text()') 

returns a list of all the leagueids and not just the 'current one' 返回所有联赛的列表,而不仅仅是“当前联赛”的列表

I hope I explained myself correctly and someone could give me a hand/advice. 我希望我能正确地解释自己,有人可以帮我。

XML doc: http://pastebin.com/BDaJ7Ayx XML文档: http//pastebin.com/BDaJ7Ayx

I think this is what you need to get the id from current node referenced by leagues variable : 我认为这是需要从leagues变量引用的当前节点获取ID的方法:

leagueid = leagues.xpath('./id/text()')

Above Xpath looks for child node <id> from current <league> node. 在Xpath上方,从当前<league>节点中查找子节点<id>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM