[英]Parse XML in Python with lxml.etree
How can I parse this site ( http://www.tvspielfilm.de/tv-programm/rss/heute2015.xml ) with python to get for example the tv programm for today on SAT at 20:15? 如何使用python解析此站点( http://www.tvspielfilm.de/tv-programm/rss/heute2015.xml ),以获取例如今天20:15上SAT的电视节目? I've tried the Python library lxml.etree, but I failed:
我尝试了Python库lxml.etree,但失败了:
#!/usr/bin/python
import lxml.etree as ET
import urllib2
response = urllib2.urlopen('http://www.tvspielfilm.de/tv-programm/rss/heute2015.xml')
xml = response.read()
root = ET.fromstring(xml)
for item in root.findall('SAT'):
title = item.find('title').text
print title
The method Element.findall
uses xpath expression as an argument. 方法
Element.findall
使用xpath表达式作为参数。 'SAT'
finds only direct children that are named SAT of the root node, witch is 'rss'
. 'SAT'
仅查找被称为根节点SAT的直接子代,witch是'rss'
。 If you need to find a tag anyway in the document use './/SAT'
. 如果仍然需要在文档中找到标签,请使用
'.//SAT'
。
The expression './/items'
is what you looking for: 您要查找表达式
'.//items'
:
#!/usr/bin/python
import lxml.etree as ET
import urllib2
response = urllib2.urlopen('some/url/to.xml')
xml = response.read()
root = ET.fromstring(xml)
for item in root.findall('.//item'):
title = item.find('title').text
print title
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.