Python＆lxml / xpath：解析XML

Question

我需要从此链接获取FLVPath的值： http ： //www.testpage.com/v2/videoConfigXmlCode.php？ ppg = video_29746_no_0_extsite

from lxml import html

sub_r = requests.get("http://www.testpage.co/v2/videoConfigXmlCode.php?pg=video_%s_no_0_extsite" % list[6])
sub_root = lxml.html.fromstring(sub_r.content)

for sub_data in sub_root.xpath('//PLAYER_SETTINGS[@Name="FLVPath"]/@Value'):
     print sub_data.text

但没有数据返回

Answer 1

您正在使用lxml.html来解析文档，这会导致lxml小写所有元素和属性名称（因为这在html中无关紧要），这意味着您必须使用：

sub_root.xpath('//player_settings[@name="FLVPath"]/@value')

或者当您正在解析xml文件时，您可以使用lxml.etree 。

Answer 2

你可以试试

print sub_data.attrib['Value']

Answer 3

url = "http://www.testpage.com/v2/videoConfigXmlCode.php?pg=video_29746_no_0_extsite"
response = requests.get(url)

# Use `lxml.etree` rathern than `lxml.html`, 
# and unicode `response.text` instead of `response.content`
doc = lxml.etree.fromstring(response.text)

for path in doc.xpath('//PLAYER_SETTINGS[@Name="FLVPath"]/@Value'):
     print path

Python＆lxml / xpath：解析XML

问题描述

3 个解决方案

解决方案1
4 已采纳 2012-12-09 20:52:03

解决方案2
2 2012-12-09 20:49:16

解决方案3
0 2012-12-09 20:56:24

Python＆lxml / xpath：解析XML

问题描述

3 个解决方案

解决方案1 4 已采纳 2012-12-09 20:52:03

解决方案2 2 2012-12-09 20:49:16

解决方案3 0 2012-12-09 20:56:24

解决方案1
4 已采纳 2012-12-09 20:52:03

解决方案2
2 2012-12-09 20:49:16

解决方案3
0 2012-12-09 20:56:24