URL的Python XPath返回空列表

Question

I am trying to access an element of the following URL using XPath: http://www.booking.com/searchresults.html?dest_id=2400&dest_type=region&offset=288 我正在尝试使用XPath访问以下URL的元素： http : //www.booking.com/searchresults.html? dest_id = 2400&dest_type = region&offset= 288

The specific element I am looking for is the Div class "sr_item_link_to_villas ". 我要查找的特定元素是Div类“ sr_item_link_to_villas”。 I have been using the following XPath to trying to access it (in this example I am trying to access the second listing, but the full script loops through each listing), but it returns an empty list: 我一直在使用以下XPath尝试访问它（在本示例中，我尝试访问第二个列表，但是完整的脚本在每个列表中循环），但是它返回一个空列表：

//*[@id="hotellist_inner"]/*[contains(@class,"sr_item")][2]//*[contains(@class,"sr_item_link_to_villas ")]

The full code is: 完整的代码是：

url='http://www.booking.com/searchresults.html?dest_id=2400&dest_type=region&offset=288'
page = parse(url).getroot()
pathstr='//*[@id="hotellist_inner"]/*[contains(@class,"sr_item")][2]//*[contains(@class,"sr_item_link_to_villas ")]'
content=page.xpath(pathstr)

Answer 1

Following Code may solve your purpose. 遵循以下代码可以解决您的目的。 You have to add header value for get data. 您必须添加标题值以获取数据。

    import urllib2
    from lxml import etree
    from lxml.html import tostring,fromstring

    def get_HTML(url):
        header={"User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:41.0) Gecko/20100101 Firefox/41.0","Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8","Connection": "keep-alive"}
        req=urllib2.Request(url,None,header)
        return urllib2.urlopen(req).read()

    url="http://www.booking.com/searchresults.html?dest_id=2400&dest_type=region&offset=288"

    read = get_HTML(url)
    tree = etree.HTML(read)    
    data = tree.xpath("//div[@class='sr_item_link_to_villas ']/a/text()");
    print data

URL的Python XPath返回空列表

问题描述

1 个解决方案

解决方案1
0 2015-12-24 04:48:46

URL的Python XPath返回空列表

问题描述

1 个解决方案

解决方案1 0 2015-12-24 04:48:46

解决方案1
0 2015-12-24 04:48:46