简体   繁体   English

Python,将数据从列表传递到lxml中的xpath

[英]Python, passing data from a list into xpath, in lxml

Having trouble passing data from a list into xpath, in lxml. lxml中无法将数据从列表传递到xpath中。

See these lines, below: 请参阅以下这些行:

formatted_xpath = str(xpath_string[x])
current_data = xpath_html.xpath("string(formatted_xpath)")
# current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")

It works fine if I use this, with a hard coded xpath in def brandScrapePage: 如果我使用它,并且在def brandScrapePage中使用硬编码的xpath,则可以正常工作:

current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")

But, it won't work if I use my formatted_xpath variable. 但是,如果我使用formatted_xpath变量,它将不起作用。 Does not find anything. 找不到任何东西。 I know formatted_xpath is correct. 我知道formatted_xpath是正确的。 Looks good when I print it. 打印时看起来不错。 Not sure what the issue is. 不知道是什么问题。

CODE: 码:

xpath_string = ["//*[@class='title-container clearfix']/h2"]


def pageSource(line):
    response = urllib2.urlopen(line)
    html = response.read()
    html = html.decode('utf-8', 'ignore').encode('utf-8')
    return html


def brandScrapePage(brand_html, x):
    xpath_html = etree.HTML(brand_html)
    formatted_xpath = str(xpath_string[x])
    current_data = xpath_html.xpath("string(formatted_xpath)")
    # current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")

def main():

    f = open(home_dir + url_file).readlines()
    for line in f:
        for x in xrange(0, len(field_names)):
            current_html = pageSource(line)
            brandScrapePage(current_html, x)


if __name__ == '__main__':
    main()

The problem is that you are basically having a formatted_xpath string but meant to create a placeholder and format the string with an actual value: 问题是您基本上拥有一个formatted_xpath字符串,但打算创建一个占位符并使用实际值格式化该字符串:

current_data = xpath_html.xpath("string(%s)" % formatted_xpath)

Or, with format() : 或者,使用format()

current_data = xpath_html.xpath("string({})".format(formatted_xpath))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM