[英]Python, passing data from a list into xpath, in lxml
Having trouble passing data from a list into xpath, in lxml. lxml中无法将数据从列表传递到xpath中。
See these lines, below: 请参阅以下这些行:
formatted_xpath = str(xpath_string[x])
current_data = xpath_html.xpath("string(formatted_xpath)")
# current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")
It works fine if I use this, with a hard coded xpath in def brandScrapePage: 如果我使用它,并且在def brandScrapePage中使用硬编码的xpath,则可以正常工作:
current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")
But, it won't work if I use my formatted_xpath variable. 但是,如果我使用formatted_xpath变量,它将不起作用。 Does not find anything. 找不到任何东西。 I know formatted_xpath is correct. 我知道formatted_xpath是正确的。 Looks good when I print it. 打印时看起来不错。 Not sure what the issue is. 不知道是什么问题。
CODE: 码:
xpath_string = ["//*[@class='title-container clearfix']/h2"]
def pageSource(line):
response = urllib2.urlopen(line)
html = response.read()
html = html.decode('utf-8', 'ignore').encode('utf-8')
return html
def brandScrapePage(brand_html, x):
xpath_html = etree.HTML(brand_html)
formatted_xpath = str(xpath_string[x])
current_data = xpath_html.xpath("string(formatted_xpath)")
# current_data = xpath_html.xpath("string(//*[@class='title-container clearfix']/h1)")
def main():
f = open(home_dir + url_file).readlines()
for line in f:
for x in xrange(0, len(field_names)):
current_html = pageSource(line)
brandScrapePage(current_html, x)
if __name__ == '__main__':
main()
The problem is that you are basically having a formatted_xpath
string but meant to create a placeholder and format the string with an actual value: 问题是您基本上拥有一个formatted_xpath
字符串,但打算创建一个占位符并使用实际值格式化该字符串:
current_data = xpath_html.xpath("string(%s)" % formatted_xpath)
Or, with format()
: 或者,使用format()
:
current_data = xpath_html.xpath("string({})".format(formatted_xpath))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.