简体   繁体   English

在python中通过xpath提取A标签的值

[英]Extracting the value of A tag by xpath in python

I have a simple python script like: 我有一个简单的python脚本,例如:

#!/usr/bin/python
import requests
from lxml import html
response = requests.get('http://site.ir/')
out=response.content
tree = html.fromstring(open(out).read())
print [e.text_content() for e in tree.xpath('//div[class="group"]/div[class="groupinfo"]/a/text()')]

I used xpath in order to get value of tag a as you can see from image below... 我使用xpath来获取标签a值,如下图所示。 在此处输入图片说明 But the output sample is not what I expected. 但是输出样本不是我期望的。

UPDATE I have also the following error: 更新我也有以下错误:

Traceback (most recent call last):
  File "p.py", line 7, in <module>
    tree = html.fromstring(open(out).read())
IOError: [Errno 36] File name too long: '\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" ....

您需要将@放在属性名称的开头,以寻址XPath中的属性:

//div[@class="group"]/div[@class="groupinfo"]/a/text()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM