[英]How to extract the href within the text using lxml xpath and requests in python
First of all, I am relatively new to python. 首先,我是python的新手。 I need to extract a link from the text in a web page, I am using lxml with Python 3.5, but i can't figure it out. 我需要从网页中的文本中提取一个链接,我将lxml与Python 3.5结合使用,但我无法弄清楚。 This is what I have so far: 这是我到目前为止的内容:
url = someUrl
page = requests.get(url)
webpage = html.fromstring(page.content)
fulllinks = webpage.xpath('//a/@href')
fulltext = webpage.xpath('//a/text()')
for line in fulltext:
if line.startswith("SomethingHere"):
'get the link from SomethingHere and do other stuff'
where "somethingHere"
is the text and I want the link from that text (eg www.someweb.com.br/trends
). 其中"somethingHere"
是文本,我想要该文本的链接(例如www.someweb.com.br/trends
)。
I'm kind of lost here. 我有点迷路了。 Thanks in advance. 提前致谢。
Got what i was looking for. 得到了我想要的东西。 The answer is: 答案是:
webpage.xpath("//a[starts-with(text(),'SomethingHere')]/@href")
Thanks anyway. 不管怎么说,还是要谢谢你。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.