如何使用lxml xpath和python中的请求在文本中提取href

Question

First of all, I am relatively new to python. 首先，我是python的新手。 I need to extract a link from the text in a web page, I am using lxml with Python 3.5, but i can't figure it out. 我需要从网页中的文本中提取一个链接，我将lxml与Python 3.5结合使用，但我无法弄清楚。 This is what I have so far: 这是我到目前为止的内容：

url = someUrl
page = requests.get(url)
webpage = html.fromstring(page.content)
fulllinks = webpage.xpath('//a/@href')
fulltext = webpage.xpath('//a/text()')


for line in fulltext:
    if line.startswith("SomethingHere"):
    'get the link from SomethingHere and do other stuff'

where "somethingHere" is the text and I want the link from that text (eg www.someweb.com.br/trends ). 其中"somethingHere"是文本，我想要该文本的链接（例如www.someweb.com.br/trends ）。

I'm kind of lost here. 我有点迷路了。 Thanks in advance. 提前致谢。

Answer 1

Got what i was looking for. 得到了我想要的东西。 The answer is: 答案是：

webpage.xpath("//a[starts-with(text(),'SomethingHere')]/@href")

Thanks anyway. 不管怎么说，还是要谢谢你。

如何使用lxml xpath和python中的请求在文本中提取href

问题描述

1 个解决方案

解决方案1
0 2017-03-16 14:07:18

如何使用lxml xpath和python中的请求在文本中提取href

问题描述

1 个解决方案

解决方案1 0 2017-03-16 14:07:18

解决方案1
0 2017-03-16 14:07:18