[英]How do I use LXML to return href attribute in path as string?
我有打印元素的工作代码
'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a/text()'
从网站https://www.baseball-reference.com/boxes/CHA/CHA202206200.shtml
使用脚本:
import requests
from lxml import html
boxScore = "CHA/CHA202206200"
url = "https://www.baseball-reference.com/boxes/" + boxScore + ".shtml"
page = requests.get(url)
tree = html.fromstring(b''.join(line for line in page.content.splitlines() if b'<!--' not in line and b'-->' not in line))
getTeams = tree.xpath('//*[@class="scorebox"]/div/div/strong/a/text()')
for team in getTeams:
team = team.replace(" ", "")
stringy = '"all_' + team + 'pitching"'
stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/text()'
tambellini = tree.xpath(stringx)
print(tambellini)
问题是我不想打印此文本,我想打印其中一条路径。 这意味着我或多或少地试图达到
'//*[@id=all_TorontoBlueJayspitching"]/div/table/tbody/tr/th/a'
然后是 /a 中的值 href (在这种情况下是 href=-"/players/b/berrijo01.shtml"
这里的任何指导都会有所帮助。 我知道如何成功打印元素,但我不知道如何将路径本身作为变量访问。 谢谢你。
将 stringx 更改为
stringx = '//*[@id=' + stringy + ']/div/table/tbody/tr/th/a/@href'
这应该输出
[
'/players/l/lynnla01.shtml',
'/players/l/lopezre01.shtml',
'/players/g/graveke01.shtml',
'/players/k/kellyjo05.shtml'
]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.