here's the python code of what i've tried:
from lxml import html
import requests
page = requests.get('http://www.rsssf.com/tablese/eng2017det.html')
tree = html.fromstring(page.content)
print(tree.xpath('/html/body/table/tbody/tr[2]//text()'))
I'm always getting my output as [] I have also checked the html page, the URL isn't broken
Do not use tbody
tag in your XPath. Note that developer might skip this tag, so it will be added automatically by browser while page rendering.
Simply try
print(tree.xpath('/html/body/table//tr[2]//text()'))
or
print([i for i in tree.xpath('/html/body/table//tr[2]//text()') if i.strip()])
to avoid printing new line characters
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.