简体   繁体   English

使用lxml抓取网站时获取空白数据

[英]getting blank data while scraping website using lxml

I am trying to scrape a simple website and wrote some code too. 我正在尝试抓取一个简单的网站并编写一些代码。 but I am getting blank result(no data).please help me /advise me where I doing wrong.I have tried scraping it using lxml then I tried selenium too but no success. 但是我得到的结果是空白(没有数据)。请帮我/建议我做错了什么。我尝试使用lxml进行抓取,然后我也尝试了硒,但没有成功。

from selenium import webdriver
import lxml.html as lh
import time

browser = webdriver.Firefox()

browser.get('http://usa.kyoceradocumentsolutions.com/americas/jsp/Kyocera/wheretobuy_result.jsp?cat=2&zipcode=98413&city=&state=NJ')   

time.sleep(5.0)

content = browser.page_source

tree = lh.fromstring(content)

for d in (tree.xpath('//table/tr/td[@class="bodytxt"]/b/text()')):
    print(d)

It does not handle frame by default, I agree with Key that its better to use direct url of that particular frame, and here is xpath for name and address for name: 默认情况下,它不处理框架,我同意Key的观点,最好使用该特定框架的直接url,这是名称的 xpath和名称的 地址

    //td[@class="bodytxt" and @align="left" and (b)]//b//text()

for address: 地址:

    //td[@class="bodytxt" and @align="left" and (b)]/text()

for both: 对彼此而言:

    //td[@class="bodytxt" and @align="left" and (b)]//text()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM