简体   繁体   English

我无法使用 selenium 和美丽的汤来抓取表数据

[英]I'm not able to scrape table data using selenium and beautiful soup

I've went as far as I can go but I can't seem to scrape data from a table.我已经尽我所能 go 但我似乎无法从表中抓取数据。 I've searched through stackoverflow for answers but nothing seems to work.我已经通过stackoverflow搜索了答案,但似乎没有任何效果。 Essentially the table comes up empty or I simply can't find elements within the table.基本上表格是空的,或者我根本无法在表格中找到元素。 I'm working with a table from yahoo's daily fantasy webpage.我正在使用来自雅虎每日幻想网页的表格。

NOTE: the current web address used will likely change week to week so it may not be a valid address in the future.注意:当前使用的 web 地址可能会每周更改,因此将来可能不是有效地址。

Current Code:当前代码:

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait

driver = webdriver.Chrome()
driver.get("https://sports.yahoo.com/dailyfantasy/contest/5416455/setlineup")

response = wait(driver, 10).until(EC.presence_of_element_located((By.TAG_NAME,"data-tst-player-id")))
driver.quit

soup = BeautifulSoup(response, 'lxml')
with open('test.txt','w', encoding='utf-8') as f_out:
    f_out.write(soup.prettify())

There is no element with the class-name or id that you are providing in the line没有您在该行中提供的类名或 id 的元素

response = wait(driver, 10).until(EC.presence_of_element_located((By.TAG_NAME,"data-tst-player-id")))

there is however, some tags with the attribute 'data-tst', so you can use that to make sure your page has loaded, and on this line但是,有些标签带有“data-tst”属性,因此您可以使用它来确保您的页面已加载,并且在这一行

driver.quit

you are doing nothing, you have to call the function driver.quit().你什么都不做,你必须调用 function driver.quit()。 working code:工作代码:

from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC 
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait as wait

driver = webdriver.Chrome()
driver.get("https://sports.yahoo.com/dailyfantasy/contest/5416455/setlineup")
wait(driver, 1).until(EC.presence_of_element_located((By.CSS_SELECTOR,"[data-tst]")))
response=driver.page_source
driver.quit()

soup = BeautifulSoup(response, 'lxml')
with open('test.txt','w', encoding='utf-8') as f_out:
    f_out.write(soup.prettify())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM