簡體   English   中英

網頁抓取返回“無”

[英]web scraping returns 'None'

我是 python 的新手,我正在嘗試構建一個網絡抓取算法。

我正在嘗試抓取“href”網址:來自網站的html圖片

我的代碼:

URL = 'https://www.rotowire.com/basketball/team.php?team=UTA'

page = requests.get(URL)

soup = BeautifulSoup(page.content, 'html.parser')

service = Service(ChromeDriverManager().install())

for link in soup.find_all({"aria-colindex" : "3"}):

    print(link.get('href'))

driver = webdriver.Chrome(service = service)

但這沒有任何回報。 我也試過 {'style' : "width: 96px; left: 190px; top: 0px;"} insteed of {"aria-colindex" : "3"},但這也返回 'None'。 不知道我做錯了什么,所以任何幫助將不勝感激:)

數據是從api動態加載的。 直接從 api 檢索鏈接更容易。 這是一個pandas實現:

import pandas as pd
from bs4 import BeautifulSoup
df = pd.read_json('https://www.rotowire.com/basketball/tables/team-schedule.php?team=UTA')
df['url'] = df['score'].apply(lambda x: BeautifulSoup(x).find('a')['href'])
df.to_csv('output.csv') #export to csv

根據你的問題。 這是工作解決方案。

代碼:

from bs4 import BeautifulSoup
from selenium import webdriver


driver = webdriver.Chrome('chromedriver.exe')
url = "https://www.rotowire.com/basketball/team.php?team=UTA"
driver.get(url)
time.sleep(8)

soup = BeautifulSoup(driver.page_source, 'html.parser')
urls = soup.select('div.webix_column.align-c div a')
for url in urls:
    print('href_url:' +url['href'])

輸出:

href_url:/basketball/box-score.php?gid=2347768
href_url:/basketball/box-score.php?gid=2347767
href_url:/basketball/box-score.php?gid=2347765
href_url:/basketball/box-score.php?gid=2347764
href_url:/basketball/box-score.php?gid=2347762
href_url:/basketball/box-score.php?gid=2347760
href_url:/basketball/box-score.php?gid=2346563
href_url:/basketball/box-score.php?gid=2346562
href_url:/basketball/box-score.php?gid=2346561
href_url:/basketball/box-score.php?gid=2346420
href_url:/basketball/box-score.php?gid=2346295
href_url:/basketball/box-score.php?gid=2314246
href_url:/basketball/box-score.php?gid=2314315
href_url:/basketball/box-score.php?gid=2314159
href_url:/basketball/box-score.php?gid=2314155
href_url:/basketball/box-score.php?gid=2314153
href_url:/basketball/box-score.php?gid=2314144
href_url:/basketball/box-score.php?gid=2314220
href_url:/basketball/box-score.php?gid=2314333
href_url:/basketball/box-score.php?gid=2314142

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM