[英]Web scraping using beautiful soup (sports data)
當我嘗試加載此代碼時,出現兩個錯誤。 1:第一個是我無法正確抓取name_text的數據。
2:我收到team = name_text.div.text的縮進錯誤。 我知道這可能很容易解決,但是我嘗試了不同的縮進,但似乎沒有任何效果。
在網站上,我想抓取球隊名稱和賠率。
<div class="size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r" data-automation-id="participant-one">Orlando Magic</div>
<div class="priceText_f71sibe"><span class="size14_f7opyze medium_f1wf24vo priceTextSize_frw9zm9" data-automation-id="price-text">5.85</span></div>
上面的html已從網站復制。
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
soup = BeautifulSoup(page_html, "html.parser")
price_text = soup.findAll("div",{"class":"priceText_f71sibe"})
name_text = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24vo participantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)
for price_text in price_texts:
team = name_text.div.text
odds = price_text.span.text
print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()
任何幫助都會很棒。 干杯。
您的for loop
縮進不正確。 正確的縮進將是:
for price_text in price_texts:
team = name_text.div.text
odds = price_text.span.text
team = name_text.div.text
odds = price_text.span.text
print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()
團隊和賠率前有4個空格。 請閱讀Python ForLoop文檔 。
而且,沒有price_texts
變量。 當您執行findAll時 ,您需要分配它,您忘記了'S':
price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})
最后一件事,請考慮使用with
代替open()
和.close()
來寫入文件。
我在想,您可以做的只是迭代並將這些存儲到列表中,然后寫入文件。 不幸的是,我無法在工作中訪問該網站,因此無法測試代碼,但是我相信這應該可以提供您所需要的輸出:
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
import csv
from itertools import zip_longest
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
soup = BeautifulSoup(page_html, "html.parser")
price_text = soup.findAll("span",{"data-automation-id":"price-text"})
name_text = soup.findAll("div",{"data-automation-id":"participant-one"})
team_list = [ name.text.strip() for name in name_text ]
odds_list = [ price.text.strip() for price in price_text ]
d = [team_list, odds_list]
export_data = zip_longest(*d, fillvalue = '')
with open('odds.csv', 'w', encoding="ISO-8859-1", newline='') as myfile:
wr = csv.writer(myfile)
wr.writerow(("Team", "odds_team"))
wr.writerows(export_data)
myfile.close()
你可以試試這個嗎?
from bs4 import BeautifulSoup
from urllib.request import urlopen as uReq
my_url = 'https://www.sportsbet.com.au/betting/basketball-us'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
soup = BeautifulSoup(page_html, "html.parser")
price_texts = soup.findAll("div",{"class":"priceText_f71sibe"})
name_texts = soup.findAll("div",{"class":"size14_f7opyze Endeavour_fhudrb0 medium_f1wf24voparticipantText_fivg86r"})
filename = "odds.csv"
f = open(filename,"w")
headers = "Team, odds_team\n"
print(name_text)
f.write(headers)
odds =''
team=''
for price_text in price_texts:
odds = price_text.text
for name_text in name_texts:
team = name_text.text
print(odds)
print(team + odds)
f.write(team + "," + odds + "\n")
f.close()
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.