[英]Using BeautifulSoup to parse some HTML
我有html的以下部分,它是顯示足球比賽結果的頁面的一部分。
<div class = "schedules-list-matchup"></div>
<!-- <un inportant stuff -->
<div class=list-matchup-row-team>
<span class="team-name away lost">team1</span>
<span class="team-logo away team-name">...</span>
<span class="team-score away lost">2</span>
<span class="team-score home">3</span>
<span class="team-logo home team-name">...</span>
<span class="team-name home">team2</span>
</div>
<div class=list-matchup-row-team>
<span class="team-name away lost">team3</span>
<span class="team-logo away team-name">...</span>
<span class="team-score away lost">2</span>
<span class="team-score home">3</span>
<span class="team-logo home team-name">...</span>
<span class="team-name home">team4</span>
</div>
<!-- <ramainder of code> -->
我正在嘗試閱讀它並創建類的對象:
class Game:
def __init__(self, homeTeam, homeTeamScore, awayTeam, awayTeamScore):
self.homeTeam = homeTeam
self.homeTeamScore = homeTeamScore
self.awayTeam = awayTeam
self.awayTeamScore = awayTeamScore
我以為我在做的是遍歷每個<div class= "list-matchup-row-team>
我的代碼:
html = urlopen(baseUrl + '1')
bsObj = BeautifulSoup(html, 'lxml')
table = bsObj.find("ul",{"class":"schedules-table"})
for game in table.findAll("li", {"class":"schedules-list-matchup"}):
for g in game.findAll("div", {"class":"list-matchup-row-team"}):
for teams in g.findAll("span", {"class" : "home"}):
print(teams.find("span", {"class" : "team-name"}))
print(teams.find("span", {"class" : "team-score"}))
print('==========================')
返回一堆空對象。 我如何遍歷<div class= "list-matchup-row-team>
標記內的每個span元素,並檢查該類是否同時包含“ team-name”和“ team-score”?
我認為您可以直接進入團隊名稱課程。
嘗試這個。
table.findAll("span", {"class" : "team-name"})
然后接走回家。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.