[英]How to get text from next pages using Beautifulsoup in python 3?
我試圖獲得團隊每個頁面的所有游戲結果。 到目前為止,我能夠讓所有對手1與對手2得分並得分。 但我不知道如何獲得下一頁來獲取其余數據。 我會找到下一頁並將其置於while循環中嗎? 這是我想要的團隊的鏈接
http://www.gosugamers.net/counterstrike/teams/7397-natus-vincere/matches
這就是我到目前為止所獲得的所有團隊比賽,並且僅為第一頁得分。
def all_match_outcomes():
for match_outcomes in match_history_url():
rest_server(True)
page = requests.get(match_outcomes).content
soup = BeautifulSoup(page, 'html.parser')
team_name_element = soup.select_one('div.teamNameHolder')
team_name = team_name_element.find('h1').text.replace('- Team Overview', '')
for match_outcome in soup.select('table.simple.gamelist.profilelist tr'):
opp1 = match_outcome.find('span', {'class': 'opp1'}).text
opp2 = match_outcome.find('span', {'class': 'opp2'}).text
opp1_score = match_outcome.find('span', {'class': 'hscore'}).text
opp2_score = match_outcome.find('span', {'class': 'ascore'}).text
if match_outcome(True): # If teams have past matches
print(team_name, '%s %s:%s %s' % (opp1, opp1_score, opp2_score, opp2))
獲取最后一個頁碼並逐頁迭代,直到您點擊最后一頁。
完整的工作代碼:
import re
import requests
from bs4 import BeautifulSoup
url = "http://www.gosugamers.net/counterstrike/teams/7397-natus-vincere/matches"
with requests.Session() as session:
response = session.get(url)
soup = BeautifulSoup(response.content, "html.parser")
# locate the last page link
last_page_link = soup.find("span", text="Last").parent["href"]
# extract the last page number
last_page_number = int(re.search(r"page=(\d+)$", last_page_link).group(1))
print("Processing page number 1")
# TODO: extract data
# iterate over all pages starting from page 2 (since we are already on the page 1)
for page_number in range(2, last_page_number+1):
print("Processing page number %d" % page_number)
link = "http://www.gosugamers.net/counterstrike/teams/7397-natus-vincere/matches?page=%d" % page_number
response = session.get(link)
soup = BeautifulSoup(response.content, "html.parser")
# TODO: extract data
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.