[英]Beautifulsoup: activate web button and continue scraping on new page
I'm having a university project and need to get data online.我有一个大学项目,需要在线获取数据。 I would like to get some data from this website.我想从这个网站获取一些数据。 https://www.footballdatabase.eu/en/transfers/-/2020-10-03 https://www.footballdatabase.eu/en/transfers/-/2020-10-03
For the 3rd of October I managed to get the first 19 rows but then there are 6 pages and I'm struggling to activate the button for loading the next page. 10 月 3 日,我设法获得了前 19 行,但后来有 6 页,我正在努力激活用于加载下一页的按钮。
This is the html code for the button:这是按钮的html代码:
<a href="javascript:;" class="inactive" onclick="showtransfers('1','2020-10-03','2','full');">2</a>
My code so far:到目前为止我的代码:
import requests
from bs4 import BeautifulSoup
import pandas as pd
headers = {'User-Agent':
'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}
page = "https://www.footballdatabase.eu/en/transfers/-/2020-10-03"
pageTree = requests.get(page, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')
Players = pageSoup.find_all("span", {"class": "name"})
Team = pageSoup.find_all("span", {"class": "firstteam"})
Values = pageSoup.find_all("span", {"class": "transferamount"})
Values[0].text
PlayersList = []
TeamList = []
ValuesList = []
j=1
for i in range(0,20):
PlayersList.append(Players[i].text)
TeamList.append(Team[i].text)
ValuesList.append(Values[i].text)
j=j+1
df = pd.DataFrame({"Players":PlayersList,"Team":TeamList,"Values":ValuesList})
Thank you very much!非常感谢!
You can use requests
module to simulate the Ajax call.您可以使用requests
模块来模拟 Ajax 调用。 For example:例如:
import requests
from bs4 import BeautifulSoup
data = {
'date': '2020-10-03',
'pid': 1,
'page': 1,
'filter': 'full',
}
url = 'https://www.footballdatabase.eu/ajax_transfers_show.php'
for data['page'] in range(1, 7): # <--- adjust number of pages here.
soup = BeautifulSoup(requests.post(url, data=data).content, 'html.parser')
for line in soup.select('.line'):
name = line.a.text
first_team = line.select_one('.firstteam').a.text if line.select_one('.firstteam').a else 'Free'
second_team = line.select_one('.secondteam').a.text if line.select_one('.secondteam').a else 'Free'
amount = line.select_one('.transferamount').text
print('{:<30} {:<20} {:<20} {}'.format(name, first_team, second_team, amount))
Prints:印刷:
Bruno Amione Belgrano Hellas Vérone 1.7 M€
Ismael Gutierrez Betis Deportivo Atlético B 1 M€
Vitaly Janelt Bochum Brentford 500 k€
Sven Ulreich Bayern Munich Hambourg SV 500 k€
Salim Ali Al Hammadi Baniyas Khor Fakkan Prêt
Giovanni Alessandretti Ascoli U-20 Recanatese Prêt
Gabriele Bellodi AC Milan U-20 Alessandria Prêt
Louis Britton Bristol City B Torquay United Prêt
Juan Brunetta Godoy Cruz Parme Prêt
Bobby Burns Barrow Glentoran Prêt
Bohdan Butko Shakhtar Donetsk Lech Poznan Prêt
Nicolò Casale Hellas Vérone Empoli Prêt
Alessio Da Cruz Parme FC Groningue Prêt
Dalbert Henrique Inter Milan Rennes Prêt
...and so on.
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.