繁体   English   中英

Beautifulsoup:激活网页按钮并继续在新页面上抓取

[英]Beautifulsoup: activate web button and continue scraping on new page

我有一个大学项目,需要在线获取数据。 我想从这个网站获取一些数据。 https://www.footballdatabase.eu/en/transfers/-/2020-10-03

10 月 3 日,我设法获得了前 19 行,但后来有 6 页,我正在努力激活用于加载下一页的按钮。

这是按钮的html代码:

<a href="javascript:;" class="inactive" onclick="showtransfers('1','2020-10-03','2','full');">2</a>

到目前为止我的代码:

import requests
from bs4 import BeautifulSoup
import pandas as pd

headers = {'User-Agent': 
           'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

page = "https://www.footballdatabase.eu/en/transfers/-/2020-10-03"
pageTree = requests.get(page, headers=headers)
pageSoup = BeautifulSoup(pageTree.content, 'html.parser')

Players = pageSoup.find_all("span", {"class": "name"})
Team = pageSoup.find_all("span", {"class": "firstteam"})
Values = pageSoup.find_all("span", {"class": "transferamount"})
Values[0].text

PlayersList = []
TeamList = []
ValuesList = []
j=1

for i in range(0,20):
    PlayersList.append(Players[i].text)
    TeamList.append(Team[i].text)
    ValuesList.append(Values[i].text)
    j=j+1
df = pd.DataFrame({"Players":PlayersList,"Team":TeamList,"Values":ValuesList})

非常感谢!

您可以使用requests模块来模拟 Ajax 调用。 例如:

import requests
from bs4 import BeautifulSoup


data = {
    'date':  '2020-10-03',
    'pid': 1,
    'page': 1,
    'filter': 'full',
}

url = 'https://www.footballdatabase.eu/ajax_transfers_show.php'

for data['page'] in range(1, 7):  # <--- adjust number of pages here.
    soup = BeautifulSoup(requests.post(url, data=data).content, 'html.parser')

    for line in soup.select('.line'):
        name = line.a.text
        first_team = line.select_one('.firstteam').a.text if line.select_one('.firstteam').a else 'Free'
        second_team = line.select_one('.secondteam').a.text if line.select_one('.secondteam').a else 'Free'
        amount = line.select_one('.transferamount').text

        print('{:<30} {:<20} {:<20} {}'.format(name, first_team, second_team, amount))

印刷:

Bruno Amione                   Belgrano             Hellas Vérone        1.7 M€
Ismael Gutierrez               Betis Deportivo      Atlético B           1 M€
Vitaly Janelt                  Bochum               Brentford            500 k€
Sven Ulreich                   Bayern Munich        Hambourg SV          500 k€
Salim Ali Al Hammadi           Baniyas              Khor Fakkan          Prêt
Giovanni Alessandretti         Ascoli U-20          Recanatese           Prêt
Gabriele Bellodi               AC Milan U-20        Alessandria          Prêt
Louis Britton                  Bristol City B       Torquay United       Prêt
Juan Brunetta                  Godoy Cruz           Parme                Prêt
Bobby Burns                    Barrow               Glentoran            Prêt
Bohdan Butko                   Shakhtar Donetsk     Lech Poznan          Prêt
Nicolò Casale                  Hellas Vérone        Empoli               Prêt
Alessio Da Cruz                Parme                FC Groningue         Prêt
Dalbert Henrique               Inter Milan          Rennes               Prêt

...and so on.

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM