简体   繁体   中英

I am trying to scrape the table from iplt20 website, it keep returning blank []

from bs4 import BeautifulSoup

import requests

url = 'https://www.iplt20.com/stats/2021/most-runs'

source = requests.get(url)

soup = BeautifulSoup(source.text, 'html.parser')

soup.find_all('table', class_ ='np-mostruns_table')

The website is fully javascript, you can't load javascript with requests.

You have to use an automated browser like selenium or similar .

I also suggest using an extension when you are scraping to disable javascript (toggle on/off) like this

Toggle JS

如果您正在寻找一个有类的表,您应该使用:

soup.find("table",{"class":"np-mostruns_table"})

You can't get the table because it's loaded dynamically. You need to find the query that loads it, and build your table from it. It has many more fields than shown on the site, so you can add additional fields that you need. I gave an example only with those fields that are on the site

import requests
import json
import pandas as pd


url = 'https://ipl-stats-sports-mechanic.s3.ap-south-1.amazonaws.com/ipl/feeds/stats/60-toprunsscorers.js?callback=ontoprunsscorers'
results = []
response = requests.get(url)
json_data = json.loads(response.text[response.text.find('(')+1:response.text.find(')')])
for player in json_data['toprunsscorers']:
    data = {
        'Player': player['StrikerName'],
        'Mat': player['Matches'],
        'Inns': player['Innings'],
        'NO': player['NotOuts'],
        'Runs': player['TotalRuns'],
        'HS': player['HighestScore'],
        'AVG': player['BattingAverage'],
        'BF': player['Balls'],
        'SR': player['StrikeRate'],
        '100': player['Centuries'],
        '50': player['FiftyPlusRuns'],
        '4s': player['Fours'],
        '6s': player['Sixes']
    }
    results.append(data)
df = pd.DataFrame(results)
print(df)

OUTPUT:

                  Player Mat Inns NO Runs    HS  ...   BF      SR 100 50  4s  6s
0            Jos Buttler  17   17  2  863   116  ...  579  149.05   4  4  83  45
1              K L Rahul  15   15  3  616  103*  ...  455  135.38   2  4  45  30
2        Quinton De Kock  15   15  1  508  140*  ...  341  148.97   1  3  47  23
3          Hardik Pandya  15   15  4  487   87*  ...  371  131.26   0  4  49  12
4           Shubman Gill  16   16  2  483    96  ...  365  132.32   0  4  51  11
..                   ...  ..  ... ..  ...   ...  ...  ...     ...  .. ..  ..  ..
157     Fazalhaq Farooqi   3    1  1    2    2*  ...    8   25.00   0  0   0   0
158   Jagadeesha Suchith   5    2  0    2     2  ...    8   25.00   0  0   0   0
159          Tim Southee   9    5  1    2    1*  ...   12   16.66   0  0   0   0
160  Nathan Coulter-Nile   1    1  1    1    1*  ...    2   50.00   0  0   0   0
161        Anrich Nortje   6    1  1    1    1*  ...    6   16.66   0  0   0   0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM