简体   繁体   中英

Python, BeautifulSoup4 TypeError: find() takes no keyword arguments

Everyone, i want to parse html with beautifulsoup4 and write this code:

from selenium import webdriver
from django.core.management.base import BaseCommand
import datetime
from bs4 import BeautifulSoup as bs


url = "https://www.basketball-reference.com/leagues/NBA_2020.html"
main_url = "https://www.basketball-reference.com"
browser = webdriver.Chrome()
browser.set_window_size(1920, 1080)
browser.minimize_window()
browser.get(url)
soup = bs(browser.page_source, 'lxml')
team_urls = []
try:
    tables = soup.find('table', id='team-stats-per_game')
    for tr in tables.tbody:
        team_name = tr.find('a')
        try:
           if type(team_name) != int:
                if type(team_name) != 'NoneType':
                    team_url = team_name.get('href')
                    team_urls.append(team_url)
        except:
            pass
except Exception as e:
    print(e)
for team in team_urls:
    browser2 = webdriver.Chrome()
    browser2.minimize_window()
    browser2.get(main_url + team)
    team_soup = bs(browser2.page_source, 'lxml')
    team_op_stats = team_soup.find('table', id='team_and_opponent').find_all('tbody')
    for t1_stats in team_op_stats[0]:
        if t1_stats.find('th', attrs={'class', 'left'}):
            print(t1_stats)
        print("##" * 50)
    browser2.quit()
    break
browser.quit()

this code output:

File "C:\Users\ysfnm\PycharmProjects\denemee\denemee\apps\result\management\commands\nba.py", line 46, in handle
    if t1_stats.find('th', attrs={'class', 'left'}):
TypeError: find() takes no keyword arguments

As a result of my research, I found that the answers given to other friends who received the same error were as follows:

You aren't calling BeautifulSoup's .find (), you're calling it on an ordinary string object (the .text attribute from your BeautifulSoup object).

But:

            for t1_stats in team_op_stats[0]:
                print(t1_stats)
                print("##" * 50)

this code output:

<tr>
<th class="left" data-stat="player" scope="row">Team/G</th>
<td class="center iz" data-stat="g"></td>
<td class="center" data-stat="mp_per_g">240.7</td>
<td class="center" data-stat="fg_per_g">43.8</td>
<td class="center" data-stat="fga_per_g">91.0</td>
<td class="center" data-stat="fg_pct">.481</td>
<td class="center" data-stat="fg3_per_g">14.0</td>
<td class="center" data-stat="fg3a_per_g">39.1</td>
<td class="center" data-stat="fg3_pct">.359</td>
<td class="center" data-stat="fg2_per_g">29.7</td>
<td class="center" data-stat="fg2a_per_g">51.9</td>
<td class="center" data-stat="fg2_pct">.573</td>
<td class="center" data-stat="ft_per_g">17.7</td>
<td class="center" data-stat="fta_per_g">24.3</td>
<td class="center" data-stat="ft_pct">.727</td>
<td class="center" data-stat="orb_per_g">10.0</td>
<td class="center" data-stat="drb_per_g">41.5</td>
<td class="center" data-stat="trb_per_g">51.5</td>
<td class="center" data-stat="ast_per_g">26.0</td>
<td class="center" data-stat="stl_per_g">7.7</td>
<td class="center" data-stat="blk_per_g">6.5</td>
<td class="center" data-stat="tov_per_g">14.7</td>
<td class="center" data-stat="pf_per_g">19.3</td>
<td class="center" data-stat="pts_per_g">119.2</td>
</tr>

Where's my fault?

  • change: if t1_stats.find('th', attrs={'class', 'left'}):
  • to: if t1_stats.find('th', attrs={'class': 'left'}):

Then

  • change: for t1_stats in team_op_stats[0]:
  • to for t1_stats in team_op_stats:

HOWEVER

Using Selenium is a slow process. The tables within there are in the Comments. You can use requests, then use BeautifulSoup to pull out the Comments, then grab the tables within there with Pandas. The processing will be much faster.

I'm not entirely sure which table you want, but from what you've shown above, looks like the team stats:

Code:

import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd


url = "https://www.basketball-reference.com/leagues/NBA_2020.html"
main_url = "https://www.basketball-reference.com"

response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')


team_urls = []
teams = soup.find_all('div', {'class':'division'})
for each in teams:
    links = each.find_all('a', href=True)
    for link in links:
        team_urls.append(main_url + link['href'])





for team in team_urls:
    response = requests.get(team)
    soup = BeautifulSoup(response.text, 'html.parser')

    seas = soup.find('h1').find_all('span')[0].text
    teamName = soup.find('h1').find_all('span')[1].text

    comments = soup.find_all(string=lambda text: isinstance(text, Comment))

    tables = []
    for each in comments:
        if 'table' in each:
            try:
                tables.append(pd.read_html(each)[0])
            except:
                continue
    print ('%s %s' %(seas, teamName))        
    print (tables[2].to_string())
    print("##" * 50)

Output sample:

2019-20 Toronto Raptors
   Unnamed: 0     G     MP     FG    FGA     FG%     3P    3PA    3P%      2P     2PA     2P%     FT    FTA    FT%    ORB   DRB   TRB    AST     STL    BLK    TOV     PF    PTS
0        Team  37.0   8955   1459   3270   0.446    489   1330  0.368     970    1940   0.500    680    850  0.800    381  1335  1716    915     305    198    549    775   4087
1      Team/G   NaN  242.0   39.4   88.4   0.446   13.2   35.9  0.368    26.2    52.4   0.500   18.4   23.0  0.800   10.3  36.1  46.4   24.7     8.2    5.4   14.8   20.9  110.5
2     Lg Rank   NaN     11     21     19  22.000      5      7  6.000      27      24  25.000     10     16  4.000     15     8     9     11       7      9     14     15     15
3   Year/Year   NaN  -0.2%  -6.5%  -0.8%  -0.027   6.8%   6.4%  0.001  -12.1%   -5.2%  -0.039   4.0%   4.5% -0.004   7.4%  1.3%  2.6%  -2.7%   -0.6%   0.4%   5.8%  -0.4%  -3.5%
4    Opponent  37.0   8955   1402   3313   0.423    470   1416  0.332     932    1897   0.491    615    811  0.758    432  1310  1742    921     249    208    610    744   3889
5  Opponent/G   NaN  242.0   37.9   89.5   0.423   12.7   38.3  0.332    25.2    51.3   0.491   16.6   21.9  0.758   11.7  35.4  47.1   24.9     6.7    5.6   16.5   20.1  105.1
6     Lg Rank   NaN     11      2     17   2.000     26     29  3.000       2       4   3.000     10     11  7.000     29    17    26     22       2     26      2     19      4
7   Year/Year   NaN  -0.2%  -5.9%  -0.1%  -0.026  18.1%  22.6% -0.013  -14.6%  -12.3%  -0.014  -2.6%  -1.7% -0.007  10.4%  3.5%  5.2%   1.4%  -11.3%  25.3%  10.4%  -2.0%  -3.0%
####################################################################################################
2019-20 Boston Celtics
   Unnamed: 0     G     MP     FG    FGA     FG%     3P    3PA     3P%     2P    2PA     2P%     FT    FTA    FT%    ORB    DRB    TRB     AST    STL    BLK   TOV    PF    PTS
0        Team  34.0   8185   1384   3036   0.456    403   1151   0.350    981   1885   0.520    598    749  0.798    372   1200   1572     784    277    210   474   720   3769
1      Team/G   NaN  240.7   40.7   89.3   0.456   11.9   33.9   0.350   28.9   55.4   0.520   17.6   22.0  0.798   10.9   35.3   46.2    23.1    8.1    6.2  13.9  21.2  110.9
2     Lg Rank   NaN     30     15     16  17.000     16     13  20.000     13     15  11.000     14     22  6.000      8     14     10      21      9      6     8    16     14
3   Year/Year   NaN  -0.2%  -3.3%  -1.4%  -0.009  -5.8%  -1.9%  -0.015  -2.2%  -1.0%  -0.006  12.5%  13.0% -0.004  11.6%   1.6%   3.8%  -12.3%  -5.4%  16.4%  8.7%  4.0%  -1.4%
4    Opponent  34.0   8185   1281   2932   0.437    397   1156   0.343    884   1776   0.498    561    761  0.737    346   1156   1502     775    233    187   539   711   3520
5  Opponent/G   NaN  240.7   37.7   86.2   0.437   11.7   34.0   0.343   26.0   52.2   0.498   16.5   22.4  0.737   10.2   34.0   44.2    22.8    6.9    5.5  15.9  20.9  103.5
6     Lg Rank   NaN     30      1      6   4.000     13     18   9.000      4      6   8.000      8     16  1.000     16      8      9       5      3     22     6    14      1
7   Year/Year   NaN  -0.2%  -4.6%  -2.1%  -0.012   1.4%   1.5%  -0.000  -7.1%  -4.3%  -0.015  -5.4%  -2.0% -0.027  -2.1%  -4.3%  -3.8%   -3.7%   1.1%  42.3%  4.7%  7.0%  -4.1%
####################################################################################################

If you are only after team stats per game, you can get it in 1 requests at https://www.basketball-reference.com/leagues/NBA_2019.html#all_team-stats-base . The stats interestingly are different (not sure why) between the 1 link, and each teams individual link.

import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd



url = 'https://www.basketball-reference.com/leagues/NBA_2020.html#all_team-stats-base'
response = requests.get(url)

soup = BeautifulSoup(response.text, 'html.parser')

comments = soup.find_all(string=lambda text: isinstance(text, Comment))

tables = []
for each in comments:
    if 'table' in each:
        try:
            tables.append(pd.read_html(each)[0])
        except:
            continue


print (tables[1])

Output:

print (tables[1].to_string())
      Rk                    Team   G     MP    FG   FGA    FG%    3P   3PA    3P%    2P   2PA    2P%    FT   FTA    FT%   ORB   DRB   TRB   AST  STL  BLK   TOV    PF    PTS
0    1.0          Boston Celtics  34  240.7  37.7  86.2  0.437  11.7  34.0  0.343  26.0  52.2  0.498  16.5  22.4  0.737  10.2  34.0  44.2  22.8  6.9  5.5  15.9  20.9  103.5
1    2.0          Denver Nuggets  36  242.1  39.0  86.1  0.453  10.7  32.6  0.327  28.3  53.5  0.529  16.6  21.9  0.757  10.1  33.4  43.5  24.0  7.1  4.7  14.4  20.6  105.2
2    3.0               Utah Jazz  36  240.0  39.2  89.3  0.439  10.8  31.8  0.341  28.4  57.6  0.493  16.7  21.7  0.769   9.8  34.3  44.0  20.5  7.9  5.0  12.6  21.2  105.9
3    4.0           Orlando Magic  37  240.0  38.8  86.4  0.449  11.9  33.6  0.354  26.9  52.8  0.509  14.3  19.0  0.754   9.8  36.8  46.5  23.1  6.9  4.3  15.1  19.2  103.8
4    5.0              Miami Heat  36  244.2  38.3  86.9  0.441  12.2  37.3  0.327  26.1  49.7  0.526  18.4  24.1  0.764   9.4  32.3  41.7  24.0  8.1  4.2  14.3  22.1  107.3
5    6.0      Los Angeles Lakers  37  240.7  38.1  87.1  0.437  11.0  32.6  0.337  27.1  54.5  0.498  17.7  22.2  0.797   9.7  32.3  42.0  22.9  8.2  4.1  16.0  21.3  104.9
6    7.0         Toronto Raptors  37  242.0  37.9  89.5  0.423  12.7  38.3  0.332  25.2  51.3  0.491  16.6  21.9  0.758  11.7  35.4  47.1  24.9  6.7  5.6  16.5  20.1  105.1
7    8.0          Indiana Pacers  37  242.0  39.2  88.6  0.442  10.8  32.3  0.336  28.3  56.3  0.503  17.0  21.8  0.780  10.4  34.6  45.1  23.4  6.5  4.8  14.3  18.9  106.2
8    9.0        Dallas Mavericks  36  242.1  40.6  90.9  0.447  11.4  33.8  0.337  29.2  57.1  0.511  16.7  21.6  0.773  11.1  34.5  45.6  23.3  7.2  3.9  12.7  21.4  109.3
9   10.0           Chicago Bulls  37  241.4  38.2  83.7  0.457  10.9  32.5  0.336  27.3  51.2  0.534  19.8  26.1  0.761  10.5  36.6  47.2  24.1  8.3  6.5  18.3  19.9  107.2
10  11.0   Oklahoma City Thunder  37  242.7  40.8  89.9  0.454  10.8  31.3  0.343  30.1  58.6  0.513  15.0  18.7  0.802  10.6  34.4  45.1  22.7  6.9  4.2  14.3  23.0  107.4
11  12.0         Houston Rockets  35  241.4  42.3  92.2  0.459  12.5  35.7  0.351  29.8  56.6  0.527  16.7  22.3  0.751  10.8  35.1  45.9  26.1  7.9  4.7  15.4  21.6  113.9
12  13.0       San Antonio Spurs  35  244.3  42.6  92.0  0.463  12.5  34.5  0.361  30.2  57.5  0.525  17.1  22.3  0.766   9.7  36.1  45.8  25.1  7.2  4.7  12.6  19.8  114.8
13  14.0           Brooklyn Nets  36  243.5  40.7  93.9  0.433  12.2  34.4  0.354  28.5  59.4  0.479  18.1  23.4  0.771  11.5  35.9  47.4  21.2  7.8  5.6  13.5  21.2  111.7
14  15.0      Philadelphia 76ers  38  241.3  39.1  85.6  0.457   9.8  27.6  0.355  29.3  57.9  0.505  18.0  24.3  0.738   8.2  32.5  40.7  21.9  7.4  4.0  14.2  20.9  105.9
15  16.0         Milwaukee Bucks  38  240.7  38.6  93.4  0.414  14.2  38.4  0.370  24.4  55.0  0.444  15.8  20.6  0.769   9.8  36.3  46.0  23.9  7.2  4.6  14.5  21.4  107.3
16  17.0  Minnesota Timberwolves  36  244.9  41.8  91.6  0.457  11.2  31.4  0.356  30.6  60.1  0.509  19.6  24.8  0.788  11.1  37.4  48.5  23.3  7.4  5.5  15.7  22.3  114.4
17  18.0        Sacramento Kings  38  242.6  39.5  84.9  0.465  11.7  33.4  0.349  27.8  51.5  0.540  17.9  22.5  0.796   9.3  33.8  43.1  24.3  8.1  4.3  15.3  19.0  108.5
18  19.0         New York Knicks  37  240.7  39.5  85.9  0.460  13.6  35.1  0.386  25.9  50.8  0.511  19.3  26.2  0.739  10.2  36.2  46.3  23.9  7.0  4.9  14.2  19.9  111.9
19  20.0    Los Angeles Clippers  38  240.7  39.4  89.7  0.439  11.9  34.4  0.346  27.5  55.3  0.497  19.0  24.7  0.768  10.9  34.4  45.3  22.8  8.4  5.0  15.3  23.5  109.8
20  21.0     Cleveland Cavaliers  37  240.7  43.4  89.7  0.484  12.7  33.7  0.377  30.7  56.0  0.549  14.1  18.3  0.770   9.9  33.5  43.5  25.9  8.8  6.6  12.9  19.6  113.6
21  22.0         Detroit Pistons  38  240.0  41.7  88.1  0.474  11.4  30.4  0.377  30.3  57.7  0.525  16.2  20.9  0.774  10.2  33.1  43.3  25.1  8.2  5.8  14.1  20.1  111.1
22  23.0            Phoenix Suns  37  242.0  41.8  87.5  0.477  11.9  32.0  0.373  29.8  55.4  0.538  19.7  25.6  0.771   9.0  36.0  45.1  23.8  7.6  5.6  16.2  23.4  115.2
23  24.0   Golden State Warriors  38  242.0  41.6  88.4  0.471  13.5  34.8  0.387  28.1  53.6  0.525  16.2  21.0  0.774  10.4  35.8  46.3  25.2  8.1  5.4  16.3  20.4  112.9
24  25.0  Portland Trail Blazers  38  240.7  40.8  91.9  0.444  12.4  34.3  0.361  28.4  57.5  0.494  19.6  25.6  0.767  11.8  36.2  47.9  23.6  7.2  5.3  13.0  19.9  113.6
25  26.0      Washington Wizards  36  240.7  43.4  89.1  0.487  12.3  33.3  0.369  31.1  55.8  0.557  21.1  26.9  0.786  10.6  35.9  46.5  25.6  7.0  5.5  15.6  21.3  120.1
26  27.0    New Orleans Pelicans  37  242.0  41.9  89.7  0.468  12.6  34.1  0.369  29.4  55.6  0.528  20.4  25.6  0.797   9.8  36.5  46.3  24.5  7.7  4.4  15.1  19.9  116.9
27  28.0       Charlotte Hornets  39  241.9  42.2  88.3  0.479  12.4  34.9  0.355  29.8  53.4  0.559  14.2  18.3  0.773  10.8  35.3  46.1  27.0  8.3  4.8  14.9  21.2  111.0
28  29.0           Atlanta Hawks  37  242.0  42.8  90.0  0.476  11.5  32.1  0.359  31.3  57.8  0.541  20.1  26.0  0.774  11.2  35.5  46.8  24.6  8.9  6.6  15.8  20.5  117.3
29  30.0       Memphis Grizzlies  38  240.7  41.8  90.1  0.464  12.3  33.8  0.365  29.5  56.3  0.524  20.3  25.7  0.790   9.9  35.1  44.9  25.1  7.9  5.4  14.6  19.8  116.3
30   NaN          League Average  37  241.7  40.4  88.9  0.455  11.9  33.6  0.355  28.5  55.3  0.516  17.6  22.9  0.771  10.3  35.0  45.3  24.0  7.6  5.1  14.8  20.8  110.4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM