AttributeError: 'NoneType' object 没有属性 'text' - BeautifulShop

Question

I have a little code for scraping info from fbref (link for data: https://fbref.com/en/comps/9/stats/Premier-League-Stats ) and it worked well but now I have some problems with some features (I've checked that the fields which don't work now are"player","nationality","position","squad","age","birth_year").我有一些用于从 fbref抓取信息的代码（数据链接： https://fbref.com/en/comps/9/stats/Premier-League-Stats ），它运行良好，但现在我在某些功能上遇到了一些问题（我检查了现在不起作用的字段是“玩家”、“国籍”、“位置”、“小队”、“年龄”、“出生年份”）。 I have also checked that the fields have the same name in the web that it used to be.我还检查了 web 中的字段是否与以前的名称相同。 Any ideas/help to solve the problem?有什么想法/帮助解决问题吗？

Many Thanks!非常感谢！


import requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np
import re
import sys, getopt
import csv

def get_tables(url):
    res = requests.get(url)
    ## The next two lines get around the issue with comments breaking the parsing.
    comm = re.compile("<!--|-->")
    soup = BeautifulSoup(comm.sub("",res.text),'lxml')
    all_tables = soup.findAll("tbody")
    team_table = all_tables[0]
    player_table = all_tables[1]
    return player_table, team_table

def get_frame(features, player_table):
    pre_df_player = dict()
    features_wanted_player = features
    rows_player = player_table.find_all('tr')
    for row in rows_player:
        if(row.find('th',{"scope":"row"}) != None):
    
            for f in features_wanted_player:
                cell = row.find("td",{"data-stat": f})
                a = cell.text.strip().encode()
                text=a.decode("utf-8")
                if(text == ''):
                    text = '0'
                if((f!='player')&(f!='nationality')&(f!='position')&(f!='squad')&(f!='age')&(f!='birth_year')):
                    text = float(text.replace(',',''))
                if f in pre_df_player:
                    pre_df_player[f].append(text)
                else:
                    pre_df_player[f] = [text]
    df_player = pd.DataFrame.from_dict(pre_df_player)
    return df_player

stats = ["player","nationality","position","squad","age","birth_year","games","games_starts","minutes","goals","assists","pens_made","pens_att","cards_yellow","cards_red","goals_per90","assists_per90","goals_assists_per90","goals_pens_per90","goals_assists_pens_per90","xg","npxg","xa","xg_per90","xa_per90","xg_xa_per90","npxg_per90","npxg_xa_per90"]

def frame_for_category(category,top,end,features):
    url = (top + category + end)
    player_table, team_table = get_tables(url)
    df_player = get_frame(features, player_table)
    return df_player

top='https://fbref.com/en/comps/9/'
end='/Premier-League-Stats'
df1 = frame_for_category('stats',top,end,stats)

df1

Answer 1

I suggest loading the table with panda's read_html .我建议用熊猫的read_html加载表格。 There is a direct link to this table under Share & Export --> Embed this Table.在 Share & Export --> Embed this Table 下有一个指向该表的直接链接。

import pandas as pd
df = pd.read_html("https://widgets.sports-reference.com/wg.fcgi?css=1&site=fb&url=%2Fen%2Fcomps%2F9%2Fstats%2FPremier-League-Stats&div=div_stats_standard", header=1)

This outputs a list of dataframes, the table can be accessed as df[0] .这会输出一个数据帧列表，该表可以作为df[0]访问。 Output df[0].head() : Output df[0].head() ：

	Rk Rk	Player播放器	Nation国家	Pos位置	Squad队	Age年龄	Born出生	MP国会议员	Starts开始	Min敏	90s 90年代	Gls玻璃钢	Ast阿斯特	G-PK G-PK	CrdY CrdY	Gls.1 GLS.1	Ast.1 Ast.1	G+A G+A	G-PK.1 G-PK.1	G+A-PK G+A-PK	xG xG	npxG npxG	xA xA	npxG+xA npxG+xA	xG.1 xG.1	xA.1 xA.1	xG+xA xG+xA	npxG.1 npxG.1	npxG+xA.1 npxG+xA.1	Matches火柴
0 0	1 1	Patrick van Aanholt帕特里克·范·安霍尔特	nl NED内德	DF东风	Crystal Palace水晶皇宫	30-190 30-190	1990 1990	16 16	15 15	1324 1324	14.7 14.7	0 0	1 1	0 0	1 1	0 0	0.07 0.07	0.07 0.07	0 0	0.07 0.07	1.2 1.2	1.2 1.2	0.8 0.8	2 2	0.08 0.08	0.05 0.05	0.13 0.13	0.08 0.08	0.13 0.13	Matches火柴
1 1	2 2	Tammy Abraham塔米亚伯拉罕	eng ENG英文	FW固件	Chelsea切尔西	23-156 23-156	1997 1997	20 20	12 12	1021 1021	11.3 11.3	6 6	1 1	6 6	0 0	0.53 0.53	0.09 0.09	0.62 0.62	0.53 0.53	0.62 0.62	5.6 5.6	5.6 5.6	0.9 0.9	6.5 6.5	0.49 0.49	0.08 0.08	0.57 0.57	0.49 0.49	0.57 0.57	Matches火柴
2 2	3 3	Che Adams切亚当斯	eng ENG英文	FW固件	Southampton南安普敦	24-237 24-237	1996 1996	26 26	22 22	1985 1985年	22.1 22.1	5 5	4 4	5 5	1 1	0.23 0.23	0.18 0.18	0.41 0.41	0.23 0.23	0.41 0.41	5.5 5.5	5.5 5.5	4.3 4.3	9.9 9.9	0.25 0.25	0.2 0.2	0.45 0.45	0.25 0.25	0.45 0.45	Matches火柴
3 3	4 4	Tosin Adarabioyo托辛·阿达拉比约	eng ENG英文	DF东风	Fulham富勒姆	23-164 23-164	1997 1997	23 23	23 23	2070 2070	23 23	0 0	0 0	0 0	1 1	0 0	0 0	0 0	0 0	0 0	1 1	1 1	0.1 0.1	1.1 1.1	0.04 0.04	0.01 0.01	0.05 0.05	0.04 0.04	0.05 0.05	Matches火柴
4 4	5 5	AdriÃ¡n阿德里安	es ESP ESP	GK GK	Liverpool利物浦	34-063 34-063	1987 1987年	3 3	3 3	270 270	3 3	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	0 0	Matches火柴

Answer 2

If you're only after the player stats, change player_table = all_tables[1] to player_table = all_tables[2] , because now you are feeding team table into get_frame function.如果您只关注球员统计数据， player_table = all_tables[1]更改为player_table = all_tables[2] ，因为现在您将团队表格输入get_frame function。

I tried it and it worked fine after that.我试过了，之后效果很好。

AttributeError: 'NoneType' object 没有属性 'text' - BeautifulShop

问题描述

2 个解决方案

解决方案1
2 2021-03-07 15:58:44

解决方案2
2 已采纳 2021-03-07 16:20:13

AttributeError: 'NoneType' object 没有属性 'text' - BeautifulShop

问题描述

2 个解决方案

解决方案1 2 2021-03-07 15:58:44

解决方案2 2 已采纳 2021-03-07 16:20:13

解决方案1
2 2021-03-07 15:58:44

解决方案2
2 已采纳 2021-03-07 16:20:13