简体   繁体   English

Beautifulsoup 解析正确的值

[英]Beautifulsoup parsing the right value

https://www.transfermarkt.us/manchester-city/kader/verein/281/saison_id/2020/plus/1 https://www.transfermarkt.us/manchester-city/kader/verein/281/saison_id/2020/plus/1

From the website above, I am trying to parse the number, player name, and the position using Beautifulsoup.从上面的网站,我尝试使用 Beautifulsoup 解析号码、球员姓名和位置。

在此处输入图片说明

For example, I want to print例如,我想打印

  1. Ederson Goalkeeper埃德森守门员
  2. Arijanet Muric Goalkeeper ...阿里贾内特·穆里奇守门员...

I tried something like我试过类似的东西

page = requests.get(url, headers={'User-Agent':'Mozilla/5.0'})
soup = bs(page.content, 'html.parser')
rows = soup.find("table", class_="items").find('tbody').find_all('a')
for row in rows:
    if row.find('img') is None:
        continue
    print(row.find('img')['title'])
    print('\n')

First to print the name, but it doesn't necessarily indicate the player name and sometimes the value is empty.首先打印姓名,但不一定表示玩家姓名,有时值为空。 Also, getting the number and the position data seem impossible in this branch.此外,在这个分支中获取数字和位置数据似乎是不可能的。 How can I access other branches at the same time to get the number and the position data as well?如何同时访问其他分支以获取编号和位置数据?

To get player numbers, names and positions, you can use this example:要获取球员编号、姓名和位置,您可以使用以下示例:

import requests
from bs4 import BeautifulSoup

url = 'https://www.transfermarkt.us/manchester-city/kader/verein/281/saison_id/2020/plus/1'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')

for row in soup.select('table.items > tbody > tr:has(td)'):
    data = row.get_text(strip=True, separator='|').split('|')
    print('{:<5} {:<30} {}'.format(data[0], data[1], data[3] if len(data) == 10 else data[4]))

Prints:印刷:

31    Ederson                        Goalkeeper
-     Zack Steffen                   Goalkeeper
49    Arijanet Muric                 Goalkeeper
33    Scott Carson                   Goalkeeper
14    Aymeric Laporte                Centre-Back
5     John Stones                    Centre-Back
6     Nathan Aké                     Centre-Back
50    Eric García                    Centre-Back
30    Nicolás Otamendi               Centre-Back
25    Fernandinho                    Centre-Back
34    Philippe Sandler               Centre-Back
24    Tosin Adarabioyo               Centre-Back
78    Taylor Harwood-Bellis          Centre-Back
22    Benjamin Mendy                 Left-Back
11    Oleksandr Zinchenko            Left-Back
12    Angeliño                       Left-Back
2     Kyle Walker                    Right-Back
27    João Cancelo                   Right-Back
-     Yan Couto                      Right-Back
16    Rodri                          Defensive Midfield
8     Ilkay Gündogan                 Central Midfield
47    Phil Foden                     Central Midfield
17    Kevin De Bruyne                Attacking Midfield
-     Luka Ilic                      Attacking Midfield
7     Raheem Sterling                Left Winger
-     Marlos Moreno                  Left Winger
20    Bernardo Silva                 Right Winger
26    Riyad Mahrez                   Right Winger
21    Ferran Torres                  Right Winger
-     Patrick Roberts                Right Winger
9     Gabriel Jesus                  Centre-Forward
10    Sergio Agüero                  Centre-Forward

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM