繁体   English   中英

使用BeautifulSoup4解析数据

[英]Parse data with BeautifulSoup4

import requests
from bs4 import BeautifulSoup

request = requests.get("http://www.lolesports.com/en_US/worlds/world_championship_2016/standings/default")
content = request.content
soup = BeautifulSoup(content, "html.parser")
team_name = soup.findAll('text', {'class': 'team-name'})

print(team_name)

我正在尝试从url解析数据:“ http://www.lolesports.com/en_US/worlds/world_championship_2016/standings/default “。 <text class="team-name">SK Telecom T1</text>是各个团队的名称。 我想做的是解析该数据(SK Telecom T1)并将其打印到屏幕上,但是我得到了[]空列表。 我究竟做错了什么?

该网站依赖javascript进行加载。 请求不会解释JS,因此它将无法解析数据。

对于这样的网站,Selenium会更好。 它使用Firefox(或其他驱动程序)作为整个网站(包括JS)的解释器。

您不需要硒,只需向http://api.lolesports.com/api/v1/leagues发出简单的get请求即可以json格式检索所有动态内容:

import requests

data = requests.get("http://api.lolesports.com/api/v1/leagues?slug=worlds").json()

它为您提供了大量数据,您想要的全部似乎在data["teams"] 其中的一个片段是:

[{'id': 2, 'slug': 'bangkok-titans', 'name': 'Bangkok Titans', 'teamPhotoUrl': 'http://na.lolesports.com/sites/default/files/BKT_GPL.TMPROFILE_0.png', 'logoUrl': 'http://assets.lolesports.com/team/bangkok-titans-597g0x1v.png', 'acronym': 'BKT', 'homeLeague': 'urn:rg:lolesports:global:league:league:12', 'altLogoUrl': None, 'createdAt': '2014-07-17T18:34:47.000Z', 'updatedAt': '2015-09-29T16:09:36.000Z', 'bios': {'en_US': 'The Bangkok Titans are the undisputed champions of Thailand’s League of Legends esports scene. They achieved six consecutive 1st place finishes in the Thailand Pro League from 2014 to 2015. However, they aren’t content with just domestic domination.

如果有以下情况,则列出了每个团队:

In [1]: import requests


In [2]: data = requests.get("http://api.lolesports.com/api/v1/leagues?slug=worlds").json()


In [3]: for d in data["teams"]:
   ...:         print(d["name"])
   ...:     
Bangkok Titans
ahq e-Sports Club
SK Telecom T1
TSM
Fnatic
Cloud9 
Counter Logic Gaming
H2K
Edward Gaming
INTZ e-Sports
paiN Gaming
Origen
LGD Gaming
Invictus Gaming
Royal Never Give Up
Flash Wolves
Splyce
Samsung Galaxy
KT Rolster
ROX Tigers
G2 Esports
I May
Albus NoX Luna

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM