简体   繁体   English

将 .json 数据组织到 Pandas 数据框/excel

[英]organise .json data to pandas dataframe/excel

I had some progression in my webscraping stats from https://liiga.fi/tilastot/joukkueet .我在https://liiga.fi/tilastot/joukkueet 的网页抓取统计数据中取得了一些进展。

edit.编辑。 Now i found better way.现在我找到了更好的方法。 So example -> https://liiga.fi/api/v1/teams/stats/2017/runkosarja/例如 -> https://liiga.fi/api/v1/teams/stats/2017/runkosarja/

How i can scrape stats, like wins, made goals, etc from there and create stats table?我如何从那里抓取统计数据,如胜利、进球等并创建统计表?

Only get this:只得到这个:

id  season  ...              cacheUpdateDate buyTicketsUrl
0       1    2022  ...  2021-11-09T16:54:56.243678Z           NaN
1       4    2022  ...  2021-11-09T16:30:45.387198Z           NaN
2       3    2022  ...  2021-11-09T16:57:55.660414Z           NaN
3       2    2022  ...  2021-11-09T17:05:34.763264Z           NaN
4       6    2022  ...  2021-11-09T16:52:37.081451Z           NaN
..    ...     ...  ...                          ...           ...
518  6909    2022  ...  2021-11-09T17:26:48.886193Z           NaN
519  6898    2022  ...  2021-11-09T16:54:08.527658Z           NaN
520  6902    2022  ...  2021-11-09T16:30:00.998782Z           NaN
521  6910    2022  ...  2021-11-09T17:05:19.950635Z           NaN
522  6906    2022  ...  2021-11-09T16:39:19.910866Z           NaN

[523 rows x 11 columns]

You could use Pandas DataFrames as:您可以将 Pandas DataFrames 用作:

import requests
import json
import pandas as pd

url = "https://liiga.fi/api/v1/games?tournament=all&season=2021" 
r = requests.get(url)
r = r.json()

df = pd.DataFrame(r)

print(df)

Since you want to reproduce the standings table, you still need to analyse the raw data that you have.由于您想重现积分榜,您仍然需要分析您拥有的原始数据。 To help you analyse the data, you could create separate the homeTeam and awayTeam columns into separate DataFrames as为了帮助您分析数据,您可以将homeTeamawayTeam列创建为单独的 DataFrames 作为

homeTeam_list = [ homeTeam for homeTeam in df['homeTeam']]
awayTeam_list = [ awayTeam for awayTeam in df['awayTeam']]

df_homeTeam = pd.DataFrame(homeTeam_list)
df_awayTeam = pd.DataFrame(awayTeam_list)

print(df_awayTeam)

In this way you can make loops to calculate the scores for each game, who won, the table score and so on.通过这种方式,您可以进行循环以计算每场比赛的分数、谁赢了、桌面分数等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM