简体   繁体   中英

organise .json data to pandas dataframe/excel

I had some progression in my webscraping stats from https://liiga.fi/tilastot/joukkueet .

edit. Now i found better way. So example -> https://liiga.fi/api/v1/teams/stats/2017/runkosarja/

How i can scrape stats, like wins, made goals, etc from there and create stats table?

Only get this:

id  season  ...              cacheUpdateDate buyTicketsUrl
0       1    2022  ...  2021-11-09T16:54:56.243678Z           NaN
1       4    2022  ...  2021-11-09T16:30:45.387198Z           NaN
2       3    2022  ...  2021-11-09T16:57:55.660414Z           NaN
3       2    2022  ...  2021-11-09T17:05:34.763264Z           NaN
4       6    2022  ...  2021-11-09T16:52:37.081451Z           NaN
..    ...     ...  ...                          ...           ...
518  6909    2022  ...  2021-11-09T17:26:48.886193Z           NaN
519  6898    2022  ...  2021-11-09T16:54:08.527658Z           NaN
520  6902    2022  ...  2021-11-09T16:30:00.998782Z           NaN
521  6910    2022  ...  2021-11-09T17:05:19.950635Z           NaN
522  6906    2022  ...  2021-11-09T16:39:19.910866Z           NaN

[523 rows x 11 columns]

You could use Pandas DataFrames as:

import requests
import json
import pandas as pd

url = "https://liiga.fi/api/v1/games?tournament=all&season=2021" 
r = requests.get(url)
r = r.json()

df = pd.DataFrame(r)

print(df)

Since you want to reproduce the standings table, you still need to analyse the raw data that you have. To help you analyse the data, you could create separate the homeTeam and awayTeam columns into separate DataFrames as

homeTeam_list = [ homeTeam for homeTeam in df['homeTeam']]
awayTeam_list = [ awayTeam for awayTeam in df['awayTeam']]

df_homeTeam = pd.DataFrame(homeTeam_list)
df_awayTeam = pd.DataFrame(awayTeam_list)

print(df_awayTeam)

In this way you can make loops to calculate the scores for each game, who won, the table score and so on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM