简体   繁体   English

将Request中的JSON数据转换成Pandas DataFrame

[英]Convert JSON data from Request into Pandas DataFrame

I'm trying to scrape some data from a web page and put it into a pandas dataframe. I tried and read many things but I just cannot get what I want.我试图从 web 页面抓取一些数据并将其放入 pandas dataframe。我尝试并阅读了很多东西,但我就是无法得到我想要的。 And I want a dataframe with all the data in separate columns and rows.我想要一个 dataframe,所有数据都在单独的列和行中。 Below is my code.下面是我的代码。

import requests
import json
import pandas as pd
from pandas.io.json import json_normalize

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

a = json.loads(r.text)

res = json_normalize(a)
##print(res)

df = pd.DataFrame(res)
print(df)

##df = pd.read_json(a)
##print(df)

pd.read_json(a) doesn't seem to work in any way. pd.read_json(a)似乎没有任何作用。

Or, more simply:或者,更简单地说:

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

j = r.json()

df = pd.DataFrame.from_dict(j)

you can do it this way:你可以这样做:

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php')

j = r.json()

df = pd.DataFrame([[d['v'] for d in x['c']] for x in j['rows']],
                  columns=[d['label'] for d in j['cols']])

Result:结果:

In [217]: df
Out[217]:
                   Country  Weight  CAPE    PE    PC   PB   PS   DY  RS 26W  RS 52W  Score
0                   Russia     1.1   5.9   9.1   5.1  1.0  0.9  3.7    1.22    1.35    1.0
1                    China     1.1  12.8   7.2   4.5  0.9  0.6  4.2    1.05    1.13    2.0
2                    Italy     1.0  12.7  31.5   5.7  1.2  0.6  3.3    1.13    1.11    3.0
3                  Austria     0.2  14.3  21.7   7.3  1.1  0.7  2.5    1.10    1.15    4.0
4                   Norway     0.4  12.8  32.4   7.4  1.6  1.2  4.0    1.10    1.17    5.0
5                  Hungary     0.0  12.5  49.8   7.5  1.4  0.7  2.3    1.12    1.19    6.0
6                    Spain     1.2  11.7  24.7   7.0  1.4  1.2  3.7    1.08    1.11    7.0
7                    Czech     0.0   8.9  13.6   6.1  1.3  1.0  6.7    1.03    1.05    8.0
8                   Brazil     1.3   9.8  42.1   7.4  1.6  1.2  3.0    1.06    1.24    9.0
9                 Portugal     0.1  11.3  29.0   4.8  1.5  0.7  3.9    1.05    1.06   10.0
..                     ...     ...   ...   ...   ...  ...  ...  ...     ...     ...    ...
42        EMERGING MARKETS    13.5  14.0  16.0   8.8  1.6  1.3  2.9    1.04    1.11    NaN
43        DEVELOPED EUROPE    22.4  16.6  26.5   9.9  1.8  1.1  3.2    1.06    1.08    NaN
44         EMERGING EUROPE     1.7   8.6  10.9   5.8  1.1  0.8  3.4    1.13    1.20    NaN
45        EMERGING AMERICA     3.0  15.2  30.1   9.4  1.9  1.2  2.4    1.03    1.11    NaN
46  DEVELOPED ASIA-PACIFIC    17.7   NaN  17.7   8.8  1.3  0.9  2.5    1.03    1.09    NaN
47   EMERGING ASIA-PACIFIC     6.9  14.9  15.1   9.1  1.8  1.4  2.7    1.01    1.08    NaN
48         EMERGING AFRICA     0.8   NaN  16.5  10.6  2.0  1.4  3.8    1.06    1.12    NaN
49             MIDDLE EAST     1.3   NaN  13.7  11.8  1.5  1.8  3.9    1.06    1.10    NaN
50                    BRIC     5.9  11.8  14.6   7.4  1.4  1.2  2.7    1.06    1.16    NaN
51     OTHER EMERGING MKT.     2.5   NaN  17.7  12.9  1.8  1.5  3.1    1.16    1.20    NaN

[52 rows x 11 columns]

And one step simpler than Justin's (already helpful) response...by putting .json() at the end of the r = requests.get line并且比 Justin 的(已经很有帮助)响应简单一步……通过将 .json() 放在r = requests.get行的末尾

import requests
import pandas as pd

r = requests.get('http://www.starcapital.de/test/Res_Stockmarketvaluation_FundamentalKZ_Tbl.php').json()

df = pd.DataFrame.from_dict(r)

You may also want pd.json_normalize<\/code><\/a> for when your data isn't exactly the way from_dict() expects.当您的数据与 from_dict() 期望的方式不完全一样时,您可能还需要pd.json_normalize<\/code><\/a> 。

For example:例如:

data = [
    {
        "id": 1,
        "name": "Cole Volk",
        "fitness": {"height": 130, "weight": 60},
    },
    {"name": "Mark Reg", "fitness": {"height": 130, "weight": 60}},
    {
        "id": 2,
        "name": "Faye Raker",
        "fitness": {"height": 130, "weight": 60},
    },
]
pd.json_normalize(data, max_level=1)
    id        name  fitness.height  fitness.weight
0  1.0   Cole Volk             130              60
1  NaN    Mark Reg             130              60
2  2.0  Faye Raker             130              60

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM