简体   繁体   English

获取嵌套的 JSON 为 Pandas 数据框

[英]Get nested JSON to be pandas dataframe

I have this JSON,我有这个 JSON,

{
  "status": "ok",
  "stocks": [
    {
      "symbol": "TSLA",
      "price": [
        {
          "date": "2021-10-19",
          "close": 141.98
        }
      ]
    },
    {
      "symbol": "AMZN",
      "price": [
        {
          "date": "2021-10-19",
          "close": 3444.15
        }
      ]
    }
  ]
}

I need to create two pandas dataframe from the first array and second array based on the price without using the stock name to identify it.我需要根据price从第一个数组和第二个数组创建两个 Pandas 数据框,而不使用股票名称来识别它。

Expected:预期的:

First array (which is based on TSLA), df1第一个数组(基于 TSLA),df1

    date      close
2021-10-19   141.98

Second array (which is based on AMZN), df2第二个数组(基于 AMZN),df2

    date      close
2021-10-19   3444.15

I had explored json_normalize function from pandas but it seem like it can only flatten first level of a JSON.我已经从json_normalize探索了json_normalize函数,但它似乎只能展平 JSON 的第一级。 How can I flatten the second layer and only get my expected result?我怎样才能压平第二层并只得到我预期的结果?

EDIT:编辑:

Managed to somewhat get it to work, less not my expected result totally.设法让它在某种程度上起作用,而不是完全不是我的预期结果。 Using this,使用这个,

df = pd.json_normalize(data['stocks'], record_path='price', meta=['symbol'])

It returns它返回

    date      close    symbol
2021-10-19   141.98    TSLA
2021-10-19   3444.15   AMZN

Is there a way to make the two stocks split when performing the json_normalize like what my expected result suppose to be?在执行json_normalize时,有没有办法让两只股票分开,就像我预期的结果那样?

In your example在你的例子中

df = df.set_index('symbol')

Then just call然后就打电话

df.loc['TSLA']

Or you can do或者你可以做

d = {x : y for x ,y in df.groupby(df.index)}
d[0]
d[1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM