简体   繁体   English

如何将多个 json/python 字典合并到 1 个数据帧中

[英]How to merge multiple json/ python dictionaries into 1 dataframe

I have the following json files I am getting from an API call, I want to be able to combine the data into 1 dataframe so I can write it to a csv file using pandas.我有以下从 API 调用中获取的 json 文件,我希望能够将数据合并到 1 个数据帧中,以便我可以使用 Pandas 将其写入 csv 文件。

raw json原始 json

{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'}

{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'}

{'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'}

{'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}

Here is what I am trying to make data do这是我试图让数据做的事情

ticker(as index)   country   currency   exchange   finnhubIndustry    ipo      logo   ...
    
A                 'US'      'USD'      'NEW YORK.. 'Life Science..'   1999-11  'http://..
AA                'US'      'USD'      'NYSE'      'Metals & Mi...'   2016-10  'http://..
AACG              'CN'      'CNY'      'NASDAQ'    'Diversified...'   2008-01  'http://..
AADR              'US'      'USD'      'NASDAQ'    'N/A'              ''       'http://..

cols = ['country', 'currency', 'exchange', 'finnhubIndustry', 'ipo', 'logo', 'marketCapitalization', 'name', 'phone', 'shareOutstanding', 'ticker', 'weburl']

When I have done something similar to this in the past I used当我过去做过类似的事情时,我用过

                datastock = requests.get(url).json()
                cols = ['o', 'h', 'l', 'c', 'v', 't', 's']
                df = pandas.DataFrame(datastock, columns=cols)

but I was getting the data already together like this但我已经像这样把数据放在一起了

{'c': [10.35, 10.36, 10.37, 10.36, 10.44, 10.45, 10.4, 10.416, 10.37, 10.43, 10.4, 10.35, 10.3, 10.12, 10.04, 10.23, 10.1, 10.1, 10.13, 10.09, 10.2, 10.15, 10.15, 10.1, 10.15, 10.125, 10.08, 10.055, 10.03, 10.04, 10.01, 10.04, 10.03, 10.03, 10.05, 10.1, 10.2, 10.08, 10.44], 'h': [10.44, 10.41, 10.4, 10.42, 10.45, 10.49, 10.45, 10.46, 10.5, 10.5, 10.45, 10.45, 10.4, 10.28, 10.39, 10.25, 10.2, 10.16, 10.17, 10.15, 10.2, 10.17, 10.18, 10.13, 10.24, 10.22, 10.15, 10.097, 10.07, 10.1, 10.09, 10.08, 10.04, 10.07, 10.1, 10.12, 10.2, 10.2, 10.45], 'l': [10.3, 10.34, 10.33, 10.35, 10.37, 10.425, 10.38, 10.33, 10.35, 10.38, 10.37, 10.34, 10.23, 10.1, 10, 10.042, 10.05, 10.05, 10.05, 10.06, 10.07, 10.11, 10.11, 10.05, 10.03, 10.07, 10.05, 10.02, 9.97, 10, 10, 10.02, 10.02, 10.02, 10.01, 10.01, 10.03, 10.06, 10.18], 'o': [10.42, 10.4, 10.35, 10.35, 10.37, 10.46, 10.41, 10.46, 10.5, 10.38, 10.37, 10.45, 10.365, 10.28, 10.39, 10.05, 10.2, 10.1, 10.1, 10.1, 10.1, 10.15, 10.17, 10.125, 10.24, 10.22, 10.07, 10.09, 10.07, 10.1, 10.09, 10.045, 10.04, 10.07, 10.02, 10.01, 10.1, 10.157, 10.2], 's': 'ok', 't': [1594684800, 1594771200, 1594857600, 1594944000, 1595203200, 1595289600, 1595376000, 1595462400, 1595548800, 1595808000, 1595894400, 1595980800, 1596067200, 1596153600, 1596412800, 1596499200, 1596585600, 1596672000, 1596758400, 1597017600, 1597104000, 1597190400, 1597276800, 1597363200, 1597622400, 1597708800, 1597795200, 1597881600, 1597968000, 1598227200, 1598313600, 1598400000, 1598486400, 1598572800, 1598832000, 1598918400, 1599004800, 1599091200, 1599177600], 'v': [17017800, 2752500, 1143800, 391000, 446900, 484800, 682300, 79600, 1295100, 15616, 537200, 99700, 717200, 682300, 329229, 371700, 939100, 214000, 149700, 461200, 304200, 411900, 37200, 141800, 371200, 488900, 750300, 311800, 443000, 554029, 176300, 152400, 48700, 571900, 136227, 85200, 49300, 200700, 329555]}

I'm not sure if my best route is to try and combine json data to look like this and then convert or if there is an easier way.我不确定我的最佳途径是否是尝试将 json 数据组合成这样,然后进行转换,或者是否有更简单的方法。

I wonder if your "raw json" is really what you mean.我想知道您的“原始 json”是否真的是您的意思。 Normally, a json file contains one single object, which in your example is 4. I would rather expect your raw json file to be like通常,一个 json 文件包含一个对象,在您的示例中为 4。我希望您的原始 json 文件像

[
{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'},
{'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'},
{'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'},
{'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}
]

which is than an array of objects.这不是一个对象数组。 Or you may mean to have multiple json files, each with one object.或者您可能想拥有多个 json 文件,每个文件都有一个对象。 Depends on your file format, you may use pandas.read_json取决于您的文件格式,您可以使用pandas.read_json

But if you some how massaged the objects into a Python list of dicts, you can just use pandas.DataFrame to create it.但是,如果您了解如何将对象转换为 Python 字典列表,则可以使用pandas.DataFrame来创建它。 It will be exactly like what you want:它将与您想要的完全一样:

>>> x = [
... {'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Life Sciences Tools & Services', 'ipo': '1999-11-18', 'logo': 'https://static.finnhub.io/logo/5f1f8412-80eb-11ea-bd05-00000000092a.png', 'marketCapitalization': 30719.97, 'name': 'Agilent Technologies Inc', 'phone': '14083458886', 'shareOutstanding': 308.309635, 'ticker': 'A', 'weburl': 'https://www.agilent.com/'},
... {'country': 'US', 'currency': 'USD', 'exchange': 'NEW YORK STOCK EXCHANGE, INC.', 'finnhubIndustry': 'Metals & Mining', 'ipo': '2016-10-18', 'logo': '', 'marketCapitalization': 2727.509, 'name': 'Alcoa Corp', 'phone': '14123152900', 'shareOutstanding': 185.924291, 'ticker': 'AA', 'weburl': 'https://www.alcoa.com/global/en/home.asp'},
... {'country': 'CN', 'currency': 'CNY', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'Diversified Consumer Services', 'ipo': '2008-01-29', 'logo': '', 'marketCapitalization': 35.81037, 'name': 'ATA Creativity Global', 'phone': '861065181133', 'shareOutstanding': 47.592384, 'ticker': 'AACG', 'weburl': 'http://www.ata.net.cn'},
... {'country': 'US', 'currency': 'USD', 'exchange': 'NASDAQ NMS - GLOBAL MARKET', 'finnhubIndustry': 'N/A', 'ipo': '', 'logo': '', 'marketCapitalization': 738.99, 'name': 'Artius Acquisition Inc', 'phone': '12123097668', 'shareOutstanding': 87.54375, 'ticker': 'AACQU', 'weburl': ''}
... ]
>>> pandas.DataFrame(x)
  country currency  ... ticker                                    weburl
0      US      USD  ...      A                  https://www.agilent.com/
1      US      USD  ...     AA  https://www.alcoa.com/global/en/home.asp
2      CN      CNY  ...   AACG                     http://www.ata.net.cn
3      US      USD  ...  AACQU                                          

[4 rows x 12 columns]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM