简体   繁体   English

字典有一个单独的字典,我想在 python 中的 dataframe 中转换它,以便该表包含具有子列的列

[英]A dictionary has a separate dictionary and i want to convert it in dataframe in python such that the table contains columns which has sub columns

Data=[{'endDate': {'raw': 1585612800, 'fmt': '2020-03-31'},
      'totalRevenue': {'raw': 67985000, 'fmt': '67.98M', 'longFmt': 
       '67,985,000'},
       'costOfRevenue': {'raw': 0, 'fmt': None, 'longFmt': '0'},
       'grossProfit': {'raw': 67985000, 'fmt': '67.98M', 'longFmt': 
        '67,985,000'},
       'sellingGeneralAdministrative': {'raw': 37779000,
        'fmt': '37.78M'}},
     {'endDate': {'raw': 1577750400, 'fmt': '2019-12-31'},
       'totalRevenue': {'raw': 79115000, 'fmt': '79.11M', 'longFmt': 
        '79,115,000'},
       'costOfRevenue': {'raw': 0, 'fmt': None, 'longFmt': '0'},
       'grossProfit': {'raw': 79115000, 'fmt': '79.11M', 'longFmt': 
        '79,115,000'},
       ' sellingGeneralAdministrative': {'raw': 36792000,
        'fmt': '36.79M',
        'longFmt': '36,792,000'}}]
 

   i want Data in this format

 Data =[{endDate:{'fmt':'2020-03-31'},
      totalRevenue:{'fmt':67.98M},
      costofRevenue:{'fmt':None}' and so on

ie removing 'raw' and 'longfmt' and after that i want it to convert the list of dict to dataframe.即删除'raw'和'longfmt',然后我希望它将dict列表转换为dataframe。

Here is what you can do to convert multiple dictionaries like that into a dataframe:以下是将多个这样的字典转换为 dataframe 的方法:

import pandas as pd

a = {...}
b = {...}

c = [a, b]
f = {'grossProfit':[], 'incomeBeforeTax':[], 'incomeTaxExpense':[]}
for e in c:
    for k in f.keys():
        f[d].append(e[d])

print(pd.DataFrame(f))

pandas doesn't actually support "sub-columns", as it seems you're requesting. pandas实际上并不支持“子列”,正如您所要求的那样。 It does, though, support flattening json objects in a way that {'a': {'b': 'value'}} gives you column ab = 'value' .但是,它确实支持以{'a': {'b': 'value'}}为您提供列ab = 'value'的方式展平json对象。 The official method for performing this is json_normalize , and would be used like such执行此操作的官方方法是json_normalize ,并且会像这样使用

import pandas as pd

income_statement_history = {
    "totalRevenue": {
        "raw": 67985000,
        "fmt": "67.98M",
        "longFmt": "67,985,000"
    },
    "costOfRevenue": {
        "raw": 0,
        "fmt": 'null',
        "longFmt": "0"
    },
    "grossProfit": {
        "raw": 67985000,
        "fmt": "67.98M",
        "longFmt": "67,985,000"
    },
    "totalOperatingExpenses": {
        "raw": 46790000,
        "fmt": "46.79M",
        "longFmt": "46,790,000"
    },
    "operatingIncome": {
        "raw": 21195000,
        "fmt": "21.2M",
        "longFmt": "21,195,000"
    }
}

df = pd.json_normalize(income_statement_history)

And printing df would give you打印df会给你

>>> df
   totalRevenue.raw totalRevenue.fmt totalRevenue.longFmt  costOfRevenue.raw costOfRevenue.fmt  ... totalOperatingExpenses.fmt  totalOperatingExpenses.longFmt operatingIncome.raw operatingIncome.fmt  operatingIncome.longFmt     
0          67985000           67.98M           67,985,000                  0              null  ...                     46.79M                      46,790,000            21195000               21.2M               21,195,000     

[1 rows x 15 columns]

You could proceed to dynamically access those column values with您可以继续动态访问这些列值

>>> col = 'totalOperatingExpenses'
>>> subcol = 'longFmt'
>>> df[f'{col}.{subcol}']
0    46,790,000
Name: totalOperatingExpenses.longFmt, dtype: object

Deciding between this, a pd.DataFrame initialization as @Ann Zen's answer suggests, or whatever method you've been using, depends on your exact need .在这之间做出决定,如@Ann Zen 的回答所建议的pd.DataFrame初始化,或者您一直使用的任何方法,取决于您的确切需要

Is your goal a visually pleasing disposition of columns based on json data?您的目标是基于 json 数据的视觉上令人愉悦的列配置吗? Is your goal a clear way of accessing a sub-column given its name and the name of the base column?给定子列的名称和基列的名称,您的目标是访问子列的清晰方法吗? Most answers I can think of are based on preference only, and the differences are minimal.我能想到的大多数答案仅基于偏好,差异很小。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将包含字典列表的 dataframe 列转换为单独的列? - How to convert dataframe column which contains list of dictionary into separate columns? 我有数据框。 我需要创建一个以行为键的字典,以“True”作为字典值的列 - I have dataframe. I need to create a dictionary with row as the key and columns which has 'True' as the values of the dictionary 将字典转换为具有键值对的 python 数据帧 - Convert dictionary to python dataframe which has key value pair 在 Python 中将具有多列的数据帧转换为字典 - Convert Dataframe with multiple columns to Dictionary in Python 将 pandas dataframe 列转换为嵌套的 python 字典 - Convert pandas dataframe columns into nested python dictionary 解析数据框中的多个字典列以分隔列 - Parsing multiple dictionary columns in a dataframe to separate columns 如何将字典转换为 pandas dataframe,键和值位于两个单独的列中? - How to convert a dictionary into a pandas dataframe with key and values in two separate columns? Pandas:比较和合并包含字典的 2 个数据帧的列 - Pandas: compare & merge 2 dataframe's of columns which contains a dictionary 如何将字典格式的数据框中的列转换为单独的列? - How do a convert a column in a dataframe that is in the format of a dictionary into separate columns? 将DataFrame的列转换为字典键 - Convert columns of DataFrame into dictionary keys
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM