簡體   English   中英

將嵌套的 json 轉換為特定格式的 pandas dataframe

[英]convert nested json to pandas dataframe of specific format

我從數據庫中提取了這個嵌套的 json 數據,我認為這是一個字典列表。 (我不確定,我是 python 新手)

我嘗試了許多關於堆棧溢出的代碼,但沒有一個解決了我的特定問題,我總是遇到錯誤......

數據比較大,總共有100多個usageId。 我只顯示第一個,它看起來像這樣:

[{'usageId': 'e83f43f8-ec4a-402d-a64e-d74b6f1df4a7',
  'assessment_status_date': '2022-03-28',
  'assessment_date': '2020-12-07',
  'usage_assessment': 'Level 1',
  'has_l3test': None,
  'compensating_control': None,
  'recommendations': None,
  'test_category': {'Usage Reconciliation': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Usage Reconciliation': {'evidence_capture': None,
      'test_result_justification': 'Test out of scope',
      'latest_test_result_date': '2019-10-02',
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Agreements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Data Agreements': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2019-10-02',
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Elements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Data Elements': {'evidence_capture': None,
      'test_result_justification': 'The rationale provided for why the Usage contains no HPDEs appears valid',
      'latest_test_result_date': '2019-10-02',
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Computations': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Computations': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Lineage': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Lineage': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Metadata': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Metadata': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Quality Monitoring': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Data Quality Monitoring': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2019-08-09',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'HPU Source Reliability': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + HPU Source Reliability': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2019-10-02',
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Change Notification': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Health and Welfare Plan + Change Notification': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None}},
  'assessment_status': 'In Progress',
  'recommendation_indicator': None,
  'assessment_justification': None,
  'revalidation_justification': None},
 {'usageId': 'b3c9cbbd-fb72-46df-a4a3-6dd1e1edce64',
  'assessment_status_date': '2022-03-28',
  'assessment_date': '2020-12-07',
  'usage_assessment': 'Level 1',
  'has_l3test': None,
  'compensating_control': None,
  'recommendations': None,
  'test_category': {'Usage Reconciliation': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'New or Changed Usage Reconciles with Prior Usage': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-10-23',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Agreements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Data Agreement Reviewed and Approved in Last Year': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-07-21',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Elements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'HPDEs Identified': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'HPDE Justification is Documented and Reasonable': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-02-28',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'HPDE Identification Rationale is Valid': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-02-28',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'Usage Output is Documented and Metadata is Registered': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-08-07',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'Data Element Metadata is in Curated State': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': None,
      'last_updated_by': 'tat00000',
      'test_execution_status': 'In Progress',
      'test_result': None},
     'Secured Data Indicator Consistency': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None},
     'HPDE Metadata is in Curated State': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Computations': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Usage Outcome is Accurate': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Lineage': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Lineage is Accurate Reflection of Run': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None},
     'Partial Lineage from Authoritiative or Acceptable Source to Originating Source Exists and is Reasonable': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None},
     'Lineage from Usage to Authoritative or Acceptable Source is Complete': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None},
     'Quality is Sufficient': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Metadata': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Usage Description is Valid': {'evidence_capture': None,
      'test_result_justification': 'xx',
      'latest_test_result_date': '2020-02-28',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     "Usage SME's EID Status is Valid": {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'Usage AE': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Quality Monitoring': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Data Defect Tracking Process is Reasonable': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-02-28',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'Data Movement is Reasonable': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-10-23',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Fail'},
     'DQ Rules and Thresholds are Comprehensive': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': 'Fail'},
     'Data Defect Tracking Process is Operating Effectively': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None},
     'DQ Monitoring Plan is Reasonable': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-10-22',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'HPU Source Reliability': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Usage Consumes from Acceptable Source': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-10-23',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Fail'},
     'Usage Consumes from Approved Source': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Change Notification': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Change Notification Process is Reasonable': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2020-07-21',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'},
     'Change Notification Process is Operating Effectively': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None}},
  'assessment_status': 'In Progress',
  'recommendation_indicator': None,
  'assessment_justification': None,
  'revalidation_justification': None},
 {'usageId': 'c67a1567-2de3-4826-97bb-99838b405acd',
  'assessment_status_date': '2022-03-28',
  'assessment_date': '2020-12-07',
  'usage_assessment': 'Level 1',
  'has_l3test': None,
  'compensating_control': None,
  'recommendations': None,
  'test_category': {'Usage Reconciliation': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Usage Reconciliation': {'evidence_capture': None,
      'test_result_justification': 'Test out of scope',
      'latest_test_result_date': '2019-10-02',
      'last_updated_by': None,
      'test_execution_status': 'In Progress',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Agreements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Data Agreements': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2019-10-21',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Elements': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Data Elements': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Computations': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Computations': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Lineage': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Lineage': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Metadata': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Metadata': {'evidence_capture': None,
      'test_result_justification': 'Valid',
      'latest_test_result_date': '2020-07-02',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Data Quality Monitoring': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Data Quality Monitoring': {'evidence_capture': None,
      'test_result_justification': 'xxx',
      'latest_test_result_date': '2019-10-21',
      'last_updated_by': None,
      'test_execution_status': 'Completed',
      'test_result': 'Pass without Compensating Controls'}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'HPU Source Reliability': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + HPU Source Reliability': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None},
   'Change Notification': {'test_category_assessment_date': None,
    'last_updated_by': None,
    'test_execution': {'Provided Health Insur Offer & Coverage Information Returns (1094C) + Change Notification': {'evidence_capture': None,
      'test_result_justification': None,
      'latest_test_result_date': None,
      'last_updated_by': None,
      'test_execution_status': 'Not Started',
      'test_result': None}},
    'test_category_assessment': None,
    'test_category_status': None,
    'test_category_assessment_justification': None}},
  'assessment_status': 'In Progress',
  'recommendation_indicator': None,
  'assessment_justification': None,
  'revalidation_justification': None}]

我想將其轉換為 2 個表。 我在 excel 中創建了 2 個表,它會是這樣的。

第一張桌子

第二張桌子

如果我問了一個愚蠢的問題或我的問題格式,我深表歉意。

我做了以下處理步驟,最后得到了帶有相關列的df4 ,請看一下:

import pandas as pd

null = 'null'
data  = {
  "usageId": "1",
  "assessment_status_date": "2022-03-28",
  "assessment_date": "2020-12-07",
  "usage_assessment": "Lev...<truncated>   }

df1 = pd.DataFrame(d)
for usageId in set(df1['usageId']):    #total more than 100 usageId
    df2 = df1[df1['usageId'] == usageId]    #filter based on each usageId
    test_category = list(df2['test_category'][0]['test_execution'].keys())[0].split(' + ')[0]
    df3 = pd.DataFrame(pd.DataFrame(df2['test_category'][0]['test_execution']))
    for i in range(1, len(df2)):
        df3 = pd.concat([df3, pd.DataFrame(df1['test_category'][i]['test_execution'])], axis=1)
    df3.columns = [col.split(' + ')[1] for col in df3.columns]
    df3 = df3.T
    df3['test_category'] = test_category
df4 = pd.concat([df3, df1.drop(columns=['test_category'])], axis=1)
print(df4)

Output:

                        evidence_capture last_updated_by latest_test_result_date test_execution_status test_result test_result_justification            test_category usageId assessment_status_date assessment_date usage_assessment has_l3test compensating_control recommendations assessment_status recommendation_indicator assessment_justification revalidation_justification  
Change Notification                 null            null                    null           Not Started        null                      null  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Computations                        null            null                    null           Not Started        null                      null  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Data Agreements                     null            null              2019-10-02           In Progress        null         Due to the Policy  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Data Elements                       null            null              2019-10-02           In Progress        null                       xxx  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Data Quality Monitoring             null            null              2019-08-09             Completed         xxx                       xxx  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
HPU Source Reliability              null            null              2019-10-02           In Progress        null                      xxx.  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Lineage                             null            null                    null           Not Started        null                      null  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Metadata                            null            null              2020-07-02             Completed         xxx                     Valid  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  
Usage Reconciliation                null            null              2019-10-02           In Progress        null         Test out of scope  Health and Welfare Plan       1             2022-03-28      2020-12-07          Level 1       null                 null            null       In Progress                     null                     null                       null  

我終於用下面的代碼解決了

for usage_json in assessment_json:
    if 'test_category' in usage_json:
        for tc in usage_json['test_category'].keys():
            if 'test_execution' in usage_json['test_category'][tc]:
                sub_te_df = pd.DataFrame(usage_json['test_category'][tc]['test_execution']).T
                sub_te_df.reset_index(inplace=True)
                sub_te_df = sub_te_df.rename(columns = {'index':'test_execution'})
                sub_te_df['usageId'] = usage_json['usageId']
                sub_te_df['test_category'] = tc
                usage_te_df_list.append(sub_te_df)

df_execution = pd.concat(usage_te_df_list).reset_index(drop=True)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM